Duration
28 hours (usually 4 days including breaks)
Requirements
- Experience with Linux, relational database systems, and SQL platforms
- Experience with Scala, Java, or Python programming
Overview
MemSQL is an in-memory, distributed, SQL database management system for cloud and on-premises. It’s a real-time data warehouse that immediately delivers insights from live and historical data.
In this instructor-led, live training, participants will learn the essentials of MemSQL for development and administration.
By the end of this training, participants will be able to:
- Understand the key concepts and characteristics of MemSQL
- Install, design, maintain, and operate MemSQL
- Optimize schemas in MemSQL
- Improve queries in MemSQL
- Benchmark performance in MemSQL
- Build real-time data applications using MemSQL
Audience
- Developers
- Administrators
- Operation Engineers
Format of the course
- Part lecture, part discussion, exercises and heavy hands-on practice
Course Outline
Introduction
Overview of MemSQL
Understanding the MemSQL Architecture
Quick Start with MemSQL Using MemSQL Ops
Understanding Essential MemSQL Concepts
- Overview of MemSQL Commands
- Working with Rowstore and Columnstore
- Implementing Data Distribution
- Using Shard Keys
- Implementing Distributed Joins
- Using Reference Tables
- Understanding Application Cluster Topologies
Installing and Upgrading MemSQL
- Designing a Cluster
- Doing Manual Installation
- Expanding a Cluster
- Implementing an Upgrade
- Securing MemSQL
Working with Schema Design and Query Optimization
- Working with Transactions
- Working with Geospatial Data
- Understanding Index Types
- Using Sparsity and Normalized Forms
- Hands-on: Using a Reference Table to Query JSON with Variant Array Lengths
- Working with Shard Key Strategies
- Identifying a Sharding Strategy
- Understanding Analyze, Explain, and Profile
- Implementing Schema Optimization for Query Performance
- Using Query Hints
Diving Deep into Administering MemSQL Operations
- Using the MemSQL Ops Command Line Interface
- Administering a Cluster
- Understanding Administrator Key Concepts
- Backing Up and Restoring Data
- Scaling Cluster Size
- Dealing with Cluster Failures
- Managing High Availability
- Monitoring MemSQL
- Working with the Trace Log
- Using Durability and Recovery
- Running Diagnostics
Working with MemSQL Procedural SQL (MPSQL)
- Using Table-Valued Functions
- Using User-Defined Functions
- Using User-Defined Aggregate Functions
- Using Stored Procedures
Implementing Performance Benchmarking and Fine-Tuning
- Using Experimental Metrics
- Performance Testing with dbbench
- Hands-on: Working with a Database Workload Generator
- Using Management Views
- Implementing Workload Profiling
- Hands-on: MemSQL Top
Working with MemSQL Pipelines and Real-Time Data Ingestion
- Using the MemSQL Connector for Apache Spark
- Using MemSQL Pipelines with Apache Kafka and AWS S3
Creating Real-Time Applications
- Working with Business Intelligence Dashboards
- Using MemSQL Pipelines for Machine Learning
- Implementing a Real-Time Dashboard
- Implementing Predictive Analytics
Troubleshooting MemSQL
Summary and Conclusion