Apache Cassandra 4.0 Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

  • Experience with database management

Audience

  • System administrators
  • Developers

Overview

Apache Cassandra is an open source, NoSQL database management system for handling large data across multiple servers and clusters. The Cassandra 4.0 release introduces new features and improvements that focuses on high speed, scalability, and performance.

This instructor-led, live training (online or onsite) is aimed at sysadmin and developers who wish to use Cassandra 4.0 to create, build, and manage large-scale databases with high availability and fewer failures.

By the end of this training, participants will be able to:

  • Install and configure Apache Cassandra.
  • Migrate from earlier versions to Cassandra 4.0.
  • Learn more about the new features in Cassandra 4.0.
  • Build, manage, and scale high-performing databases.
  • Monitor table activities and optimize internode communications.
  • Use zero copy streaming to quickly transfer data across a distributed database.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • What’s new in Cassandra 4.0?
  • Cassandra basics and core concepts
  • Migrating to Cassandra 4.0

Getting Started

  • Installing and configuring Cassandra
  • Using Cassandra Query Language (CQL)
  • Choosing a client driver
  • Production recommendations

Data Modeling and Management

  • Concepts, examples, and tools
  • Defining application queries
  • Physical and logical data modeling
  • Refining table designs
  • Defining database schema
  • Working with virtual tables

Administration

  • Configuring audit logging
  • Enabling full query logging
  • Optimizations and tuning
  • Enabling transient replication
  • Implementing zero copy streaming

Troubleshooting

Summary and Next Steps

Cassandra for Developers – Bespoke Training Course

Duration

21 hours (usually 3 days including breaks)

Requirements

  • comfortable with Java programming language
  • comfortable in Linux environment (navigating command line, editing files with vi / nano)

Lab environment:

A working Cassandra environment will be provided for students. Students would need an SSH client and a browser to access the cluster.

Zero Install : There is no need to install Cassandra on students’ machines!

Overview

This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics.

Duration : 3 days

Audience : Developers

Course Outline

  • Section 1: Introduction to Big Data / NoSQL
    • NoSQL overview
    • CAP theorem
    • When is NoSQL appropriate
    • Columnar storage
    • NoSQL ecosystem
  • Section 2 : Cassandra Basics
    • Design and architecture
    • Cassandra nodes, clusters, datacenters
    • Keyspaces, tables, rows and columns
    • Partitioning, replication, tokens
    • Quorum and consistency levels
    • Labs : interacting with cassandra using CQLSH
  • Section 3: Data Modeling – part 1
    • introduction to CQL
    • CQL Datatypes
    • creating keyspaces & tables
    • Choosing columns and types
    • Choosing primary keys
    • Data layout for rows and columns
    • Time to live (TTL)
    • Querying with CQL
    • CQL updates
    • Collections (list / map / set)
    • Labs : various data modeling exercises using CQL ; experimenting with queries and supported data types
  • Section 4: Data Modeling – part 2
    • Creating and using secondary indexes
    • composite keys (partition keys and clustering keys)
    • Time series data
    • Best practices for time series data
    • Counters
    • Lightweight transactions (LWT)
    • Labs : creating and using indexes;  modeling time series data
  • Section 5 : Data Modeling Labs  : Group design session
    • multiple use cases from various domains are presented
    • students work in groups to come up designs and models
    • discuss various designs, analyze decisions
    • Lab : implement one of the scenario
  • Section 6: Cassandra drivers
    • Introduction to Java driver
    • CRUD (Create / Read / Update, Delete) operations using Java client
    • Asynchronous queries
    • Labs : using Java API for Cassandra
  • Section 7 : Cassandra Internals
    • understand Cassandra design under the hood
    • sstables, memtables, commit log
    • read path / write path
    • caching
    • vnodes
  • Section 8: Administration
    • Hardware selection
    • Cassandra distributions
    • Installing Cassandra
    • Running benchmarks
    • Tooling for monitoring performance and node activities
      • DataStax OpsCenter
    • Diagnosting Cassandra performance issues
    • Investigating a node crash
    • Understanding data repair, deletion and replication
    • Other troubleshooting tools and tips
    • Cassandra best practices (compaction, garbage collection,)
  • Section 9:  Bonus Lab (time permitting)
    • Implement a music service like Pandora / Spotify on Cassandra

Cassandra Administration Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

  • comfortable in Linux environment (navigating command line, editing files with vi / nano)
  • For on-site courses, a laptop or desktop with 8 GB of RAM
  • For remote courses, a working Cassandra lab will be provided, and nothing is needed except a web browser

Overview

This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics.

Course Outline

  • Section 1: Introduction to Big Data / NoSQL
    • NoSQL overview
    • CAP theorem
    • When is NoSQL appropriate
    • Columnar storage
    • NoSQL ecosystem
  • Section 2 : Cassandra Basics
    • Design and architecture
    • Cassandra nodes, clusters, datacenters
    • Keyspaces, tables, rows and columns
    • Partitioning, replication, tokens
    • Quorum and consistency levels
    • Labs : interacting with cassandra using CQLSH
  • Section 3: Data Modeling – part 1
    • introduction to CQL
    • CQL Datatypes
    • creating keyspaces & tables
    • Choosing columns and types
    • Choosing primary keys
    • Data layout for rows and columns
    • Time to live (TTL)
    • Querying with CQL
    • CQL updates
    • Collections (list / map / set)
    • Labs : various data modeling exercises using CQL ; experimenting with queries and supported data types
  • Section 4: Data Modeling – part 2
    • Creating and using secondary indexes
    • composite keys (partition keys and clustering keys)
    • Time series data
    • Best practices for time series data
    • Counters
    • Lightweight transactions (LWT)
    • Labs : creating and using indexes;  modeling time series data
  • Section 5 : Cassandra Internals
    • understand Cassandra design under the hood
    • sstables, memtables, commit log
  • Section 6: Administration
    • Hardware selection
    • Cassandra distributions
    • Cassandra Nodes Communication
    • Writing and Reading data to/from the storage engine
    • Data directories
    • Anti-entropy operations
    • Cassandra Compaction
    • Choosing and Implementing compaction strategies
    • Cassandra best practices (compaction, garbage collection,)
    • Creating a test Cassandra instance with low memory footprint
    • Troubleshooting tools and tips
    • Lab : students install Cassandra, run benchmarks

Cassandra for Developers Training Course

Duration

21 hours (usually 3 days including breaks)

Requirements

  • comfortable with Java programming language
  • comfortable in Linux environment (navigating command line, editing files with vi / nano)

Overview

This course will introduce Cassandra –  a popular NoSQL database.  It will cover Cassandra principles, architecture and data model.   Students will learn data modeling  in CQL (Cassandra Query Language) in hands-on, interactive labs.  This session also discusses Cassandra internals and some admin topics.

Audience : Developers

Course Outline

  • Section 1: Introduction to Big Data / NoSQL
    • NoSQL overview
    • CAP theorem
    • When is NoSQL appropriate
    • Columnar storage
    • NoSQL ecosystem
  • Section 2 : Cassandra Basics
    • Design and architecture
    • Cassandra nodes, clusters, datacenters
    • Keyspaces, tables, rows and columns
    • Partitioning, replication, tokens
    • Quorum and consistency levels
    • Labs : interacting with cassandra using CQLSH
  • Section 3: Data Modeling – part 1
    • introduction to CQL
    • CQL Datatypes
    • creating keyspaces & tables
    • Choosing columns and types
    • Choosing primary keys
    • Data layout for rows and columns
    • Time to live (TTL)
    • Querying with CQL
    • CQL updates
    • Collections (list / map / set)
    • Labs : various data modeling exercises using CQL ; experimenting with queries and supported data types
  • Section 4: Data Modeling – part 2
    • Creating and using secondary indexes
    • composite keys (partition keys and clustering keys)
    • Time series data
    • Best practices for time series data
    • Counters
    • Lightweight transactions (LWT)
    • Labs : creating and using indexes;  modeling time series data
  • Section 5 : Data Modeling Labs  : Group design session
    • multiple use cases from various domains are presented
    • students work in groups to come up designs and models
    • discuss various designs, analyze decisions
    • Lab : implement one of the scenario
  • Section 6: Cassandra drivers
    • Introduction to Java driver
    • CRUD (Create / Read / Update, Delete) operations using Java client
    • Asynchronous queries
    • Labs : using Java API for Cassandra
  • Section 7 : Cassandra Internals
    • understand Cassandra design under the hood
    • sstables, memtables, commit log
    • read path / write path
    • caching
    • vnodes
  • Section 8: Administration
    • Hardware selection
    • Cassandra distributions
    • Cassandra best practices (compaction, garbage collection,)
    • troubleshooting tools and tips
    • Lab : students install Cassandra, run benchmarks
  • Section 9:  Bonus Lab (time permitting)
    • Implement a music service like Pandora / Spotify on Cassandra