Kafka – Bluechip AI Asia, AI Development Company

Duration

7 hours (usually 1 day including breaks)

Requirements

Experience with Python and Apache Kafka
Familiarity with stream-processing platforms

Audience

Data engineers
Data scientists
Programmers

Overview

Apache Spark Streaming is a scalable, open source stream processing system that allows users to process real-time data from supported sources. Spark Streaming enables fault-tolerant processing of data streams.

This instructor-led, live training (online or onsite) is aimed at data engineers, data scientists, and programmers who wish to use Spark Streaming features in processing and analyzing real-time data.

By the end of this training, participants will be able to use Spark Streaming to process live data streams for use in databases, filesystems, and live dashboards.

Format of the Course

Interactive lecture and discussion.
Lots of exercises and practice.
Hands-on implementation in a live-lab environment.

Course Customization Options

To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

Overview of Spark Streaming Features and Architecture

Supported data sources
Core APIs

Preparing the Environment

Dependencies
Spark and streaming context
Connecting to Kafka

Processing Messages

Parsing inbound messages as JSON
ETL processes
Starting the streaming context

Performing a Windowed Stream Processing

Slide interval
Checkpoint delivery configuration
Launching the environment

Prototyping the Processing Code

Connecting to a Kafka topic
Retrieving JSON from data source using Paw
Variations and additional processing

Streaming the Code

Job control variables
Defining values to match
Functions and conditions

Acquiring Stream Output

Counters
Kafka output (matched and non-matched)

Troubleshooting

Summary and Conclusion

Duration

7 hours (usually 1 day including breaks)

Requirements

An understanding of Apache Kafka
Java programming experience

Overview

Kafka Streams is a client-side library for building applications and microservices whose data is passed to and from a Kafka messaging system. Traditionally, Apache Kafka has relied on Apache Spark or Apache Storm to process data between message producers and consumers. By calling the Kafka Streams API from within an application, data can be processed directly within Kafka, bypassing the need for sending the data to a separate cluster for processing.

In this instructor-led, live training, participants will learn how to integrate Kafka Streams into a set of sample Java applications that pass data to and from Apache Kafka for stream processing.

By the end of this training, participants will be able to:

Understand Kafka Streams features and advantages over other stream processing frameworks
Process stream data directly within a Kafka cluster
Write a Java or Scala application or microservice that integrates with Kafka and Kafka Streams
Write concise code that transforms input Kafka topics into output Kafka topics
Build, package and deploy the application

Audience

Developers

Format of the course

Part lecture, part discussion, exercises and heavy hands-on practice

Notes

To request a customized training for this course, please contact us to arrange

Course Outline

Introduction

Kafka vs Spark, Flink, and Storm

Overview of Kafka Streams Features

Stateful and stateless processing, event-time processing, DSL, event-time based windowing operations, etc.

Case Study: Kafka Streams API for Predictive Budgeting

Setting up the Development Environment

Creating a Streams Application

Starting the Kafka Cluster

Preparing the Topics and Input Data

Options for Processing Stream Data

High-level Kafka Streams DSL
Lower-level Processor

Transforming the Input Data

Inspecting the Output Data

Stopping the Kafka Cluster

Options for Deploying the Application

Classic ops tools (Puppet, Chef and Salt)
Docker
WAR file

Troubleshooting

Summary and Conclusion

Tag: Kafka

Spark Streaming with Python and Kafka Training Course

Duration

Requirements

Overview

Course Outline

Stream Processing with Kafka Streams Training Course

Duration

Requirements

Overview

Course Outline