Duration
28 hours (usually 4 days including breaks)
Requirements
- Experience with Java or Scala.
Audience
- Developers
- Architects
- Data engineers
- Analytics professionals
- Technical managers
Overview
Apache Flink is an open-source framework for scalable stream and batch data processing.
This instructor-led, live training introduces the principles and approaches behind distributed stream and batch data processing, and walks participants through the creation of a real-time, data streaming application in Apache Flink.
By the end of this training, participants will be able to:
- Set up an environment for developing data analysis applications.
- Understand how Apache Flink’s graph-processing library (Gelly) works.
- Package, execute, and monitor Flink-based, fault-tolerant, data streaming applications.
- Manage diverse workloads.
- Perform advanced analytics.
- Set up a multi-node Flink cluster.
- Measure and optimize performance.
- Integrate Flink with different Big Data systems.
- Compare Flink capabilities with those of other big data processing frameworks.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
Installing and Configuring Apache Flink
Overview of Flink Architecture
Developing Data Streaming Applications in Flink
Managing Diverse Workloads
Performing Advanced Analytics
Setting up a Multi-Node Flink Cluster
Mastering Flink DataStream API
Understanding Flink Libraries
Integrating Flink with Other Big Data Tools
Testing and Troubleshooting
Summary and Conclusion