Duration
21 hours (usually 3 days including breaks)
Requirements
- Experience with machine learning, devops, or data engineering.
Audience
- Data scientists
- DevOps or infrastructure engineers
- Software developers
Overview
Apache Airflow is a platform for authoring, scheduling and monitoring workflows.
This instructor-led, live training (online or onsite) is aimed at data scientists who wish to use Apache Airflow to build and manage end-to-end data pipelines.
By the end of this training, participants will be able to:
- Install and configure Apache Airflow.
- Author, schedule and monitor complex data pipelines
- Manage many ETLs (Extract, extract, Transform, Load).
- Scale and secure Apache Airflow.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
Overview of Apache Airflow Features and Architecture
Setting up Apache Airflow
Navigating the Apache Airflow UI
Using the CLI
Reading Big Data Sets
Working with DAGs
Monitoring Apache Airflow
Customizing Apache Airflow
Securing Apache Airflow
Scaling Apache Airflow
Best Practices
Troubleshooting
Summary and Conclusion