Apache Druid for Real-Time Data Analysis Training Course – Bluechip AI Asia, AI Development Company

Duration

21 hours (usually 3 days including breaks)

Requirements

A basic understanding of data infrastructure.
A general knowledge of distributed systems.
Basic Linux command line familiarity.

Audience

Application developers
Software engineers
Technical consultants
DevOps professionals
Architecture engineers

Overview

Apache Druid is an open-source, column-oriented, distributed data store written in Java. It was designed to quickly ingest massive quantities of event data and execute low-latency OLAP queries on that data. Druid is commonly used in business intelligence applications to analyze high volumes of real-time and historical data. It is also well suited for powering fast, interactive, analytic dashboards for end-users. Druid is used by companies such as Alibaba, Airbnb, Cisco, eBay, Netflix, Paypal, and Yahoo.

In this instructor-led, live course we explore some of the limitations of data warehouse solutions and discuss how Druid can compliment those technologies to form a flexible and scalable streaming analytics stack. We walk through many examples, offering participants the chance to implement and test Druid-based solutions in a lab environment.

Format of the Course

Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding

Course Outline

Introduction

Installing and Starting Apache Druid

Druid Architecture and Design

Real-Time Ingestion of Event Data

Sharding and Indexing

Loading Data

Querying Data

Visualizing Data

Running a Distributed Cluster

Druid + Apache Hive

Druid + Apache Kafka

Druid + Others

Troubleshooting

Administrative Tasks

Summary and Conclusion

Duration

Requirements

Overview

Course Outline

Leave a Reply Cancel reply