Duration
14 hours (usually 2 days including breaks)
Requirements
- Experience with SQL
Audience
- Data Scientists
Overview
Presto is a distributed query engine for big data analytics. Using Presto, users can natively query data, access data from multiple systems, and more.
This instructor-led, live training (online or onsite) is aimed at data scientists who wish to query big data sources with Presto.
By the end of this training, participants will be able to:
- Employ Presto key concepts to optimize modern big data systems.
- Use Presto to run exabyte scale warehouses.
- Clone data to a proprietary data storage system.
- Work with existing BI tools such as R and Tableau.
Format of the Course
- Interactive lecture and discussion.
- Lots of exercises and practice.
- Hands-on implementation in a live-lab environment.
Course Customization Options
- To request a customized training for this course, please contact us to arrange.
Course Outline
Introduction
Presto and Query Engines
- What is Presto?
- ANSI SQL
Preparing the Development Environment
- Setting up a sandbox and Presto
- Connecting Tableau
- Connecting R
In-Place Analysis
- Working with connectors
- Benchmarking with TCHP
SQL Concepts
- Retrieving data
- Combining data sources
- Using SQL functions
Advanced SQL Concepts
- Working with bolllinger bands
- Accessing data
- Filtering data
- Migrating data sources
Summary and Conclusion