Duration
21 hours (usually 3 days including breaks)
Requirements
- basic Linux administration skills
- basic programming skills
Overview
The course is dedicated to IT specialists that are looking for a solution to store and process large data sets in distributed system environment
Course goal:
Getting knowledge regarding Hadoop cluster administration
Course Outline
- Introduction to Cloud Computing and Big Data solutions
- Apache Hadoop evolution: HDFS, MapReduce, YARN
- Installation and configuration of Hadoop in Pseudo-distributed mode
- Running MapReduce jobs on Hadoop cluster
- Hadoop cluster planning, installation and configuration
- Hadoop ecosystem: Pig, Hive, Sqoop, HBase
- Big Data future: Impala, Cassandra