PostgreSQL for Developers Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

A working knowledge of SQL

Overview

This course provides programmatic interaction with PostgreSQL databases. Learn techniques, syntaxes and structures needed to develop quality applications using PostgreSQL backend. This training also covers SQL Tuning covering best practices for writing efficient SQL.

Target audience includes developers who want to use or extend PostgreSQL, as well as database architects.

Course Outline

Introduction to PostgreSQL

  • A Brief History of PostgreSQL
  • Features
  • Internals Summary
  • Limits and Terminology

Installation and Configuration

  • Pre-requisites
  • Installation from Packages and Creating Database
  • Installation from Source Code
  • Client Installation
  • Starting and Stopping a Database Server
  • Environment Setup

The SQL Language

  • SQL Syntax
  • Data Definition
  • Data Manipulation
  • Queries
  • Data Types
  • JSON
  • Functions and Operators
  • Type Conversion
  • Indexes

Transactions and Concurrency

  • Transactions and Isolation
  • Multi-Version Concurrency Control

Client Interfaces

  • Command Line Interface – psql
  • Graphical Interface – pgadmin4

Server Programming

  • Extending SQL
  • Triggers
  • The Rule System
  • Procedural Languages
  • PL/pgSQL – SQL Procedural Language
  • Error Handling
  • Cursors

Foreign Data Wrappers

  • Extension in PostgreSQL
  • Adding FDW in a Database
  • postgres_fdw
  • file_fdw
  • Other FDWs

SQL Tuning

  • Logging in PostgreSQL
  • Query Plans
  • Optimizing Queries
  • Statistics
  • Planner Parameters
  • Parallel Query Scans
  • SQL Best Practices
  • Indexes
  • Table Partitioning

MongoDB for Developers Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

Knowledge of a programming language (Java, PHP, C# or any other supported by MongoDB)

Overview

This course covers everything a database developer needs to know to successfully develop applications using MongoDB.

Course Outline

Manipulating Documents

  • Query
  • Insert
  • Update
  • Remove
  • Upsert
  • Removing databases, fields and others

Document Structure

  • Datatypes
  • References
  • ID
  • Keys
  • Embedded sub-documents
  • Tree structures
  • Tailable Cursor
  • Two Phase Commits
  • Auto-incrementing Sequence field

Aggregation 

  • Distinct
  • Aggregation Pipelines
  • Map-reduce

Indexes

  • Default _id
  • Single Field
  • Compound Index
  • Multikey Index
  • Geospatial Index
  • Hashed Index
  • Unique
  • Sparse

Heroku for Developers Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

  • A general understanding of cloud computing concepts.
  • Experience with web or mobile application development.
  • Programming experience in any of the languages supported by Heroku (Ruby, Python, PHP, Clojure, Go, Java, Scala, and Node.js, etc.)

Audience

  • Web developers
  • Mobile developers

Overview

Heroku is a Platform-as-a-Service (PaaS) for building, running, operating and scaling containerized web and mobile applications in the cloud. It supports multiple programming languages, various development tools, pre-installed operating systems, and redundant servers.

This instructor-led, live training (online or onsite) is aimed at developers who wish to use Heroku to conveniently deploy web and mobile applications to the cloud, without grappling with infrastructure setup, configuration, management, etc.

By the end of this training, participants will be able to:

  • Understand the Heroku ecosystem and how it differs from AWS EC2 and other PaaS offerings.
  • Leverage Heroku features such as Git integration, Heroku CLI and Heroku Dashboard to push applications to the cloud with ease.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

Preparing Your Heroku Account

Overview of Heroku Features and Architecture

Architecting an App Using the Twelve-Factor App Methodology

Navigating the Heroku Dashboard

Using the Heroku CLI

Creating a Simple Application

Uploading the Application through Git

Deploying the Application

Testing the Application

Provisioning Add-On Services

Implementing a CI/CD Workflow

Monitoring Application Uptime and Performance

Scaling Your Application

Troubleshooting

Summary and Conclusion

Docker for Developers and System Administrators Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

Some familiarity with command line and Linux is an advantage.

Overview

Docker is a platform for developers and sysadmins to maintain distributed applications. It consists of a runtime to run containers and a service for sharing containers.

With docker the same app can run unchanged on laptops, dedicated servers and virtual servers.

This course teaches the basic usage of Docker, useful both for developers and system administrators. The course includes a lot of hands on exercises and the participants will practice in their own Docker environment and build their own Docker images during the 2 days.

Course Outline

What is Docker?

  • Use cases
  • Major components of Docker
  • Docker architecture fundamentals

Docker architecture

  • Docker images
  • Docker registry
  • Docker containers

The underlying technology

  • Namespaces
  • Control groups
  • Union FS
  • Container format

Installation of Docker

  • Installation on Ubuntu via apt-get
  • installation of newer version of Docker

Dockerizing applications

  • The hello world example
  • Interactive container
  • Daemonizing programs

Container usage

  • Running a webapp in a container
  • Investigating a container
  • Port mapping
  • Viewing the logs
  • Looking at processes
  • Stopping and restarting
  • Removing a container

Managing images

  • Listing images
  • Downloading images
  • Finding images

Networking of containers

  • Port mapping details
  • Container linking and naming
  • Linking and environment variables

Data in containers

  • Data volumes
  • Host directories as data volume
  • Host file as data volume
  • Data volume containers
  • Backup, restore of data volumes

Contributing to the ecosystem

  • What is Docker Hub?
  • Registering on Docker Hub
  • Command line login
  • Uploading to Docker Hub
  • Private repositories
  • Automated builds

OKD (Origin Kubernetes Distribution) for Developers Training Course

Duration

21 hours (usually 3 days including breaks)

Requirements

  • A general understanding of containers and orchestration
  • Software development experience

Audience

  • Developers

Overview

OKD is an application development platform for deploying containerized applications using Kubernetes. OKD is the upstream code base upon which Red Hat OpenShift Online and Red Hat OpenShift Container Platform are built.

In this instructor-led, live training (onsite or remote), participants will learn learn to create, update, and maintain containerized applications using OKD.

By the end of this training, participants will be able to:

  • Deploy a containerized web application to an OKD cluster on-premise or in the cloud.
  • Automate part of the software delivery pipeline.
  • Apply the principles of the DevOps philosophy to ensure continuous delivery of an application.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • This course is based on OKD (Origin Kubernetes Distribution).
  • To customize the course or request training on a different version of OpenShift (e.g., OpenShift Container Platform 3 or OpenShift Container Platform 4), please contact us to arrange.

Course Outline

Introduction

The DevOps philosophy and Continuous Integration (CI) principles

Overview of OKD Features and Architecture

The Life Cycle of a Containerized Application

Navigating the OKD Web Console and CLI

Setting up the Development Environment

Defining a CI/CD Build Strategy

Developing an Application

Packaging an Application on Kubernetes

Running an Application in an OKD Cluster

Monitoring the Status of an Application

Debugging the Application

Updating an Application in Production

Managing Container Images

Customizing OKD with Custom Resource Definitions (CRDs)

Deploying Advanced Kubernetes Containers

Troubleshooting

Summary and Conclusion

Pivotal Greenplum for Developers Training Course

Duration

21 hours (usually 3 days including breaks)

Requirements

  • An understanding of database concepts.

Audience

  • Developers

Overview

Pivotal Greenplum is a Massively Parallel Processing (MPP) Data Warehouse platform based on PostgreSQL.

This instructor-led, live training (online or onsite) is aimed at developers who wish to set up a multi-node Greenplum database.

By the end of this training, participants will be able to:

  • Install and configure Pivotal Greenplum.
  • Model data in accordance to current needs and future expansion plans.
  • Carry out different techniques for distributing data across multiple nodes.
  • Improve database performance through tuning.
  • Monitor and troubleshoot a Greenplum database.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

Setting up Pivotal Greenplum

Overview of Pivotal Greenplum Features and Architecture

Accessing Data

  • DDL, DML, and DQL

Implementing a Table Storage Model

  • Understanding tablespaces
  • Compressing table data

Distributing the Data

  • Distribution keys and partitioning
  • Managing joins and indexing

Loading Data

  • Table partitioning

OLAP Querying

  • Implementing Greenplum functions

Modeling the Data

  • Physical design considerations

Expanding the System

  • Adding nodes
  • Migrating data

Monitoring a Greenplum system

  • Database activity and performance

Performance Tuning

  • Optimizing queries
  • Optimizing SQL joins
  • Indexing optimization

Greenplum Best Practices

Troubleshooting

Summary and Conclusiond

Apache Ignite for Developers Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

  • An understanding of databases.
  • An understanding of Java.

Audience

  • Developers

Overview

Apache Ignite is an in-memory computing platform that sits between the application and data layer to improve speed, scale, and availability.

In this instructor-led, live training, participants will learn the principles behind persistent and pure in-memory storage as they step through the creation of a sample in-memory computing project.

By the end of this training, participants will be able to:

  • Use Ignite for in-memory, on-disk persistence as well as a purely distributed in-memory database.
  • Achieve persistence without syncing data back to a relational database.
  • Use Ignite to carry out SQL and distributed joins.
  • Improve performance by moving data closer to the CPU, using RAM as a storage.
  • Spread data sets across a cluster to achieve horizontal scalability.
  • Integrate Ignite with RDBMS, NoSQL, Hadoop and machine learning processors.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

Overview of Big Data Tools and Technologies

Installing and Configuring Apache Ignite

Overview of Ignite Architecture

Querying Data in Ignite

Spreading Large Data Sets across a Cluster

Understanding the In-Memory Data Grid

Writing a Service in Ignite

Running Distributed Computing with Ignite

Integrating Ignite with RDBMS, NoSQL, Hadoop and Machine Learning Processors

Testing and Troubleshooting

Summary and Conclusion

Apache NiFi for Developers Training Course

Duration

7 hours (usually 1 day including breaks)

Requirements

  • Java programming experience.
  • Experience with Maven.

Audience

  • Developers
  • Data engineers

Overview

Apache NiFi (Hortonworks DataFlow) is a real-time integrated data logistics and simple event processing platform that enables the moving, tracking and automation of data between systems. It is written using flow-based programming and provides a web-based user interface to manage dataflows in real time.

In this instructor-led, live training, participants will learn the fundamentals of flow-based programming as they develop a number of demo extensions, components and processors using Apache NiFi.

By the end of this training, participants will be able to:

  • Understand NiFi’s architecture and dataflow concepts.
  • Develop extensions using NiFi and third-party APIs.
  • Custom develop their own Apache Nifi processor.
  • Ingest and process real-time data from disparate and uncommon file formats and data sources.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Data at rest vs data in motion

Overview of Big Data Tools and Technologies

  • Hadoop (HDFS and MapReduce) and Spark

Installing and Configuring NiFi

Overview of NiFi Architecture

Development Approaches

  • Application development tools and mindset
  • Extract, Transform, and Load (ETL) tools and mindset

Design Considerations

Components, Events, and Processor Patterns

Exercise: Streaming Data Feeds into HDFS

Error Handling

Controller Services

Exercise: Ingesting Data from IoT Devices using Web-Based APIs

Exercise: Developing a Custom Apache Nifi Processor using JSON

Testing and Troubleshooting

Contributing to Apache NiFi

Summary and Conclusion

Hadoop for Developers and Administrators Training Course

Duration

21 hours (usually 3 days including breaks)

Overview

Hadoop is the most popular Big Data processing framework.

Course Outline

Module 1. Introduction to Hadoop

  • The Hadoop Distributed File System (HDFS)
  • The Read Path and The Write Path
  • Managing Filesystem Metadata
  • The Namenode and the Datanode
  • The Namenode High Availability
  • Namenode Federation
  • The Command-Line Tools
  • Understanding REST Support

Module 2. Introduction to MapReduce

  • Analyzing the Data with Hadoop
  • Map and Reduce Pattern
  • Java MapReduce
  • Scaling Out
  • Data Flow
  • Developing Combiner Functions
  • Running a Distributed MapReduce Job

Module 3. Planning a Hadoop Cluster

  • Picking a Distribution and Version of Hadoop
  • Versions and Features
  • Hardware Selection
  • Master and Worker Hardware Selection
  • Cluster Sizing
  • Operating System Selection and Preparation
  • Deployment Layout
  • Setting up Users, Groups, and Privileges
  • Disk Configuration
  • Network Design

Module 4. Installation and Configuration

  • Installing Hadoop
  • Configuration: An Overview
  • The Hadoop XML Configuration Files
  • Environment Variables and Shell Scripts
  • Logging Configuration
  • Managing HDFS
  • Optimization and Tuning
  • Formatting the Namenode
  • Creating a /tmp Directory
  • Thinking Namenode High Availability
  • The Fencing Options
  • Automatic Failover Configuration
  • Format and Bootstrap the Namenodes
  • Namenode Federation

Module 5. Understanding Hadoop I/O

  • Data Integrity in HDFS  
  • Understanding Codecs
  • Compression and Input Splits
  • Using Compression in MapReduce
  • The Serialization mechanism
  • File-Based Data Structures
  • The SequenceFile format
  • Other File Formats and Column-Oriented Formats

Module 6. Developing a MapReduce Application

  • The Configuration API 
  • Setting Up the Development Environment
  • Managing Configuration
  • GenericOptionsParser, Tool, and ToolRunner
  • Writing a Unit Test with MRUnit
  • The Mapper and Reducer
  • Running Locally on Test Data 
  • Testing the Driver
  • Running on a Cluster
  • Packaging and Launching a Job
  • The MapReduce Web UI
  • Tuning a Job

Module 7. Identity, Authentication, and Authorization

  • Managing Identity
  • Kerberos and Hadoop
  • Understanding Authorization

Module 8. Resource Management

  • What Is Resource Management?
  • HDFS Quotas
  • MapReduce Schedulers
  • Anatomy of a YARN Application Run
  • Resource Requests
  • Application Lifespan
  • YARN Compared to MapReduce 1
  • Scheduling in YARN
  • Scheduler Options
  • Capacity Scheduler Configuration
  • Fair Scheduler Configuration
  • Delay Scheduling
  • Dominant Resource Fairness

Module 9. MapReduce Types and Formats

  • MapReduce Types
  • The Default MapReduce Job
  • Defining the Input Formats
  • Managing Input Splits and Records
  • Text Input and Binary Input
  • Managing Multiple Inputs
  • Database Input (and Output)
  • Output Formats
  • Text Output and Binary Output
  • Managing Multiple Outputs
  • The Database Output

Module 10. Using MapReduce Features

  • Using Counters
  • Reading Built-in Counters
  • User-Defined Java Counters
  • Understanding Sorting
  • Using the Distributed Cache

Module 11. Cluster Maintenance and Troubleshooting

  • Managing Hadoop Processes
  • Starting and Stopping Processes with Init Scripts
  • Starting and Stopping Processes Manually
  • HDFS Maintenance Tasks
  • Adding a Datanode
  • Decommissioning a Datanode
  • Checking Filesystem Integrity with fsck
  • Balancing HDFS Block Data
  • Dealing with a Failed Disk
  • MapReduce Maintenance Tasks 
  • Killing a MapReduce Job
  • Killing a MapReduce Task
  • Managing Resource Exhaustion

Module 12. Monitoring

  • The available Hadoop Metrics
  • The role of SNMP
  • Health Monitoring
  • Host-Level Checks
  • HDFS Checks
  • MapReduce Checks

Module 13. Backup and Recovery

  • Data Backup
  • Distributed Copy (distcp)
  • Parallel Data Ingestion
  • Namenode Metadata

HBase for Developers Training Course

Duration

21 hours (usually 3 days including breaks)

Requirements

  • comfortable with Java programming language
  • comfortable in Java programming language (navigate Linux command line, edit files with vi / nano)
  • A Java IDE like Eclipse or IntelliJ

Lab environment:

A working HBase cluster will be provided for students. Students would need an SSH client and a browser to access the cluster.

Zero Install : There is no need to install HBase software on students’ machines!

Overview

This course introduces HBase – a NoSQL store on top of Hadoop.  The course is intended for developers who will be using HBase to develop applications,  and administrators who will manage HBase clusters.

We will walk a developer through HBase architecture and data modelling and application development on HBase. It will also discuss using MapReduce with HBase, and some administration topics, related to performance optimization. The course  is very  hands-on with lots of lab exercises.


Duration : 3 days

Audience : Developers  & Administrators

Course Outline

  • Section 1: Introduction to Big Data & NoSQL
    • Big Data ecosystem
    • NoSQL overview
    • CAP theorem
    • When is NoSQL appropriate
    • Columnar storage
    • HBase and NoSQL
  • Section 2 : HBase Intro
    • Concepts and Design
    • Architecture (HMaster and Region Server)
    • Data integrity
    • HBase ecosystem
    • Lab : Exploring HBase
  • Section 3 : HBase Data model
    • Namespaces, Tables and Regions
    • Rows, columns, column families, versions
    • HBase Shell and Admin commands
    • Lab : HBase Shell
  • Section 3 : Accessing HBase using Java API
    • Introduction to Java API
    • Read / Write path
    • Time Series data
    • Scans
    • Map Reduce
    • Filters
    • Counters
    • Co-processors
    • Labs (multiple) : Using HBase Java API to implement  time series , Map Reduce, Filters and counters.
  • Section 4 : HBase schema Design : Group session
    • students are presented with real world use cases
    • students work in groups to come up with design solutions
    • discuss / critique and learn from multiple designs
    • Labs : implement a scenario in HBase
  • Section 5 : HBase Internals
    • Understanding HBase under the hood
    • Memfile / HFile / WAL
    • HDFS storage
    • Compactions
    • Splits
    • Bloom Filters
    • Caches
    • Diagnostics
  • Section 6 : HBase installation and configuration
    • hardware selection
    • install methods
    • common configurations
    • Lab : installing HBase
  • Section 7 : HBase eco-system
    • developing applications using HBase
    • interacting with other Hadoop stack (MapReduce, Pig, Hive)
    • frameworks around HBase
    • advanced concepts (co-processors)
    • Labs : writing HBase applications
  • Section 8 : Monitoring And Best Practices
    • monitoring tools and practices
    • optimizing HBase
    • HBase in the cloud
    • real world use cases of HBase
    • Labs : checking HBase vitals