Kubeflow Fundamentals Training Course

Duration

28 hours (usually 4 days including breaks)

Requirements

  • Familiarity with Python syntax 
  • Experience with Tensorflow, PyTorch, or other machine learning framework
  • A public cloud provider account (optional) 

Audience

  • Developers
  • Data scientists

Overview

Kubeflow is a toolkit for making Machine Learning (ML) on Kubernetes easy, portable and scalable.

This instructor-led, live training (online or onsite) is aimed at developers and data scientists who wish to build, deploy, and manage machine learning workflows on Kubernetes.

By the end of this training, participants will be able to:

  • Install and configure Kubeflow on premise and in the cloud.
  • Build, deploy, and manage ML workflows based on Docker containers and Kubernetes.
  • Run entire machine learning pipelines on diverse architectures and cloud environments.
  • Using Kubeflow to spawn and manage Jupyter notebooks.
  • Build ML training, hyperparameter tuning, and serving workloads across multiple platforms.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.
  • To learn more about Kubeflow, please visit: https://github.com/kubeflow/kubeflow

Course Outline

Introduction

Overview of Kubeflow Features and Components

  • Containers, manifests, etc.

Overview of a Machine Learning Pipeline

  • Training, testing, tuning, deploying, etc.

Deploying Kubeflow to a Kubernetes Cluster

  • Preparing the execution environment (training cluster, production cluster, etc.)
  • Downloading, installing and customizing.

Running a Machine Learning Pipeline on Kubernetes

  • Building a TensorFlow pipeline.
  • Building a PyTorch pipleline.

Visualizing the Results

  • Exporting and visualizing pipeline metrics

Customizing the Execution Environment

  • Customizing the stack for diverse infrastructures
  • Upgrading a Kubeflow deployment

Running Kubeflow on Public Clouds

  • AWS, Microsoft Azure, Google Cloud Platform

Managing Production Workflows

  • Running with GitOps methodology
  • Scheduling jobs
  • Spawning Jupyter notebooks

Troubleshooting

Summary and Conclusion

Kubeflow on OpenShift Training Course

Duration

28 hours (usually 4 days including breaks)

Requirements

  • An understanding of machine learning concepts.
  • Knowledge of cloud computing concepts.
  • A general understanding of containers (Docker) and orchestration (Kubernetes).
  • Some Python programming experience is helpful.
  • Experience working with a command line.

Audience

  • Data science engineers.
  • DevOps engineers interesting in machine learning model deployment.
  • Infrastructure engineers interesting in machine learning model deployment.
  • Software engineers wishing to automate the integration and deployment of machine learning features with their application.

Overview

Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is one of the most popular machine learning libraries. Kubernetes is an orchestration platform for managing containerized applications. OpenShift is a cloud application development platform that uses Docker containers, orchestrated and managed by Kubernetes, on a foundation of Red Hat Enterprise Linux.

This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to an OpenShift on-premise or hybrid cloud.

  • By the end of this training, participants will be able to:
  • Install and configure Kubernetes and Kubeflow on an OpenShift cluster.
  • Use OpenShift to simplify the work of initializing a Kubernetes cluster.
  • Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
  • Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
  • Call public cloud services (e.g., AWS services) from within OpenShift to extend an ML application.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Kubeflow on OpenShift vs public cloud managed services

Overview of Kubeflow on OpenShift

  • Code Read Containers
  • Storage options

Overview of Environment Setup

  • Setting up a Kubernetes cluster

Setting up Kubeflow on OpenShift

  • Installing Kubeflow

Coding the Model

  • Choosing an ML algorithm
  • Implementing a TensorFlow CNN model

Reading the Data

  • Accessing a dataset

Kubeflow Pipelines on OpenShift

  • Setting up an end-to-end Kubeflow pipeline
  • Customizing Kubeflow Pipelines

Running an ML Training Job

  • Training a model

Deploying the Model

  • Running a trained model on OpenShift

Integrating the Model into a Web Application

  • Creating a sample application
  • Sending prediction requests

Administering Kubeflow

  • Monitoring with Tensorboard
  • Managing logs

Securing a Kubeflow Cluster

  • Setting up authentication and authorization

Troubleshooting

Summary and Conclusion

Kubeflow Training Course

Duration

35 hours (usually 5 days including breaks)

Requirements

  • Familiarity with Python syntax 
  • Experience with Tensorflow, PyTorch, or other machine learning framework
  • An AWS account with necessary resources

Audience

  • Developers
  • Data scientists

Overview

Kubeflow is a toolkit for making Machine Learning (ML) on Kubernetes easy, portable and scalable. AWS EKS (Elastic Kubernetes Service) is an Amazon managed service for running the Kubernetes on AWS.

This instructor-led, live training (online or onsite) is aimed at developers and data scientists who wish to build, deploy, and manage machine learning workflows on Kubernetes.

By the end of this training, participants will be able to:

  • Install and configure Kubeflow on premise and in the cloud using AWS EKS (Elastic Kubernetes Service).
  • Build, deploy, and manage ML workflows based on Docker containers and Kubernetes.
  • Run entire machine learning pipelines on diverse architectures and cloud environments.
  • Using Kubeflow to spawn and manage Jupyter notebooks.
  • Build ML training, hyperparameter tuning, and serving workloads across multiple platforms.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Introduction to Kubernetes
  • Overview of Kubeflow Features and Architecture
  • Kubeflow on AWS vs on-premise vs on other public cloud providers

Setting up a Cluster using AWS EKS

Setting up an On-Premise Cluster using Microk8s

Deploying Kubernetes using a GitOps Approach

Data Storage Approaches

Creating a Kubeflow Pipeline

Triggering a Pipeline

Defining Output Artifacts

Storing Metadata for Datasets and Models

Hyperparameter Tuning with TensorFlow

Visualizing and Analyzing the Results

Multi-GPU Training

Creating an Inference Server for Deploying ML Models

Working with JupyterHub

Networking and Load Balancing

Auto Scaling a Kubernetes Cluster

Troubleshooting

Summary and Conclusion

Kubeflow on IBM Cloud Training Course

Duration

28 hours (usually 4 days including breaks)

Requirements

  • An understanding of machine learning concepts.
  • Knowledge of cloud computing concepts.
  • A general understanding of containers (Docker) and orchestration (Kubernetes).
  • Some Python programming experience is helpful.
  • Experience working with a command line.

Audience

  • Data science engineers.
  • DevOps engineers interesting in machine learning model deployment.
  • Infrastructure engineers interesting in machine learning model deployment.
  • Software engineers wishing to automate the integration and deployment of machine learning features with their application.

Overview

Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is one of the most popular machine learning libraries. Kubernetes is an orchestration platform for managing containerized applications.

This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to IBM Cloud Kubernetes Service (IKS).

By the end of this training, participants will be able to:

  • Install and configure Kubernetes, Kubeflow and other needed software on IBM Cloud Kubernetes Service (IKS).
  • Use IKS to simplify the work of initializing a Kubernetes cluster on IBM Cloud.
  • Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
  • Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
  • Leverage other IBM Cloud services to extend an ML application.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Kubeflow on IKS vs on-premise vs on other public cloud providers

Overview of Kubeflow Features on IBM Cloud

  • IKS
  • IBM Cloud Object Storage

Overview of Environment Setup

  • Preparing virtual machines
  • Setting up a Kubernetes cluster

Setting up Kubeflow on IBM Cloud

  • Installing Kubeflow through IKS

Coding the Model

  • Choosing an ML algorithm
  • Implementing a TensorFlow CNN model

Reading the Data

  • Accessing the MNIST dataset

Pipelines on IBM Cloud

  • Setting up an end-to-end Kubeflow pipeline
  • Customizing Kubeflow Pipelines

Running an ML Training Job

  • Training an MNIST model

Deploying the Model

  • Running TensorFlow Serving on IKS

Integrating the Model into a Web Application

  • Creating a sample application
  • Sending prediction requests

Administering Kubeflow

  • Monitoring with Tensorboard
  • Managing logs

Securing a Kubeflow Cluster

  • Setting up authentication and authorization

Troubleshooting

Summary and Conclusion

Kubeflow on GCP Training Course

Duration

28 hours (usually 4 days including breaks)

Requirements

  • An understanding of machine learning concepts.
  • Knowledge of cloud computing concepts.
  • A general understanding of containers (Docker) and orchestration (Kubernetes).
  • Some Python programming experience is helpful.
  • Experience working with a command line.

Audience

  • Data science engineers.
  • DevOps engineers interesting in machine learning model deployment.
  • Infrastructure engineers interesting in machine learning model deployment.
  • Software engineers wishing to automate the integration and deployment of machine learning features with their application.

Overview

Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is one of the most popular machine learning libraries. Kubernetes is an orchestration platform for managing containerized applications.

This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to Google Cloud Platform (GCP).

By the end of this training, participants will be able to:

  • Install and configure Kubernetes, Kubeflow and other needed software on GCP and GKE.
  • Use GKE (Kubernetes Kubernetes Engine) to simplify the work of initializing a Kubernetes cluster on GCP.
  • Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
  • Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
  • Leverage other GCP services to extend an ML application.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Kubeflow on GCK vs on-premise vs on other public cloud providers

Overview of Kubeflow Features on GCP

  • Declarative management of resources
  • GKE autoscaling for machine learning (ML) workloads
  • Secure connections to Jupyter
  • Persistent logs for debugging and troubleshooting
  • GPUs and TPUs to accelerate workloads

Overview of Environment Setup

  • Virtual machine preparation
  • Kubernetes cluster setup
  • Kubeflow installation

Deploying Kubeflow

  • Deploying  Kubeflow on GCP
  • Deploying Kubeflow across on-premises and cloud environments
  • Deploying Kubeflow on GKE
  • Setting up a custom domain on GKE

Pipelines on GCP

  • Setting up an end-to-end Kubeflow pipeline
  • Customizing Kubeflow Pipelines

Securing a Kubeflow Cluster

  • Setting up authentication and authorization
  • Using VPC service controls and private GKE

Storing, Accessing, Managing Data

  • Understanding shared filesystems and Network Attached Storage (NAS)
  • Using managed file storage services in GCE

Running an ML Training Job

  • Training an MNIST model

Administering Kubeflow

  • Logging and monitoring

Troubleshooting

Summary and Conclusion

Kubeflow on Azure Training Course

Duration

28 hours (usually 4 days including breaks)

Requirements

  • An understanding of machine learning concepts.
  • Knowledge of cloud computing concepts.
  • A general understanding of containers (Docker) and orchestration (Kubernetes).
  • Some Python programming experience is helpful.
  • Experience working with a command line.

Audience

  • Data science engineers.
  • DevOps engineers interesting in machine learning model deployment.
  • Infrastructure engineers interested in machine learning model deployment.
  • Software engineers wishing to automate the integration and deployment of machine learning features with their application.

Overview

Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is one of the most popular machine learning libraries. Kubernetes is an orchestration platform for managing containerized applications.

This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to Azure cloud.

By the end of this training, participants will be able to:

  • Install and configure Kubernetes, Kubeflow and other needed software on Azure.
  • Use Azure Kubernetes Service (AKS) to simplify the work of initializing a Kubernetes cluster on Azure.
  • Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
  • Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
  • Leverage other AWS managed services to extend an ML application.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Kubeflow on Azure vs on-premise vs on other public cloud providers

Overview of Kubeflow Features and Architecture

Overview of the Deployment Process

Activating an Azure Account

Preparing and Launching GPU-enabled Virtual Machines

Setting up User Roles and Permissions

Preparing the Build Environment

Selecting a TensorFlow Model and Dataset

Packaging Code and Frameworks into a Docker Image

Setting up a Kubernetes Cluster Using AKS

Staging the Training and Validation Data

Configuring Kubeflow Pipelines

Launching a Training Job.

Visualizing the Training Job in Runtime

Cleaning up After the Job Completes

Troubleshooting

Summary and Conclusion

Kubeflow on AWS Training Course

Duration

28 hours (usually 4 days including breaks)

Requirements

  • An understanding of machine learning concepts.
  • Knowledge of cloud computing concepts.
  • A general understanding of containers (Docker) and orchestration (Kubernetes).
  • Some Python programming experience is helpful.
  • Experience working with a command line.

Audience

  • Data science engineers.
  • DevOps engineers interesting in machine learning model deployment.
  • Infrastructure engineers interesting in machine learning model deployment.
  • Software engineers wishing to integrate and deploy machine learning features with their application.

Overview

Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is a machine learning library and Kubernetes is an orchestration platform for managing containerized applications.

This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to an AWS EC2 server.

By the end of this training, participants will be able to:

  • Install and configure Kubernetes, Kubeflow and other needed software on AWS.
  • Use EKS (Elastic Kubernetes Service) to simplify the work of initializing a Kubernetes cluster on AWS.
  • Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
  • Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
  • Leverage other AWS managed services to extend an ML application.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Kubeflow on AWS vs on-premise vs on other public cloud providers

Overview of Kubeflow Features and Architecture

Activating an AWS Account

Preparing and Launching GPU-enabled AWS Instances

Setting up User Roles and Permissions

Preparing the Build Environment

Selecting a TensorFlow Model and Dataset

Packaging Code and Frameworks into a Docker Image

Setting up a Kubernetes Cluster Using EKS

Staging the Training and Validation Data

Configuring Kubeflow Pipelines

Launching a Training Job using Kubeflow in EKS

Visualizing the Training Job in Runtime

Cleaning up After the Job Completes

Troubleshooting

Summary and Conclusion

Kubeflow on OpenShift Training Course

Duration

28 hours (usually 4 days including breaks)

Requirements

  • An understanding of machine learning concepts.
  • Knowledge of cloud computing concepts.
  • A general understanding of containers (Docker) and orchestration (Kubernetes).
  • Some Python programming experience is helpful.
  • Experience working with a command line.

Audience

  • Data science engineers.
  • DevOps engineers interesting in machine learning model deployment.
  • Infrastructure engineers interesting in machine learning model deployment.
  • Software engineers wishing to automate the integration and deployment of machine learning features with their application.

Overview

Kubeflow is a framework for running Machine Learning workloads on Kubernetes. TensorFlow is one of the most popular machine learning libraries. Kubernetes is an orchestration platform for managing containerized applications. OpenShift is a cloud application development platform that uses Docker containers, orchestrated and managed by Kubernetes, on a foundation of Red Hat Enterprise Linux.

This instructor-led, live training (online or onsite) is aimed at engineers who wish to deploy Machine Learning workloads to an OpenShift on-premise or hybrid cloud.

  • By the end of this training, participants will be able to:
  • Install and configure Kubernetes and Kubeflow on an OpenShift cluster.
  • Use OpenShift to simplify the work of initializing a Kubernetes cluster.
  • Create and deploy a Kubernetes pipeline for automating and managing ML models in production.
  • Train and deploy TensorFlow ML models across multiple GPUs and machines running in parallel.
  • Call public cloud services (e.g., AWS services) from within OpenShift to extend an ML application.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Kubeflow on OpenShift vs public cloud managed services

Overview of Kubeflow on OpenShift

  • Code Read Containers
  • Storage options

Overview of Environment Setup

  • Setting up a Kubernetes cluster

Setting up Kubeflow on OpenShift

  • Installing Kubeflow

Coding the Model

  • Choosing an ML algorithm
  • Implementing a TensorFlow CNN model

Reading the Data

  • Accessing a dataset

Kubeflow Pipelines on OpenShift

  • Setting up an end-to-end Kubeflow pipeline
  • Customizing Kubeflow Pipelines

Running an ML Training Job

  • Training a model

Deploying the Model

  • Running a trained model on OpenShift

Integrating the Model into a Web Application

  • Creating a sample application
  • Sending prediction requests

Administering Kubeflow

  • Monitoring with Tensorboard
  • Managing logs

Securing a Kubeflow Cluster

  • Setting up authentication and authorization

Troubleshooting

Summary and Conclusion