## Course content

• Part 1: Introduction
• The Field of Data Science – The Various Data Science Disciplines
• The Field of Data Science – Connecting the Data Science Disciplines
• The Field of Data Science – The Benefits of Each Discipline
• The Field of Data Science – Popular Data Science Techniques
• The Field of Data Science – Popular Data Science Tools
• The Field of Data Science – Careers in Data Science
• The Field of Data Science – Debunking Common Misconceptions
• Part 2: Probability
• Probability – Combinatorics
• Probability – Bayesian Inference
• Probability – Distributions
• Probability – Probability in Other Fields
• Part 3: Statistics
• Statistics – Descriptive Statistics
• Statistics – Practical Example: Descriptive Statistics
• Statistics – Inferential Statistics Fundamentals
• Statistics – Inferential Statistics: Confidence Intervals
• Statistics – Practical Example: Inferential Statistics
• Statistics – Hypothesis Testing
• Statistics – Practical Example: Hypothesis Testing
• Part 4: Introduction to Python
• Python – Variables and Data Types
• Python – Basic Python Syntax
• Python – Other Python Operators
• Python – Conditional Statements
• Python – Python Functions
• Python – Sequences
• Python – Iterations
• Python – Advanced Python Tools
• Part 5: Advanced Statistical Methods in Python
• Advanced Statistical Methods – Linear Regression with StatsModels
• Advanced Statistical Methods – Multiple Linear Regression with StatsModels
• Advanced Statistical Methods – Linear Regression with sklearn
• Advanced Statistical Methods – Practical Example: Linear Regression
• Advanced Statistical Methods – Logistic Regression
• Advanced Statistical Methods – Cluster Analysis
• Advanced Statistical Methods – K-Means Clustering
• Advanced Statistical Methods – Other Types of Clustering
• Part 6: Mathematics
• Part 7: Deep Learning
• Deep Learning – Introduction to Neural Networks
• Deep Learning – How to Build a Neural Network from Scratch with NumPy
• Deep Learning – TensorFlow 2.0: Introduction
• Deep Learning – Digging Deeper into NNs: Introducing Deep Neural Networks
• Deep Learning – Overfitting
• Deep Learning – Initialization
• Deep Learning – Digging into Gradient Descent and Learning Rate Schedules
• Deep Learning – Preprocessing
• Deep Learning – Classifying on the MNIST Dataset
• Deep Learning – Business Case Example
• Deep Learning – Conclusion
• Appendix: Deep Learning – TensorFlow 1: Introduction
• Appendix: Deep Learning – TensorFlow 1: Classifying on the MN 1ST Dataset
• Appendix: Deep Learning – TensorFlow 1: Business Case
• Software Integration
• Case Study – What’s Next in the Course?
• Case Study – Preprocessing the ‘Absenteeism_data’
• Case Study – Applying Machine Learning to Create the ‘absenteeism module’
• Case Study – Analyzing the Predicted Outputs in Tableau
• Appendix – Additional Python Tools
• Appendix – pandas Fundamentals
• Appendix – Working with Text Files in Python
• Bonus Lecture

## Course content

• Welcome to ChatGPT!
• ChatGPT Fundamentals
• Elements of an Effective Prompt
• Prompt Engineering Essentials
• ChatGPT Plus & GPT-4
• Advanced Data Analyis (Code Interpreter)
• ChatGPT Plugins
• The Technology Behind ChatPT
• ChatGPT for Every Business Professional
• ChatGPT for Data Science
• ChatGPT for Programming
• ChatGPT for Social Media
• OpenAI APIs: An Essential Guide
• Midjourney: Essentials
• Midjourney Deep Dive
• Bing Chat Essentials
• Bonus

## Python with Plotly and Dash Training Course

Introduction

Data Science in Depth

• What is Plotly? What is Dash?
• Pandas overview
• Numpy overview

Plotly Basics

• Plots
• Heatmaps

Preparing the Development Environment

• Installing and configuring Plotly
• Installing and configuring Dash

Dash Core Components

• Using drowdown and slider components
• Working with Dash layouts
• Converting Plotly plots to dashboards
• Using callbacks
• Working with inputs and outputs

Dash Dashboards

• Pulling API data
• Building a binance dashboard
• Connecting Dash components
• Using alpha vantage
• Cleaning data
• Controlling callbacks
• Updating graphs
• Working with layout updating

Deployment

• Working with app authorization
• Deploying with Heroku

Summary and Conclusion

## Data Science with Tableau and R Programming Training Course

Introduction

Core Programming and Syntax in R

• Variables
• Loops
• Conditional statements

Fundamentals of R

• What are vectors?
• Functions and packages in R

Preparing the Development Environment

• Installing and configuring R and RStudio
• Setting up Rserve

Classifying Data

• Moving data between R and Tableau
• Preparing and cleaning data
• Modeling and scripting in R

Regressions in R and Tableau

• Creating a regression model
• Visualizing regressions
• Predicting and comparing values

Clustering and Models

• Working with clustering algorithms
• Creating clusters
• Visualizing clustered data

Advanced Analytics with R and Tableau

• Using CRISP-DM
• Working with TDSP models
• Summarizing data

## Data Analysis with Tableau and Python Training Course

• Introduction
• Overview of Tableau and the TabPy API
• Exploring Use Cases of TabPy for Data Scientists
• Installing and Setting Up TabPy
• Setting Up Tableau Desktop with Python
• Configuring a TabPy Connection on Tableau
• Passing Expressions to Python
• Running Python Scripts on Tableau
• Estimating the Probability of Customer Churn Using Logistic Regression
• Getting Sentiment Scores for Reviews of Products Sold
• Scoring User Behavior using a Predictive Model
• Using Findings to Create an Efficient Conversion Funnel

## Data Analytics With R Training Course

### Day One: Language Basics

• Course Introduction
• Data Science Definition
• Process of Doing Data Science.
• Introducing R Language
• Variables and Types
• Control Structures (Loops / Conditionals)
• R Scalars, Vectors, and Matrices
• Defining R Vectors
• Matricies
• String and Text Manipulation
• Character data type
• File IO
• Lists
• Functions
• Introducing Functions
• Closures
• lapply/sapply functions
• DataFrames
• Labs for all sections

### Day Two: Intermediate R Programming

• DataFrames and File I/O
• Data Preparation
• Built-in Datasets
• Visualization
• Graphics Package
• plot() / barplot() / hist() / boxplot() / scatter plot
• Heat Map
• ggplot2 package (qplot(), ggplot())
• Exploration With Dplyr
• Labs for all sections

### Day Three: Advanced Programming With R

• Statistical Modeling With R
• Statistical Functions
• Dealing With NA
• Distributions (Binomial, Poisson, Normal)
• Regression
• Introducing Linear Regressions
• Recommendations
• Text Processing (tm package / Wordclouds)
• Clustering
• Introduction to Clustering
• KMeans
• Classification
• Introduction to Classification
• Naive Bayes
• Decision Trees
• Training using caret package
• Evaluating Algorithms
• R and Big Data
• Connecting R to databases
• Big Data Ecosystem
• Labs for all sections

## Duration

35 hours (usually 5 days including breaks)

## Requirements

• An understanding of Data Structure.
• Experience with Programming.

Audience

• Programmers
• Data Scientist
• Engineers

## Overview

The training course will help the participants prepare for Web Application Development using Python Programming with Data Analytics. Such data visualization is a great tool for Top Management in decision making.

## Course Outline

Day 1

1. Data Science
2. Data Science Team Composition (Data Scientist, Data Engineer, Data Visualizer, Process Owner)
3. Business Intelligence and the Data Visualization
4. Data Visualization
1. Importance of Data Visualization
2. The Visual Data Presentation
3. The Data Visualization Tools (infographics, dials and gauges, geographic maps, sparklines, heat maps, and detailed bar, pie and fever charts)
4. Painting by Numbers and Playing with Colors in Making Visual Stories
5. Activity

Day 2

1. Data Visualization in Python Programming
1. Data Science with Python
2. Review on Python Fundamentals
1. Variables and Data Types (str, numeric, sequence, mapping, set types, Boolean, binary, casting)
2. Operators, Lists, Tuples. Sets, Dictionaries
3. Conditional Statements
4. Functions, Lambda, Arrays, Classes, Objects, Inheritance, Iterators
5. Scope, Modules, Dates, JSON, RegEx, PIP
6. Try / Except, Command Input, String Formatting
7. File Handling
8. Activity

Day 3

1. Python and MySQL
1. Creating Database and Table
2. Manipulating Database (Insert, Select, Update, Delete, Where Statement, Order by)
3. Drop Table
4. Limit
5. Joining Tables
6. Removing List Duplicates
7. Reverse a String
1. Data Visualization with Python and MySQL
1. Using Matplotlib (Basic Plotting)
2. Dictionaries and Pandas
3. Logic, Control Flow and Filtering
4. Manipulating Graphs Properties (Font, Size, Color Scheme)
2. Activity

Day 4

1. Plotting Data in Different Graph Format
• Histogram
• Line
• Bar
• Box Plot
• Pie Chart
• Donut
• Scatter Plot
• Area
• 2D / 3D Density Plot
• Dendogram
• Map (Bubble, Heat)
• Stacked Chart
• Venn Diagram
• Seaborn
2. Activity

Day 5

1. Data Visualization with Python and MySQL
1. Group Work: Create a Top Management Data Visualization Presentation Using ITDI Local ULIMS Data
2. Presentation of Output

## Duration

14 hours (usually 2 days including breaks)

## Requirements

• An understanding of big data concepts (HDFS, Hive, etc.)
• An understanding of relational databases (MySQL, etc.)
• Experience with the Linux command line

## Overview

Sqoop is an open source software tool for transfering data between Hadoop and relational databases or mainframes. It can be used to import data from a relational database management system (RDBMS) such as MySQL or Oracle or a mainframe into the Hadoop Distributed File System (HDFS). Thereafter, the data can be transformed in Hadoop MapReduce, and then re-exported back into an RDBMS.

In this instructor-led, live training, participants will learn how to use Sqoop to import data from a traditional relational database to Hadoop storage such HDFS or Hive and vice versa.

By the end of this training, participants will be able to:

• Install and configure Sqoop
• Import data from MySQL to HDFS and Hive
• Import data from HDFS and Hive to MySQL

Audience

• Data engineers

Format of the Course

• Part lecture, part discussion, exercises and heavy hands-on practice

Note

## Course Outline

Introduction

• Moving data from legacy data stores to Hadoop

Installing and Configuring Sqoop

Overview of Sqoop Features and Architecture

Importing Data from MySQL to HDFS

Importing Data from MySQL to Hive

Importing Data from HDFS to MySQL

Importing Data from Hive to MySQL

Importing Incrementally with Sqoop Jobs

Troubleshooting

Summary and Conclusion

## Duration

14 hours (usually 2 days including breaks)

## Requirements

Good R knowledge.

## Overview

R is an open-source free programming language for statistical computing, data analysis, and graphics. R is used by a growing number of managers and data analysts inside corporations and academia. R has a wide variety of packages for data mining.

## Course Outline

### Sources of methods

• Artificial intelligence
• Machine learning
• Statistics
• Sources of data

### Pre processing of data

• Data Import/Export
• Data Exploration and Visualization
• Dimensionality Reduction
• Dealing with missing values
• R Packages

• Automatic or semi-automatic analysis of large quantities of data
• Extracting previously unknown interesting patterns
• groups of data records (cluster analysis)
• unusual records (anomaly detection)
• dependencies (association rule mining)

### Data mining

• Anomaly detection (Outlier/change/deviation detection)
• Association rule learning (Dependency modeling)
• Clustering
• Classification
• Regression
• Summarization
• Frequent Pattern Mining
• Text Mining
• Decision Trees
• Regression
• Neural Networks
• Sequence Mining
• Frequent Pattern Mining

Data dredging, data fishing, data snooping