Deep Learning NLP: Build and Deploy a BERT COVID Q&A System

Course content

  • Welcome to the Course
  • Accessing and Saving the COVID Dataset (OPTIONAL)
  • Data Pre-Processing (OPTIONAL)
  • Exploratory Data Analysis
  • Elasticsearch Document Store
  • Question Answering Engine
  • Streamlit User Interface Design
  • Deployment

Machine Learning, Deep Learning + AWS Sagemaker

Course content

  • Introduction
  • Basic python + Pandas + Plotting
  • Machine Learning: Numpy + Scikit Learn
  • Machine Learning: Classification + Time Series + Model Diagnostics
  • Unsupervised Learning
  • Natural Language Processing + Regularization
  • Deep Learning
  • Deep Learning (TensorFIow) – Convolutional Neural Nets
  • Deep Learning: Recurrent Neural Nets
  • Deep Learning: PyTorch Introduction
  • Deep Learning: Transfer Learning with PyTorch Lightning
  • Pixel Level Segmentation (Semantic Segmentation) with PyTorch
  • Deep Learning: Transformers and BERT
  • Bayesian Learning and probabilistic programming
  • Model Deployment
  • AWS Sagemaker (for Model Deployment)
  • Final Thoughts

NLP with Python and TextBlob Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

  • An understanding of NLP concepts
  • Python programming experience

Audience

  • Data scientists
  • Developers

Overview

TextBlob is a Python NLP library for processing textual data. It provides a simple API that makes it easy to perform NLP tasks, such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, etc.

This instructor-led, live training (online or onsite) is aimed at data scientists and developers who wish to use TextBlob to implement and simplify NLP tasks, such as sentiment analysis, spelling corrections, text classification modeling, etc.

By the end of this training, participants will be able to:

  • Set up the necessary development environment to start implementing NLP tasks with TextBlob.
  • Understand the features, architecture, and advantages of TextBlob.
  • Learn how to build text classification systems using TextBlob.
  • Perform common NLP tasks (Tokenization, WordNet, Sentiment analysis, Spelling correction, etc.)
  • Execute advanced implementations with simple APIs and a few lines of codes.

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Overview of TextBlob features and architecture
  • NLP fundamentals

Getting Started

  • Installing TextBlob
  • Importing libraries and data

Building Text Classification Models

  • Loading data and creating classifiers
  • Evaluating classifiers
  • Updating classifiers with new data
  • Using feature extractors

Performing NLP Tasks using TextBlob

  • Tokenization  
  • WordNet integration  
  • Noun phrase extraction  
  • Part-of-speech tagging  
  • Sentiment analysis  
  • Spelling correction
  • Translation and language detection

APIs and Advanced Implementations

  • Sentiment analyzers  
  • Tokenizers
  • Noun phrase chunkers  
  • POS taggers  
  • Parsers  
  • Blobber

Troubleshooting

Summary and Next Steps

Scaling Data Pipelines with Spark NLP Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

  • Familiarity with Apache Spark
  • Python programming experience

Audience

  • Data scientists
  • Developers

Overview

Spark NLP is an open source library, built on Apache Spark, for natural language processing with Python, Java, and Scala. It is widely used for enterprise and industry verticals, such as healthcare, finance, life science, and recruiting.

This instructor-led, live training (online or onsite) is aimed at data scientists and developers who wish to use Spark NLP, built on top of Apache Spark, to develop, implement, and scale natural language text processing models and pipelines.

By the end of this training, participants will be able to:

  • Set up the necessary development environment to start building NLP pipelines with Spark NLP.
  • Understand the features, architecture, and benefits of using Spark NLP.
  • Use the pre-trained models available in Spark NLP to implement text processing.
  • Learn how to build, train, and scale Spark NLP models for production-grade projects.
  • Apply classification, inference, and sentiment analysis on real-world use cases (clinical data, customer behavior insights, etc.).

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.

Course Outline

Introduction

  • Spark NLP vs NLTK vs spaCy
  • Overview of Spark NLP features and architecture

Getting Started

  • Setup requirements
  • Installing Spark NLP
  • General concepts

Using Pre-trained Pipelines

  • Importing required modules
  • Default annotators
  • Loading a pipeline model
  • Transforming texts

Building NLP Pipelines

  • Understanding the pipeline API
  • Implementing NER models
  • Choosing embeddings
  • Using word, sentence, and universal embeddings

Classification and Inference

  • Document classification use cases
  • Sentiment analysis models
  • Training a document classifier
  • Using other machine learning frameworks
  • Managing NLP models
  • Optimizing models for low-latency inference

Troubleshooting

Summary and Next Steps

Natural Language Processing (NLP) with Python spaCy Training Course

Duration

14 hours (usually 2 days including breaks)

Requirements

  • Python programming experience.
  • A basic understanding of statistics
  • Experience with the command line

Audience

  • Developers
  • Data scientists

Overview

This instructor-led, live training (online or onsite) is aimed at developers and data scientists who wish to use spaCy to process very large volumes of text to find patterns and gain insights.

By the end of this training, participants will be able to:

  • Install and configure spaCy.
  • Understand spaCy’s approach to Natural Language Processing (NLP).
  • Extract patterns and obtain business insights from large-scale data sources.
  • Integrate the spaCy library with existing web and legacy applications.
  • Deploy spaCy to live production environments to predict human behavior.
  • Use spaCy to pre-process text for Deep Learning

Format of the Course

  • Interactive lecture and discussion.
  • Lots of exercises and practice.
  • Hands-on implementation in a live-lab environment.

Course Customization Options

  • To request a customized training for this course, please contact us to arrange.
  • To learn more about spaCy, please visit: https://spacy.io/

Course Outline

Introduction

  • Defining “Industrial-Strength Natural Language Processing”

Installing spaCy

spaCy Components

  • Part-of-speech tagger
  • Named entity recognizer
  • Dependency parser

Overview of spaCy Features and Syntax

Understanding spaCy Modeling

  • Statistical modeling and prediction

Using the SpaCy Command Line Interface (CLI)

  • Basic commands

Creating a Simple Application to Predict Behavior 

Training a New Statistical Model

  • Data (for training)
  • Labels (tags, named entities, etc.)

Loading the Model

  • Shuffling and looping 

Saving the Model

Providing Feedback to the Model

  • Error gradient

Updating the Model

  • Updating the entity recognizer
  • Extracting tokens with rule-based matcher

Developing a Generalized Theory for Expected Outcomes

Case Study

  • Distinguishing Product Names from Company Names

Refining the Training Data

  • Selecting representative data
  • Setting the dropout rate

Other Training Styles

  • Passing raw texts
  • Passing dictionaries of annotations

Using spaCy to Pre-process Text for Deep Learning

Integrating spaCy with Legacy Applications

Testing and Debugging the spaCy Model

  • The importance of iteration

Deploying the Model to Production

Monitoring and Adjusting the Model

Troubleshooting

Summary and Conclusion

Natural Language Processing (NLP) – AI/Robotics Training Course

Duration

21 hours (usually 3 days including breaks)

Requirements

Knowledge and awareness of NLP principals and an appreciation of AI application in business

Overview

This classroom based training session will explore NLP techniques in conjunction with the application of AI and Robotics in business. Delegates will undertake computer based examples and case study solving exercises using Python

Course Outline

Detailed training outline

  1. Introduction to NLP
    • Understanding NLP
    • NLP Frameworks
    • Commercial applications of NLP
    • Scraping data from the web
    • Working with various APIs to retrieve text data
    • Working and storing text corpora saving content and relevant metadata
    • Advantages of using Python and NLTK crash course
  2. Practical Understanding of a Corpus and Dataset
    • Why do we need a corpus?
    • Corpus Analysis
    • Types of data attributes
    • Different file formats for corpora
    • Preparing a dataset for NLP applications
  3. Understanding the Structure of a Sentences
    • Components of NLP
    • Natural language understanding
    • Morphological analysis – stem, word, token, speech tags
    • Syntactic analysis
    • Semantic analysis
    • Handling ambigiuty
  4. Text data preprocessing
    • Corpus- raw text
      • Sentence tokenization
      • Stemming for raw text
      • Lemmization of raw text
      • Stop word removal
    • Corpus-raw sentences
      • Word tokenization
      • Word lemmatization
    • Working with Term-Document/Document-Term matrices
    • Text tokenization into n-grams and sentences
    • Practical and customized preprocessing
  5. Analyzing Text data
    • Basic feature of NLP
      • Parsers and parsing
      • POS tagging and taggers
      • Name entity recognition
      • N-grams
      • Bag of words
    • Statistical features of NLP
      • Concepts of Linear algebra for NLP
      • Probabilistic theory for NLP
      • TF-IDF
      • Vectorization
      • Encoders and Decoders
      • Normalization
      • Probabilistic Models
    • Advanced feature engineering and NLP
      • Basics of word2vec
      • Components of word2vec model
      • Logic of the word2vec model
      • Extension of the word2vec concept
      • Application of word2vec model
    • Case study: Application of bag of words: automatic text summarization using simplified and true Luhn’s algorithms
  6. Document Clustering, Classification and Topic Modeling
    • Document clustering and pattern mining (hierarchical clustering, k-means, clustering, etc.)
    • Comparing and classifying documents using TFIDF, Jaccard and cosine distance measures
    • Document classifcication using Naïve Bayes and Maximum Entropy
  7. Identifying Important Text Elements
    • Reducing dimensionality: Principal Component Analysis, Singular Value Decomposition non-negative matrix factorization
    • Topic modeling and information retrieval using Latent Semantic Analysis
  8. Entity Extraction, Sentiment Analysis and Advanced Topic Modeling
    • Positive vs. negative: degree of sentiment
    • Item Response Theory
    • Part of speech tagging and its application: finding people, places and organizations mentioned in text
    • Advanced topic modeling: Latent Dirichlet Allocation
  9. Case studies
    • Mining unstructured user reviews
    • Sentiment classification and visualization of Product Review Data
    • Mining search logs for usage patterns
    • Text classification
    • Topic modelling

Natural Language Processing (NLP) with Deep Dive in Python and NLTK Training Course

Duration

35 hours (usually 5 days including breaks)

Requirements

There are no specific requirements needed to attend this course.

Overview

By the end of the training the delegates are expected to be sufficiently equipped with the essential python concepts and should be able to sufficiently use NLTK to implement most of the NLP and ML based operations. The training is aimed at giving not just an executional knowledge but also the logical and operational knowledge of the technology therein.

Course Outline

Introduction to Python

Introduction

1 – Installing Python

2 – Numbers

3 – Strings

4 – Slicing up Strings

5 – Lists

6 – Installing PyCharm

Conditional Statements

7 – if elif else

Iterations

8 – for

9 – Range and While

10 – Comments and Break

11 – Continue

Functions

12 – Functions

13 – Return Values

14 – Default Values for Arguments

15 – Variable Scope

16 – Keyword Arguments

17 – Flexible Number of Arguments

18 – Unpacking Arguments

19 – My trip to Walmart and Sets

20 – Dictionary

21 – Modules

Playing with Requests and Files

22 – Download an Image from the Web

23 – How to Read and Write Files

24 – Downloading Files from the Web

Exceptions

28 – Exceptions

Object Oriented Programs

29 – Classes and Objects

30 – init

31 – Class vs Instance Variables

32 – Inheritance

33 – Multiple Inheritance

34 – threading

Playing around with Python

35 – Unpack List or Tuples

36 – Zip (and yeast infection story)

37 – Lamdba

38 – Min, Max, and Sorting Dictionaries

39 – Pillow

40 – Cropping Images

41 – Combine Images Together

42 – Getting Individual Channels

43 – Awesome Merge Effect

44 – Basic Transformations

45 – Modes and Filters

46 – struct

47 – map

48 – Bitwise Operators

49 – Finding Largest or Smallest Items

50 – Dictionary Calculations

51 – Finding Most Frequent Items

52 – Dictionary Multiple Key Sort

53 – Sorting Custom Objects

Add Ons:

54 – Database Connectivity and Querying for MySQL

55 – Quick look into Regular Expressions

56 – Playing around with REST API

Writing a Web Crawler

Natural Language Processing and NLTK

Introduction to NLP (examples in Python of course)

  1. Simple Text Manipulation
    1. Searching Text
    2. Counting Words
    3. Splitting Texts into Words
    4. Lexical dispersion
  2. Processing complex structures
    1. Representing text in Lists
    2. Indexing Lists
    3. Collocations
    4. Bigrams
    5. Frequency Distributions
    6. Conditionals with Words
    7. Comparing Words (startswith, endswith, islower, isalpha, etc…)
  3. Natural Language Understanding
    1. Word Sense Disambiguation
    2. Pronoun Resolution
  4. Machine translations (statistical, rule based, literal, etc…)
  5. Exercises

NLP in Python in examples

  1. Accessing Text Corpora and Lexical Resources
    1. Common sources for corpora
    2. Conditional Frequency Distributions
    3. Counting Words by Genre
    4. Creating own corpus
    5. Pronouncing Dictionary
    6. Shoebox and Toolbox Lexicons
    7. Senses and Synonyms
    8. Hierarchies
    9. Lexical Relations: Meronyms, Holonyms
    10. Semantic Similarity
  2. Processing Raw Text
    1. Priting
    2. struncating
    3. extracting parts of string
    4. accessing individual charaters
    5. searching, replacing, spliting, joining, indexing, etc…
    6. using regular expressions
    7. detecting word patterns
    8. stemming
    9. tokenization
    10. normalization of text
    11. Word Segmentation (especially in Chinese)
  3. Categorizing and Tagging Words
    1. Tagged Corpora
    2. Tagged Tokens
    3. Part-of-Speech Tagset
    4. Python Dictionaries
    5. Words to Propertieis mapping
    6. Automatic Tagging
    7. Determining the Category of a Word (Morphological, Syntactic, Semantic)
  4. Text Classification (Machine Learning)
    1. Supervised Classification
    2. Sentence Segmentation
    3. Cross Validation
    4. Decision Trees
  5. Extracting Information from Text
    1. Chunking
    2. Chinking
    3. Tags vs Trees
  6. Analyzing Sentence Structure
    1. Context Free Grammar
    2. Parsers
  7. Building Feature Based Grammars
    1. Grammatical Features
    2. Processing Feature Structures
  8. Analyzing the Meaning of Sentences
    1. Semantics and Logic
    2. Propositional Logic
    3. First-Order Logic
    4. Discourse Semantics
  9. Managing Linguistic Data
    1. Data Formats (Lexicon vs Text)
    2. Metadata

Artificial Intelligence – the most applied stuff – Data Analysis + Distributed AI + NLP Training Course

Duration

21 hours (usually 3 days including breaks)

Overview

This course is aimed at developers and data scientists who wish to understand and implement AI within their applications. Special focus is given to Data Analysis, Distributed AI and NLP.

Course Outline

  1. Distribution big data
    1. Data mining methods (training single systems + distributed prediction: traditional machine learning algorithms + Mapreduce distributed prediction)
    2. Apache Spark MLlib
  2. Recommendations and Advertising:
    1. Natural language
    2. Text clustering, text categorization (labeling), synonyms
    3. User profile restore, labeling system
    4. Recommended algorithms
    5. Insuring the accuracy of “lift” between and within categories
    6. How to create closed loops for recommendation algorithms
  3. Logical regression, RankingSVM,
  4. Feature recognition (deep learning and automatic feature recognition for graphics)
  5. Natural language
    1. Chinese word segmentation
    2. Theme model (text clustering)
    3. Text classification
    4. Extract keywords
    5. Semantic analysis, semantic parser, word2vec (vector to word)
    6. RNN long-term memory (TSTM) architecture

Natural Language Processing (NLP) Training Course

Duration

21 hours (usually 3 days including breaks)

Requirements

No background in NLP is required.

Required: Familiarity with any programming language (Java, Python, PHP, VBA, etc…).

Expected: Reasonable maths skills (A-level standard), especially in probability, statistics and calculus.

Beneficial: Familiarity with regular expressions.

Overview

This course has been designed for people interested in extracting meaning from written English text, though the knowledge can be applied to other human languages as well.

The course will cover how to make use of text written by humans, such as  blog posts, tweets, etc…

For example, an analyst can set up an algorithm which will reach a conclusion automatically based on extensive data source.

Course Outline

Short Introduction to NLP methods

  • word and sentence tokenization
  • text classification
  • sentiment analysis
  • spelling correction
  • information extraction
  • parsing
  • meaning extraction
  • question answering

Overview of NLP theory

  • probability
  • statistics
  • machine learning
  • n-gram language modeling
  • naive bayes
  • maxent classifiers
  • sequence models (Hidden Markov Models)
  • probabilistic dependency
  • constituent parsing
  • vector-space models of meaning