Duration
21 hours (usually 3 days including breaks)
Requirements
No background in NLP is required.
Required: Familiarity with any programming language (Java, Python, PHP, VBA, etc…).
Expected: Reasonable maths skills (A-level standard), especially in probability, statistics and calculus.
Beneficial: Familiarity with regular expressions.
Overview
This course has been designed for people interested in extracting meaning from written English text, though the knowledge can be applied to other human languages as well.
The course will cover how to make use of text written by humans, such as blog posts, tweets, etc…
For example, an analyst can set up an algorithm which will reach a conclusion automatically based on extensive data source.
Course Outline
Short Introduction to NLP methods
- word and sentence tokenization
- text classification
- sentiment analysis
- spelling correction
- information extraction
- parsing
- meaning extraction
- question answering
Overview of NLP theory
- probability
- statistics
- machine learning
- n-gram language modeling
- naive bayes
- maxent classifiers
- sequence models (Hidden Markov Models)
- probabilistic dependency
- constituent parsing
- vector-space models of meaning