21 hours (usually 3 days including breaks)
- Some familiarity with programming.
- Linguists and programmers
It is estimated that unstructured data accounts for more than 90 percent of all data, much of it in the form of text. Blog posts, tweets, social media, and other digital publications continuously add to this growing body of data.
This instructor-led, live course centers around extracting insights and meaning from this data. Utilizing the R Language and Natural Language Processing (NLP) libraries, we combine concepts and techniques from computer science, artificial intelligence, and computational linguistics to algorithmically understand the meaning behind text data. Data samples are available in various languages per customer requirements.
By the end of this training participants will be able to prepare data sets (large and small) from disparate sources, then apply the right algorithms to analyze and report on its significance.
Format of the Course
- Part lecture, part discussion, heavy hands-on practice, occasional tests to gauge understanding
- NLP and R vs Python
Installing and Configuring R Studio
Installing R Packages Related to Natural Language Processing (NLP)
An Overview of R’s Text Manipulation Capabilities
Getting Started with an NLP Project in R
Reading and Importing Data Files into R
Text Manipulation with R
Document Clustering in R
Parts of Speech Tagging in R
Sentence Parsing in R
Working with Regular Expressions in R
Named-Entity Recognition in R
Topic Modeling in R
Text Classification in R
Working with Very Large Data Sets
Visualizing Your Results
Integrating R with Other Languages (Java, Python, etc.)
Summary and Conclusion