Introduction
Overview of Data Cleaning
- Why is Data Cleaning Important?
Case Study: When Big Data Is Dirty
Developing A Thorough Data Cleaning Strategy
Common Data Cleaning Tools
- Drake
- OpenRefine
- Pandas (for Python)
- Dplyr (for R)
Achieving High Data Integrity
- Complete
- Correct
- Accurate
- Relevant
- Consistent
Automating the Data Cleaning Process
Monitoring Your Data Cleaning System
Summary and Conclusion