Duration
21 hours (usually 3 days including breaks)
Requirements
Fair knowledge about relational data structures, SQL
Overview
Course can be provided with any tools, including free open-source data mining software and applications
Course Outline
Introduction
- Data mining as the analysis step of the KDD process (“Knowledge Discovery in Databases”)
- Subfield of computer science
- Discovering patterns in large data sets
Sources of methods
- Artificial intelligence
- Machine learning
- Statistics
- Database systems
What is involved?
- Database and data management aspects
- Data pre-processing
- Model and inference considerations
- Interestingness metrics
- Complexity considerations
- Post-processing of discovered structures
- Visualization
- Online updating
Data mining main tasks
- Automatic or semi-automatic analysis of large quantities of data
- Extracting previously unknown interesting patterns
- groups of data records (cluster analysis)
- unusual records (anomaly detection)
- dependencies (association rule mining)
Data mining
- Anomaly detection (Outlier/change/deviation detection)
- Association rule learning (Dependency modeling)
- Clustering
- Classification
- Regression
- Summarization
Use and applications
- Able Danger
- Behavioral analytics
- Business analytics
- Cross Industry Standard Process for Data Mining
- Customer analytics
- Data mining in agriculture
- Data mining in meteorology
- Educational data mining
- Human genetic clustering
- Inference attack
- Java Data Mining
- Open-source intelligence
- Path analysis (computing)
- Reactive business intelligence