Duration
35 hours (usually 5 days including breaks)
Requirements
Analytical thinking approach.
Basics of statistics and mathematical analysis.
Overview
KNIME is a free and open-source data analytics, reporting and integration platform. KNIME integrates various components for machine learning and data mining through its modular data pipelining concept. A graphical user interface and use of JDBC allows assembly of nodes blending different data sources, including preprocessing (ETL: Extraction, Transformation, Loading), for modeling, data analysis and visualization without, or with only minimal, programming. To some extent as advanced analytics tool KNIME can be considered as a SAS alternative.
Since 2006, KNIME has been used in pharmaceutical research, it also used in other areas like CRM customer data analysis, business intelligence and financial data analysis.
Course Outline
- Introduction to data processing and data analysis
- Fundamental information of KNIME platform
- Installation and configuration
- Overview of the interface
- Discussion of tool integration
- Building workflows
- Methodology of creating business models and data modeling
- Documentation
- import and export workflows
- Basic nodes
- Design ETL processes
- Data mining
- Data Import
- from files
- from relational databases using SQL
- creating SQL queries
- Advanced nodes
- Data analysis:
- data preparation
- data check-up
- statistical data examination
- data modeling
- Introduction to Flow Variables and Loops
- Advanced process automation
- Visualization Features
- Open source data sources
- Data mining basics
- selected types of Data Mining tasks and processes
- Getting more knowlegde from data
- Web Mining
- SNA
- Text Mining
- Data visualization on graphs
- Install Extensions and Integrations
- R
- Java
- Python
- Gephi
- Neo4j
- Reporting
- Overview
- BIRT Integration
- KNIME WebPortal
- Conclusion and Q&A session