Duration
28 hours (usually 4 days including breaks)
Requirements
- An understanding of data warehousing concepts
- An understanding of database and data modeling concepts
Audience
- Data modelers
- Data warehousing specialist
- Business Intelligence specialists
- Data engineers
- Database administrators
Overview
Data Vault Modeling is a database modeling technique that provides long-term historical storage of data that originates from multiple sources. A data vault stores a single version of the facts, or “all the data, all the time”. Its flexible, scalable, consistent and adaptable design encompasses the best aspects of 3rd normal form (3NF) and star schema.
In this instructor-led, live training, participants will learn how to build a Data Vault.
By the end of this training, participants will be able to:
- Understand the architecture and design concepts behind Data Vault 2.0, and its interaction with Big Data, NoSQL and AI.
- Use data vaulting techniques to enable auditing, tracing, and inspection of historical data in a data warehouse.
- Develop a consistent and repeatable ETL (Extract, Transform, Load) process.
- Build and deploy highly scalable and repeatable warehouses.
Format of the course
- Part lecture, part discussion, exercises and heavy hands-on practice
Course Outline
Introduction
- The shortcomings of existing data warehouse data modeling architectures
- Benefits of Data Vault modeling
Overview of Data Vault architecture and design principles
- SEI / CMM / Compliance
Data Vault applications
- Dynamic Data Warehousing
- Exploration Warehousing
- In-Database Data Mining
- Rapid Linking of External Information
Data Vault components
- Hubs, Links, Satellites
Building a Data Vault
Modeling Hubs, Links and Satellites
Data Vault reference rules
How components interact with each other
Modeling and populating a Data Vault
Converting 3NF OLTP to a Data Vault Enterprise Data Warehouse (EDW)
Understanding load dates, end-dates, and join operations
Business keys, relationships, link tables and join techniques
Query techniques
Load processing and query processing
Overview of Matrix Methodology
Getting data into data entities
Loading Hub Entities
Loading Link Entities
Loading Satellites
Using SEI/CMM Level 5 templates to obtain repeatable, reliable, and quantifiable results
Developing a consistent and repeatable ETL (Extract, Transform, Load) process
Building and deploying highly scalable and repeatable warehouses
Closing remarks