Getting Certified as a Data Vault Engineer – Day 1

Today I started a course on the data modelling technique Data Vault by Dan Linstedt. From the perspective of being a teacher of Dimensional Modelling (Ralph Kimball) for the Capgemini Academy, this is a interesting course, because Data Vault claims to take both the advantages from Dimensional Modelling and 3NF modelling. Beside this teaching of the dimensional model, I am currently working at a customer which is implementing the IBM Banking Data Warehouse Model for their Enterprise Data Warehouse.

Dan Linstedt himself and Hans Hultgren, from the Genesee Academy, are in The Netherlands this week, to spread the word about "THE VAULT". They are teaching the methodology / vision to the Dutch Tax Autority, since they are going to use the Data Vault architecture for their Enterprise Data Warehouse (thanks to Ronald Damhof), and I had the opportunity to attend this course as well. Officially the course consists of two parts in the US, it has been slightly altered to meet the Dutch requirements. After three days of courses, we will be certified Data Vault Engineers and mentioned on the Genesee Academy website if we pass the certification test.


The course started with a business view on "The Vault". Dan is clear about his ancestors, the Data Vault is based on two founded methodologies ("I stand on the shoulders of two giants."). Data Vault will not come in place of the 3NF or Dimensional Modelling, it contributes to these methods at Enterprise Data Warehouse level. They all have their advantages and disadvantages, and they all have their own target audience in my opinion.

 

Defining Data Vault:

The Data Vault is a;

  • Detail Oriented
  • Historical Tracking
  • Uniquely Linked Set of
  • Normalised Tables, that
  • Supports one or more Functional Business Areas.

 

The design of Data Vault is specifically for the Enterprise Data Warehouse, and the DV architecture is flexible, scalable, consistent and adaptable to the needs of the enterprise (taken from an TDAN article).

The Data Vault Architecture relies on only three basic elements;

  1. HUB
    • Any business concept of value to this business identified by a key.
  2. LINK
    • Physical representation of a many-to-many relationship aka Transaction / Events.
  3. SATELITE
    • HUB / LINK key descriptive information. All Type 2 dimensional stuff goes in here.

During this first day of the course we started from a business perspective to look at the Data Vault Architecture and Concept. In the next days we’ll dive into the technical details about the Data Vault creation (and thereby the Enterprise Data Warehouse), and how information is extracted out of this Enterprise Data Warehouse. My ears and eyes are wide open for this ‘new’ vision and "One integrated statement of the facts".

My questions after reading the material on Dan’s website and listening to him on the first day;

  • How would Data Vault handle / cope with REVIVAL / REUSE of business keys?
  • Why is there not a future dated ENDDATE in the SATELITES?
  • How would Data Vault handle / cope with multiple identifiers?
    • SALES uses 123ABC
    • MARKETING uses ABD467
    • LOGISTICS uses DEFx175
    • And all are refering to the same person…
  • How would Data Vault handle / cope with multiple source systems delivering the same information?
  • How would Data Vault handle / cope with versioning of the EDW data model?
  • How would Data Vault handle / cope with valuta conversions?
  • Does Data Vault have a solutions / is it recommended to change the source system to the primary one?
  • How does Model Driven Architecture fit into place with Data Vault?

I hope all these questions and maybe more will be answered in the next two days…

 

 

You can leave a response, or trackback from your own site.

4 Responses to “Getting Certified as a Data Vault Engineer – Day 1”

  1. This sounds very interesting. I am just getting familiar with the concepts of DataVault (the popularity seems to be rising), but it is a very promising concept.

  2. Walter says:

    It sure is a nice architectural view on how enterprise data warehouses should connect / follow the business and by that represent the business value of all the information kept within the organisation.

  3. Rob says:

    Nide blog.
    Are all your questions from the first day answered yet?
    Well some need some more discussion (like Model Driven Architecture).
    Lets find some way to discuss those.

  4. Daniel says:

    I couldn’t understand some parts of this article Certified as a Data Vault Engineer – Day 1 | Dutch Business Intelligence Blog, but I guess I just need to check some more resources regarding this, because it sounds interesting.

Leave a Reply

Powered by WordPress & The Best MLM Companies