Browse: Home0. General Wisdom → Modelling Business Data the Vault Way

Modelling Business Data the Vault Way

In my post about modelling the RECORD SOURCE in a data-vault-like way, in 2007 (See POST), I got a reaction by Rob Mol (see Comments on the POST). I reacted by posting my answers and conclusions to his views (See POST).

One of my colleagues in the Business, Ronald Damhof, posted on his BLOG a reaction to this POST (see his article). His view is that the business metadata is data, just like the ‘normal’ data regarding e.g. Marketing or Finance. It could even be handled like the ‘normal’ data, and can be stored in a Data Vault-like model (TEMPORAL, EXTENSIBLE, NORMALIZED). I agree with him on that, but one of the question that remains is how these information streams should be linked :-) ?

We are able to built Data Vault models for the Business Intelligence needs, and visualise the CSF’s and dashboards for the end users. We can even built Data Vault models for storing the Business Metadata, which in fact brings up a memory to an other project of mine, where the customer wanted to store this Business Metadata in a central spot. I have done an toolinvestigation (very small), and we found out that the Microsoft Sharepoint Solution was the cheapest and best-fitting solution, since the other reports where also built towards a Microsoft Sharepoint Portal (Using Cognos Cubes / Reports). Unfortunately this business metadata was not modelled in the data vault way, so … the solution was not used (until recently they started to ask for this again). Writing this experience I come up with an other thought / insight / question; how do we handle unstructured data in the data vault model?

So there are multiple questions in my opinion to answer;

  1. Do we need to link ;-) the Business Metadata to the ‘normal’ data?
  2. If the first questions is answered positively, how would we relate the Business Metadata to the ‘normal’ datamodel?
  3. How does the Data Vault model cope with unstructured data, e.g. WORD, PDF, Pictures, Sound, etc.?

To answer the first question from my own perspective; I think it would be wise to do so (I think Ronald agrees with me on that). It gives context to the ‘normal’ data and provides valuable information about ownership, maintenance, definition changes, etc. => Data Lineage: How did the element I am looking at originate from the source after all.

The second question is very tricky in my opinion. If we use for instance the RECORD SOURCE as an example, we are not allowed to LINK to the SATs, where the RECORD SOURCE is used, also linking to the LINK would result in a recursive LINK (the RECORD SOURCE of the RECORD SOURCE of the LINK…). So, if we follow the discussion about RECORD SOURCES on this BLOG and the BLOG by Ronald Damhof, we conclude to use the natural key for RECORD SOURCES instead of LINKING to a RECORD SOURCE HUB. Thinking about this, I conclude that the DATA VAULT model for the ‘normal’ data, is a source in itself for the Business Metadata for this item. If we want to know more about the RECORD SOURCE, we just create a stand alone HUB in an other DATA VAULT model and LOAD it with all unique natural keys in the DATA VAULT model for the ‘normal’ data.

RELATED

SHARE

2 COMMENTS

Good discussion Walter. Especially your observation: ‘But these are all metadata and perhaps do not belong in the Data Vault Model. Let’s ask the question to the user / customer and let then decide (I got the feeling it’’s only maintenance that can benefit from this information, so maybe it’s better to skip this requirement…)’

I absolutely feel this information as being vital but not vital in the sense that we should connect it to a record-source hub. Aside the fact that the loading of this meta information is cumbersome, I feel that it does not serve the purpose.

So how is this meta-data vital? Well – every object in your warehouse should be made meaningfull in terms of definition, domain-values, data-owner, ‘whatever-the-business-needs-to-in-terms-of-context-to-understand-the-meaning-of-the-objects’.

I call this Business metadata and I propose a Data vault like model for this kind of data (peeps need to report/query on it). However – this business metadata should ideally come from the source systems – but source systems nowadays does generally not provide this kind of metadata (although with SOA-like architectures this should be the case). So the business metadata-owner should maintain this data.

The Data Vault is excellent in storing this kind of metadata:
- it’s temporal (e.g. what was the definition a year ago – who altered it – etc? – what was the version of the record source then!!!)
- it’s extensible (You wanna have a model that can evolve. You wanna add attribute, construct new taxonomies)
- It’s normalized (you wanna define each object once and be able to link it to verious location in the warehouse)

So…my point; you are right in saying it’s vital information. You are also right in concluding that relating it to the hubs, aint the solution.

But we must store it….Business metadata…..there are no silver bullets here – you have to design it as an integral part of the warehouse.The Data Vault is truelly excellent for this purpose.

Oh by the way…..to make it even more complex (or not…). Handle your business metadata the same way as you handle your ‘normal’ data…

Like your post Walter…

March 13, March at 20:55

Just wanted to let you guys know that we’ve put up a site where we will be putting up all the information about anchor modeling here: http://www.anchormodeling.com

September 10, September at 17:09

Leave a comment

© 2009 NEWSPRESS. POWERED BY WORDPRESS AND WPCRUNCHY.
Design by Acai Berry. In collaboration with Online Gambling, Credit Repair and spielautomat