ETL, ELT or ELTL, what’s the big difference

Last monday, my collegue blogger, Ronald Damhof, wrote two article about the T (Transformation) and his view / vision on the location of this (big) T in a data warehouse architecture (article 1). The second article also emphasized the role of the TOOLs in this market (article 2) and how they might contribute to the big T theory.

First I must admit that reading this posts was a recognition of a lot Data Vault principles. Ronald is a true Data Vault adapt in that way (Dan will be proud of you I guess, but don’t mistake the words with the archormodelling principles ;-) ). That makes it funny, but makes me wonder as well; are you using the Anchormodelling principles on your job, or are you using the DV words? Why did you choose to use these words in the BLOG, if you had the opportunity to use DV words as well…

I fully agree with Ronald that the T should be after the L of the Enterprise Data Warehouse. Like Dan would like to say, “I want the single version of the fact, because there is NO single version of the Truth”. This is because every person might have a slightly different approach to the same e.g. KPI or subject. Registrations / Operational data on the other hand don’t have that kind of intepretation / context problems. The Operational System just has to work 24/7 for the Business and e.g. firstname or DOB (date of birth) have no other meaning in diffent contexts.

I also recognise the nice architectural approach taken by Ronald on the EDW+ layer/instance. From the source landing area, data is flowing towards a data distribution and staging layer and consolidated in the stable layer. I see the staging layer as the source structured layer where the first step in Change Data Capture is done, before it is consolidated in the stable layer (which holds all unTransformed data). To have re-usable pieces of information, you could indeed use an ETL+ layer. This way we would create a efficient and effective layer which would combine the SINGLE VERSION of THE FACT with the GENERAL VERSION OF THE TRUTH !!!

The only thing I disagree on with Ronald is the part about the tools. When creating the layer model you will see the target (DV) model is almost always the same (at least for the HUBs and LINKs), which means they can be generated by ETL tools if configurated the right way. We could put the pressure on the developers of the tools to create these DV templates, but the most ETL developers in the field should have at least one or two DV templates for e.g. SAS / Informatica / OWB / AbInitio. This process untill the EDW (or even EDW+) isn’t that difficult it needs nifty ETL tools, it can even be done with the simplest SQL scripts i.m.h.o. The BIG T between EDW (or EDW+) and the datamart(s) is where the ETL tools deliver their strength for creating individual T processes easily by using the Graphical User Interface.

Maybe my vision is something to discuss in a nice new topic on Business Intelligence and Data Warehousing. In my opinion we could ‘build’/develop standard ETL processes for contributing to the DV thought of Dan Linstedt and benefit of these standard templates for Data Warehousing in the 21st century (using the Data Vault principles)

You can leave a response, or trackback from your own site.

4 Responses to “ETL, ELT or ELTL, what’s the big difference”

  1. He Walter – Just read your post.

    Dont think we disagree. ETL vendors should wake up and smell the (standard modelling) roses….They dont for now. Just supplement ETL libraries with a HUB, link and SAT load…is it that hard?

    Even creating a kimball like dimension can be generated when the big T is minimal. ETL vendors…wake up.

    The thing I mention is about model driven generation tools that seem to generate models and process instructions (e.g. ETL). It’s amazing to see where these tools position the big T (with me I position it always between the hub and the spoke). They just act like the Big T is not there and request for the customer to put in already-transformed data before the model driven tooling can start with it….

    Well Duh – I say…..That’s just plain nasty selling.

  2. Ronald,

    We fully agree on the TOOLS (after reading your comment(s)). But the thing I want to emphasise is the fact that the TOOL developers are lazy and we, as the true developers, should indeed wake up the TOOL developers to deliver the DV loads as STANDAARD templates and even working KIMBALL processes (at the moment we use OWB WITHOUT the SCD mechnism because it’s not bugfree and it is TOO hard to use).

    I was talking to developers the other day about the strength of ETL tools. The strength of these tools is not within the tools in my opinion. At the moment my feeling is the developer that works with the tool and underlying layer has the power to create GOOD ETL processes / mappings / procedures. Without the knowledge what the tool does underneath the surface, you are creating crappy ETL. That’s why I mentioned the fact about using plain SQL statements. You have control over these statements without the overhead of the ETL tool (did you ever create an Oracle Warehouse Builder mapping and deployed it … You should see the difference in real code (needed for the ETL) and the overhead (needed for the ETL tool to do monitoring and control on the mapping / job).

    So concluding; we need to shake the ETL tools awake and let them smell the power of a GOOD ETL developer. They need to understand the business and not dictate the business. In this example it is again the old fashioned IT versus Business discussion… Who is in charge? The business should be able to work with the product and benefit from it, IT (ETL tools) should only be facilitating to this prupose / goal and not dictating the rules!!!

    BACK TO THE DRAWING BOARD !!! or like we use to say in the Netherlands “Vroegah was alles beter, toen had je nog krachtige programmeertalen die deden wat je zei dat ze moesten doen…”

  3. Matt Casters says:

    I find it interesting that you figured out the main drawback of a number of ELT tools (OWB & all): they are indeed technology driven, technology enabled and technology limited when in fact they should help you to deploy your data modeled through requirements.
    To go from that insight to pointing out that it’s better to just code everything manually is a bridge too far IMHO.
    Please don’t let your understandably (!) negative experiences with OWB lead you to the wrong conclusions.

  4. Shahid says:

    Hi Matt,
    Nice discussion, I will stay tuned to your website for further on this….

    Regards,
    Shahid.

Leave a Reply

Powered by WordPress & The Best MLM Companies