Dimensional Modeling to Data Vault Evolution

Поділитися
Вставка
  • Опубліковано 4 лис 2024

КОМЕНТАРІ • 63

  • @avhilash
    @avhilash Місяць тому

    best explanation of moving from dimensional modelling towards data vault in a step by step manner..

  • @bvenkateshx
    @bvenkateshx 4 роки тому +4

    The moment you reverse the business key to reverse the relation.. it is a pragmatic way ti show evolution. Loved this.

  • @vieworacle
    @vieworacle 7 років тому +2

    Crisp, clear and short introductory explanation of the how Data Vault has evolved from Dimensional Modeling. Personal thanks to the engaging and excellent presenter.

    • @TheYourbox
      @TheYourbox 2 роки тому

      Dimensional modelling was SAP. A complete company without any idea how data models woks at all.,

  • @JeFFiNat0R
    @JeFFiNat0R 2 роки тому +2

    Wow this was so easy to understand. Great story telling and explanation. Thank you!!

  • @harfenueglebarfenuegle8192
    @harfenueglebarfenuegle8192 Рік тому

    Thank you. This is the best explanation of the problem that Data Vault actually solves.

  • @EMB3D
    @EMB3D 3 роки тому

    this is one of the most important video for one who comes from a the kimbal world ...

  • @habibjanjua
    @habibjanjua 3 роки тому

    Best video on adding new source attributes

  • @pamelasprague5814
    @pamelasprague5814 4 роки тому +3

    Brilliant! So helpful in helping me to understand *both* modeling techniques.

  • @bhanu4j
    @bhanu4j 11 років тому +10

    Thanks for sharing the concept. My queries are:
    How is this different from Snowflake?
    The detail lies is implementation because of more joins and hence performance for faster read.

  • @DataVaultAcademy
    @DataVaultAcademy  11 років тому +6

    Data Vault is typically used for the main DW layer. This layer is integrated, non-volatile, time-variant, and subject oriented (@ Inmon) and auditable, agile, atomic etc. The business meaning implied by the "conformed" dimension in this example represents the common business rule transformations. These are intended to be completed within the DW layer as part of the target or BDW area. In the same layer there is also a RAW area (traceability).
    The Data Marts remain Dimensional Models.

  • @boseashish
    @boseashish 17 днів тому

    Mr. Hans, your explanations and videos are always better than those of other experts in DV. However, I have a question - how does it impact the ETL in case of dimension model appends vs. DV satellite additions?

  • @paulohsgoes1959
    @paulohsgoes1959 7 років тому

    Short, straightforward and objective as every great work should be done. Congrats

  • @robwozniak9912
    @robwozniak9912 2 роки тому

    Thank you. This made it real and practical for me.

  • @rupanag
    @rupanag 8 місяців тому

    Can you please help me with these . Would be great if you can explain with some examples.
    1. Why link to link relationship is not recommended in RDV?
    2. In BDV bridge table, if we are storing only hash keys( not natural keys), then how in fact/dimension we are going to get natural keys?

  • @SolucaoEmAberto
    @SolucaoEmAberto 10 років тому

    Fantastic! It just solved the DV to DM on the Dimensions!!! The dimension is a plain cross product between the satellites on the hub!

  • @chucka6415
    @chucka6415 2 роки тому

    I love this great presentation.

  • @swaminaidu9454
    @swaminaidu9454 8 місяців тому

    Awesome explanation. Great. Thank you

  • @Nesikme
    @Nesikme 8 місяців тому

    Brilliant explanation !!!! Thank you very much

  • @AnVinhNguyen
    @AnVinhNguyen 8 років тому +4

    absolutely stunning explanation. Thank you a lot

  • @MartijnEuyen
    @MartijnEuyen 10 років тому

    Very good! Very clear story on DV modeling - thanks a lot for sharing.

  • @SathishKumar-gj6te
    @SathishKumar-gj6te 8 років тому +1

    Thanks for the Great Introduction into Data Vault. The videos in the academy are great. Appreciate if you could put a logical structure of Videos that can be read on a sequential basis

  • @emilagaev5201
    @emilagaev5201 Рік тому

    great explanation, thanks a lot

  • @phaninany
    @phaninany 4 роки тому

    How can we design the already connected dimension tables to the fact table in Data vault ? You were just explaining about the extended dimensions with vault

  • @aztec9424
    @aztec9424 4 роки тому

    Pretty good explanation on DV. Though it talks about Hub and Satellites it doesn't seem to talk about the Links. Is there a separate video on that? Covering Links would make this topic complete in my opinion.

    • @HansHultgren_GA
      @HansHultgren_GA 4 роки тому +1

      This particular topic didn't concern links however there are other videos in this channel that do. Also you can find more on the Genesee Academy youtube channel.

  • @gitanjalikhanna3009
    @gitanjalikhanna3009 7 років тому +1

    Thanks very much for this clear explanation. Could you please also create a video explaining how FACTS are derived from DV or vice versa? Thanks

  • @kimsung2384
    @kimsung2384 4 роки тому +3

    I could see this becoming a performance nightmare ...

  • @SSSoM3
    @SSSoM3 6 років тому

    Great and easy to watch video, big thanks

  • @PCoetzee24
    @PCoetzee24 6 років тому

    Very clear explanation. Thanks! Simple Question: If I aggregate Sales upwards, via the link tables, to a unique customer contained within the hub, how do I know what version of a Satellite record to use. A unique Hub record can now have multiple Satellite records by having different Load Dates.

    • @PCoetzee24
      @PCoetzee24 6 років тому

      An SCD2 is painful to maintain and use, but very effective at tracking history, as new versions have unique surrogates that can be associated with transactions. The hub-Sat model does not associate different satellite versions for the same hub record to a transaction. Only to the unique hub record.
      So If I use a Sat table to track changes to a specific customer contained in the hub, I cannot see what transactions occurred when during the life of the customer.
      Is my understanding correct??

  • @haniattar
    @haniattar 5 років тому +1

    very well said.. I really enjoyed this sessions, and learned a lot.

  • @dsinghr
    @dsinghr 4 роки тому

    best 11.23 mins i spent today

  • @nickr2957
    @nickr2957 4 роки тому

    Very good. Thank you.

  • @beemac79
    @beemac79 7 років тому

    Really well explained, great example!

  • @shubhamg931
    @shubhamg931 3 роки тому

    Awesome!

  • @barrybatsma8947
    @barrybatsma8947 6 років тому

    Great explanation!

  • @4suc6
    @4suc6 5 років тому +1

    This approach is no way fundamentally different from dimensional modeling. Just a variation. If a new hub comes, you still have to change all the link tables . Small changes may be are easier but Ralph never said you must prop everything in a giant dimension. This is more or less what DV improves, isolating attributes based on the same frequency of changes (satellites) and linking this to a dimension (hub) instead of direct link to the facts tables like classical MDM does. Makes sense, but packaging this like new modeling paradigm and calling it all different 😂😂😂😂. Presentation would benefit from talking about frequency of changes grouping. This is the essential part

  • @gauravnist438
    @gauravnist438 9 років тому

    It is really a great lecture

  • @aktharhussain6342
    @aktharhussain6342 11 років тому

    v good. i have just started reading on data vaults
    akthar

  • @farwanqv
    @farwanqv 6 років тому

    I didn't get if all or any of the smaller customer dimensions are type 2 dimensions?

  • @anishgopinath6801
    @anishgopinath6801 6 років тому

    wowow!!! simple and perfect explanation

  • @nareshbuchuk
    @nareshbuchuk 6 років тому

    This is Great quickly digestible

  • @Palthewall
    @Palthewall 6 років тому

    It's very similar to EAV ( Entity Attribute Value) Model

  • @alexanderyelich817
    @alexanderyelich817 3 роки тому

    THANKS!!!

  • @terminalrecluse
    @terminalrecluse 8 місяців тому

    Is this really the evolution of star or Snowflake schema’d DWH design? A data vault modeled DWH would require far more joins and likely perform poorly. What about that? Whereas a single dimensional model has fewer tables is easier to partition faster to query.

  • @eugenea8264
    @eugenea8264 3 роки тому +5

    No, no, no, no... No on so many levels. And people cannot see the lecturer is making mistakes because he is giving no examples. The final structure you are making is turning dimension table into a fact table and you would have a record there for every customer. So basically if you have a million customers, you would have a million records there. If you have half a million customers from Melbourne you would end up writing "Melbourne" half a million times. Good luck with aggregating data by value.
    But that is only half a problem.
    Say you want to add a "gender" value to your customer. On your initial structure you would need to go through a million records of customers and add a new value to a new field. And you say it is difficult. And you offer an easier way, you create a new table and you add the gender there. So you create a new table with a key for every customer and then and a value of their gender, that would take exactly the same amount of calculation - it would be 1 million records processed. Where is a simplification?

    • @romeoalexander6175
      @romeoalexander6175 2 роки тому

      Melbourne would not be written a million times. What this video did not mention is that there would be an intermediate link table. That is, there would be a customer-location link table with a million records linking those customers to the same (integer) id. Then there is a final location SAT table that contains a single record associating that id with “Melbourne”.
      As for your second point, the benefits come from modularity. Instead of having to modify a “dim_customers” file every time you want to add new attributes, you can write brand new “sat_customer_location” and “link_customer_location” files without modifying the original files. In a fast paced agile environment, writing new files is much easier than modifying existing files.

  • @saravanan2011YOUable
    @saravanan2011YOUable 5 років тому

    Perfect but It seems 3NF like hybrid approach.

  • @botrrun9399
    @botrrun9399 4 роки тому

    are there any modeling tools that support Data Vault?

  • @kshitijpathak8646
    @kshitijpathak8646 2 роки тому +1

    I am not convinced that data vault modeling is the evolution of dimensional modeling. The main purpose for dimensional modeling ( denormalizing) is to reduce the number of joins, make it less expensive and hence get performance gain while querying reports in realtime. With the above data vault approach, you are again increasing the number of joins ( seems like you are again somewhat normalizing). Please let me know if my logic is correct.

    • @rh.m6660
      @rh.m6660 3 місяці тому

      Really late answer but the dimensional vs vault discussion is not relevant. Many times if not always we need a semantic dimensional layer on a vault model. Vault is not just a modelling technique but also an architectural direction. It only works best with cdc and business keys on source. The beautiful thing about Vault is that it somewhat integrates mdm into the model part, stores history almost by default.

  • @ranurobin
    @ranurobin 5 років тому

    Good one

  • @maverickrider4591
    @maverickrider4591 5 років тому

    Thank You

  • @amitshrestha2165
    @amitshrestha2165 11 років тому

    Well done.

  • @TheYourbox
    @TheYourbox 2 роки тому

    Don't you understand how ridiculous dimensional modelling is compared to 3rdNF. Dimensional modelling is simple nonsense. Data Vault is a fair good concept, but very related to 3rdNF. I like Data Vault, because you can do things in parallel, but dimensional modelling is simple pure nonsense. One of those unprecedented failures in IT.

  • @vickysharma2949
    @vickysharma2949 Рік тому

    wonderfully explained!!