Comparing 3 Types of Data Modeling (Normalized vs Star Schema vs Data Vault)

Поділитися
Вставка
  • Опубліковано 20 жов 2024

КОМЕНТАРІ • 20

  • @KahanDataSolutions
    @KahanDataSolutions  Рік тому

    Get my Modern Data Essentials training (for free) & start building more reliable data architectures
    www.ModernDataCommunity.com

  • @Milhouse77BS
    @Milhouse77BS Рік тому +2

    You had me at star schema. Thanks, a nice and short introduction.

  • @Mirminman
    @Mirminman Рік тому +2

    Nice video! I have a a question:
    Is it necessary to use FK in the fact tables of dimensional or denormalized models?
    I saw this question in many forums on the web and many say "YES", arguing that this allows the data models to not load primary keys into the fact tables that are not in the dimension tables, but others say "NO ", arguing that if the OLTP data model is well built (or should be), the validation of the transactional model itself (non-null PK) should be sufficient and the tables of the dimensional model can be focused on the speed of loading the Data Warehouse. without the need to put primary/foreign keys in the tables, just paying attention to defining default values in each dimension (Dummy value) so that the fact table takes it.
    The only reason I really use primary keys, is for my dimension tables where I have business keys and surrogate keys and have to deal with SCD1 /SCD2.
    Is it necessary to add said FK restriction to those dimensions in the fact?
    Greetings,

    • @ivani3237
      @ivani3237 9 місяців тому

      no need, especially maybe you know maybe not - snowflake for example doesn't even have a enabled FK constraints (only disabled)

  • @IronmanMahesh
    @IronmanMahesh Рік тому +1

    Hope you make more videos on data vault topic covering with good examples

  • @kbdub16
    @kbdub16 Рік тому

    @KahanDataSolutions - great videos, thank you for them. Question per Link tables, are they natively handled by graph databases? Link tables in a data-vault seem like something a graph database would naturally be designed for, no? and further, one could relatively easily conceptualize mapping document based structures in a doc db to vector based structures in a vector/graph db, no?

  • @luvincste
    @luvincste Рік тому

    the characteristic of the data vault model is not clear to me, would these elements be each in a separate database? like what is distinguishing it from a normalized db

  • @saltrocklamp199
    @saltrocklamp199 9 місяців тому

    What's the "previous video" mentioned? I see the one specifically about the Kimball method, but not the others.

    • @KahanDataSolutions
      @KahanDataSolutions  9 місяців тому +1

      ua-cam.com/video/IdCmMkQLvGA/v-deo.htmlsi=3YjxTwRk7BQlwvi5

  • @dominhquanho9319
    @dominhquanho9319 2 місяці тому

    So whats the difference between Inkon data warehouse vs traditional relational db?

  • @rachellougee6687
    @rachellougee6687 Рік тому

    I like the data vault example, you explained it well. But I am confused about the Normalized example as I don't see how each of these tables can relate to each other, e.g. Order to OrderItem table, no key to link them. Overall it's a good introduction

    • @KahanDataSolutions
      @KahanDataSolutions  Рік тому +1

      Thanks Rachel! In hindsight, I probably could have found a better example for normalized. But in the real world, each table would (or at leas should) have a unique primary key. Therefore Order would have an OrderId column, OrderItem would have OrderItemId. However, if it does not have one in the source system, you could create what's called a surrogate key and essentially generate a unique column based on all of the other columns in the row.
      In terms of relating to each other - best case scenario is there are ids in the tables. For example, OrderItem table has a column for OrderId that you can join the two together. However if that doesn't exist, you'll have to work with the business to understand what ideally should be matched together so that you can join the tables together accordingly. As an example, maybe you find you need to join on something like Product and Location and Date.
      This can get tricky, but that's the lovely game of data modeling.

    • @rachellougee6687
      @rachellougee6687 Рік тому

      @@KahanDataSolutions thanks Michael. I think I got sidetracked by that specific example. What you just said makes sense. Thank you for the clarification.

    • @KahanDataSolutions
      @KahanDataSolutions  Рік тому

      @@rachellougee6687 You bet, thanks for watching and for the comment!

  • @muralichiyan
    @muralichiyan 4 місяці тому

    Kimbel and star schema both are same..?

    • @saraschandrabaru3618
      @saraschandrabaru3618 12 днів тому

      I see where your confusion comes. The speaker mis-spoke when describing Inmon design. His design has a centralized base which is normalized. From this marts are built. Kimball says marts first and link them via confirmed dimensions. Bottom line, the consumption layer is dimensional both in Inmon and Kimall.

  • @andynelson2340
    @andynelson2340 Рік тому

    I wish there was a more interactive way to learn/practice modeling other than read 10 books and build out 2 models per year at work

    • @KahanDataSolutions
      @KahanDataSolutions  Рік тому +7

      Haha that's too real. The next best option IMO is to create your own mini-version to get the basics.
      For example, I remember when I was first learning, I created my own local SQL Server database w/ diff tables based on my favorite hip hop artists. I created tables for their albums, hometowns, names, etc. Tried to keep it separate as if they were "data source tables". Probably only about 5 or so tables with less than 20 records in each. Literally wrote insert scripts with hardcoded values. But it was enough to play around w/ concepts, form relationships and get the gist. Plus it doubled to help me learn how to actually create tables, add primary/foreign keys/etc.
      Oftentimes you can learn 80-90% of what you need w/ just a small dataset like that vs trying to mimic an enterprise volume - which is what I think most of us think we need and get hung up on. But it's the edge cases & optimizations that you need the work experience for, and cant get in a book. Plus every company has their own nuances.
      Anyways - that was long winded but hope that was helpful or inspires somebody with new ideas

  • @walterwine
    @walterwine Рік тому

    Normalized can also be identified as snowflake schema, no?

    • @ecozturk
      @ecozturk Рік тому

      not really. Snowflake is an extension of the star schema, with a fact table in the center but the dimensions tables around can have ramifications. A simple example to illustrate this : (FACT_table)