Data Modeling in the Modern Data Stack

Поділитися
Вставка
  • Опубліковано 25 лис 2024

КОМЕНТАРІ • 49

  • @KahanDataSolutions
    @KahanDataSolutions  2 роки тому +11

    Deliver more impact w/ modern data tools, without getting overwhelmed
    See how in The Starter Guide for Modern Data → www.kahandatasolutions.com/guide

  • @nlishivha1005
    @nlishivha1005 6 місяців тому

    just started out my career watch the videos 4 months back gave me a high level overview ------------------- came back after a session with my seniors I see the granula details thanks

  • @johncosgrove7727
    @johncosgrove7727 Рік тому +9

    Hey Mike, this video was straight 🔥🔥🔥 - This is such an important topic for all businesses taking the journey right now and it really blows my mind how much understanding of modelling has been lost. Your overview of the pros and cons of each was a well balanced and to-the-point summary - I especially love that you correctly call out the source data and scale risks of OBT. I think too many peeps in the MDS are working for SaaS companies where they don't realise that 1) source systems are way less static in enterprise and 2) most enterprises outside of SaaS aren't just okay with massive pipeline code spaghettis. Great video!

  • @shreshti82
    @shreshti82 Рік тому +1

    Addresses the challenges and thoughts to be taken when going into the cloud. But not enough details on modelling itself from the Data.

  • @nlopedebarrios
    @nlopedebarrios Рік тому +3

    Great summary, thank you! Personally, I'm more comfortable with the Kimball approach, but the "modern data stack" part in the title drove catched my attention, and I think it's important to know what trends align better withe the modern tech. Also, I couldn't agree more than design is a key part of the process, and if done correctly is going to prevent a lot of headaches. It would be nice to get deeper into the hybrid model you presented.

  • @firefoxmetzger9063
    @firefoxmetzger9063 Рік тому +4

    Just because storage is cheap doesn't mean it's unimportant. I/O is the often slowest part of OLAP and ETL queries and directly dictates compute cost. The two current OLAP pricing models (pay by minute or pay by bytes "scanned") both depend on how much data you load per query. The former because you pay for an idle cluster that is waiting for data, the latter because it literally bills you for how much data you load. Data modeling helps you get the compression and partitioning you need to control and minimize that cost.

  • @nicky_rads
    @nicky_rads 2 роки тому +12

    Solid overview! I’ve mainly used dimensional modeling with fact/dim tables.
    A good data model goes a long way within the analytics pipeline, important stuff. Thanks

    • @KahanDataSolutions
      @KahanDataSolutions  2 роки тому +1

      Definitely. Appreciate you sharing your experience

    • @datasleek7950
      @datasleek7950 2 роки тому +2

      I second that. As a original DBA, proper DB modeling (either in OLTP or OLAP) will provide peace of mind in the future.

    • @summer_xo
      @summer_xo 2 роки тому +1

      Get the data model right and the rest falls into place IMO 👍

  • @Joeymbryan
    @Joeymbryan Рік тому

    Wow this is spot on. I've been using the modern data stack for the last ~8 years or so. Starting out, I was definitely focusing on a star schema/denormalized approach with the MPP databases but as I started to learn they do best with fewer joins and can handle wide tables I strive for the OBT approach. In practice, the hybrid is what typically happens, there are so many dimensional tables which very in need from team to team so especially in a larger enterprise, the hybrid is almost a guaranteed.
    Great video! Love the work you did here.

    • @KahanDataSolutions
      @KahanDataSolutions  Рік тому

      Appreciate the feedback! What you describe is really similar to my journey as well.

  • @theconfusedchannel6365
    @theconfusedchannel6365 Рік тому +1

    Good one. I think taking an actual data and flowing through model would be great.

  • @davemeech
    @davemeech Рік тому +1

    Oh man, it's so nice to get something substantial on UA-cam for data modeling! Awesome awesome stuff.
    What do you think of unistore? I have yet to dive in, and I'm also very fresh into the industry, but the prospect of combining oltp and olap capabilities is certainly compelling!

  • @ArmstrongNigere
    @ArmstrongNigere 2 роки тому +2

    Awesome video , really enjoyed very clear and straight to the point

  • @wingnut29
    @wingnut29 Рік тому +1

    Great overview! Thanks for educating us on the newer approaches. We currently are using Denormalized Modeling. This is due to the fact that all our current needs are around our ERP. The Marketing Dept wants to start collecting more website and estore analytics, which I believe will lead us to a Hybrid model.

    • @KahanDataSolutions
      @KahanDataSolutions  Рік тому

      Glad it was helpful! I still think denormalized is a great strategy.

  • @mrcool4uall
    @mrcool4uall Рік тому

    Good one Michael. Well summarized and to the point. Even I think Hybrid approach is the best for the MDS.

  • @drewwolin3162
    @drewwolin3162 2 місяці тому

    This was great

  • @Rex_793
    @Rex_793 2 роки тому +2

    great stuff - when are you going on Joe Reis and Matt Housley's podcast?

  • @datasleek7950
    @datasleek7950 2 роки тому +2

    Great Job. Great Presentation.

  • @zulkhaireesulaiman8575
    @zulkhaireesulaiman8575 2 роки тому +2

    thank you, incredibly helpful.

  • @Billbillbillhahagdvdve
    @Billbillbillhahagdvdve 3 місяці тому

    Excellent Video !

  • @amitsaha7756
    @amitsaha7756 Рік тому +1

    In some cases, we observe another pattern whether industry standard data model is used after raw layer.

  • @bradleymiller437
    @bradleymiller437 2 роки тому +1

    I literally hit this video so fast sincerely thinking I was going to learn "How to date a model." My brain went faster than my eyes and lost the game.

  • @johnytheripper
    @johnytheripper 2 роки тому +2

    Thanks for addressing this often overlooked but important topic!
    I'm looking for some good sources on dimensional data modeling. Of course I have the Kimball books, but something more practical (books, courses...) and hands-on, perhaps with exercises on various business scenarios / sources would be great. Any ideas?

    • @deltagamma1442
      @deltagamma1442 Рік тому

      Any luck?

    • @johnytheripper
      @johnytheripper Рік тому

      @@deltagamma1442 not really :/. Took some inspiration from Gitlab data handbook as I'm mostly looking for SaaS use cases

  • @Lima3578user
    @Lima3578user Рік тому

    great video. thank you. can you upload a vlog on Speech to Text transcripts using AI

  • @sunil-de
    @sunil-de Рік тому

    hey, videos was top tier, can you suggest a any good course to get the in depth understandin of the DM

  • @jimgillespie3540
    @jimgillespie3540 Рік тому +1

    *slow clap* thank you, fantastic.

  • @renvils
    @renvils 8 місяців тому

    hey ur explanation really calm and clear ! did you have any udemy course?

  • @okj4521
    @okj4521 Рік тому +1

    Next: How to date a model!

  • @theukulelegod
    @theukulelegod 8 місяців тому

    What are the downsides of doing all three in one? Pull all source systems raw data (inmon) then modeling fact and dim tables (kimball) then making data marts? If storage is getting cheaper wouldn’t this be the best way?

  • @venkatvaddula6343
    @venkatvaddula6343 Місяць тому

    can someone please tell me what the website/document that was shown at 13 sec point?

  • @SnowFlake-h4y
    @SnowFlake-h4y Рік тому

    Could you please explain the differences between different data models(Inmon,Kimball,3NF,Dimension Modelling,Data Vault).

  • @S_B_S1
    @S_B_S1 7 місяців тому

    How do Data Products in their various guises fit with these data modelling concepts.

  • @nlopedebarrios
    @nlopedebarrios Рік тому

    Hi Michael, interesting approach the hybrid model. What could be used to transition from the star schema to the OBT data marts? for example, views, materialized views? or are they separate schemas? Also, in what scenario this would make sense?

  • @rpelegrini
    @rpelegrini Рік тому +2

    hello, really nice video about data modeling. I was looking for this.
    It's very difficult to define a right approach for data modeling, each case is a case, in my experience I did a lot of star schema during my carrer, but in nows day I see a trend to one big table in modern data warehouse like bigquery, redshift or synapse.
    Do you have the same impression?

    • @KahanDataSolutions
      @KahanDataSolutions  Рік тому +1

      Yep I'm seeing the same thing. Truly case by case. I think the concept of star schema/data models are still very applicable today mainly b/c of the organization and structure it brings rather than for any performance gains.

  • @medhatatef7737
    @medhatatef7737 2 роки тому

    merci beaucoup a toi :))

  • @SujitA-h9d
    @SujitA-h9d Рік тому

    How to create a LDm in Magic draw

  • @largpack
    @largpack 9 місяців тому +4

    just theoretical bla bla in my opinion.. great for COO's to talk about stuff they have no idea about

    • @moonfire5069
      @moonfire5069 8 місяців тому

      Until you are being grilled about these in an interview with companies like airbnb, Netflix and Facebook and you look like a complete clueless idiot and you get shown the door