Dimensional Modeling

Поділитися
Вставка
  • Опубліковано 26 січ 2025

КОМЕНТАРІ • 352

  • @chadrob
    @chadrob 5 років тому +158

    I've seen hours of videos on data warehousing. These are the most valuable 56 minutes.

    • @MightyMike55
      @MightyMike55 4 роки тому +3

      Well said

    • @vaibhav8382
      @vaibhav8382 4 роки тому +1

      53 minutes*

    • @sndselecta
      @sndselecta 3 роки тому

      100% agreed

    • @maryjones2175
      @maryjones2175 2 роки тому +1

      Could not agree more, this is the best video on data warehousing I have watched so far!!

    • @P_Belle
      @P_Belle 2 роки тому

      Agreed! Conversational tone, appropriate visuals, and well-paced. I'm in a $4k EDW graduate school class that teaches from bulleted PowerPoint slides. Doesn't compare 🙆😟

  • @DanielGuzman-i5h
    @DanielGuzman-i5h 8 днів тому

    I watched this video to prep for an interview. Not only did I get the job but here I am a year later rewatching this with more confidence and still learning. Thank you, sir

    • @BryanCafferky
      @BryanCafferky  7 днів тому

      That's great to hear. Glad it helped and thanks for letting me know.

  • @avenuech39
    @avenuech39 4 роки тому +65

    The most understandable, detailed & overall introduction to Dimensional Modeling, with clear key words explanation and logically sequential arrangement on the slide content. Thanks Bryan!

  • @bradjohnson9022
    @bradjohnson9022 2 роки тому +2

    This is by far the best guide to dimensional modeling I've found on the internet!

  • @prithvikumar3751
    @prithvikumar3751 3 роки тому +18

    This is the most informative session I have ever seen! 54 minutes of pure knowledge. If only everything I had an interest in learning was taught by Bryan!

    • @BryanCafferky
      @BryanCafferky  3 роки тому +1

      Thanks. What are your tech learning interests?

  • @markparee99
    @markparee99 6 років тому +18

    Using for an interview prep. Great refresher. Perfect length and depth of content...

  • @paoloogr
    @paoloogr Рік тому +2

    My current favorite channel on UA-cam being a senior data engineer. I would like to have a video created by you telling more about what cubes are, OLAP, ROLAP, etc. This kind of nomenclature is getting more and more rare and people joining data engineering in this modern data stack area should understand them well so that they can understand how we got here and also talk to people familiar with this data engineering nomenclature.

    • @BryanCafferky
      @BryanCafferky  Рік тому +1

      ROLAP and MOLAP are only used in the legacy SSAS Multidimensional Models. They were cool but the Tabular model does not use them b/c everything is in memory. Thanks for the suggestion. Are you using SSAS Multidimensional Models?

  • @johnlamarre7330
    @johnlamarre7330 3 роки тому +4

    I come from an OLTP software data architect side of the house and only had to hand off streaming/replication to an ODS. I just took a role that will require dimensional modeling. So glad I found your video. This clicks.

  • @risebyliftingothers
    @risebyliftingothers 2 роки тому +1

    By far the MOST lucid and practical no-nonsense explanation of key terms. Loved it!

    • @BryanCafferky
      @BryanCafferky  2 роки тому

      Glad it is helpful! Thank you for the kind words.

  • @terryxu1481
    @terryxu1481 3 роки тому +4

    Hi Bryan, many thanks to provide this tutorial. The most solid foundation tutorial on data warehouse & dimensional modelling I have even seen. For those who want to build their career path as a data analyst, data architect etc. This is the best start point. I really regret that I did not see this video 2 years ago when I start off my career.

  • @kasiatuszynska486
    @kasiatuszynska486 4 роки тому +5

    Perfect combination of experience and theoretical handle on the subject. Thank you for the time it took to record it.

  • @jilanimohammed4656
    @jilanimohammed4656 4 роки тому +2

    These are the best 53 minutes I've spent recently. Crisp and Clear. I feel much more confident and I request more videos on Dimensional Modeling. Thank you for your effort.

    • @BryanCafferky
      @BryanCafferky  4 роки тому

      Thanks. I did do a video on Slowly Changing Dimensions which is a big area for interview questions. What topics are you most interested in regarding dimensional modeling?

  • @manarlab84
    @manarlab84 2 роки тому +2

    I'm grateful for the time and efforts you put in explaining one of the key concepts anyone deals with enterrpise data needs. Clear and comprehensive. Thanks a lot

  • @pranshumishra
    @pranshumishra 4 роки тому +1

    This is premium content on the topic. Simple yet effective explanation shows your understanding of the subject. Thank you Bryan!

  • @nandk98
    @nandk98 4 роки тому +1

    A very comprehensive and fully understandable video during my very first viewing itself. I felt like having read a high quality book on the topic within an hour. Bryan never wastes time on unnecessary talks and very effectively concentrates on the core through and through. Thank you!!

    • @BryanCafferky
      @BryanCafferky  4 роки тому

      Thanks

    • @nandk98
      @nandk98 4 роки тому +1

      @@BryanCafferky My statement was true. Not just a passing compliment. I got a full perspective on what is dimension modelling and how it is different from Data warehousing.
      Leave alone warehousing, it actually helped me in designing various reports using data from Database tables. I was able to see the common threads that run through them all.

  • @akshaybaura
    @akshaybaura 2 роки тому

    I have been working in data warehousing for the last 5 years and this video gave me the answer I have looking for so long- why is there so much normalisation in data warehouses we see today? Nobody ever gave me a satisfactory answer but you did sir. Big thanks !! This video deserves to be on the billboard.

  • @kcmalik5992
    @kcmalik5992 2 роки тому +5

    Good sir, I typically never like or comment UA-cam videos. This was a must. You are masterful, and could quite possibly mint some people into data engineers off of this video. God bless you. I hope to become a teacher like you one day

  • @stalsams179
    @stalsams179 2 роки тому

    I dont usually spend time watching a 53 min learning video most of the time unless the course and the orator is really worth it. Trust me sir, you kept me glued every minute. God bless you Sir
    !

  • @marcosoliveira8731
    @marcosoliveira8731 4 роки тому +3

    Lovely way of teaching.
    Looking up for more of your material about data warehouse on the web.

  • @sshfromhere
    @sshfromhere 3 роки тому +2

    Great presentation!👍 Thank you Bryan ! 👏

  • @TheMadMagician87
    @TheMadMagician87 3 роки тому +1

    I'm about 4 or 5 years late to this party. You've still done a great job in this video compared to many other sources. Thanks, and well done!

  • @Molaa21
    @Molaa21 4 роки тому +7

    Amazing video. Well explained. Uploaded 3 years ago, still valid for educational and professional training for 2020. Thanks Bryan!

  • @noelslater458
    @noelslater458 2 місяці тому

    Thank you for taking the time to put this together. I found it very educational and helpful!

    • @BryanCafferky
      @BryanCafferky  2 місяці тому +1

      You're welcome and thanks for the kind words.

  • @himanshityagi5055
    @himanshityagi5055 4 роки тому +1

    I wish I had watched it earlier when i was struggling to understand Datawarehousing and Dimension Modeling. Very informative video. Every second was worth spending.

    • @BryanCafferky
      @BryanCafferky  4 роки тому +1

      Thanks. Yeah. I struggled understanding this too for some time. Glad it helped.

  • @Vermitude
    @Vermitude 3 роки тому +1

    A very useful introduction to data warehousing and the common terminologies, presented in an interesting and easy to understand manner. IT was a very quick 53 minutes - I only meant to get an idea of the content and then watch it later - but got completely absorbed.

  • @Lego-tech
    @Lego-tech 5 років тому +4

    Many Thanks !! This is the best video I have even seen on this subject. Simple explanation of all complicated areas.

  • @billrosmus6734
    @billrosmus6734 4 роки тому +1

    This is an excellent primer. The best one I've seen. I'll recommend it if anyone asks for advice on a good starting point. One point however: USER STORY. In 99% of Agile Software Development, the example from that book is NOT a User Story. A user story actually describes something that needs to be done and why. e.g. As a business manager I need to be able to drill down into the types of sales act my book stores to see the types of books being purchased, how they are being purchased, how they are being paid for, down to the level of the book title, so that I can understand what is being sold and better stock my stores. What the book you mention is actually doing is closer to a Use Case. There is semantic impedance going on. But it's just data. :) Anyway, I really do like this video and think the approach of using use cases is very good. I think though that there will be some need to translate between different groups kind of like what is a location to different systems. :)

  • @individualassignment2661
    @individualassignment2661 2 роки тому

    this is great Dimensional Modeling tutorial ever.. !! Thank you so much, sir..

  • @SanjuG-k4y
    @SanjuG-k4y Рік тому

    The best video I have ever watched till date. Very well explained and neatly presented. This video helped a lot!!!

  • @anjaneynaik3080
    @anjaneynaik3080 2 роки тому +1

    The concept has been explained thoroughly with real time examples. Thanks Bryan.

  • @creativeluf
    @creativeluf 2 роки тому

    Great explanation of dimensional modelling. Highly insightful.

  • @paulm3106
    @paulm3106 4 роки тому +2

    An excellent video on data warehousing, easily the best I've seen.

  • @stuieblack
    @stuieblack 4 роки тому +2

    Quality. Seeing your videos, I realise that this is the subsection of my role that I enjoy the most and need to learn more about. Great video.

  • @terribradshaw4366
    @terribradshaw4366 3 роки тому +3

    Thank you for this excellent presentation on Dimensional Modeling. I'm a student and it was so easy to follow because you made it interesting and you provided some excellent examples to support your slides.

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      Thanks. Glad it helped. Hope you find my other videos helpful too.

  • @sau002
    @sau002 4 роки тому +1

    Excellent video. I do have a question at 15:57 . I was unable to understand how the design of the SalesFact table differs from what the OLTP table for Sales would have been. My OLTP Sales table would have been almost identical to SalesFact shown in this presentation, with the exception of a SalesLineItem.

    • @BryanCafferky
      @BryanCafferky  4 роки тому +1

      Yeah. I had a hard time seeing the difference in the beginning too. Good question. In the OLTP table at about 23:09, notice the sample table has descriptive columns like CustomerFirstName, CustomerLastName, Product, SalesDate, Order Number, and OrderLineNumber. These are Dimension attributes. It also has Quantity and Price which are Facts or Measures. In a Star Schema, these cannot be in the same table. The Dimension attributes are stored in a separate table that has its own Primary Key. The Facts are stored in a Fact table with a foreign key to the Dimension table. The Dimension table Primary Key is called a Surrogate key and is usually an auto-generated identity column. There is no effort to reduce data redundancy for Dimensions, i.e. you could have Product Category values in the same table with Product Model values. It is not efficient to maintain the data that way which is why OLTP design would not do this. But it is fine for a Star Schema, i.e. Dimensional Modeled design. Make sense?

  • @cannonkalra
    @cannonkalra 3 роки тому +1

    This is what I subscribe internet for, it's beautiful piece of 53 mins straight

    • @BryanCafferky
      @BryanCafferky  3 роки тому +1

      Thanks. Hope you check out my other videos like the ones on Databricks and Python too.

  • @C_G_1962
    @C_G_1962 3 роки тому +1

    The ideal video for a dba trying to reach the dw world (using mssql server and also Azure) . Thanks a lot for the video !

  • @2001july06
    @2001july06 3 роки тому +1

    Amazing simple and focused explanations.
    Thanks Bryan

  • @mirdhapuneet
    @mirdhapuneet 4 роки тому +1

    Hi Bryan, This video is one of the best videos have watched and has every information required, to the point and well described...Hats off...Thank You.
    Highly recommend must watch for everyone who is working in DWH domain.

  • @consumer323
    @consumer323 5 років тому +6

    This was an excellent primer. I was alert and it really grounded me on a number of key points. Thank you so much for this contribution.

  • @tregatregs8804
    @tregatregs8804 3 роки тому +1

    This is a must watch video for anyone having a hard time understanding Dimensional Modeling. Wish you could do a full series on Database systems and Warehousing.

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      Thanks. What specific topics are you thinking of?

  • @mohammedansari818
    @mohammedansari818 3 роки тому +1

    This is the first time I loved blue screen on my computer. :) Very good advice.

  • @erb6411
    @erb6411 4 роки тому +2

    This is so helpful. I'm modeling a data warehouse for my Org and only have experience with OLTP. This has saved so much headache

    • @BryanCafferky
      @BryanCafferky  4 роки тому

      Great! Yeah. Dimensional Modeling is a very different mindset. Good luck!

  • @simondavidvgm
    @simondavidvgm 4 роки тому +4

    AMAZING lecture, Bryan - thanks so much! Exactly what I was looking for and an extremely well articulated 56 minutes.

  • @houstonfirefox
    @houstonfirefox Рік тому

    Very well presented! Clear and concise with real-world use cases!

  • @vikaschoudhary6904
    @vikaschoudhary6904 3 роки тому +1

    Great tutorial sir , Thank you so much for such relevent information .

  • @aiikaiik
    @aiikaiik 4 роки тому +1

    Great Job Bryan, great content .Thanks for sharing

  • @hgiang100
    @hgiang100 4 роки тому +1

    Thank you Bryan for this video. You did an excellent job of explaining the concepts data warehouse design.

  • @GokulShanth
    @GokulShanth 3 роки тому +1

    For someone just getting started, this was amazing thank you so much!

  • @FightAndFunHub
    @FightAndFunHub 2 роки тому

    I am listening you for the first time and I found out that you are a great teacher.

    • @BryanCafferky
      @BryanCafferky  2 роки тому +1

      Thanks!

    • @FightAndFunHub
      @FightAndFunHub 11 місяців тому

      @@BryanCafferky After an year I am listening again to refresh concepts.

  • @rishabhbhatt7373
    @rishabhbhatt7373 10 місяців тому

    Great content Bryan. Great level of detail and insights (from actual experience). Please keep it up !

  • @wentingzhu343
    @wentingzhu343 6 років тому +2

    among several videos i watched on dimensional modeling, this is the one with more insight and experience sharing!

  • @aaragon0902
    @aaragon0902 4 роки тому +2

    Thank you so much!! Struggling through my data modeling/structuring course and your video was incredibly helpful in understanding dimensional modeling.

    • @BryanCafferky
      @BryanCafferky  4 роки тому

      Wow! Really glad to hear that. Thanks for letting me know.

  • @saurabhjain2005
    @saurabhjain2005 4 роки тому +1

    You are amazing!! Thank you so much for this. Best summary you can get and which can make you talk like a pro..

  • @ragacbe
    @ragacbe 3 роки тому +1

    It's a great content and presentation. Thank you very much for this wonderful work!

  • @sendilbm
    @sendilbm 2 роки тому

    Amazing video, very detail level. Thanks so much.

  • @Martin-lf9se
    @Martin-lf9se 4 роки тому +1

    Very well done and explained. Thank you for sharing your knowledge.

  • @stephenhordes4244
    @stephenhordes4244 3 роки тому +1

    Thanks Bryan, great video.

  • @astersathya
    @astersathya 6 років тому +2

    Excellent Video and loved it. Being an OLTP modeler, this gave me a very nice idea about dimension modeling. The only one thing which I wanted to bring it to your attention that, When you talked about 7W's of dimensional modeling, it only had 6 Ws and I was searching for the 7th one :)

    • @BryanCafferky
      @BryanCafferky  6 років тому

      Hi Sathya, Great observation! Actually, the 'How Many?' is one of the 7 but yeah, it is an H not a W. The book 'Agile Data Warehouse Design' presents it that way so I did to. It should be the 6 W's and 1 H of Data Warehouse Design but that is harder to say. :-) Actually, it helps me remember the How because it is different. Thanks!

    • @msdew9885
      @msdew9885 5 років тому

      @@BryanCafferky
      hoW?
      What?
      When?
      Where?
      Who?
      hoW many?
      Why?
      makes 7.
      Thanks for the video. Very informative!!!👍

  • @4rmtinc
    @4rmtinc 4 роки тому +1

    A nice and concise presentation of dimensional modeling for data warehouse.

  • @manankashyap7726
    @manankashyap7726 2 роки тому

    One of the best videos I’ve seen on DM!!

  • @pradeepnagaraj7347
    @pradeepnagaraj7347 3 роки тому +1

    Excellent explanation Bryan!!

  • @dunlapww
    @dunlapww 4 місяці тому +2

    This is a phenomenal presentation on dimensional modeling but i don’t understand the implementation of surrogate keys. I feel like I’m missing an obvious and low compute way of maintaining all the surrogate keys on your facts. No videos I’ve seen discuss this. But it seems every time a new fact record is generated you have to join every related dim on the foreign natural key and update the fact with the dim’s related surrogate key. So that you can later perform joins using the surrogate key. Am I thinking through this correctly?

    • @BryanCafferky
      @BryanCafferky  4 місяці тому +1

      Yeah. It does add complexity but you have the gist of it. Surrogate keys are particuarly important when you want SCD history since natural keys would result in duplicate keys on the dim tables. Also, they isolate changes from the backend systems to the dw. But they do add some extra work.

    • @dunlapww
      @dunlapww 4 місяці тому

      Thank you for confirming my understanding and great presentation!

  • @dylankelly3318
    @dylankelly3318 4 роки тому +1

    Great video Bryan.

  • @richardogujawa-oldaccount1336

    Great lecture, definitely worth the watch!

  • @ohnotoyota4692
    @ohnotoyota4692 3 роки тому +1

    Excellent, specially focus on process model. Thank you

  • @valentinussofa4135
    @valentinussofa4135 2 роки тому

    Amazing lecture. Thanks Sir. 🙏🙏🙏

  • @saisanikommu8551
    @saisanikommu8551 4 роки тому +1

    I listed out some topics (after my failed interview )to gain clearcut understanding and this video answered all my questions in detail,Sir big thumbs to you ,if possible please do a video on interview questions and how to answer them (Dw,DBMS concepts).

    • @BryanCafferky
      @BryanCafferky  4 роки тому +1

      Hi Sai, That's a great idea for a video. Do you have any specific questions in mind? BTW: Interviews always have questions to stump you. But I'd like to help with a video that helps.
      Thanks! Bryan.

  • @thehouse2620
    @thehouse2620 5 років тому +1

    Thanks, This is very helpful. When you discussed the scenarios regarding a person getting married, it triggered a bunch of other questions I can ask for my project. I enjoyed the descriptions of what is fact vs what is dimension.

  • @sau002
    @sau002 4 роки тому

    I have a question about the Time_Dimension table at 19:02. This captures Day,Month, Quarter, Year, Fiscal_year, Day_of_week. I understand how this would beneficial. Consider the following scenario - where the requirement now is to capture the time (hour+minute) of the Sales transaction? - How would this dimension table change?

    • @BryanCafferky
      @BryanCafferky  4 роки тому

      Hi Saurabh,
      Did you see my answer to your last question? Is that answered now?
      Thanks,
      Bryan

  • @maniji5756
    @maniji5756 4 роки тому +2

    Thank you, loved the content and how well it was structured and presented. Looking forward to your other tutorials!

  • @lyreco7910
    @lyreco7910 3 роки тому

    Absolutely amazing video, thanks Bryan!

  • @LyAn215
    @LyAn215 3 роки тому +1

    I learned so much from this one video. Thank you! Also, 23:51 "snowflake is something you may get questioned in an interview, so wake up" I feel personally attacked lol. I wasn't sleeping (your video was long but not boring at all) but I DID get asked about this in a recent interview and I totally flopped. At least now I can answer that question :)

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      Yes. I find I always remember answers to interview questions I missed.

  • @joaorataoo
    @joaorataoo 4 роки тому +1

    Thank you so much for sharing your knowledge and your skills to teach them in a so clean, so comprehensible.

  • @tobman781
    @tobman781 4 роки тому +1

    Great presenation. Very clear and to the point!

  • @clintp3504
    @clintp3504 3 роки тому +1

    Excellent video! Thanks for sharing

  • @jimpanging87
    @jimpanging87 4 роки тому +1

    Great video, great explanation!

  • @AnshumanSingh-gk2md
    @AnshumanSingh-gk2md 3 роки тому +1

    Amazing explanation

  • @YeetYeetYe
    @YeetYeetYe 2 роки тому

    Absolutely amazing explanations.

  • @sumit12345yadav
    @sumit12345yadav 4 роки тому +1

    @Bryan Cafferky - thank you for creating this great video. Its really a marvel , simple, realistic approach to understand.

  • @barbararibeiromaia1502
    @barbararibeiromaia1502 4 роки тому +1

    This video was perfect to answer my questions! Thank you!

  • @vijayd15
    @vijayd15 Рік тому

    Best video on DW design ever!

  • @Seamonkey1981
    @Seamonkey1981 2 роки тому

    outstanding overview, well done

  • @torque6389
    @torque6389 4 роки тому +1

    Excellent job! Thank you for this wonderful video!

  • @mathinsovie9954
    @mathinsovie9954 Рік тому

    Wow. well detail and explainable. Thanks Bryan

  • @SuperAerodragon
    @SuperAerodragon 4 роки тому +1

    @BryanCafferky Thank you for taking the time to put this together. This is a great foundational video for anyone getting started and presents the subject in a very relatable way.

  • @m.lmoore
    @m.lmoore 4 місяці тому

    Hey Bryan! This is amazing, thank you for the great video. Could we get a download link for the slide deck?

  • @00EagleEye00
    @00EagleEye00 3 роки тому +1

    Good day Sir Bryan.
    I have a question regarding data changes.
    I use to compare incoming/ingested data to staging db records via 'dateUpdated' column (if the condition is not equal
    and incoming data dateUpdated is greater than staging). If the ff. condition has been satisfied, incoming records will
    be processed. My question is, what if one of the columns has been updated via script without including
    the 'dateUpdated' column, should i considered this scenario and continue to be processed (need to compare the records column by column)?
    What was the best practice to consider a data changes if it is done normal and intentionally (data manipulation via script)?
    Looking forward to your advice. Thank you.

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      Hi, Well, the best practice is not to update data outside the application, i.e. should not update DW tables via scripts other than your ETL. If this is unavoidable, you could add a table trigger to automatically update the dateUpdated column whenever a change to a table row occurs. Will that work in your scenario?

    • @00EagleEye00
      @00EagleEye00 3 роки тому +1

      @@BryanCafferky thanks for your response. Same thing for me that changes should be done normally specifically if there are applications that maintains the original data. I brought it out as there are testers who manage to modify the records via scripts then run the ETL process. So it was tagged as a failed process.

  • @Ari-lu5ve
    @Ari-lu5ve 2 роки тому +2

    Thank you so much for this! Very organized lecture, and I love how you included the time stamps.

  • @vinyasshetty4042
    @vinyasshetty4042 5 років тому +1

    Thank you for this worderful session.Very clear and informative.Really enjoyed it.

  • @jplee123
    @jplee123 3 роки тому +1

    I love this video, the facts, and color commentary you present with it. But what is the relevance of star schema (vs wide flat) for consumption by analysis tools such as Tableau which implicitly and automatically creates a high performing dimensional model from a flat view without a human needing to do any dimensional data development? I would love to hear your thoughts on this.

    • @BryanCafferky
      @BryanCafferky  3 роки тому +1

      Thanks. Sure.
      First, a Star Schema is arranged for efficient and easy data analysis for Tableau, Power BI, or any other tool. There is more to the world than Tableau. The use of surrogate keys supports inclusion of dimension history, i.e. slowly changing dimensions. Thinking things out like conformed dimensions builds in extensibility. The dimensional model creates the foundation of your data which will sustain your organization long term through many changes in the reporting tools that consume the data.
      The second reason to use a Star Schema is performance when used by reporting tools. You need to load the data and that process can be slow if a lot of joins are needed to get the data together. The Star Schema connects the fact to dimension tabless in one join. No need to join dimension to dimension tables. See medium.com/data-ops/why-do-i-need-a-star-schema-338c1b029430
      Many organizations don't want to spend the effort to build a star schema but then run into problems and build hacks to solve them. It's pay now or pay later.

  • @haiderali-uf4gy
    @haiderali-uf4gy 4 роки тому +1

    best video on data warehousing on youtube..

  • @ghazitozri4989
    @ghazitozri4989 3 роки тому +1

    Thank you sir, you saved my life.

  • @somerandomname1985
    @somerandomname1985 3 роки тому +1

    Hi Bryan,
    Thank you so much for helping me understand dimensional modeling. I had a question regarding fact tables. Is it an acceptable practice to create a separate fact table that reports on a different grain?
    So say for example we have an orders fact table that consist of billions or rows. There are requests to create reports on the lowest grain possible, so in this case it would be the order_id but there other reports where the business wants to do their analysis at a higher grain, so say for example total number of order by day and country in the past 3 years. Due to the number of records, the query to preform this takes a a lot of time and eats into costs.
    If so, would it make sense to script the ETL to create this other fact table by utilizing the original fact table as the base table?
    I hope my question made sense.
    Thanks!

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      Yes. Aggregated fact tables are a way to do what you are saying. See www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/aggregate-fact-table-cube/#:~:text=Aggregate%20fact%20tables%20are%20simple,aggregate%20level%20at%20query%20time.
      I there is a need for the more detailed grain too, you can have that as a fact table too.

  • @vibhaskashyap8247
    @vibhaskashyap8247 Рік тому

    Thanks Brian for awesome presentation, can you please also cover topics like how to handle late arriving dimensions and is it relevant in Midern Data Warehouse?

  • @shivamahirao5377
    @shivamahirao5377 2 роки тому

    You're a savior, thank you Bryan.

  • @craigdubin6325
    @craigdubin6325 2 роки тому +1

    Hi Bryan, great tutorial! I've been dimensional modeling for quite some time but it's always good to review the basics again. I was wondering if you might have some thoughts on a fact at the atomic grain for a order header/detail fact where your order lines have different dimension attributes. For example, imagine a fact for vehicle maintenance where 2 or 3 line numbers might be for labor with 3 different employees, and 5 line numbers for different parts. This is all easy to rollup, but getting it to the atomic grain leaves a lot of zero keys for the dimensions that aren't applicable. Would be curious to see your thoughts on this. Thanks!

    • @BryanCafferky
      @BryanCafferky  2 роки тому +1

      My thought is to consolidate the data (employye, parts, etc.) into a single dimension table and add a column, charge type with values like 'Emp', 'Part'. Assign a surrogate key and point the fact table to this. The line detail points to the related charges which is now in one table.

    • @craigdubin6325
      @craigdubin6325 2 роки тому +1

      @@BryanCafferky Hi Bryan, thanks for the reply. Interesting. So if you had 1 line item with 1 employee and 3 parts, then another line item with the same employee and 2 parts, I'm guessing that you'd have to calculate ratios of hours/costs for the labor to each row that's got a part...otherwise, you'd have repeating values (i.e., double-counting) of the labor. In my original thought, it avoids that by having a row for each record type, but I get why that's a problem. Another approach is to just create a fact for each type of line item detail, but then a work order (similar to a sales order) would be broken up among multiple line item detail facts and make drill down impossible.

    • @BryanCafferky
      @BryanCafferky  2 роки тому

      @@craigdubin6325 If feasible, creating the DW at the lowest grain you may ever need gives you a lot of future flexibility. Your second reply indicate each line item has both an employee(s) and part(s) so you could just aggregate for reporting. Maybe aggregate to the employee/part level for drill down or whatever the business needs.

    • @craigdubin6325
      @craigdubin6325 2 роки тому

      @@BryanCafferky Yup, I always go down to the atomic grain if possible. And I think I probably made it more confusing. A work order is comprised of basically 2 different types of records, 1) labor, 2) parts. So imagine a quote or invoice from a mechanic. You might have the summary costs at the top with total labor and total parts. But there will be n number of lines with only labor charges, and n number of lines with only parts. They're never on the same line. So quantity on a labor line might be hours for the employee on that labor line, unit cost would be dollars per hour, and total cost would be hours x unit cost. On the parts lines, the quantity would be the number of the specific part number, the unit cost would be the cost of the particular part, and the total cost would be quantity x unit cost. My concern with this method was that the dim_employee_key will always be 0 (or possibly could make it -1 for N/A) on the parts line...while the dim_part_key will always be 0 (or -1 as above) for the part number on the labor lines. Hopefully that's making more sense!

  • @helovesdata8483
    @helovesdata8483 3 роки тому +1

    I'm getting into data engineering and I really enjoyed this content.

  • @WhosShamouz
    @WhosShamouz 3 роки тому

    Amazing. The only thing I don’t understand is to „declare the grain“ 😪 any tip on what to research or how to think about it?

  • @HerdingDogRescuer
    @HerdingDogRescuer 3 роки тому +1

    Great video, and very helpful. I've been struggling with the dryness of the Kimball book.

    • @BryanCafferky
      @BryanCafferky  3 роки тому

      Thanks. Yeah. The book can be a bit dry.