MSHGQM - Star Schemas are Stupid

Поділитися
Вставка
  • Опубліковано 17 тра 2024
  • Description:
    What is Deckler's Maxim? Find out in this video!
    Download MSHGQM: github.com/gdeckler/Microsoft...
    Books!
    The Definitive Guide to Power Query (M) - www.amazon.com/Definitive-Gui...
    The Definitive Guide to Power Query (M) - www.amazon.com/Definitive-Gui...
    Mastering Power BI 2nd Ed. - www.amazon.com/Mastering-Micr...
    Learn Power BI 2nd Ed. - www.amazon.com/Learn-Power-co...
    DAX Cookbook - www.amazon.com/DAX-Cookbook-a...
    Power BI Cookbook 2nd Ed. - www.amazon.com/Microsoft-Powe...
  • Наука та технологія

КОМЕНТАРІ • 24

  • @MDevion
    @MDevion 27 днів тому +7

    Few notes:
    1. A flat table is slower for example with DISTINCTCOUNT vs a SUMMARIZE with a star schema.
    2. The whole reason for a star schema is to have shared dimensions between facts. If you ever have a second fact up popping up you need to redesign everything, costing more times. And thats is very likely, even in simple models.
    3. Because its a bit more complex, doesnt mean its worse. Yes a 12 year old without a license can drive a car, doesnt mean its a good idea.
    4. Time Intelligence is hell with a flat table.
    5. You dont have to use BIDIRECTIONAL relationships. The dimensions visual can be filtered on COUNTROWS of fact table is not empty, and you will get the same result. Also, you dont always want this behaviour anyway.
    6. A star schema is easily explained and ignorance should not be an excuse not to do it.
    7. If you hide all non measures in a fact, you will show up as a fact, making it easier to discern dimensions from facts. That is not possible with a flat table if you still want to be able to select attributes.
    While Im a big fan of Keep It Simple Stupid, a star schema is a bare necessity for Power BI developers, new or experienced.

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  27 днів тому

      Well, one could quibble with a star schema with multiple fact tables, I would call that multiple stars in a single semantic model. And if that's the entire reason for a star schema and you only have one fact table then I guess we agree that a star schema isn't necessary in most cases.
      The rest is pretty much missing the entire point of the video, but I'm out at a bar and don't feel like responding point by point. Maybe later.

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  26 днів тому +1

      Back to this, these are all fine IT arguments. But the video was not made from an IT perspective. It was made from a business user perspective. This is why IT people and business people never see eye-to-eye on things. IT people can only think and talk IT. Unfortunately, when it comes to tech, the narrative is always driven by IT people and not business people and so the business person's perspective is generally lost. That's the point I'm making in this video and, of course, because you can only think IT, you totally whiffed on the entire message of the video. Very IT of you.

    • @MDevion
      @MDevion 26 днів тому +1

      @@MicrosoftHatesGreg @MicrosoftHatesGreg Responding to my comment in a bar. Cheers!
      Anyway, these are not IT arguments. There functional comments. Im not even an IT person, I graduated in Business Ecomics. I consult from big to small business and everyone understands a star schema.
      These users are usually accountants and controllers. Making a Star Schema is not IT, people were already doing in Excel with PowerPivot.
      The question you pose is, is Star Schema really necessary because of development time and understanding of the model. In a single pivot table, maybe not, but experience has taught me, it never stays at 1 pivot table. Thats not an IT argument, its common sense.

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  25 днів тому

      Eh, I just posted an analysis of Power BI feature releases and I don't see that stretching beyond a single table data model for any reason. There are lots of cases where this is true, probably the majority of the cases. Maybe not the majority of enterprise cases, but the majority of normal, everyday, run-of-the-mill, hey I just want to get a quick answer to something. Like with Fabric Killed the Power BI Star. Nothing to be gained with a star schema.
      As far as "most users are accountants and controllers", that may be who you interact with but that is not the majority of Power BI users, I assure you.
      Also, back to the original post, I do TI on single table data models all the time. You just don't use CALCULATE or DAX's TI functions, which I tend not to use anyway.

    • @WorldWideGarvin-2023
      @WorldWideGarvin-2023 25 днів тому +1

      @@MicrosoftHatesGreg seconding the issue with IT perspective - if they did THEIR job better, we wouldn't have to be diddling around with cheesy desktop schemas.

  • @vdj6298
    @vdj6298 28 днів тому +4

    This is a tiny 1 thousand row dataset , not even 1 million let alone thousands. Star schema is far more performant than a single table & it is extremely easy to setup.

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  28 днів тому

      Exactly, the vast majority of business user cases are extremely simple. And "extremely easy" is relative. Compared to a single table data model, star schemas are far, far more complex.

  • @douglascory
    @douglascory 24 дні тому +1

    It would be interesting to see one of these but comparing the Incremental Refresh Performance in the PBI Service (With vs Without)

  • @Henrik.Vestergaard
    @Henrik.Vestergaard 2 дні тому

    Awesome points, no nonsense approach thanks Greg. Nb, dont you hate the gifs on the forum, I sure do?

  • @WorldWideGarvin-2023
    @WorldWideGarvin-2023 25 днів тому +3

    It is heresy to refuse to drink the Star Schema Kool Aid. And yet, as Greg points out, most Power BI reporting runs will relatively small datasets, which is obviously true: Simple Pareto Analysis. In my experience, any dimensions under five million rows work very well when partially denormalized. Of course, maybe I am just a very, very good architect. And the Schema Fanatics? Often full of hot air (the optimistic substance). It is easy to forget Classic YAGNI - You Ain't Gonna Need It, at least for the first few iterations of any app or reporting requirement. And that's about as far as 99% of them go. Complexity is often High Tech Onanism - yeah, a feel good ain't I really clever - love me & my magic tricks, rather than getting to the nut of the matter.
    Regarding those who are concerned about scaling to "entereprise grade", certainly a concern for a handful of critical apps, note that you only need to scale UP because you are working with local rinky-dink reporting solutions rather than feeding DOWN from well-designed and professionally curated Data Marts.

  • @hamishmaxa6509
    @hamishmaxa6509 28 днів тому +2

    Just a side note. You need to add 2-3 seconds per field you add to the table visual. Because, you know, the default 'helpful' behaviour of Power BI is to go ahead and aggregate columns to Count of or Sum of, because........

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  28 днів тому

      Yeah, in my original run through I set the order id to not summarize but forgot when recording the video. Doesn't really impact the performance tests at all.

  • @richardmartin7904
    @richardmartin7904 28 днів тому +4

    I guess my counter is that you learn a lot more thinking at an enterprise level. And a lot of business users should be trying to think at a larger scale because it's good for them :). I also find a proper semantic model easy to understand whereas a flat table is just one big mess.
    I would like to see more discussion on this. I also thought it was understood that vertipaq engine is so efficient even flat tables get nice performance for simple use cases. But at scale flat tables can falter.
    If you have evidence contradicting the Sqlbi guys I am sure they would love to see it. I don't think they would disagree about small data size.

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  28 днів тому +1

      This is absolutely about scale. My guess is that 80% or more of semantic models in Power BI don't ever need to be more complex than a single table data model and similarly 80%+ of Power BI users are business users that just want to get do some data analysis and visualization without having to know all the complexity of the product.
      Also, consider this, up until 4 years ago, the data model in Tableau was essentially a single, flat table. All tables added to the data model ended up as a single, flat table in the physical layer. And that worked just peachy for a long, long time and ushered in a huge wave of data analytics and visualization which Power BI simply rode and struggled to catch up to for years.

    • @richardmartin7904
      @richardmartin7904 28 днів тому

      @@MicrosoftHatesGreg I meant to ask. I do not know Tableau but I have heard it falters with more complex models because its just a flat table. And may also struggle at scale. Do you have knowledge of this? I am curious how Tableau has done so well since it violates the data warehouse toolkit.

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  28 днів тому

      The new data model for Tableau actually introduces a logical semantic model layer that keeps the data normalized instead of de-normalizing it. So it's essentially the equivalent of Power BI's semantic model. There are some complex data model scenarios that are "unsupported", like multiple fact tables related to multiple shared dimension tables and directly relating 3 or more fact tables on shared dimensions but both have workarounds or just caveats to keep in mind. This is the best write-up I have found on the topic: help.tableau.com/current/pro/desktop/en-us/datasource_datamodel.htm

    • @WorldWideGarvin-2023
      @WorldWideGarvin-2023 25 днів тому +1

      One thing you rarely find is an enterprise-grade Star Schema. Every discussion I see starts with a core Fact Table and 4-5 points on the star called Dimensions. And yet, in any significant enterprise-grade application, more than half of those "dimensions" are actually significant and data-intense Facts. So how does one actually scale a Star Schema? Bus Architecture (Kimball)? Multiple focused Datamarts (Inmon)? Or just common sense (slightly denormalized 3NF databases Referential, Domain and Entity Integrity enforced)? Aside from rapid prototyping or proof of concept experiments, I don't see why scaling UP should ever be an issue. If truly "enterprise" then start with the enterprise and work DOWN to a manageable Semantic Model. It is far easier to start with a solid source of data and denormalize a bit with simple SQL Views than build-up from a lot of Power Query and Dax mumbo-jumbo. (shades of Roche's Maxim for sure)

  • @danielnichols3594
    @danielnichols3594 28 днів тому +4

    Deckler's Maxim? LOL. Personally, I create a lot of semantic models in a user-led, self-service environment but I am always thinking about how my models could be deployed at Enterprise scale. I solve the slicer issue by putting a COUNTROWS DAX function in my fact table, then adding it as a filter to the slicer visual with filter value > 0. It solves the issue & I use this technique all the time. I won't speak to size & query performance issues since you are quite adept at that & I am still learning query timings. But I will say that a star schema w/ relatively uncluttered fact tables, with dimensions pushed out as 'lookup' tables, makes for a simpler data model from a design perspective and user comprehension standpoint. I definitely concede your point that Power BI newbies can get by with a single table model and, hopefully, later on migrate to more complex model structures.

    • @MicrosoftHatesGreg
      @MicrosoftHatesGreg  28 днів тому

      That's a good work-a-round but even more technically complex than just using bi-directional relationships. I mean, code. And, to be clear and I hope I made it clear in the video, star schemas are fine for advanced users and IT type people, but it's overkill for the vast majority of end user business scenarios. It's like hitting a gnat with a sledge hammer. The point I'm trying to make in this video is that, similar to CALCULATE, the common narrative out there simply doesn't apply to the vast majority of people using this product. It's unfortunate that the, by far, largest persona is almost never considered in discussion about Power BI be that CALCULATE, star schemas, etc.

    • @danielnichols3594
      @danielnichols3594 7 днів тому +1

      @@MicrosoftHatesGreg Absolutely. I’m a bit of a purist but it’s really hard when you have a relational data warehouse & no SQL background to develop star schemas. Just get started & pick up more skills along the way. I learned SQL over a few years and it has helped my modeling enormously. Still have some annoying org IT restrictions that complicate things.

  • @UlyssesHaq
    @UlyssesHaq 28 днів тому +2

    "Straight to Hell" 😅😅

    • @WorldWideGarvin-2023
      @WorldWideGarvin-2023 25 днів тому +1

      Straight to Hell on board the Deckler Express - The Ultimate Power BI Contrarian Experience