Can't create a relationship? Time for a surrogate key in Power BI!

Поділитися
Вставка
  • Опубліковано 30 вер 2024

КОМЕНТАРІ • 45

  • @Naryan17
    @Naryan17 5 місяців тому +24

    I usually create a new column in both tables to use for the relationship. Is there a reason why a merged column like EntityID:ChildID for the relationship is worse than the presented Solution?

    • @douglascory
      @douglascory 5 місяців тому

      Would also be interested to know

    • @yuuzhang
      @yuuzhang 4 місяці тому

      The compression for a integer (consecutive and Starting from 1) ist much better than for the string.
      Could reduce your dataset size and Refresh significantly (from 10gb to

    • @markwallstrom9994
      @markwallstrom9994 3 місяці тому

      @@yuuzhang I am not so sure. Is it worth doing a join operation in PQ instead of just concatenate in each table? PQ is notoriously slow at joining from my experience. Besides, if you join you break the folding, whereas with a concatenated key in a SQL statement, for example, you do not. Your example of 10gb -> 1gb would not be true for a "normal" dimension where this is most common (say a table with 2m records).

  • @j-town4408
    @j-town4408 5 місяців тому +27

    I'm confused. Why wouldn't you just combine the entity id and child ID into a unique value/surrogate key (i.e. 1-1, 1-2, 1-3, etc.)?

    • @mrcosmojones880
      @mrcosmojones880 5 місяців тому +4

      My initial thoughts too, Definitely would love to know if this isn't the optimal solution!

    • @skv4611
      @skv4611 5 місяців тому +1

      Reason could be indexing

    • @Tyberes
      @Tyberes 5 місяців тому +2

      There's three answers that I can give you here.
      1. It's just good practice for all of your records to have their own unique identifier anyways.
      2. Ideally you'd have this done upstream so when the system create the record it's coming in with a unique key in both the parent and child. (This would be on the software dev team to implement)
      3. You'd need to do the compound key generating steps in every individual table that references the parent in this case, and that can be a hassle.

    • @Tyberes
      @Tyberes 5 місяців тому +1

      But yea, just the steps of duplicating and then merging all of the relevant "key" columns on both sides of the relationship would achieve the same result.

    • @MrSparkefrostie
      @MrSparkefrostie 5 місяців тому +2

      Two things in my mind, your key would be hyphenated and your key would be hyphenated, ok first one is just about how it looks and how you can easily say add the key on both sides so a 1 and a 2 is 3 and a 2 and a 1 is a 3 so you will need to pad the child key. So 1 becomes 1000 so now you have 1002 or 2001 in the examples above, the issue is if 999 suddenly is no longer enough. That's the one potential issue, the other issue is the text value, if you keep everything integer then you gain some efficiency, each space only has 10 available characters, for text it's 36 if there are no special characters, but I am sure it's far more with all the special characters. Apologies for the long answer and no bullet point, phone not doing line breaks

  • @kevindoherty8788
    @kevindoherty8788 5 місяців тому +13

    But wouldn't it be nice if Semantic Model supported multi-column relationships? We can dream.

  • @Tyberes
    @Tyberes 5 місяців тому +3

    Gosh I wish they'd let us just do joins on multiple columns at once. Skip this whole process entirely.

  • @Baldur1005
    @Baldur1005 4 місяці тому +1

    Well somehow the ChildId and EntityId has made it to the Fact Table, so I guess Star Schema is working. Definitely what is advised here is bad practice (you have your disclaimer in the video). The unique key need to be created for dimension table - period and repopulated in DATA WAREHOUSE during ETL.

  • @opod84
    @opod84 5 місяців тому +2

    For measures, you can also use the TREATAS function for virtual relationships.

  • @peterdaniels3428
    @peterdaniels3428 5 місяців тому +2

    Such a clean and effective description, Adam. It gets around using a term like "non-semantic", and clearly shows the importance of retaining the actual "semantic" (or "business") composite key, too. I appreciate your teaching style!

  • @dragobrumen
    @dragobrumen 4 місяці тому +1

    I tried this approach on a fact table with several million records. Initially, it was too slow and even failed during refreshes. To resolve this, I added an Index column for ProductID in Power Query and applied the changes. Next, I created a calculated column in the fact table using both CategoryID and ProductID to properly integrate ProductID. After setting up these changes and establishing the relationships, the process ran much more smoothly. In fact, this was part of the optimizations I implemented for a customer.

  • @Blog-igorbelovRu
    @Blog-igorbelovRu 4 місяці тому

    Здравствуйте
    Подскажите пожалуйста, а как можно синхронизировать визуальный элемент в Power BI?
    К примеру я создал фигуру и поместил в нее три фильтра, затем все это сгруппировал. Затем этот элемент я помешаю на другие созданные вкладки и если я вношу изменение в эту фигура, то эти изменения автоматически распространяются на на остальные вкладки где содержится данный объект.
    --
    Hello
    Please tell me, how can I synchronize a visual element in Power BI?
    For example, I created a shape and put three filters in it, then grouped it all. Then I will add this element to other created tabs and if I make a change to this shape, then these changes are automatically propagated to the other tabs where this object is contained.

  • @Milhouse77BS
    @Milhouse77BS 5 місяців тому +1

    Microsoft Fabric will make it easier to do surrogate key generation and integration with fact tables in the "back room" as Kimball would like.

  • @sauravtomar341
    @sauravtomar341 4 місяці тому

    Anyone who reads this comment can reply ......If i am creating an power BI dashboard and publish it as app. Now my user wants to create new graphs or KPIs in the app in a blank page. Is it possible to have this capability in Power BI.

  • @Lebkuecher
    @Lebkuecher 4 місяці тому

    Like several other people have mentioned, I have done stuff like this before, but I would have created a custom column to combine the Entity and Child Ids, using a dash or hyphen or something. This would ensure consistency across the two tables and avoid the Merge step, essentially being more effiecent.....because I'm what...I'm not lazy, I'm efficient, lol. Thanks!

  • @Chiz_1
    @Chiz_1 5 місяців тому +1

    In similar scenarios I create a new key column in both tables that is a merge of child id and product id. That way all the information is in the actual key and easier to validate.

    • @davejl8982
      @davejl8982 5 місяців тому +1

      Definitely. This way, it provides greater data transparency and traceability during data troubleshooting..

  • @wojciechjaniszewski9086
    @wojciechjaniszewski9086 5 місяців тому +2

    I didn't know, PQ merge operation supports composite key :) Every day is a school day! Thank you

  • @HarshithaNagaraj-lv7vc
    @HarshithaNagaraj-lv7vc 4 місяці тому

    hi, I am facing issue in refreshing a dataset (api) in services, i am getting SSL certificate issue but in local am able to refresh please if anyone has any idea please help me out

  • @ammarahmed5981
    @ammarahmed5981 2 місяці тому

    Adam, you are awesome.

  • @akki4718
    @akki4718 4 місяці тому

    Q: can’t we set parameters rule for postgre sql in deployment pipeline

  • @roshannawaz1977
    @roshannawaz1977 4 місяці тому

    if we refresh will the surrogate key update as well or should we update it manually?
    can ayone help

  • @Acheiropoietos
    @Acheiropoietos 4 місяці тому

    You can do this in your sleep, can’t you? 😂

  • @alvarorodriguezlasso
    @alvarorodriguezlasso 4 місяці тому

    Great, regards from Cali-Colombia

  • @He3nt6
    @He3nt6 5 місяців тому

    спасибо вам , что вы есть

  • @paullevchuk
    @paullevchuk 5 місяців тому

    Nothing new