Working with Slowly Changing Dimensions in Power BI

Поділитися
Вставка
  • Опубліковано 5 жов 2024

КОМЕНТАРІ • 73

  • @evnpresson7258
    @evnpresson7258 3 роки тому +5

    I literally was just talking to our data engineer about how to model with SCD2 data when I saw this video was posted. So timely!

  • @koenverbeeck9514
    @koenverbeeck9514 3 роки тому +7

    Great video and interesting solution with the calculation group. What I typically do is add another column to the store dimension called "current manager". It has the advantage of not modifying the fact and not adding extra measures (or calculation groups). Users just have to remember to use the correct column to view the manager.

    • @GuyInACube
      @GuyInACube  3 роки тому

      Yep, another great solution Koen.

  • @BryanCampbell888
    @BryanCampbell888 3 роки тому +5

    Probably one of the most useful videos you guys have done thus far... It'll be over the head of some folks, and it's a lot to take in, but you keep the length short enough that you can "skim watch" and then "study watch" later... Kudos 👍👍👍

  • @mytwospence
    @mytwospence 3 роки тому +2

    Happy to see the Data warehouse toolkit getting a shout-out! Great book for building DW's.

  • @dizzyharris2658
    @dizzyharris2658 3 роки тому

    This was super helpful. I was able to apply this same sort of logic to a different report for "Total IT Tickets open more than 10 days, month over month"

  • @thomasivarsson2468
    @thomasivarsson2468 3 роки тому +2

    It was fun to see how you can use Power Query here. I sent Parick some examples of how to solve this in a DWH with Anchor modelling that makes it possible to use several SCD-types. You can also avoid putting the manager in the same dimension table as the store and create two dimensions. This is easy and will record all changes but you need to do this in the ETL.

    • @GuyInACube
      @GuyInACube  3 роки тому +1

      Yes. Thanks @Thomas for all the information that you shared.

    • @thomasivarsson2468
      @thomasivarsson2468 3 роки тому

      @@GuyInACube Happy to help!

  • @rehanahmed6052
    @rehanahmed6052 3 роки тому +2

    I have implemented Scd using SSIS and that was quite easy to do.
    Anyways thank you Patrick for the great video!
    Surely I'll implement the same in Power BI too..
    Cheers

  • @mdhidayat5706
    @mdhidayat5706 3 роки тому +4

    Love this Patrick! Hope to see more DW concepts videos that can be used in PowerBI!
    Thank you, as always :)

  • @dirkuecker6145
    @dirkuecker6145 3 роки тому +2

    Thanks Patrick. I love your way of thinking in terms of where the changes need to happen. Depending on the requirements not all could be solved in the Front end tool, even if it is Power BI. Some things are better to be fixed in the DWH. I love your videos and would love to work with you in a project.

  • @DaveRuijter
    @DaveRuijter 3 роки тому +1

    Love the video, Patrick! In your query examples in the first minutes you're using StoreID where that should be StoreAK.

    • @GuyInACube
      @GuyInACube  3 роки тому

      Great point. The queries are actually point at the Source where it is actually StoreID, which is the StoreAK in the DW. It is kind of confusing.

  • @alexandermilland6674
    @alexandermilland6674 9 місяців тому

    Would love a video about doing some other calculations, like how long was Adam manager of X store?
    Especially for an example where you might have multiple columns you are doing SCD2 logic on.

  • @dimitridelizee4032
    @dimitridelizee4032 Рік тому +1

    Amazing video. I can't download the examples. I have this message: Account is cancelled and can not accept new subscribers.
    Can you make them available again. Thanks

  • @TheJustinjet
    @TheJustinjet 2 роки тому

    Really great work! Excellent communicator.

  • @Negi_From_Pauri
    @Negi_From_Pauri 2 роки тому

    Dear Patrick bro you are the collest american power bi God.

  • @SoheilBakhshinz
    @SoheilBakhshinz 3 роки тому +1

    Very good topic Patric.
    Just one thing, in minute 2:51 you mention type 2 is over-write, I suppose you meant type 1 is over-write. So SCD0, fixed dimension, SCD1 over-write, SCD2 historical.

    • @jessmattingly6098
      @jessmattingly6098 2 роки тому +1

      I alsso noticed this and wanted to see if anyone else mentioned it. Glad to see someone else noticed it as well.

  •  3 роки тому

    Very good explanation for a difficult topic to put in place in Power Bi

  • @fabiendelaloye6370
    @fabiendelaloye6370 3 роки тому +2

    Hi from Swizerland! Nice video!
    Question or idea for a new video:
    With SCD2, How to exploit the startDate and endDate (as shown in 4:17) in power bi ?
    For example display a list of all active rows in a date (between startDate and endDate)
    -> and, challenge... change the reference date with a slicer...
    Bye
    Fabien

  • @Adam-hy9ou
    @Adam-hy9ou Рік тому +1

    The steps taken to get the surrogate key into the fact table is the only thing missing. So adding the surrogate key based on the From and To dates in the SCD table. Lots of solutions via google but none particularly elegent.

    • @Milhouse77BS
      @Milhouse77BS 10 місяців тому

      ua-cam.com/video/MZGme1YTEtA/v-deo.htmlsi=i1OpS-5bcvgkhTy5

  • @Mim_BI
    @Mim_BI 3 роки тому

    Patrick, you nail it !!!!

  • @nigeltufnel4031
    @nigeltufnel4031 3 роки тому

    SCD, great for DW, challenge for BI. Great stuff here!

  • @stephiesobgoum
    @stephiesobgoum Рік тому +1

    Merci beaucoup pour cette vidéo Patrick ainsi qu'à la team @GuyInACube
    Moi j'ai une autre préoccupation toujours avec la gestion du SCD Type 2 dans POWERBI.
    En fait, pour chaque dimension, j'ai un StartDate , un EndDate et un Statut (Actif / Expiré).
    Je souhaite sur PowerBI disposer un filtre de date (DateExtraction): lorsqu'une date sera sélectionner, PowerBI devra présenter uniquement les données tels qu'elle se trouvait à cette Date.
    Donc pour chaque Mesure dans le table de fait j'ai rajouter à la fonction calculate un filtre qui sélectionne une ligne lorsque la startDate

  • @albertobaraza
    @albertobaraza 3 роки тому

    Awesome video, easy to follow and great idea to implement on future reports

  • @mirinda101
    @mirinda101 3 роки тому

    Hello Patrick, i just want to mention that for SCD2 (historical) the start date and end date is overlapping. What i mean is that if first manager Adam EndDate is 20210116 for the new manager Patrick it has to be from the next day. I guess you cannot be managers in the same time both of you. Imagine that you query the manager in act for the 20210116 - the result will be both of you - if you query on a range of dates - for example - "select Manager from table where today() between StartDate and EndDate" - and today is 20210116. Anyway, nice videos and useful aswell, great job!

  • @leonidiakovlev
    @leonidiakovlev 3 роки тому

    Thanks! It will come handy for me to track the employees moving from one team to another .)

  • @StuartGreenBinkyBMF
    @StuartGreenBinkyBMF 3 роки тому +1

    Hi Patrick, how would power BI handle an SCD for trending changes, such as an item is either rented, unrented or being repaired? The SCD would log the times in and out of the shop, So you could trend over the year what status the item was in?

  • @fpa89
    @fpa89 Рік тому

    Great work, thanks

  • @neverGrowup1224
    @neverGrowup1224 3 роки тому

    great method to use ''userelationship()' function

  • @ysrfan2345
    @ysrfan2345 3 роки тому

    Great video Patrick! As always thank you.:)

  • @sachin.tandon
    @sachin.tandon 2 роки тому

    Excellent video. I really liked the introduction and context you gave at the beginning of the video, and the way you explained things with clear examples. I'm doing a similar project, and wanted to try and create from source data, a collection of fact and dimension tables using Power Query. However one of the tables I need to create, (or which I think I should be creating, and would like to create) is a Type 2 - S.C.D.. But after doing some web research to find out whether this is possible, and reading some of the best practice documentation on STAR Data Models, on the official Power B.I. site, I've concluded that it's just not possible. So I think I'm going to have to create a dimension table whose primary key is just a composite key of all the attribute fields within it, and then link it the Dimension Table that way to the Fact Table. It would be good to understand though, how to do such a thing properly. But thanks for the video anyway.

  • @VikingGuard
    @VikingGuard 3 роки тому +1

    Great video Patrick :)

  • @martijnvermeulen3982
    @martijnvermeulen3982 Рік тому

    Does the use of views rather than straight tables not break direct-query models?

  • @Hinkakan
    @Hinkakan Рік тому

    What if you don't have the surrogate key in the Fact table? How do you do the join? As far as I can tell, you cannot do joins in PowerBI on GREATER/LESS than?

  • @mehdifadhli6413
    @mehdifadhli6413 3 роки тому

    Thanx Patrick !!

  • @peterfarkas672
    @peterfarkas672 Рік тому

    The title is wrong it is not "SCD IN PB" it is "SCD IN SQL". All the issue which pop up with SCD was solved in the source. I have a project where the source is excel, and no chance to make changes before porwer bi. :(

  • @WilliamGardnerOrderedSteps
    @WilliamGardnerOrderedSteps 3 роки тому

    Yaa Patrick, super video and a great help, where do we get the code or resources for this please.

  • @김차수-k2u
    @김차수-k2u 3 роки тому

    great!! and function is littlcel compliacated but amazing

  • @geirberge8971
    @geirberge8971 2 роки тому

    Thank you very much, very helpful. I am struggling to find a good solution with scd2 and hierarchy with the path function. Do you have any suggestions?

  • @rqn2274
    @rqn2274 3 роки тому

    nice video. Btw does anyone know why we are still missing Power BI update from last month?

  • @ediths1784
    @ediths1784 2 роки тому

    How do you handle SCD2 with Dataverse as data source?

  • @adanyoshimoto7504
    @adanyoshimoto7504 2 роки тому

    Amazing!!!

  • @arunt2007
    @arunt2007 3 роки тому

    Thank you. Excellent video. Column masking based on user login. Any thoughts?

  • @felipefrancisco01
    @felipefrancisco01 3 роки тому

    that's amazing

  • @santoshraghunath8133
    @santoshraghunath8133 3 роки тому

    You are awesome

  • @hrz3942
    @hrz3942 3 роки тому

    Your Company consists of the regulatory and non regulatory business divisions for which segregated data sets are required due to compliance reasons but high level management, Finance & HR requires combined access to both the divisions. Some of the HR data elements like salary & personal information columns are required to be accessed by HR only. Further for operational reporting, line of business managers require access to their respective areas, general managers to their geographical regions, team leads to their teams and personnel to their own activity reports.
    Considering above organizational structure and requirements what are the security models and methods you will suggest to apply for securing reporting and data access?
    Anyone have an answer for this?

  • @shafa7668
    @shafa7668 2 роки тому

    I gave up mate. I will watch later.

  • @neverGrowup1224
    @neverGrowup1224 3 роки тому

    Really nice video! and just wonder if anyhow Guy in a Cube could make more data transformation videos! Now, I am facing a data cleaning question, the business needs to match the first six characters of a data field to another one, after the comparing, the unmatched ones could be modified by hand directly. I have tried PowerBI to extract the characters and use the fuzzy match to filter out the values in data transformation, but then it seems like that Powerbi cannot change the original value, so does it mean the unmatched value has to be changed from the source?

  • @sravankumar1767
    @sravankumar1767 3 роки тому

    it is new option in power bi But i didn't able to understand. I had seen SCD in SSIS . I was worked on that. It's simple to understand. This is little bit confusion.

  • @podamaire
    @podamaire 2 роки тому

    why would you need to modify Fact to add new custom store surr key ,instead couldnt you handle in the measure by putting a Case statement ? also custom surr key is needed in dikension, i think we could live only with the Current manager flag ?

  • @StephenEngine
    @StephenEngine 2 місяці тому

    Does anyone know how to use the PATH dax formula on a Type 2 scd table such as dim_emp to implement Row Level Security? For example:
    Yesterday, John reported to Dan who reported to Jack. John's chain of command using PATH would be "Jack|Dan|John".
    However, today, a new record was inserted into dim_emp where John's reporting remained to Dan but now Dan reports Maddy making John's PATH as "Jack|Dan|Maddy". Because a new row was inserted into dim_emp for John, his old row is now flagged as "N" for the "Active Record" boolean. This would be the same for Dan|John to Dan|Maddy but only focusing one exaple.
    Anyway, that said, when running the PATH dax function on this type 2 scd, I get an error that says "each value in dim_emp[name] must have the same value in dim_emp[supervisor_name]. the value 'John' has multiple values.
    The issue is that PowerBI is not able to write the PATH command because there are multiple rows for John. If I filter for active records only, PATH works just fine but when showing the historical and most recent records together, PATH breaks. I even tried running an IF(active_record = "Y", PATH, blank) and that still doesn't want to do it.
    Any suggestions? Thanks!

  • @riazuddin7493
    @riazuddin7493 3 роки тому

    I have a column in my table name reason for leaving and based on that I want to add another column which is Volunteer attrition or Involunteer attrition i.e. if reason for leaving for an employee appears as Retired then the new column should show Involunteer attrition whereas for resignation it should show as volunteer attrition. Can you help me add this column please? Also how can I calculate the percentage of volunteer and involunteer attrition based on the total attrition for the whole year which is 1157.

  • @lipidsled
    @lipidsled 3 роки тому

    Same video but for Tableau please?

  • @micahdail
    @micahdail 3 роки тому

    What if I really really don't want to add another column to my fact? My fact table is already massive as is.
    How upset would you be if I kept my SCD as a table related to the fact using a surrogate key, but then calculated my 'current' dimension (current indicator = 1) as an additional table to add to the model as a snowflake? I would use the SCD alternate key to relate to the (now primary key of the) 'current' table snowflake. Then users that need point-in-time reporting can use the SCD table, and users that need current reporting can use the 'current' table, which will push the alternate key filter from the 'current' table to the SCD which will push the surrogate key filter to the fact.
    Again, I know that in ideal world we should avoid snowflakes, but this seems like the most straightforward way of accomplishing it without monkeying my fact. Thoughts?

    • @stevecoleman9522
      @stevecoleman9522 Рік тому

      I know this is 2 years old now, but I'm researching how to do SCD2 in a model, and I'm with you...I don't think we can have a "Current Key" column on the fact table as this value could change for old facts, and you don't want to have to reprocess the entire fact table in case the Current Key column is different.
      But you could introduce a kind of bridge table between the Store table and the Fact table. This bridge table would be a table would have 2 columns in it: StoreSK (surrogate key) and CurrentSK (current key). I would think you could build a view that would associate each StoreSK with the current version of the store. Then create a relationship between the Store.StoreSK and the StoreBridge.CurrentSK columns (it would be many to one so it may require bi-direcitonal filtering!) , an then an inactive relationship between the StoreBridge.StoreSK and the Fact.StoreSK.
      No need for a separate dimension. And the Calc Group would disable the main relationship between Store and the Fact table, and activate the relationship between the StoreBridge and Fact table.
      I haven't tried this yet, but I like it! No need to refresh facts, and you always can access the current record from the Store table.

  • @tekezeshewa
    @tekezeshewa 3 роки тому

    Interesting

  • @pabeader1941
    @pabeader1941 3 роки тому +1

    oops! At 2:30 you say "Type 2" but you mean to say "Type 1"

  • @sushmita4713
    @sushmita4713 2 роки тому

    I just messaged about this and this shows up. Not sure if i should be thankful for spying on my data or not

  • @dariuszspiewak5624
    @dariuszspiewak5624 3 роки тому

    That's again foking bananas :)))

  • @Negi_From_Pauri
    @Negi_From_Pauri 2 роки тому

    very funny man