Great video and interesting solution with the calculation group. What I typically do is add another column to the store dimension called "current manager". It has the advantage of not modifying the fact and not adding extra measures (or calculation groups). Users just have to remember to use the correct column to view the manager.
This was super helpful. I was able to apply this same sort of logic to a different report for "Total IT Tickets open more than 10 days, month over month"
Probably one of the most useful videos you guys have done thus far... It'll be over the head of some folks, and it's a lot to take in, but you keep the length short enough that you can "skim watch" and then "study watch" later... Kudos 👍👍👍
It was fun to see how you can use Power Query here. I sent Parick some examples of how to solve this in a DWH with Anchor modelling that makes it possible to use several SCD-types. You can also avoid putting the manager in the same dimension table as the store and create two dimensions. This is easy and will record all changes but you need to do this in the ETL.
I have implemented Scd using SSIS and that was quite easy to do. Anyways thank you Patrick for the great video! Surely I'll implement the same in Power BI too.. Cheers
Thanks Patrick. I love your way of thinking in terms of where the changes need to happen. Depending on the requirements not all could be solved in the Front end tool, even if it is Power BI. Some things are better to be fixed in the DWH. I love your videos and would love to work with you in a project.
The steps taken to get the surrogate key into the fact table is the only thing missing. So adding the surrogate key based on the From and To dates in the SCD table. Lots of solutions via google but none particularly elegent.
Amazing video. I can't download the examples. I have this message: Account is cancelled and can not accept new subscribers. Can you make them available again. Thanks
Would love a video about doing some other calculations, like how long was Adam manager of X store? Especially for an example where you might have multiple columns you are doing SCD2 logic on.
Hi Patrick, how would power BI handle an SCD for trending changes, such as an item is either rented, unrented or being repaired? The SCD would log the times in and out of the shop, So you could trend over the year what status the item was in?
Very good topic Patric. Just one thing, in minute 2:51 you mention type 2 is over-write, I suppose you meant type 1 is over-write. So SCD0, fixed dimension, SCD1 over-write, SCD2 historical.
Hello Patrick, i just want to mention that for SCD2 (historical) the start date and end date is overlapping. What i mean is that if first manager Adam EndDate is 20210116 for the new manager Patrick it has to be from the next day. I guess you cannot be managers in the same time both of you. Imagine that you query the manager in act for the 20210116 - the result will be both of you - if you query on a range of dates - for example - "select Manager from table where today() between StartDate and EndDate" - and today is 20210116. Anyway, nice videos and useful aswell, great job!
What if you don't have the surrogate key in the Fact table? How do you do the join? As far as I can tell, you cannot do joins in PowerBI on GREATER/LESS than?
Thank you very much, very helpful. I am struggling to find a good solution with scd2 and hierarchy with the path function. Do you have any suggestions?
Hi from Swizerland! Nice video! Question or idea for a new video: With SCD2, How to exploit the startDate and endDate (as shown in 4:17) in power bi ? For example display a list of all active rows in a date (between startDate and endDate) -> and, challenge... change the reference date with a slicer... Bye Fabien
Excellent video. I really liked the introduction and context you gave at the beginning of the video, and the way you explained things with clear examples. I'm doing a similar project, and wanted to try and create from source data, a collection of fact and dimension tables using Power Query. However one of the tables I need to create, (or which I think I should be creating, and would like to create) is a Type 2 - S.C.D.. But after doing some web research to find out whether this is possible, and reading some of the best practice documentation on STAR Data Models, on the official Power B.I. site, I've concluded that it's just not possible. So I think I'm going to have to create a dimension table whose primary key is just a composite key of all the attribute fields within it, and then link it the Dimension Table that way to the Fact Table. It would be good to understand though, how to do such a thing properly. But thanks for the video anyway.
The title is wrong it is not "SCD IN PB" it is "SCD IN SQL". All the issue which pop up with SCD was solved in the source. I have a project where the source is excel, and no chance to make changes before porwer bi. :(
Really nice video! and just wonder if anyhow Guy in a Cube could make more data transformation videos! Now, I am facing a data cleaning question, the business needs to match the first six characters of a data field to another one, after the comparing, the unmatched ones could be modified by hand directly. I have tried PowerBI to extract the characters and use the fuzzy match to filter out the values in data transformation, but then it seems like that Powerbi cannot change the original value, so does it mean the unmatched value has to be changed from the source?
Your Company consists of the regulatory and non regulatory business divisions for which segregated data sets are required due to compliance reasons but high level management, Finance & HR requires combined access to both the divisions. Some of the HR data elements like salary & personal information columns are required to be accessed by HR only. Further for operational reporting, line of business managers require access to their respective areas, general managers to their geographical regions, team leads to their teams and personnel to their own activity reports. Considering above organizational structure and requirements what are the security models and methods you will suggest to apply for securing reporting and data access? Anyone have an answer for this?
I have a column in my table name reason for leaving and based on that I want to add another column which is Volunteer attrition or Involunteer attrition i.e. if reason for leaving for an employee appears as Retired then the new column should show Involunteer attrition whereas for resignation it should show as volunteer attrition. Can you help me add this column please? Also how can I calculate the percentage of volunteer and involunteer attrition based on the total attrition for the whole year which is 1157.
Merci beaucoup pour cette vidéo Patrick ainsi qu'à la team @GuyInACube Moi j'ai une autre préoccupation toujours avec la gestion du SCD Type 2 dans POWERBI. En fait, pour chaque dimension, j'ai un StartDate , un EndDate et un Statut (Actif / Expiré). Je souhaite sur PowerBI disposer un filtre de date (DateExtraction): lorsqu'une date sera sélectionner, PowerBI devra présenter uniquement les données tels qu'elle se trouvait à cette Date. Donc pour chaque Mesure dans le table de fait j'ai rajouter à la fonction calculate un filtre qui sélectionne une ligne lorsque la startDate
Does anyone know how to use the PATH dax formula on a Type 2 scd table such as dim_emp to implement Row Level Security? For example: Yesterday, John reported to Dan who reported to Jack. John's chain of command using PATH would be "Jack|Dan|John". However, today, a new record was inserted into dim_emp where John's reporting remained to Dan but now Dan reports Maddy making John's PATH as "Jack|Dan|Maddy". Because a new row was inserted into dim_emp for John, his old row is now flagged as "N" for the "Active Record" boolean. This would be the same for Dan|John to Dan|Maddy but only focusing one exaple. Anyway, that said, when running the PATH dax function on this type 2 scd, I get an error that says "each value in dim_emp[name] must have the same value in dim_emp[supervisor_name]. the value 'John' has multiple values. The issue is that PowerBI is not able to write the PATH command because there are multiple rows for John. If I filter for active records only, PATH works just fine but when showing the historical and most recent records together, PATH breaks. I even tried running an IF(active_record = "Y", PATH, blank) and that still doesn't want to do it. Any suggestions? Thanks!
it is new option in power bi But i didn't able to understand. I had seen SCD in SSIS . I was worked on that. It's simple to understand. This is little bit confusion.
why would you need to modify Fact to add new custom store surr key ,instead couldnt you handle in the measure by putting a Case statement ? also custom surr key is needed in dikension, i think we could live only with the Current manager flag ?
What if I really really don't want to add another column to my fact? My fact table is already massive as is. How upset would you be if I kept my SCD as a table related to the fact using a surrogate key, but then calculated my 'current' dimension (current indicator = 1) as an additional table to add to the model as a snowflake? I would use the SCD alternate key to relate to the (now primary key of the) 'current' table snowflake. Then users that need point-in-time reporting can use the SCD table, and users that need current reporting can use the 'current' table, which will push the alternate key filter from the 'current' table to the SCD which will push the surrogate key filter to the fact. Again, I know that in ideal world we should avoid snowflakes, but this seems like the most straightforward way of accomplishing it without monkeying my fact. Thoughts?
I know this is 2 years old now, but I'm researching how to do SCD2 in a model, and I'm with you...I don't think we can have a "Current Key" column on the fact table as this value could change for old facts, and you don't want to have to reprocess the entire fact table in case the Current Key column is different. But you could introduce a kind of bridge table between the Store table and the Fact table. This bridge table would be a table would have 2 columns in it: StoreSK (surrogate key) and CurrentSK (current key). I would think you could build a view that would associate each StoreSK with the current version of the store. Then create a relationship between the Store.StoreSK and the StoreBridge.CurrentSK columns (it would be many to one so it may require bi-direcitonal filtering!) , an then an inactive relationship between the StoreBridge.StoreSK and the Fact.StoreSK. No need for a separate dimension. And the Calc Group would disable the main relationship between Store and the Fact table, and activate the relationship between the StoreBridge and Fact table. I haven't tried this yet, but I like it! No need to refresh facts, and you always can access the current record from the Store table.
I literally was just talking to our data engineer about how to model with SCD2 data when I saw this video was posted. So timely!
Nice.
Great video and interesting solution with the calculation group. What I typically do is add another column to the store dimension called "current manager". It has the advantage of not modifying the fact and not adding extra measures (or calculation groups). Users just have to remember to use the correct column to view the manager.
Yep, another great solution Koen.
This was super helpful. I was able to apply this same sort of logic to a different report for "Total IT Tickets open more than 10 days, month over month"
Probably one of the most useful videos you guys have done thus far... It'll be over the head of some folks, and it's a lot to take in, but you keep the length short enough that you can "skim watch" and then "study watch" later... Kudos 👍👍👍
It was fun to see how you can use Power Query here. I sent Parick some examples of how to solve this in a DWH with Anchor modelling that makes it possible to use several SCD-types. You can also avoid putting the manager in the same dimension table as the store and create two dimensions. This is easy and will record all changes but you need to do this in the ETL.
Yes. Thanks @Thomas for all the information that you shared.
@@GuyInACube Happy to help!
Happy to see the Data warehouse toolkit getting a shout-out! Great book for building DW's.
I have implemented Scd using SSIS and that was quite easy to do.
Anyways thank you Patrick for the great video!
Surely I'll implement the same in Power BI too..
Cheers
Thanks Patrick. I love your way of thinking in terms of where the changes need to happen. Depending on the requirements not all could be solved in the Front end tool, even if it is Power BI. Some things are better to be fixed in the DWH. I love your videos and would love to work with you in a project.
Love this Patrick! Hope to see more DW concepts videos that can be used in PowerBI!
Thank you, as always :)
The steps taken to get the surrogate key into the fact table is the only thing missing. So adding the surrogate key based on the From and To dates in the SCD table. Lots of solutions via google but none particularly elegent.
ua-cam.com/video/MZGme1YTEtA/v-deo.htmlsi=i1OpS-5bcvgkhTy5
Dear Patrick bro you are the collest american power bi God.
Love the video, Patrick! In your query examples in the first minutes you're using StoreID where that should be StoreAK.
Great point. The queries are actually point at the Source where it is actually StoreID, which is the StoreAK in the DW. It is kind of confusing.
Amazing video. I can't download the examples. I have this message: Account is cancelled and can not accept new subscribers.
Can you make them available again. Thanks
Would love a video about doing some other calculations, like how long was Adam manager of X store?
Especially for an example where you might have multiple columns you are doing SCD2 logic on.
Hi Patrick, how would power BI handle an SCD for trending changes, such as an item is either rented, unrented or being repaired? The SCD would log the times in and out of the shop, So you could trend over the year what status the item was in?
Patrick, you nail it !!!!
Really great work! Excellent communicator.
Very good topic Patric.
Just one thing, in minute 2:51 you mention type 2 is over-write, I suppose you meant type 1 is over-write. So SCD0, fixed dimension, SCD1 over-write, SCD2 historical.
I alsso noticed this and wanted to see if anyone else mentioned it. Glad to see someone else noticed it as well.
Thanks! It will come handy for me to track the employees moving from one team to another .)
Very good explanation for a difficult topic to put in place in Power Bi
Glad you liked it
Awesome video, easy to follow and great idea to implement on future reports
Glad you liked it!
SCD, great for DW, challenge for BI. Great stuff here!
Hello Patrick, i just want to mention that for SCD2 (historical) the start date and end date is overlapping. What i mean is that if first manager Adam EndDate is 20210116 for the new manager Patrick it has to be from the next day. I guess you cannot be managers in the same time both of you. Imagine that you query the manager in act for the 20210116 - the result will be both of you - if you query on a range of dates - for example - "select Manager from table where today() between StartDate and EndDate" - and today is 20210116. Anyway, nice videos and useful aswell, great job!
What if you don't have the surrogate key in the Fact table? How do you do the join? As far as I can tell, you cannot do joins in PowerBI on GREATER/LESS than?
Great video Patrick! As always thank you.:)
Thank you very much, very helpful. I am struggling to find a good solution with scd2 and hierarchy with the path function. Do you have any suggestions?
Hi from Swizerland! Nice video!
Question or idea for a new video:
With SCD2, How to exploit the startDate and endDate (as shown in 4:17) in power bi ?
For example display a list of all active rows in a date (between startDate and endDate)
-> and, challenge... change the reference date with a slicer...
Bye
Fabien
Does the use of views rather than straight tables not break direct-query models?
great method to use ''userelationship()' function
Yaa Patrick, super video and a great help, where do we get the code or resources for this please.
nice video. Btw does anyone know why we are still missing Power BI update from last month?
Thank you. Excellent video. Column masking based on user login. Any thoughts?
Great video Patrick :)
Many thanks!
Excellent video. I really liked the introduction and context you gave at the beginning of the video, and the way you explained things with clear examples. I'm doing a similar project, and wanted to try and create from source data, a collection of fact and dimension tables using Power Query. However one of the tables I need to create, (or which I think I should be creating, and would like to create) is a Type 2 - S.C.D.. But after doing some web research to find out whether this is possible, and reading some of the best practice documentation on STAR Data Models, on the official Power B.I. site, I've concluded that it's just not possible. So I think I'm going to have to create a dimension table whose primary key is just a composite key of all the attribute fields within it, and then link it the Dimension Table that way to the Fact Table. It would be good to understand though, how to do such a thing properly. But thanks for the video anyway.
Great work, thanks
How do you handle SCD2 with Dataverse as data source?
The title is wrong it is not "SCD IN PB" it is "SCD IN SQL". All the issue which pop up with SCD was solved in the source. I have a project where the source is excel, and no chance to make changes before porwer bi. :(
Thanx Patrick !!
great!! and function is littlcel compliacated but amazing
Really nice video! and just wonder if anyhow Guy in a Cube could make more data transformation videos! Now, I am facing a data cleaning question, the business needs to match the first six characters of a data field to another one, after the comparing, the unmatched ones could be modified by hand directly. I have tried PowerBI to extract the characters and use the fuzzy match to filter out the values in data transformation, but then it seems like that Powerbi cannot change the original value, so does it mean the unmatched value has to be changed from the source?
Your Company consists of the regulatory and non regulatory business divisions for which segregated data sets are required due to compliance reasons but high level management, Finance & HR requires combined access to both the divisions. Some of the HR data elements like salary & personal information columns are required to be accessed by HR only. Further for operational reporting, line of business managers require access to their respective areas, general managers to their geographical regions, team leads to their teams and personnel to their own activity reports.
Considering above organizational structure and requirements what are the security models and methods you will suggest to apply for securing reporting and data access?
Anyone have an answer for this?
oops! At 2:30 you say "Type 2" but you mean to say "Type 1"
Yep. You caught me!
I have a column in my table name reason for leaving and based on that I want to add another column which is Volunteer attrition or Involunteer attrition i.e. if reason for leaving for an employee appears as Retired then the new column should show Involunteer attrition whereas for resignation it should show as volunteer attrition. Can you help me add this column please? Also how can I calculate the percentage of volunteer and involunteer attrition based on the total attrition for the whole year which is 1157.
Q: is there a current working link to the data files in this video? Will be good to have access to it and practice. Can anyone send to me pls?
Merci beaucoup pour cette vidéo Patrick ainsi qu'à la team @GuyInACube
Moi j'ai une autre préoccupation toujours avec la gestion du SCD Type 2 dans POWERBI.
En fait, pour chaque dimension, j'ai un StartDate , un EndDate et un Statut (Actif / Expiré).
Je souhaite sur PowerBI disposer un filtre de date (DateExtraction): lorsqu'une date sera sélectionner, PowerBI devra présenter uniquement les données tels qu'elle se trouvait à cette Date.
Donc pour chaque Mesure dans le table de fait j'ai rajouter à la fonction calculate un filtre qui sélectionne une ligne lorsque la startDate
Does anyone know how to use the PATH dax formula on a Type 2 scd table such as dim_emp to implement Row Level Security? For example:
Yesterday, John reported to Dan who reported to Jack. John's chain of command using PATH would be "Jack|Dan|John".
However, today, a new record was inserted into dim_emp where John's reporting remained to Dan but now Dan reports Maddy making John's PATH as "Jack|Dan|Maddy". Because a new row was inserted into dim_emp for John, his old row is now flagged as "N" for the "Active Record" boolean. This would be the same for Dan|John to Dan|Maddy but only focusing one exaple.
Anyway, that said, when running the PATH dax function on this type 2 scd, I get an error that says "each value in dim_emp[name] must have the same value in dim_emp[supervisor_name]. the value 'John' has multiple values.
The issue is that PowerBI is not able to write the PATH command because there are multiple rows for John. If I filter for active records only, PATH works just fine but when showing the historical and most recent records together, PATH breaks. I even tried running an IF(active_record = "Y", PATH, blank) and that still doesn't want to do it.
Any suggestions? Thanks!
I gave up mate. I will watch later.
Amazing!!!
that's amazing
it is new option in power bi But i didn't able to understand. I had seen SCD in SSIS . I was worked on that. It's simple to understand. This is little bit confusion.
why would you need to modify Fact to add new custom store surr key ,instead couldnt you handle in the measure by putting a Case statement ? also custom surr key is needed in dikension, i think we could live only with the Current manager flag ?
Same video but for Tableau please?
You are awesome
What if I really really don't want to add another column to my fact? My fact table is already massive as is.
How upset would you be if I kept my SCD as a table related to the fact using a surrogate key, but then calculated my 'current' dimension (current indicator = 1) as an additional table to add to the model as a snowflake? I would use the SCD alternate key to relate to the (now primary key of the) 'current' table snowflake. Then users that need point-in-time reporting can use the SCD table, and users that need current reporting can use the 'current' table, which will push the alternate key filter from the 'current' table to the SCD which will push the surrogate key filter to the fact.
Again, I know that in ideal world we should avoid snowflakes, but this seems like the most straightforward way of accomplishing it without monkeying my fact. Thoughts?
I know this is 2 years old now, but I'm researching how to do SCD2 in a model, and I'm with you...I don't think we can have a "Current Key" column on the fact table as this value could change for old facts, and you don't want to have to reprocess the entire fact table in case the Current Key column is different.
But you could introduce a kind of bridge table between the Store table and the Fact table. This bridge table would be a table would have 2 columns in it: StoreSK (surrogate key) and CurrentSK (current key). I would think you could build a view that would associate each StoreSK with the current version of the store. Then create a relationship between the Store.StoreSK and the StoreBridge.CurrentSK columns (it would be many to one so it may require bi-direcitonal filtering!) , an then an inactive relationship between the StoreBridge.StoreSK and the Fact.StoreSK.
No need for a separate dimension. And the Calc Group would disable the main relationship between Store and the Fact table, and activate the relationship between the StoreBridge and Fact table.
I haven't tried this yet, but I like it! No need to refresh facts, and you always can access the current record from the Store table.
Interesting
That's again foking bananas :)))
I just messaged about this and this shows up. Not sure if i should be thankful for spying on my data or not
very funny man