Excellent breakdown and I cannot stress the importance of this enough. Many data issues/errors can be tracked back to this. It's usually one of the first things I check when working within unfamiliar tables.
Kimball have type 1 to type 7 for SCD. TSQL have two good Windows functions for setting start and end dates for a dimension member with many versions. If you have an anchor dimension table(same dimension) , with only the source key and the surrogate key, and the changing attributes in a separate dimension table(same dimension) you can create SCD 1 to 7. Good video on a subject that can be a full day.
Brilliant! But yeah, I understand what you did here, but what is happening on the fact table and with the relationship to the fact table while you handle the dimension?
Hey Patrick, When merging a dimension with many type 1 or 2 attributes, do you have an opinion on whether you should do the comparison directly in the merge statement vs pre-calculating a hash of the values and comparing that? Thanks!
Great video Patrick. However I'd like to know why didn't you subtract 1 (mili)second from the Expiration Date of the old record? The way how you did it you have overlapping in the pair of effective date and expiration date. Let me explain, if I'd like to know what was the price of the product BI-5679 exactly in '2023-01-15 16:16:03.880' both records will be returned, instead of just one. Keep up the good work!
Thanks Patrick. Useful dim generator. However I always find it difficult to have a correct surrogate key for the lookup in the facttable. Loading the dimension looks straightforward. Loading the facttable is still a bit obscured for me
Hmmm... What if you would like to have a complete history? If you are loading/samplig the source data once a day you would not catch changes that cancel out on the same day. I guess you would need a push dataset for that. If you have a push dataset. How do you handle the relation between the date/time dimmention and the fact when the fact could be stamped with any date and time down to the micro second? Do you deside to set the time granularity to lets say one hour and use the latest value from that or previous hours?
This is all from the Data Warehouse side. My assumption is that you are referring to Power BI datasets. If that is the case, it will all be handled in the load of the fact table. We will publish a video on this topic soon.
Patrick´s recent videos clearly have lost the loved clicky-clicky-powerbi-vibe. But I am open to it. Did I know at the end what SCD are? no, but maybe I have to view it more times...
Hi Patrick..... Is there any way to show the actual value of a column on a line chart without doing any aggregation like count, sum, average,etc.... I also changed the summarization to " don't summarize" and let when the column in added on a axis it does the aggregation........ Is there any way to over come this ? Please do a video on this.
Excellent breakdown and I cannot stress the importance of this enough. Many data issues/errors can be tracked back to this. It's usually one of the first things I check when working within unfamiliar tables.
Excellent ! Would be indeed very interested in videos about insert, Update, Upsert, Merge in t SQL !! Thank you !
Great video Patrick. More in depth SQL videos would be awesome!
Loved it! Excellent explaining great samples. Did you a Video about the fact tables?
Kimball have type 1 to type 7 for SCD. TSQL have two good Windows functions for setting start and end dates for a dimension member with many versions. If you have an anchor dimension table(same dimension) , with only the source key and the surrogate key, and the changing attributes in a separate dimension table(same dimension) you can create SCD 1 to 7. Good video on a subject that can be a full day.
Yes there are many types. A full day course would be interesting. Thanks for watching
Oh nice, that's a great script for type 2 dimensions! Definitely going to save that in my folder of random useful code lol
BAM! Glad it was helpful! 👊
Brilliant! But yeah, I understand what you did here, but what is happening on the fact table and with the relationship to the fact table while you handle the dimension?
Interesting video, thanks Patrick, looking forward to the Fact Table video….
Hi Patrick, nice video for Understand the SCD 2 attribute. Can you share the script for your fans?
It is in the notes.
it is in the description of the video. Thanks for watching.
@@GuyInACube Thank you!
Hey Patrick,
When merging a dimension with many type 1 or 2 attributes, do you have an opinion on whether you should do the comparison directly in the merge statement vs pre-calculating a hash of the values and comparing that?
Thanks!
Great video Patrick. However I'd like to know why didn't you subtract 1 (mili)second from the Expiration Date of the old record?
The way how you did it you have overlapping in the pair of effective date and expiration date. Let me explain, if I'd like to know what was the price of the product BI-5679 exactly in '2023-01-15 16:16:03.880' both records will be returned, instead of just one. Keep up the good work!
Instead of Identity key I see people using sequence keys , what's your preference?
Thanks Patrick. Useful dim generator. However I always find it difficult to have a correct surrogate key for the lookup in the facttable. Loading the dimension looks straightforward. Loading the facttable is still a bit obscured for me
Thanks. We will be publishing a video on this topic soon.
Hmmm... What if you would like to have a complete history?
If you are loading/samplig the source data once a day you would not catch changes that cancel out on the same day. I guess you would need a push dataset for that.
If you have a push dataset. How do you handle the relation between the date/time dimmention and the fact when the fact could be stamped with any date and time down to the micro second? Do you deside to set the time granularity to lets say one hour and use the latest value from that or previous hours?
This is all from the Data Warehouse side. My assumption is that you are referring to Power BI datasets. If that is the case, it will all be handled in the load of the fact table. We will publish a video on this topic soon.
Why not use Temporal Tables? That does the same without the extra effort of additional columns or merging statements.
Patrick´s recent videos clearly have lost the loved clicky-clicky-powerbi-vibe. But I am open to it. Did I know at the end what SCD are? no, but maybe I have to view it more times...
Hi Patrick..... Is there any way to show the actual value of a column on a line chart without doing any aggregation like count, sum, average,etc.... I also changed the summarization to " don't summarize" and let when the column in added on a axis it does the aggregation........ Is there any way to over come this ? Please do a video on this.