Create custom keys for your Power BI relationships

Guy in a Cube

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 26 лис 2024

КОМЕНТАРІ • 213

@GuyInACube 4 роки тому ⁺¹³
Note that doing this approach could result in the underlying table being queried multiple times on your data source to satisfy the multiple Power Query queries.
@danneubauer6474 4 роки тому
Could you create one query of the source in power query and then "refer" to it multiple times rather than copy? Would this reduce the number of times the source is queried? I've done this many times for consistency sake, but am not sure if it results in the source being hit multiple times.
I usually then disable load of the "source" query so it does not become a table in my data model.
@subanark 4 роки тому
@@danneubauer6474That won't help. Each query is evaluated in isolation. Also for cleaner reference to just a single column use: MyTable[[MyColumn]]. This way you don't have to convert list to table vs MyTable[MyColumn]
@krishorrocks639 2 роки тому ⁺¹
From what I can tell the only way to truly avoid this is with premium capacity or premium per user dataflows which then do allow you to do true query chaining with the results of each query stored to disk for use as a source by downstream queries.
@MrGeneralLedger 4 роки тому ⁺⁵⁸
Rather than creating a new item "Not Supplied", I use "" or "". That way those items appear first in an ascending sorted list or a slicer. Plus they stand out when looking at many rows in a table. Now that it is clear they exist in my data, I can take steps to address them.
@GuyInACube 4 роки тому ⁺²
Thanks for sharing that John! nice trick to get them to the top.
@akshaychalke8122 Рік тому
AA😂😂😊😅😊😅,p
@akshaychalke8122 Рік тому
S
@akshaychalke8122 Рік тому
,,,zx,,,,,zzzz**
@zachj1217 4 роки тому ⁺⁴⁵
Dude I can't tell you how much your videos have helped me. I inherited a mess of a database in my new position and had no one to really learn from. You rock at teaching
@GuyInACube 4 роки тому ⁺⁶
That's awesome to hear. Thanks for watching! 👊
@rushmuzik 2 роки тому ⁺¹
What I like is how the star schema changes how you look at you data. It organizes your thought. Segments your perspective.
@MzDorsey 7 місяців тому
so thank you for this video. there was not too much talking, but only the amount to provide support for the steps. i love how you broke the task down into easy to follow steps and explaining why it was done that way. 🙂
@malk5434 2 роки тому
Dude, you just saved my life! I was looking for it and all the results I got were like "how to replace ID with the name", and I wanted the opposite! You just got a subscriber from Brazil! Great content! Cheers!
@PlatinumDragonProductions999 3 роки тому ⁺¹
Excellent video! Exactly what I needed with no unnecessary filler. As a budding data engineer, this was a huge help! You are both a scholar and a gentleman! :-D
@MrAszpic 4 роки тому ⁺⁸
Thanks for confirming I wasn't that nuts when doing exactly this. The merge step can take a LONG time for bigger tables though. Nice video!
@GuyInACube 4 роки тому
Indeed
@hukumka2601 2 роки тому
That is why I doubt if it really useful to use that technique. I see huge cons (long merging in PQ) and tiny pros (a little bit more usability).
If we speak about a small model, there should not be any noticable difference in productivity, but if we consider a big one, we could merging in PQ would be painful.
So why should we do all of that and pay more then receive? Or when this approach really matters and helps?
@tomoleusz Рік тому
@@hukumka2601 I have the same struggle currently, as I need to decide which approach to take for generating relationship keys (create integers vs concatenated keys). I like integer approach however it slows down refresh time significantly. Luckily I have a premium PBI capacity so I am considering moving most of the data transformations to data flows.
@inkerisandberg1114 3 роки тому ⁺¹
I’m just learning Power BI and your videos are so helpful and fun to watch! Thank you so much!
@BrenSunshine 3 роки тому ⁺⁴
Hi Patrick! I had to create IDs and I did a very similar process but instead of 'right click -> add as a new query', I've duplicated the entire table ('right click on main table-> duplicate') and from them I've performed exact same steps that you, what is the difference? Thanks for creating such great videos!
@arnohoedelmans 3 роки тому ⁺³
Thanx Patrick, when comparing this method with using Dax combine values is there a performance difference?
@ABENMUSIC 4 роки тому ⁺²
I can’t thank you enough for this video! I think this will solve our problem at work! 🙌🏻🙌🏻
@GuyInACube 4 роки тому
Awesome! 👊
@zygfrodo 4 роки тому ⁺²
First of all, mapping tables are cool:)
When your end user is someone who knows how to use PBI, then this method may come in handy to clean up the main table, however most people on the end of the chain possibly just know the frontend and only things they will change will be the filters. Therefore there is no need to create an artificial (in this particular example) mapping table.
Nevertheless great video, I really enjoy your content
@impala4641 4 роки тому
I get what you say. But will this improve performance or size of the data? Will that be a valid reason to create this kind of tables?
@krynnadin 4 роки тому ⁺²
@@impala4641 I find star schemas useful for speeding up report performance, however when one needs to build your star schema from your fact table, this can really reduce refresh performance, so one needs to balance these two points. If you're scheduling refreshes you might be able to offload refresh performance to off peak hours. Win win.
@TainuiaKid1973 3 роки тому
@@impala4641 it’ll reduce the RAM required to hold the data model, and also make some DAX calculations easier. Power BI is designed to use Star Schemas.
@bcippitelli 4 роки тому ⁺¹
Thanks dude. I didn't know about this method which doesn't use the "duplicate function". Much easier!
@GuyInACube 4 роки тому
Love it! 👊
@allejulien645 4 роки тому
You don't even imagine how I learn looking your videos, thousands thanks for your great job
@alexiusgalloway5951 2 роки тому ⁺¹
Your instructions are certainly stepping stones towards becoming "a big deal", keep 'em comin'
@GuyInACube 2 роки тому
BAM! 👊 Thanks Alexius!
@gaillaguet 4 роки тому ⁺¹
Hey Patrick, your videos are just awesome. Thank you so much for such easy to understand and accurate explanations ! Great Job
@GuyInACube 4 роки тому
Appreciate that! Thanks for watching 👊
@nataschalaros4334 2 роки тому
This is great! Thank you so much! You guys make this fun to learn. Keep up the good work!
@zx600e93 3 роки тому
Thanks, I actually used this yesterday and your steps worked like a champ, YA HIGH FIVE! Thanks for explaining what trim does too, that tool is very helpful. Going to put my gloves back on, clean clean clean data haha good thing I have a janitorial degree from the Corp. haha
@tashaparker5157 Рік тому
Awesome!! I am going to try this method! Thank you for walking through it.
@johngriffiths4636 4 роки тому
Nice video. I like these quick an useful data wrangling type videos. Please keep them up.
@ThinkwithLex Рік тому ⁺¹
I have been searching for you. Great video🥰🥰 thanks a lot
@barttitulaerexcelbart9400 4 роки тому
Patrick, very cool video.. Normaly you receive multiple tables and have to do something with it, now you give an example if you receive one big (=wide) table. Thank you for this interesting example.
@jlmacalisang 3 роки тому
Again, I went back here to refreshen up the ideas you got here Patrick. It really helped me a lot with my stuff! How about using this method in Import mode of connection then the data is updated, Does the other table will also be updated together with your other keys? Thanks a lot man!
@EST1865 3 роки тому
Brilliant way to generate look up tables. Thank you
@malakaclothing4336 5 місяців тому
Thanks man, one hour looking for this
@samuelmanseau6905 4 роки тому ⁺¹
Thanks ! Excellent advice at 6:45 !
@jameskarchut3089 4 роки тому
That is seriously good stuff! I've been thinking about something similar and now I have the ultimate solution to make this work. My only more-burning question at the moment is how do I get one of those Power BI coffee mugs... lusting after that!
@EffnShaShinko 4 роки тому ⁺¹
This is incredibly helpful. Thank you so much!
@GuyInACube 4 роки тому
Happy to help. Thanks for watching! 👊
@vxmine 4 роки тому ⁺¹
Thanks, Patrick. Good stuff as always!
@GuyInACube 4 роки тому
Thank you! 👊
@danneubauer6474 4 роки тому ⁺¹
I use this method all the time, works very well! Cool to see you guys use the same methods!
What's the best practice for troubleshooting the data once you've broken everything out?
For example, if you need to sift through that fact table by airline name, it become rather tedious to go back and forth between the tables matching keys. Worse yet, if you have multiple dimensions that are filtering the fact table, it can be difficult to identify the proper keys to look through the fact table.
If the source is a relational database, this could be done in the database, but in this situation, the source is a CSV or other file, so that type of out-of-Power BI querying is not possible.
Thanks!
@krynnadin 4 роки тому
I usually create a table visual in report space with the columns I need, and just add some slicers for the dimensions I care about QCing. Then I browse the data in report space rather than in query space.
@chelliebradshaw8721 4 місяці тому
This was so helpful! Now I’m trying to add more columns from my two fact tables to the new tables 😅 without my PK’s yet.
@EricaDyson 4 роки тому ⁺¹
Yep. All tallies with what I do ! Thanks for confirming !
@GuyInACube 4 роки тому
Thanks for watching! 👊
@nelsonma4711 4 роки тому ⁺⁴
Good Stuff Patrick!!
@GuyInACube 4 роки тому ⁺¹
Appreciate that Nelson! 👊
@g986 4 роки тому ⁺¹
Come back to the UK Nelson 😊
@nelsonma4711 4 роки тому ⁺¹
Wolfstar eheh I will eventually! But for now I’m enjoying this February’s - almost summer weather - in Lisbon :)
@alexrosen8762 2 роки тому
Really useful tutorial for messy data. Thanks!
@hk_200k 8 місяців тому
The most beautiful part is that it makes that column disappear from original table!
@wmfexcel 4 роки тому ⁺²
Hi Patrick, thanks for the video.
I have one question: Why we don't use AirlineName directly as the "Key"? We can skip the step of merge and it should be faster. Isn't it? Or I miss anything?
@GuyInACube 4 роки тому ⁺¹
Yeah it was just the example that was used. Definitely different ways you can do it.
@wmfexcel 4 роки тому
Guy in a Cube thanks for confirming! 🤗
@toyotakande5608 3 роки тому
Hi Patrick , I love your explanation very much , actually im beginner , pls help me below , i want to lookup one particular product in another table , but that product was booked by two different customer , finally it was sold to one customer , how to create a relationship for this from one table to another table
@nimaiahluwalia5678 4 роки тому ⁺¹
Nice video, do we have other methods to remove many to many in power bi?
@amarkhaliq641 3 роки тому
This video helped me a lot thanks was getting low percentages merging tables
@fabiovanroon1524 2 роки тому
Great video! If you get additional data, let’s say, with a new airline, will the refresh process take care of everything? Meaning add the ID to your airline table?
@stevenjeppesen4563 Рік тому
Just what I needed. Thank you
@johnmatta9577 4 роки тому ⁺⁴
Great one Patrick..I'd add one more step to yours and hide the airline id from the transactions table
@chelliebradshaw8721 4 місяці тому
This was so helpful! Now I’m trying to add more columns from my two fact tables to the new tables 😅 without my PK’s yet and having some difficulty 😢
@justapasserby69420 2 роки тому
Thanks for this, now I know how to narrow my fact tables down
@felixsaint-gelais-nault3028 Рік тому
I do this also, but instead of merge I do a transform with my buffered table. So if I have multiple columns, it's one step. I usually do
TblID[ID]{List.PositionOf(TblID[Element], [ThingToReplace]}
I don't know if the merges would be faster
@clarasdk 3 роки тому ⁺¹
Is having the relation on ID giving a better performance than just having it on airlinename in the airline helper table? Is the performance gain of this worth the performance overheat you mention for generating the keys? I would (until I saw this video) just have made the link on airline name....
@Mahi_RSV Місяць тому
Thanks a ton GuyInACube! Super useful video here :)
@markhenderson3771 2 роки тому
Great job, Patrick. This is helpful. I am going to use this technique in my dataflows so that it doesn't slow down the refresh. My question is about CamelCase. I heard (from the Tabular Editor Best Practices Analyzer) that CamelCase is not best practice. Why do people say that and what do you think?
@Silverlythia 4 роки тому ⁺¹
Great video, I do this when I want to split up a column that has multiple values, such as a tags column that would have a list of tags delimited by semi-colon. That way the user can select a single tag and see all matching rows that have that tag.
Question: Why duplicate instead of reference if you are doing multiple columns?
@luisalejandrorodriguezcamp9516 4 роки тому ⁺¹
Hi Zoe, as far as I know, you cannot merge referenced queries, only duplicated ones
@Silverlythia 4 роки тому
Thank you Luis! That makes sense
@joaquinmorris6845 2 роки тому
This was perfect. Thank you!!
@uzmarat 4 роки тому ⁺¹
Cool video, thanks! Do you have a video about caveats of joining on strings? Tnx!
@GuyInACube 4 роки тому ⁺¹
We do not. We should definitely do something about strings. Lots of things to consider.
@kaulpelly 4 роки тому
Came looking for exactly this. Great stuff!!
@norpriest521 4 роки тому
looking for what?
Surrogate key?
@atharvapawar7 2 роки тому
Thank you so much ! This helped me a lot.
@mako5708 4 роки тому ⁺¹
Maybe a newbie question, but still. I come from SAP BW world. How to ensure a new index will be automatically created and a new entry will be automatically added into this Airline dimension table when a new unique Airline name appears in source data (excel, csv, table,...).?
@Marc-gu4dh Рік тому
Can you still use this method if the incoming values for Airline are constantly changing? (e.g., new airlines are regularly being added to your original table)
@maximabr648 Рік тому
This video helped me a LOT, thank you so much
@anirbna 4 роки тому
Great video Patrick. Such a clean way to create lookup table and join. I have 2 related questions.
1. If I need to join 2 tables on multiple columns (e.g. composite keys), do I create a lookup table with those columns from the 1 side of the 1:N relationship?
2. If I need to join on BETWEEN clause, e.g. table1.date between table2.startdt and table2.enddt, what would be the best approach?
Thanks in advance.
@TainuiaKid1973 3 роки тому
@ANIRBAN PAL, did you solve your two challenges?
@harmonizewithme 4 роки тому ⁺³
How does this affect performance?
@ZachRenwickData 4 роки тому
Your report should be faster when interacting with visuals. This is because joins are more performant when using integer data types instead of text strings (especially for bigger models with millions of records). On the negative side, it can make your dataset refreshes slower because of the additional steps needed to create these keys.
@davidbrown1373 4 роки тому
I wish that I had watched this video last week! I did this in a much more manual way.
@billydavenport5828 3 роки тому
Would be REALLY cool if power query had an automated way of doing this. Right click on column -> "Create Unique Dim Table" and it does this all automatically.
@mmtmarathi6589 4 роки тому ⁺¹
Best explanation ever.
@GuyInACube 4 роки тому ⁺¹
Thanks for watching! 👊
@ClubSoundsForever 3 роки тому ⁺¹
Will a relationship between two integers in PowerBI perform faster than a relationship between two strings? In SQL I would say yes, but for Power BI - I don't know.
@RJ-yf3qs 3 роки тому
Love every your video, huge help to me. Thank you so much!
@mohammedimran4257 4 роки тому
Thank you so much for the video.
How to connect Dynamic folder(File name changes day to day) ? Extracting the data from new file through refresh is getting failed?
@sureshful Місяць тому
Does it work if new rows get added in dimension table and fact table. Will the new ids automatically get mapped?
@seanclark7727 4 роки тому
You must be a mind reading Jedi. I needed this as I am currently doing a similar approach through much more convoluted methods of duplicating tables and removing columns to get down to a basic ID Table. Your method will save me much time and my mind and emotional state are very appreciative. Have a question of how I can possibly have my company's IT department give me access to view the relationships they have built and had created through our data warehouse. Currently, they have many many tables with similar or exact names of different columns and attributes. It's making me go through a process of trying out different combinations figure out where they have pulled the data from and what relationships they have built between the two.I relate it to shooting a target with an arrow, in the dark, blindfolded, and with one arm tied behind my back. I'm not lazy, I just want to be efficient and not waste my time with the guessing game approach I find myself in.
@johngriffiths4636 4 роки тому
ask them. Develop a relationship with someone in ICT.
@MucahitKatirci 2 роки тому
Thanks for the video. It was really helpful.
@forallyoutuber2724 2 роки тому
Thank you for giving good information
@nicolaimller9791 2 роки тому
Hey Patrick, great video! Do the Airline query update the names when new ones is added in your ERP system?
@TainuiaKid1973 3 роки тому
Great video, Patrick!
@MiguelMartinez-sh8gz 3 роки тому
Great video, thanks for all the help
@johnconroy3917 4 роки тому
Hi, Patrick, if new Airline is added to original table will be it auto added to new query
Thanks
John
@sambasiva907 2 роки тому
Hello All, Is there a method of automating the process of Merging two tabular model ? I am using manual method in BISM normalizer
@morinho96 2 роки тому
Hi Patrick ! love the video ! Just one question that I couldn't find answer to : does matching on Id's rather than Airplane name improves the performance of the model ? Thanks à lot
@TracyOsimowicz Рік тому
How do you add the ID back into the fact table if you want to avoid (merging for query load time reasons)?
@biexbr 4 роки тому ⁺¹
Yooow! Hi Patrick. You asked if I would make this in a diferant way. Yes I would.
Until 5:21 I make just like you. But after, I wold add a custom step as "Table.Buffer()", and I wouldn't "add as new query". I wold make a Reference, rename this new query as "Arlines", remove other coluns, and make the same you do until 6:41.
Ok, ok calm down you are thinkin "But Daniel, this would make a 'A cyclic reference ' and this won't work".
So, to resolve this I make a new Reference to the 1st query, rename to "fData" (or something) and then I Merge Query with "Arlines".
To end I would hide my 1st table from my data model.
So let me justify, I wold do all this work because if I find out that I forgot to make a repalce, I would just need to make this new step in one Query. The way you did if my file change just a little bit, I would need to change twice.
You think that this would gain or loss process time? (sorry possible spelling errors, I don't speak english very well)
@biexbr 4 роки тому
Oh, and maybe I would add a new step to Capitalized Each Word with Text.Proper.
@nico1z95 4 роки тому
@@biexbr I still can't understand what does table.buffer() does. When I update the list with a new different row (ie: New airline) the lookup does not update. so after merging the rows became null.
@evelic 2 роки тому
How would this work with multiple columns. Columns example: cost center, cost center mapping and period.
@pchidambaram9137 4 роки тому
Hey Patrick, I want to use a parameter to filter Top N output of a matrix ; using the parameter slicer. Could you please show me how? Thanks
@robsonnascimento5935 4 роки тому ⁺¹
Cool T-Shirt Patrick !!
@GuyInACube 4 роки тому
Thanks. 👊
@operacionsql8578 3 роки тому
I really liked the way of subtly taking performance into consideration (Look for int), is there a visual form of execution plan? What would be the equivalent of SQL Execution Plan to use with Power BI? Exists?
@johnmclean1288 3 роки тому
Very helpful - thank you!
@mangeshmehendale4139 3 роки тому ⁺¹
This is a beautiful and relevant video Patrick. I've often found myself thinking about this business case where the dimension data is long text strings and doing joins on such dimensions is fraught with uncertainty at the best of times. There is one use case I find myself thinking about, which the video does not address so I'm pinging to understand how you have thought about this. Imagine the flat file had different values for "TWA", "Transworld Airlines", "Trans World Airlines". This technique would create a different custom key for each of these entries - in reality however, these should point to the same key. Therefore, using this technique in power query will not cover this particular use case. Up in my head, the only way to do this is through manual intervention where the key is inserted through a manual scan of the table to ensure that "TWA", "Transworld Airlines" and "Trans World Airlines" all point to the same key. Short question - is there a way to reject this "lazy" technique and become more "efficient"??!!!?!
@johnadair4979 2 роки тому
I'm sure you've already found your answer, but a creating a Transformation table would solve that problem. Power Query's documentation would show you how to do that.
@sahiladya8473 3 роки тому
What if we have to consider multiple columns for this approach?
Also, if we have to use a table as global filters across different sheets developed using different table. I.e. star schema.
@9zQx86LT 2 роки тому
hey Patrick... would about joining on alphanumeric keys with Tpye as "any" ?
@alep1186 11 місяців тому
thanks this helped a lot!
@diamonddas 4 роки тому
Hi Patrick can please do a video on aggregation , i have created an aggregated table using fax and want to create a dynamic filter for a column not included in the aggregation table
@fabryespejo2697 4 роки тому ⁺¹
Espectacular!! Muchas gracias!
@GuyInACube 4 роки тому
Appreciate it, thanks! 👊
@abeybrams2366 11 місяців тому
Very dope video. It really helped me
@spacial7777 Рік тому
Is there a way to automate this , i have 40 tables I need to move from the flat-file
@PHorncastle 4 роки тому
Thanks in my real world with no clean data warehouse the data modelling and joining is the biggest hurdle to using power bi
@jonprendergast7009 2 роки тому
This is beautiful
@mariorgutierrezleal 4 роки тому ⁺¹
Excelente, Just saved me hours of Work!!
@GuyInACube 4 роки тому ⁺¹
That's awesome! 👊
@ArtificialFertilizer 3 роки тому
The problem id that with any big dataset this method will make the data refresh terribly slow.
@ShijuKattarkandy 3 роки тому
Hey Patrick, what if new airlines get added to the flat table? Will it update the index table? Or do we have to do all this process again?
@krishorrocks639 2 роки тому ⁺¹
All the IDs get regenerated with each refresh so new values shouldn't present a problem. Because of this you definitely don't want to take a dependency on ID values in your reports since "US Air" could = 1 today but = 2 tomorrow. In dimensional modeling parlance these are "Surrogate Keys" and should never be exposed to users. It is best practice to hide surrogate ID columns in the model.
@bradentilley2238 2 роки тому
@@krishorrocks639 Thanks for commenting on this. I’m new in this field and looking for further clarification.
If I do want to depend on an ID from refresh to refresh, how is this typically done?
@hiolka 3 роки тому ⁺¹
What happens if with time a new airline name appears in the source file? Will it be added automatically to the "Airline" table? Or will it result in an airline name without a related key?
@mondaynighthockeyleague 3 роки тому ⁺¹
Added automatically. PBI will import the data then follow the transformations, one if which will create the new key
@spen2431 4 роки тому
What's the difference between this method and using a reference table? (ie right click / duplicate - then remove all other columns etc..?) - Great vid... as always !
@krishorrocks639 2 роки тому
From a data source perspective there isn't any difference between reference and duplicate. Both result in two independent queries that will query all the way back to the data source. When you reference, changes in the referenced query will affect the referencing query. Duplicate is just a one time copy/paste of query steps into a new query.
@leozaraterdz Рік тому
What if I want to make two columns as my primary key, I mean, instead of doing just Airline Name as a Primary Key, I want Primary Key and Claim Site, both of them as my primary key?
@infips00 4 роки тому
Patrick, I am using your solution but I am facing performance problems when I join (inner join) both tables by the text key column (number of rows 1.000.000 aprox). Thanks
@adamk6553 4 роки тому
Nice one Patrick! QQ, assuming I am using this method with a DW what happen if new data (Airline in your case) get added? Would the index capture the new lines? Cheers.
@Elkhamasi 4 роки тому ⁺¹
I second this question :)
@johngriffiths4636 4 роки тому
@@Elkhamasi your DW should already have an index. But, if you reference the query rather than duplicate, it should be dynamic.

Наступне

Автоматичне відтворення

Database Normalization for Beginners | How to Normalize Data w/ Power Query (full tutorial!)