How we optimize Power Query and Power BI Dataflows

Поділитися
Вставка
  • Опубліковано 19 січ 2022
  • Let's go into Adam's mind as he optimizes Power Query to make some Power BI Dataflows run faster to avoid resource usage. The result is INSANE AMAZING!
    Power Query M function reference
    docs.microsoft.com/powerquery...
    📢 Become a member: guyinacu.be/membership
    *******************
    Want to take your Power BI skills to the next level? We have training courses available to help you with your journey.
    🎓 Guy in a Cube courses: guyinacu.be/courses
    *******************
    LET'S CONNECT!
    *******************
    -- / guyinacube
    -- / awsaxton
    -- / patrickdba
    -- / guyinacube
    -- / guyinacube
    -- guyinacube.com
    **Gear**
    🛠 Check out my Tools page - guyinacube.com/tools/
    #PowerBI #PowerQuery #GuyInACube
  • Наука та технологія

КОМЕНТАРІ • 78

  • @GuyInACube
    @GuyInACube  2 роки тому +14

    One thing not mentioned in the video is to be aware and careful about SQL Injection.

    • @Plackyu
      @Plackyu 2 роки тому +1

      Would love to see a video on SQL injection + Power BI and how to mitigate potential issues

    • @kkkkkkkkkkkkkkkk-k
      @kkkkkkkkkkkkkkkk-k 2 роки тому

      Can Best Practice Analyzer catch SQL injection?

  • @ItsNotAboutTheCell
    @ItsNotAboutTheCell 2 роки тому +13

    M-ind bending! Love the Advanced Editor and some neat tricks in here too for those performance folks! Keep up the PQ videos! One happy CAT!😻😻😻

  • @daryllynch998
    @daryllynch998 2 роки тому +9

    Hi Adam, thanks for the Video. I use this approach all the time. One thing that I don slightly different is to use the List.Buffer function because sometimes, I want PQ to read the List only once to get the full Set. I find it sometimes help performance. One suggestion for each time is consider Folding Query scenario. In this situation the following statement works:
    #"Filter Rows" = Table.SelectRows( Source, each List.Contains( CurrencyList, [Source Column] ) )
    This leads to folder query in most data sources. I also feel this can be more effective T-SQL because the filter is applied to the main table without the need to the Left or Inner Join.

  • @JCMCodaste
    @JCMCodaste 2 роки тому +2

    This is a great video! thank you, Adam. Good to see that you guys are playing more with dataflows now! I will take it as a good sign of their "health" within the Power BI roadmap! (at my own risk ;)
    It would be great to have more videos on how to optimize dataflows deployment when linking them across different workspaces (also where the latter are engaged in a deployment pipeline!) ..choosing between the two different connectors in those specific situations is still making me scratch my head, to be honest. Happy 2022!

  • @michelleleroux1728
    @michelleleroux1728 2 роки тому

    Learning to optimise is my goal for 2022! Thanks, Adam.

  • @pabeader1941
    @pabeader1941 2 роки тому +1

    Added this to my BI Goodies playlist. Good job and good timing for me. Have a project where this is going to come in handy. Not so much for the performance gain as an easy way to get excel into PQ.

  • @tkadosh
    @tkadosh 2 роки тому

    Really amazing and efficent way ... Congratulations

  • @louism.4980
    @louism.4980 Місяць тому

    Very insightful, thank you!

  • @klaustrampedach7312
    @klaustrampedach7312 2 роки тому

    Great video! 😊 The trick also works with a non-native query. Might come in handy 😎

  • @antoniogarrido3058
    @antoniogarrido3058 2 роки тому

    Ingenuos solution.. 👍 Really powerful dataflow trick!

  • @romualdulcyfer
    @romualdulcyfer 2 роки тому

    That's impressive !! I have some mergers in my PQ (eg. at data cleansing) which I need to somehow optimize after your video.

  • @michalmolka
    @michalmolka 2 роки тому

    Really great video!

  • @bijilabs7599
    @bijilabs7599 2 роки тому

    Wow😍 You guys are awesome!

  • @dbszepesi
    @dbszepesi 2 роки тому

    Nice, more of this please!

  • @manawaajirioghene6231
    @manawaajirioghene6231 Рік тому

    This is such a great video

  • @trevorc7734
    @trevorc7734 Рік тому

    Brilliant!!

  • @Anthony_Lecoq
    @Anthony_Lecoq 2 роки тому

    Great stuff Adam ;)

  • @jls14
    @jls14 2 роки тому

    Excellent video 👍. If advanced editor is scary, you can also do the text combine as a new step too. That way you don't have to do so much text editing. But SQL injection is pretty cool. Fantastic explanation as always 👍.

  • @matthieumamet3497
    @matthieumamet3497 2 роки тому

    Great video ❤

  • @cirilolhoycuenco6261
    @cirilolhoycuenco6261 2 роки тому

    This is so cool!

  • @blank3786
    @blank3786 2 роки тому

    You guys are amazing

  • @billcoleman2009
    @billcoleman2009 2 роки тому +2

    Nice. You could probably hook straight to the table, use selectcolumns() & filter steps and let query folding do its thing to keep it clean for incremental refresh etc - that should work, right?

  • @TheAalouis
    @TheAalouis 2 роки тому

    Great Job, explaining the refresh protocol columns. I am having an pretty expensive relative join in my dataflow script (find keyword in a text field and join that). Maybe this is something for your goal 2022! :)

  • @ekaterinazamosha4149
    @ekaterinazamosha4149 2 роки тому

    wow !!!!! Thank you!

  • @kebincui
    @kebincui 2 місяці тому

    Super ! 👍👍❤❤

  • @wmfexcel
    @wmfexcel 2 роки тому

    WOW! It makes a huge difference!
    May I know if there is any performance boost if we applied the Filters instead of doing it by Merging Queries?

  • @noahhadro8213
    @noahhadro8213 2 роки тому

    Awesome video. How would I do this if I wanted to filter that table but was not using a native query. so I connect to a table in sql and then I want to filter the table based upon a column IN a list of items from an excel spreadsheet?

  • @coolblue5929
    @coolblue5929 2 роки тому

    I’ve been doing this for a long time, initially in datasets (and excel) and then in dataflows. I wrote custom functions to convert from tables to SQL clauses (IN or VALUES) and a custom, multi-replace function to hack the SQL text. I also use parameters of the form :from and :to, for example, to fold back range filters into the native SQL also using the multi-replace function. Things change quickly with monthly updates but, the biggest problem with this approach has been finding the right pattern to satisfy the formula fire-wall. It would be great to compliment this video with a structured commentary on this, along with careful explanation of the potential problems with SQL injection.

    • @coolblue5929
      @coolblue5929 2 роки тому +2

      I would also suggest that, if doing such a trivial merge breaks M then M is seriously broken.

  • @sidharh100
    @sidharh100 2 роки тому +1

    Hey Adam, I was just wondering, if you could possibly make a video regarding Significance Testing in Power Bi?

  • @Kierslee
    @Kierslee 6 місяців тому

    Adam, where can I find the details regarding the time and resources to run the refresh (cpu, memory, etc.)?

  • @sirojiddinsobirov5508
    @sirojiddinsobirov5508 2 роки тому

    Insane Amazing!!!!!

  • @olivierfiliatrault6027
    @olivierfiliatrault6027 2 роки тому

    Amazing! One question: will the resulting query fold?

  • @emilymorrison4263
    @emilymorrison4263 2 роки тому

    How do you approach this if the merge is on two or more fields?

  • @visheshjjain
    @visheshjjain 2 роки тому

    Instead of filtering, If you want a column from your an external file in your fact table, is there any other way, apart from merge, to do it in PQ?
    Thank you!

  • @kasmirasmarzo
    @kasmirasmarzo 2 роки тому

    So useful. I’m doing merges everywhere and because I’m not a ‘real data scientist’ (I’m in regulation!) I have an itty bitty laptop. I’m constantly running out of memory.

  • @mikemagill68
    @mikemagill68 6 місяців тому

    Q. Awesome video. This is exactly the challenge I’m facing currently but m pulling in a SQL table with c. 54. million rows and then need to eliminate rows based on an external table with c. 10,000 rows. I totally get what you’ve done but can you inject a SQL ‘WHERE’ command with a comma separated list of 10,000 values? Is there an upper limit?

  • @Korallis1
    @Korallis1 2 роки тому

    Ok so this is all great and everything but what about when you have to merge in order to get the data you require for creating relationships? Is it better to just create a SQL query to provide the exact data your after? I have such a complex model structure and as such I have a lot of merges but the source is almost always sql

  • @MrSparkefrostie
    @MrSparkefrostie 11 місяців тому +1

    For anyone running into issues where your list converts to a table, this should help if your query is SQL, will need some tweaks if you use a differrent source. I am guessing it changes to table as its a live source and not sitting with load disabled
    Text.Combine(Sql.Database("Server", "Database", [Query = "SELECT CAST([NeededColumn] as VARCHAR) AS NeededColumn
    FROM [dbo].[NeededTable]
    WHERE Conditions = Conditions
    "])[NeededColumn], ", ")

    • @GuyInACube
      @GuyInACube  11 місяців тому

      Appreicate the call out on that. Will need to look at that more 👊

    • @MrSparkefrostie
      @MrSparkefrostie 11 місяців тому

      @@GuyInACube All good, this issue was driving me...bananas....

  • @fsanfo
    @fsanfo 11 місяців тому

    I have this query that takes 20m to run in my Oracle PL/SQL client. And I discovered that PQuery Online evaluates querys for 10 minutes only. So what do you recommend me? Retrieve the data by chunks or something else? I'm Pro User using Power BI service.

  • @danmeiss
    @danmeiss 2 роки тому +2

    Adam's version of "unplugged". See the thought process behind problem solving...

  • @LifeatAustralia
    @LifeatAustralia Рік тому

    Amazing Video Guy in a Cube 🙂 Would you please tell me how to manage merge queries when merged two tables through two different API calls. Is there any way to optimize API calls?

  • @bbangel90
    @bbangel90 Рік тому

    I have a Power App Dataflows which transforms and load data to a Dataverse table . This operation now takes around 50 minutes to complete, and I am speculating that the issue is not in the calculation in Power Query, but rather in the loading operation to the Dataverse table. How can I go about optimizing/improving the data loading performance?

  • @ReadySteadyExcel
    @ReadySteadyExcel 2 роки тому

    Awesome thumbnail!!

  • @9zQx86LT
    @9zQx86LT 2 роки тому

    Can this work with cosmos db?

  • @rakeshverma-fi2ju
    @rakeshverma-fi2ju 2 роки тому

    Hey everyone,
    We are refreshing a dataset against a dataflow that is using an incremental refresh and our C drive of gateway servers is hitting 0 MB space( from 400GBs). Is there anything we are doing wrong…

  • @gravestoner2488
    @gravestoner2488 Рік тому

    All my data comes in excel sheets, i have 1 sheet per month for 2 different data fact sheets. Is there a good way to combine these automatically without merging?

  • @remyschrader9286
    @remyschrader9286 10 місяців тому

    In Advanced Editor: ‘PromotionKeysAsList = Promotion[PromotionKey],
    CurrencyKeysAsList = Currency[CurrencyKey],’
    Values from a column in a table return as a list from ‘NeedValuesAsListType = TableName[ColumnName]’

  • @fluffigverbimmelt
    @fluffigverbimmelt 2 роки тому

    Nice, I might be able to use this on a db where DBA only let's us access via NOLOCK hint (yeah, don't get me started on that)

  • @DanielWeikert
    @DanielWeikert 2 роки тому +1

    Where can I see which operation is "expensive" and should be avoided? Is there a reference doc for that? Thanks

    • @Narses3
      @Narses3 2 роки тому +1

      Dataflow = slow,
      Understand what each step is doing, and what isn't folding to the source,
      An inner merge (filter) isn't folding to the source = all the data is being queried before then being filtered by the mashup engine
      There is no silver bullet , but if you understand what each step does (and isnt doing) then you can optimize, as with a lot of things it comes with experience.

  • @vikramanmohan121
    @vikramanmohan121 2 роки тому +1

    I have done this for multiple SQL queries where one SQL statement executed using a parameter and becomes an input for the next.
    One thing I learnt will make things easier is to store the quietly in a variable and pass it to source and you can paste your SQL query from your code editor directly on to the variable in Power Query

  • @Narses3
    @Narses3 2 роки тому

    Nice video concept to explain the mashup containers, also a nice showcase of other people Frankensteining queries/dataflows potentially over time from different people .
    Not sure if the exact problem you solved for the customer was as simple as this problem, but if it was then the team that manages dataflows/powerbi really should have seen this a mile off , but I guess not everyone knows everything to do with powerbi. I appreciate things are obvious in hindsight but I would be really surprised if a problem like this really did make its way to your desk.

    • @davidlopez-fe2lb
      @davidlopez-fe2lb 2 роки тому

      To your point Jonathan, I feel like the "architect" team should've have guidelines on how to build things, so monsters like this don't get created. We just brought on Power BI Desktop + Service onto our org and we've set no guidelines on how to build, so we'll see plenty of monsters like this. It's just a classic org problem of migrating processes over to the new platform asap, "we don't care how it gets done as long as its done" scenario. Luckily I watch Guy in a Cube weekly, and so far have minimized the monsters I've built.

    • @ItsNotAboutTheCell
      @ItsNotAboutTheCell 2 роки тому +3

      Can confirm, with the constant release of new features, it's often difficult to keep up with each nuance of the product, especially for those who aren't in Power BI for 40 hours a week (as simple or as complex as the topic may be). I always appreciate the opportunity to see the product with "fresh eyes" though and see some light bulb moments when people learn something new that can help them in the future.

  • @WojennyMlotek
    @WojennyMlotek 2 роки тому

    Yep, more optimazation please

  • @mohammedameen9477
    @mohammedameen9477 2 роки тому

    Awesome Video,
    How can we solve this when we have two tables which we want to merge and the data is coming from a flat file like csv or txt. I would want to perform left outer join on these two table based on some condition like
    IF (Table1[Column] = Table2[Column], Table2[Column], 0)
    Suppose if we have only one column from Table2 which we want to bring in Table1. We can convert that column in a list and use List.Select and write a condition but when we want to bring two columns or more how will do that.
    Can you help me with this.
    Thanks :)

    • @olemew
      @olemew 2 роки тому

      csv file is not sql based, you cant do that

  • @biexbr
    @biexbr 2 роки тому

    Mannn, I do this a Lot! And I mean a lot lot.

  • @asjones987
    @asjones987 2 роки тому

    Nice example, but curious on the syntax in the where clause. What were the “&” doing around PromotionList and CurrencList?

    • @dipeshbhundia1290
      @dipeshbhundia1290 2 роки тому

      Concatenation to access the variables

    • @pabeader1941
      @pabeader1941 2 роки тому +1

      They are there to 'build' the actual string that is being sent to the data source. & is the concatenate operator. It's used to combine strings together.

  • @jenniferpavey1
    @jenniferpavey1 2 роки тому

    This!! Omg

  • @MrSparkefrostie
    @MrSparkefrostie 11 місяців тому

    What I am looking for is to apply predicates against an upstream dataflow, maybe it does this already, time will tell

  • @matthiask4602
    @matthiask4602 2 роки тому +1

    finally a GIAC M-Code video. #avoidthewait

  • @pawewrona9749
    @pawewrona9749 2 роки тому

    Was laughing really hard when I heard that Adam and Patrick had a "race competition"

  • @matthiask4602
    @matthiask4602 2 роки тому

    This almost like in Inception. Just with a list.

  • @Baldur1005
    @Baldur1005 2 роки тому

    I know this against Roche's Maxim, but refresh time-outs are not worth it. I do "Group Bys" and Merges in DAX and if I want to to enrich table I use relationship or lookup function, doing those in DAX hurts performance by little margin and increase model size, but time-outs are much much worse.

  • @Phoenixspin
    @Phoenixspin Рік тому

    The thumbnail makes me think of monkey brains. (Indiana Jones reference)

  • @PaulBailey3
    @PaulBailey3 2 роки тому

    That's 🍌s

  • @Paul-pv8mo
    @Paul-pv8mo 2 роки тому

    🌈 P r o m o S M!!!