I've setup aggregations in only one project a while ago, more or less a testproject. Going through this process once again is really helpful. Looking forward to the follow-up videos.
Right now I work exclusively with SQL Server and the PowerPlatform. I've found managing aggregations within SQL to be the cleanest and easiest route, but as you stated, if you don't have the permissions to create ETL tables or Views, it's great to see you can achieve the exact same output within PowerBI. It's important too as you build out the database and BI report to determine the best place for this level of work as the overhead can have consequences in either environment. Great intro to aggregations within PowerBI!
Hi Patrick Great video! Actually I have done some aggregations before I saw this video but I wasn't sure if this was the right way of doing this. Now I have more confidence! Thanks! Looking forward to next posts!!
Watched the whole thing, it is good to know that the grain is important. It will accommodate the query with the visuals, that it helps in the etl proces. There is no query folding in native query. But can somebody explain to me what aggregation is?
I believe to finish off the excellent feature, it would nice use the "Manage Aggregations" feature to enable transparent performance improvements wherever possible.
Can I set up aggregations with two import tables. My detailed import FACT table has 50M rows. Then I have a product AGG which has 15M. Lastly I have a HRCHY_Agg that has 5M rows. My dax would be so much faster if I could do this. I can't set the 50M row table to direct query because it is frequently referenced and import is way faster. But when I set the detailed fact table to Import I can't seem to set up aggregations.
In Tableau, after the dashboard is done, we can create an extract and select "hide unused fields", then "aggregate to visible dimensions". If Power BI can't make it that user friendly, they should work on it! If I need to change to aggregation later, I select "show hidden fields", then I use the newly required field in a chart, then "hide unused fields", then "aggregate to visible dimensions". This has many advantages, but especially for beginners who don't know exactly where they will end up / what fields they will end up using in the long run.
I have a project where we need to aggregate hourly meter readings by customer up to 10 years of data. when we created an initial aggregated view in the DB (Snowflake) at the customer level (450K) and by date (1 year data ONLY). This comes to a 150 M records and that seem to be a lot of data to cache it. We are planning have more similar "aggregated" tables for this report. it is Power BI Aggregations is still a good choice or should do DQ all the time? I've told that Snowflake should be able to return data in a matter of a couple seconds? help please
Thank you. If we use native query then how can we configure the incremental refresh? IR needs a datetime column filter applied by rangestart and rangeend
Once you deselect the fields you dont need, you cannot re-add them again, correct? You would have to start over. We query Salesforce and often we create new fields in Salesforce and then we need those within Power Bi.
I'm sure you'll bring this up in another video. If I aggregate data by date key am I going to loose a lot of dimensions I slice data by? eg a sales table aggregated by day is going to loose the ability to filter by territory or product. Or was this the point of the 'grain' note? If so it might be worth expanding on that going forward...
Hi yooooooooo Good day.. need your help do we have any options to show excel pivot tabular view in PowerBI. If we drag multiple columns in Matrix it's coming at the bottom I want it to come as next.. do we have any options. Need your help.
I learned something new. I been using power query to filter rows and columns in a 12 mio row table. The option you showed to create the new data source and write a sql statement has improve my refresh time from 30 min. to 30 sec. #GuyInACube - I'm your biggest fan :-)
Nice! 👊 Just be careful as using the native query option breaks query folding. You could end up seeing performance issues as a result. Also, if you go the sql statement route, make sure all of your transformations are in that sql statement.
I loved this video too. However I am curious, what if I need the same date in 2 aggregation levels, due to special columns, which aggregate up differently in daily and monthly view, and it is good. I did add the data 2 times to my modell, but it would be nice to make it only once. Any tips???
I think I would need to be quite careful how I used this. One of the reasons for moving towards Power BI from a death by excel graph situation was to be able to quickly answer the "why is the graph showing that" question by drilling into the graphic. Using aggregations seems to return me back to excel method. I understand your reasoning, just need to be careful where I would use it.
it would be nice to find the "next" video you say in the description of the video -- if you have a the full detail table and the aggregated table you can define the aggregated table as an aggregated version of the full detail table so that dax queries that can be resolved only with the aggregated table do not scan the full detail table -- I guess you have a video on that too, right? I saw one form adam not long ago, maybe you have some other one?
Goodday Patrick, thanks for your helpful video. I have a wonder, I had followed your instructions, using Aggregations, my dataset reduces from 25 mil rows to 6 mil rows. So when I refresh data (from desktop or services), I just need to refresh 6 mil rows, right?
I created a simple view and when I call this in Power bi it takes very long and still does not give the preview in power query. Show 'Preview is evaluating' If I click on load and start building a report I can see the data. View has only 2500 rows. It takes under 15 secs to run the query in SQL server. Data set is based on Contoso.
Patrick, you rock! I have a question that persists after watching this and next vid in the series several times. When you apply the Group By to your dim table, that dim table disappears from the model and the new grouped table persist. That is not how the ongoing modeling and management of Aggregations work, though. You still need to keep the dim table, and ADD that Grouped table, do make this work - is that correct? Does that mean that you duplicated the dim table before applying the Group By? It would be so awesome if you could clarify that step for me/us. thank you!!!
Hi Patrick, Most of the time I import all the data, build the report. When the report is finished, I want to clean up the unused data. Is there a way to easily check which data (Tables/Columns) am I not using?
Hey Patric, Thank you so much for the great video. I have a small question. I have millions of rows, so I want to do aggregation but my partner need to give access to others to download row level data. So can I do aggregation In table and also keep one without. So that who ever wants to download they can write query in Q&A and download. So my data won’t take much time. I know I am not making much sense, but seeking some suggestions from you.
Nagesh, not sure if you've already found a solution, but I think you can have two options. 1)Include row level data in your dataset. Simply create a non-aggregated table and allow end-users to connect to it via Excel with "Power BI Datasets" as a source 2)If your partner already has database read access, simply set them up with the query in Excel and allow them to connect and refresh it from an Excel file. The first option only works if your partner only has access to Power BI, but there are significant drawbacks for dataset refresh time and overall system resources. Natively connecting to a database from Excel will allow for the PBI report to be more efficient and your partner to only refresh data when they need it. They will also have the option to apply filters before loading into excel to save time and limit row counts for analysis (example: only current year data instead of full database table)
My first reaction when I saw this title, “Prepare to confused for way too long why your data looks so weird, try some extremely complex DAX formulas, then face palm yourself when you realize it was just set to the wrong aggregate...” haha
What goes around comes around :-) we finally realise that vertipac wasn't the cure-all and multi-dimensional had it right all along! True performance comes from knowing the answers before time just like everyone realised in 1990 something. Got to smile hey? But as usual nice vid.... Wish you guys would come over to SQLBits
👍i learned a lot with your videos, just one question : last update (February) power bi desktop we can refresh just what we need do you think it help without doing aggregation
Hi Patrick, really great video! One question: How do you configure a monthly aggregation table, without making a many-to-many relationship to the dates table?
I think there's a gap on Power BI aggregations (unless I'm missing something). - Scenario 1: Power BI hits an aggregate table depending on the joins: having Sales joined to Time and Product, and SalesAgg joined only to Time, if the report uses Time+Sales, SalesAgg will be used; if you include Product, then it'll hit Sales. + Problem: it doesn't work for different levels (ProductCategory/Product) - Scenario 2: If you want to have an aggregation on different levels of Product (ProductCategory and Product), then you can have 2 product tables (ProductCategory joined to SalesAgg, Product joined to Sales). Then Power BI will use SalesAgg if you have Category in the report, and Sales if you have Product (detailed table). + Problem: you can't have a Product hierarchy (do you??) ++ Solution: Sales has its aggregate table SalesAgg, and Product has its aggregate table ProductCategory; the joins are: Sales > Product, SalesAgg > ProductCategory. With this approach, it's transparent for the user to choose any attribute from Product, and depending on the level (detail or category) Power BI will hit ProductCategory or Product, and the same will happen with Sales. And more importantly, we will still have the Product hierarchy (!!). Is this something that can be performed in Power BI or it's really a gap?? I posted a Power BI Idea (ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/39619291-extend-aggregation-feature-for-enterprise-data-usa) but I think it will never reach enough votes to be developed. BTW, Oracle BI has this functionality (for 15-20 years minimum). Thanks!!
There are things coming with new Azure technologies that will accommodate this. In addition, stay tuned for upcoming videos on aggregations that should you can create aggs using relationships, without relationships and a hybrid approach.
@@GuyInACube Hello Patrick, have you done the video yet ? I need to use aggregations from the same fact table as well as the dim tables ( data coming that way from synapse table but I have control on how to read it or if any views needs to be created )
I've setup aggregations in only one project a while ago, more or less a testproject. Going through this process once again is really helpful. Looking forward to the follow-up videos.
Thanks for watching!
Right now I work exclusively with SQL Server and the PowerPlatform. I've found managing aggregations within SQL to be the cleanest and easiest route, but as you stated, if you don't have the permissions to create ETL tables or Views, it's great to see you can achieve the exact same output within PowerBI. It's important too as you build out the database and BI report to determine the best place for this level of work as the overhead can have consequences in either environment. Great intro to aggregations within PowerBI!
Thanks for watching!
Brilliant! Pure Brilliance!! Value added again. Patrick, You Rock!
Love getting these comments. Thanks!! 👊
love your content and passion! thank you Patrick
Hi Patrick
Great video! Actually I have done some aggregations before I saw this video but I wasn't sure if this was the right way of doing this. Now I have more confidence! Thanks! Looking forward to next posts!!
Excellent!👊
If we are getting knowledge free of cost. Why not hit like button?
Patrick, you know how to keep us motivated ! Used it a couple of times and will continue :) Massive thanks.
Woot! Great to hear Nelson! 👊
This is such a great video that clears me out on what aggregation is. Thank you for sharing!
How can we decide the columns to use in Group by out of 30 columns, can we use all the column in group by
Loving the way you explain and present in all your videos.. Thanks
Watched the whole thing, it is good to know that the grain is important. It will accommodate the query with the visuals, that it helps in the etl proces. There is no query folding in native query.
But can somebody explain to me what aggregation is?
I believe to finish off the excellent feature, it would nice use the "Manage Aggregations" feature to enable transparent performance improvements wherever possible.
Hey Patrick Thank for sharing such a great video. And, I love the way you explain and specially your expression connects with the story part. Great !
Can I set up aggregations with two import tables. My detailed import FACT table has 50M rows. Then I have a product AGG which has 15M. Lastly I have a HRCHY_Agg that has 5M rows. My dax would be so much faster if I could do this. I can't set the 50M row table to direct query because it is frequently referenced and import is way faster. But when I set the detailed fact table to Import I can't seem to set up aggregations.
If you configure Group By settings for a Direct Query fact table, is it possible to set the storage mode of the (hidden) aggregate table to Import?
Hi, Will aggregation work with Inactive relationship and USERELATIONSHIP function?
In Tableau, after the dashboard is done, we can create an extract and select "hide unused fields", then "aggregate to visible dimensions". If Power BI can't make it that user friendly, they should work on it!
If I need to change to aggregation later, I select "show hidden fields", then I use the newly required field in a chart, then "hide unused fields", then "aggregate to visible dimensions".
This has many advantages, but especially for beginners who don't know exactly where they will end up / what fields they will end up using in the long run.
I have a project where we need to aggregate hourly meter readings by customer up to 10 years of data. when we created an initial aggregated view in the DB (Snowflake) at the customer level (450K) and by date (1 year data ONLY). This comes to a 150 M records and that seem to be a lot of data to cache it. We are planning have more similar "aggregated" tables for this report. it is Power BI Aggregations is still a good choice or should do DQ all the time? I've told that Snowflake should be able to return data in a matter of a couple seconds? help please
Thank you. If we use native query then how can we configure the incremental refresh? IR needs a datetime column filter applied by rangestart and rangeend
Once you deselect the fields you dont need, you cannot re-add them again, correct? You would have to start over. We query Salesforce and often we create new fields in Salesforce and then we need those within Power Bi.
I'm sure you'll bring this up in another video. If I aggregate data by date key am I going to loose a lot of dimensions I slice data by? eg a sales table aggregated by day is going to loose the ability to filter by territory or product. Or was this the point of the 'grain' note? If so it might be worth expanding on that going forward...
You are correct Tim. I will discuss this in an upcoming video. If I miss the point, please call it out in the comments below.
Hi yooooooooo Good day.. need your help do we have any options to show excel pivot tabular view in PowerBI. If we drag multiple columns in Matrix it's coming at the bottom I want it to come as next.. do we have any options. Need your help.
where can i find contosoDW data ?
I learned something new. I been using power query to filter rows and columns in a 12 mio row table. The option you showed to create the new data source and write a sql statement has improve my refresh time from 30 min. to 30 sec. #GuyInACube - I'm your biggest fan :-)
Nice! 👊 Just be careful as using the native query option breaks query folding. You could end up seeing performance issues as a result. Also, if you go the sql statement route, make sure all of your transformations are in that sql statement.
I loved this video too. However I am curious, what if I need the same date in 2 aggregation levels, due to special columns, which aggregate up differently in daily and monthly view, and it is good. I did add the data 2 times to my modell, but it would be nice to make it only once. Any tips???
These videos are awesome! Keep up the great work :)!
You are very welcome.
I think I would need to be quite careful how I used this. One of the reasons for moving towards Power BI from a death by excel graph situation was to be able to quickly answer the "why is the graph showing that" question by drilling into the graphic. Using aggregations seems to return me back to excel method. I understand your reasoning, just need to be careful where I would use it.
it would be nice to find the "next" video you say in the description of the video -- if you have a the full detail table and the aggregated table you can define the aggregated table as an aggregated version of the full detail table so that dax queries that can be resolved only with the aggregated table do not scan the full detail table -- I guess you have a video on that too, right? I saw one form adam not long ago, maybe you have some other one?
Goodday Patrick, thanks for your helpful video.
I have a wonder, I had followed your instructions, using Aggregations, my dataset reduces from 25 mil rows to 6 mil rows. So when I refresh data (from desktop or services), I just need to refresh 6 mil rows, right?
Your video was super helpful, thank you so much!
Really thanks for this video. you guys make us Zero to Hero.. :)
Love to hear that! Hearing people's success with Power BI really gets us excited. 👊
with aggregation, do you lose access to the detailed underlying data?
You don't. If the query does not hit the agg, they it will go to the detail table in Direct Query mode, which could be slow depending on the source.
Would love to see a video on this!
Nicely done.
but i have to make sure that the user wont ask any LoD later after publishing, correct? or there is another idea?
I created a simple view and when I call this in Power bi it takes very long and still does not give the preview in power query. Show 'Preview is evaluating'
If I click on load and start building a report I can see the data. View has only 2500 rows. It takes under 15 secs to run the query in SQL server. Data set is based on Contoso.
What about the negatives of using aggregations? I'm working on a model with over 250 measures on it along with some very large tables any tips ?
Interesting technique. How can I apply this if my data source are bunch of csv files in a folder?
Hi Patrick, Good video!
However, please advice how to deal with grouping on dynamic columns.
Really thanks for this video! And a question from me - which query works faster for this case - the saved view, direct query, or power query as well?
Patrick, you rock! I have a question that persists after watching this and next vid in the series several times. When you apply the Group By to your dim table, that dim table disappears from the model and the new grouped table persist. That is not how the ongoing modeling and management of Aggregations work, though. You still need to keep the dim table, and ADD that Grouped table, do make this work - is that correct? Does that mean that you duplicated the dim table before applying the Group By? It would be so awesome if you could clarify that step for me/us. thank you!!!
Hi Patrick,
Most of the time I import all the data, build the report. When the report is finished, I want to clean up the unused data. Is there a way to easily check which data (Tables/Columns) am I not using?
This tool, github.com/stephbruno/Power-BI-Field-Finder, will show you which fields are being use. Hope this helps.
I'm not using aggs yet, but I will start! Thanks!
Awesome!
Aggregate data is a great way to speed up the performance of your reports
Yes they can.
you tutorials are f***ng great! when I need to figure out something, I look for #GuyInACube
That's fabulous!
Hey Patric,
Thank you so much for the great video.
I have a small question. I have millions of rows, so I want to do aggregation but my partner need to give access to others to download row level data. So can I do aggregation In table and also keep one without. So that who ever wants to download they can write query in Q&A and download. So my data won’t take much time. I know I am not making much sense, but seeking some suggestions from you.
Nagesh, not sure if you've already found a solution, but I think you can have two options.
1)Include row level data in your dataset. Simply create a non-aggregated table and allow end-users to connect to it via Excel with "Power BI Datasets" as a source
2)If your partner already has database read access, simply set them up with the query in Excel and allow them to connect and refresh it from an Excel file.
The first option only works if your partner only has access to Power BI, but there are significant drawbacks for dataset refresh time and overall system resources. Natively connecting to a database from Excel will allow for the PBI report to be more efficient and your partner to only refresh data when they need it. They will also have the option to apply filters before loading into excel to save time and limit row counts for analysis (example: only current year data instead of full database table)
My first reaction when I saw this title, “Prepare to confused for way too long why your data looks so weird, try some extremely complex DAX formulas, then face palm yourself when you realize it was just set to the wrong aggregate...” haha
HAHAHAHA.
What goes around comes around :-) we finally realise that vertipac wasn't the cure-all and multi-dimensional had it right all along! True performance comes from knowing the answers before time just like everyone realised in 1990 something. Got to smile hey? But as usual nice vid.... Wish you guys would come over to SQLBits
Multi-dimensional still has its place. Yeah, we were bummed that we missed out on Bits this year. Hopefully next year will work out.
I was JUST wondering if this is conceptually similar to Multi-Dimensional. Not being a user of it, I wasn't sure.
👍i learned a lot with your videos, just one question : last update (February) power bi desktop we can refresh just what we need do you think it help without doing aggregation
Yes, but refreshing is only part of it. The aggregations provide many other benefits.
@@GuyInACube ok thanks
Hi Patrick, really great video!
One question: How do you configure a monthly aggregation table, without making a many-to-many relationship to the dates table?
I would say you create a table in between with the month-year and first date of the month and link those to both the date and aggregation table!
Geaux Tigers! Nice background!
YES! Thanks! 👊
Awesome content. Watched a lot of your content and love it all! BTW, Geaux Tigers!! :)
Love the lsu wallpaper! Geaux Tigers
Great option . looking forward for more videos on aggregation.. Thanks
Thanks for watching! 👊
Let's start with. What is Aggregate?
I think there's a gap on Power BI aggregations (unless I'm missing something).
- Scenario 1: Power BI hits an aggregate table depending on the joins: having Sales joined to Time and Product, and SalesAgg joined only to Time, if the report uses Time+Sales, SalesAgg will be used; if you include Product, then it'll hit Sales.
+ Problem: it doesn't work for different levels (ProductCategory/Product)
- Scenario 2: If you want to have an aggregation on different levels of Product (ProductCategory and Product), then you can have 2 product tables (ProductCategory joined to SalesAgg, Product joined to Sales). Then Power BI will use SalesAgg if you have Category in the report, and Sales if you have Product (detailed table).
+ Problem: you can't have a Product hierarchy (do you??)
++ Solution: Sales has its aggregate table SalesAgg, and Product has its aggregate table ProductCategory; the joins are: Sales > Product, SalesAgg > ProductCategory.
With this approach, it's transparent for the user to choose any attribute from Product, and depending on the level (detail or category) Power BI will hit ProductCategory or Product, and the same will happen with Sales. And more importantly, we will still have the Product hierarchy (!!).
Is this something that can be performed in Power BI or it's really a gap??
I posted a Power BI Idea (ideas.powerbi.com/forums/265200-power-bi-ideas/suggestions/39619291-extend-aggregation-feature-for-enterprise-data-usa) but I think it will never reach enough votes to be developed.
BTW, Oracle BI has this functionality (for 15-20 years minimum).
Thanks!!
There are things coming with new Azure technologies that will accommodate this. In addition, stay tuned for upcoming videos on aggregations that should you can create aggs using relationships, without relationships and a hybrid approach.
@@GuyInACube Hello Patrick, have you done the video yet ? I need to use aggregations from the same fact table as well as the dim tables ( data coming that way from synapse table but I have control on how to read it or if any views needs to be created )
that is really bananas
U guys are cool
Thanks!👊
I like the cup Power Bier..
This is Bananas!!
Yes it is!! 👊
What ! This is Bananas..
Yoooooooo