This was a brilliant video. I've struggled with understanding the differences between native query, query folding and what gets pushed back to the data source. Seeing how you used the Azure Data Studio profiling extension was very helpful too as I wasn't aware of running this in the background while you refresh your Power BI file. This video clears up all my confusion in this topic. Thank you Patrick for this wonderful tutorial.
The most important kind of step you definitely want to push to the source is filter. Using native query you must include filtering in the query's where clause. Otherwise the entire query result will be imported and filtering will take place on the Power BI side which is obviously bad.
If someone knows their RDMS well enough to write SQL to import data, there'd be no reason they couldn't add the SQL statement to add a field for those concatenated columns in their native query from the start, as opposed to importing the data then merging the columns in the Query Editor.
I wish all new all this when I used a native query before! But I saw something you didn't mention: when the native query is greyed out in your query steps, the original SQL query for view the still appears in the source step in the M code. It's really really ugly, but it CAN be accessed and edited in the formula bar or advanced editor. Not as good as what you recommended, but still there for emergency access :)
As soon as I have to do some kind of advanced data manipulation, creating custom columns, casting datatypes of similar, then I use the native query directly and manage all of the main structure through that. If im not doing that, then I very quickly end up with some quite large and heavy pbix files, as many data operations requires to import data, and that simply is not an option in many cases where I use Power Bi and need to publish the report. Could be a difference of a pbix with 600Kb vs. a pbix 1Gb. Then we will be getting calls from IT that our dataset is taking up to many resoures at the Bi data warehouse when processing. However, when using native query towards a SQL server, then all the processing are done at SQL server level, before the data is delivered to the Power Bi service - that has for me been the best of two worlds and also I've found it to be the best performing one. However, it comes down to the performance of the SQL server.
The problem is - if I have a report with some parameters that change what dates are pulled in queries, if I do it with native qeury there doesn't seem to be any way to reference parameters in the SQL? Or maybe build the sql query programatically on refresh so I can reference them? The native query part means once I build it anyone who maintains it has to know SQL instead of just learning power query basics.
Adam or Patrick, is it possible to change your Power BI file from Import to DirectQuery? I decide I want zero latency from my data source and I connect to a MPP like Azure SQL DW or Snowflake, as there are Microsoft reference architectures in their knowledge base recommending this, is this possible? The use case here is to create views off of the source system that Power BI connects to, then I re-connect to these re-defined views with the same exact metadata in my new MPP source system. Is there a way to change the settings in M to account for this, or do I need to recreate everything from scratch? I've looked around and I continually hear "re-create the report, they're fast to create." However, re-creating 800 Power BI reports for an enterprise client is extremely time consuming and expensive. Is your recommendation to move to Premium capacity since the model limitations are much higher? Thanks!
I am using "WITH(NOLOCK)" in my SQL statement because I am pulling data from an operational database. I want to switch to query folding so I can use incremental refresh but I do not want to risk locking the tables. Does PowerBI have a setting for this somewhere? Or better yet, does it not lock the tables by default?
Hey, thanks for the video ! Is it preferred to use stored procedures or view:s when giving users access to restricted columns when importing data to power bi ? Or is there any difference when the SP is only a select statement ?
I use native query for every source since i don't have access to create views. I perform my transformations and calc columns in the native query as much as possible. I also find it easier to give a query to other people with the same logic for our metrics.
I think it depends. If you have excellent SQL server than sure it makes sense to push back the burden onto it. But I have shitty SQL server so it is much faster to let PBI service do the job
whatif the tables i want from sql server contains related tables, should i import those in full or those that i need. And since i don't have access to create views in the database, it is still better that i generate sql statement and import them.
Usefull info. But for me it is wierd to see after several first steps of removing or other general stuff SELECT from multiple subqueries. Do you see sence in this subqueries instead of just giving a list of needed columns?
From a general sense, that is how the T-SQL auto generation works within Power Query. It may not be the most optimal way to do it, but from a generic sense it first most scenarios.
I have a hive database and am connecting it, but I want to use a query to pull data where do I input the query as while connecting to hive there is no advance option to write the query
Somesh Mk So I have been waiting for that to happen. Need to check if it true. Bad timing for this video maybe, but nice to know that even Adam and Patrik has a hard time catching up with all the new nice features :-)
I thought about this when I saw the blog, but take a look at the post, powerbi.microsoft.com/en-us/blog/power-bi-desktop-september-2019-feature-summary/#postgreSQL. Looks like this only works for PostGre.
Beware, I think... when I embedded a native query in power BI and tried to set up incremental refresh, I was stymied. Didn't see the limitations with incremental refresh and this whole folding business... I thought that the incremental refresh would just add a where statement but it appears not so.... Am I missing something Patrick? Cheers!
Yup. You will run into badness. Pretty sure Patrick mentioned that in the video. I know it was one of the driving items for this video. Incremental refresh relies on query folding for the different partitions. If you are using Native Query, it will not be effective.
Thanks for taking up my question Patrick, and as a bonus, I learned that it's called a 'Native Query' and not an 'Embedded Query' as I've been calling it. Just one comment though, why have you (Microsoft) named it 'Query Folding'? As a non-native English speaker that naming makes no sense to me. I fold a shirt, but how do I fold a query? Something like query source pushing/execution makes more sense to me, but I might be alone with that view...
Doesn’t really mean anything for native speakers either. It’s tidy like folding shirts? Fold a poker hand, you give up and let the native query take over? Fold egg whites into a meringue of data? Bring someone into the fold (join the cult of native query)?
I've got some large tables that are unindexed, several log tables, and multiple equations for fields. Normally creating a temp table is the best route via SQL to prevent timeout but now there's a 1 to many relationship I need to report on too. Any suggestions on structuring the input into PowerBI?
Hey Patrick, wouldn't it make sense for the Power Query engine to declare the native query as a CTE (common table expression) and then still use query folding even for native query data sources? Wouldn't that still perform better than doing all the heavy lifting in PQ as opposed to pushing it back to the SQL server?
Even if it is a Common Table Expression (CTE) query folding will not kick in because of the use of a Native Query. If you wrap the CTE in a view then query folding will kick in and all the work will be pushed back to the source.
@@GuyInACube I'm proposing that the PQ engineering team should implement query folding for native queries by taking the native query and declaring it as a CTE in the query it folds back to the source. The generated SQL would be identical to when using a table/view source except that it'll reference the CTE instead of the table/view.
Hi Patrick, First time visit to the channel and thanks for the video! I was struggling getting my sql script (with Drop query!) into Power BI and was trying to resolve the 'Incorrect syntax nears the keyword' error. You mentioned not to use Drop query... So is it then best to create a new db table with those query and then to simply grab that new table into Power BI or would there be any work around? Thanks in advance!
Confused by the term import. You say Always import, but native query is also import. How else are you to get data stored in Oracle if you can’t query for it. Cant always make a view in Oracle.
on a separate note, can you share the best way to import tables from sql server. I have difficulty to reconcile the related tables even with system data model and data dictionary. A lot of them is based on my understanding and not from the system maker or data administrator (which has no knowledge of the data structure).
I am trying to add native query while setting up connection with SQL server but after clicking on Ok button but it will be continuously loading for more than an hr. Can you please tell me what needs to be done for this.
I think what you are saying is that power bi is slow. Sure you can write your sql to do what the query folding does. BUT... most of value of power bi is in those steps that can't be done in sql either by the dev or by "query folding": Appending and merging alternative sources, pivoting and unpivoting columns, or any other step too complex to write as sql. To me this is what power bi is for. It is just too slow at it.
There are certain things mentioned here, I don’t agree with. Not all power query transformations are pushed to the source. If you know tour SQL you can beat power query performance. Specifically when using a gateway.
So you should put all your logic in your native query yourself and you can confidently assume no weird stuff will be added after. That is how I hear what you are saying.
This was a brilliant video. I've struggled with understanding the differences between native query, query folding and what gets pushed back to the data source. Seeing how you used the Azure Data Studio profiling extension was very helpful too as I wasn't aware of running this in the background while you refresh your Power BI file. This video clears up all my confusion in this topic. Thank you Patrick for this wonderful tutorial.
Great video man! You and Adam are truly inspirational.
Thanks Robert!
As always, amazing video. You have helped me to have more interest in Power BI every day.
The most important kind of step you definitely want to push to the source is filter.
Using native query you must include filtering in the query's where clause.
Otherwise the entire query result will be imported and filtering will take place on the Power BI side which is obviously bad.
If someone knows their RDMS well enough to write SQL to import data, there'd be no reason they couldn't add the SQL statement to add a field for those concatenated columns in their native query from the start, as opposed to importing the data then merging the columns in the Query Editor.
I wish all new all this when I used a native query before! But I saw something you didn't mention: when the native query is greyed out in your query steps, the original SQL query for view the still appears in the source step in the M code. It's really really ugly, but it CAN be accessed and edited in the formula bar or advanced editor. Not as good as what you recommended, but still there for emergency access :)
Great call out. Yes you can edit the native query from an M Syntax perspective, or by hitting the gear icon next to the Source step. 👊
Excellent explanation
As soon as I have to do some kind of advanced data manipulation, creating custom columns, casting datatypes of similar, then I use the native query directly and manage all of the main structure through that.
If im not doing that, then I very quickly end up with some quite large and heavy pbix files, as many data operations requires to import data, and that simply is not an option in many cases where I use Power Bi and need to publish the report. Could be a difference of a pbix with 600Kb vs. a pbix 1Gb. Then we will be getting calls from IT that our dataset is taking up to many resoures at the Bi data warehouse when processing. However, when using native query towards a SQL server, then all the processing are done at SQL server level, before the data is delivered to the Power Bi service - that has for me been the best of two worlds and also I've found it to be the best performing one. However, it comes down to the performance of the SQL server.
I love your energy
The problem is - if I have a report with some parameters that change what dates are pulled in queries, if I do it with native qeury there doesn't seem to be any way to reference parameters in the SQL? Or maybe build the sql query programatically on refresh so I can reference them?
The native query part means once I build it anyone who maintains it has to know SQL instead of just learning power query basics.
This T-shirt is amazing!
Very thorough Patrick. Thanks a mil!
You are welcome. Thanks for watching.
Adam or Patrick, is it possible to change your Power BI file from Import to DirectQuery? I decide I want zero latency from my data source and I connect to a MPP like Azure SQL DW or Snowflake, as there are Microsoft reference architectures in their knowledge base recommending this, is this possible? The use case here is to create views off of the source system that Power BI connects to, then I re-connect to these re-defined views with the same exact metadata in my new MPP source system. Is there a way to change the settings in M to account for this, or do I need to recreate everything from scratch? I've looked around and I continually hear "re-create the report, they're fast to create." However, re-creating 800 Power BI reports for an enterprise client is extremely time consuming and expensive. Is your recommendation to move to Premium capacity since the model limitations are much higher? Thanks!
I am using "WITH(NOLOCK)" in my SQL statement because I am pulling data from an operational database. I want to switch to query folding so I can use incremental refresh but I do not want to risk locking the tables. Does PowerBI have a setting for this somewhere? Or better yet, does it not lock the tables by default?
You should always make data manipulation as close to the source as possible (this time it does not depend).
1 Quick Question: When you say push back, does it mean pushing back to the cached data, right?
As you have selected Import while doing this.
Hey, thanks for the video !
Is it preferred to use stored procedures or view:s when giving users access to restricted columns when importing data to power bi ? Or is there any difference when the SP is only a select statement ?
What do u mean with pushing back to the source?
I use native query for every source since i don't have access to create views. I perform my transformations and calc columns in the native query as much as possible. I also find it easier to give a query to other people with the same logic for our metrics.
Native requied run permission for each different parameters value whats is solution
I think it depends. If you have excellent SQL server than sure it makes sense to push back the burden onto it. But I have shitty SQL server so it is much faster to let PBI service do the job
whatif the tables i want from sql server contains related tables, should i import those in full or those that i need. And since i don't have access to create views in the database, it is still better that i generate sql statement and import them.
Usefull info. But for me it is wierd to see after several first steps of removing or other general stuff SELECT from multiple subqueries. Do you see sence in this subqueries instead of just giving a list of needed columns?
From a general sense, that is how the T-SQL auto generation works within Power Query. It may not be the most optimal way to do it, but from a generic sense it first most scenarios.
Thank you for this
I have a hive database and am connecting it, but I want to use a query to pull data where do I input the query as while connecting to hive there is no advance option to write the query
Hi Adam and Patrick .. earlier today i read in power bi September updated "Support for folding over native database quarries" ..
what does it mean???
Somesh Mk So I have been waiting for that to happen. Need to check if it true. Bad timing for this video maybe, but nice to know that even Adam and Patrik has a hard time catching up with all the new nice features :-)
I thought about this when I saw the blog, but take a look at the post, powerbi.microsoft.com/en-us/blog/power-bi-desktop-september-2019-feature-summary/#postgreSQL. Looks like this only works for PostGre.
Beware, I think... when I embedded a native query in power BI and tried to set up incremental refresh, I was stymied. Didn't see the limitations with incremental refresh and this whole folding business... I thought that the incremental refresh would just add a where statement but it appears not so.... Am I missing something Patrick?
Cheers!
Yup. You will run into badness. Pretty sure Patrick mentioned that in the video. I know it was one of the driving items for this video. Incremental refresh relies on query folding for the different partitions. If you are using Native Query, it will not be effective.
Thanks for taking up my question Patrick, and as a bonus, I learned that it's called a 'Native Query' and not an 'Embedded Query' as I've been calling it. Just one comment though, why have you (Microsoft) named it 'Query Folding'? As a non-native English speaker that naming makes no sense to me. I fold a shirt, but how do I fold a query? Something like query source pushing/execution makes more sense to me, but I might be alone with that view...
Doesn’t really mean anything for native speakers either. It’s tidy like folding shirts? Fold a poker hand, you give up and let the native query take over? Fold egg whites into a meringue of data? Bring someone into the fold (join the cult of native query)?
how to select particular schema in oracle db, for custom sql statement
I've got some large tables that are unindexed, several log tables, and multiple equations for fields.
Normally creating a temp table is the best route via SQL to prevent timeout but now there's a 1 to many relationship I need to report on too.
Any suggestions on structuring the input into PowerBI?
Hey Patrick, wouldn't it make sense for the Power Query engine to declare the native query as a CTE (common table expression) and then still use query folding even for native query data sources? Wouldn't that still perform better than doing all the heavy lifting in PQ as opposed to pushing it back to the SQL server?
Even if it is a Common Table Expression (CTE) query folding will not kick in because of the use of a Native Query. If you wrap the CTE in a view then query folding will kick in and all the work will be pushed back to the source.
@@GuyInACube I'm proposing that the PQ engineering team should implement query folding for native queries by taking the native query and declaring it as a CTE in the query it folds back to the source. The generated SQL would be identical to when using a table/view source except that it'll reference the CTE instead of the table/view.
@@reuvainkrasner7879 Ok. Makes sense.
How to solve this native queries solution for redshift if am using a sp to call from redshift to powerbi
Hi Patrick, First time visit to the channel and thanks for the video!
I was struggling getting my sql script (with Drop query!) into Power BI and was trying to resolve the 'Incorrect syntax nears the keyword' error.
You mentioned not to use Drop query...
So is it then best to create a new db table with those query and then to simply grab that new table into Power BI or would there be any work around?
Thanks in advance!
Awesome T-Shirt😋😋😊😊😉😉😎😎😍😍😍😘😘🥰🥰👌👌👍👍. I am lovin' it.
Awesome!
Confused by the term import. You say Always import, but native query is also import. How else are you to get data stored in Oracle if you can’t query for it. Cant always make a view in Oracle.
Great!! Power packed video!!
Thanks Pratik! Lots of great stuff in this one. 👊
@@GuyInACube yes !! Certainly ✌️
on a separate note, can you share the best way to import tables from sql server. I have difficulty to reconcile the related tables even with system data model and data dictionary. A lot of them is based on my understanding and not from the system maker or data administrator (which has no knowledge of the data structure).
I am trying to add native query while setting up connection with SQL server but after clicking on Ok button but it will be continuously loading for more than an hr. Can you please tell me what needs to be done for this.
I think what you are saying is that power bi is slow. Sure you can write your sql to do what the query folding does. BUT... most of value of power bi is in those steps that can't be done in sql either by the dev or by "query folding": Appending and merging alternative sources, pivoting and unpivoting columns, or any other step too complex to write as sql. To me this is what power bi is for. It is just too slow at it.
Thanks for this great video! How I can send reports as an attachment when row level security is enabled?
How can I get the T “ I ❤️ Data”
It has been handed out two years in a row at the Gartner Data and AI Summit at the Microsoft booth. Haven't seen it handed out anywhere else.
make the same by yourself)))
Uff.... I've been using native query and then working stuff in power query without giving it two thoughts. Thanks
I tried to insert native sql query in direct query mode and I always get the same syntax error for every query I use. Is that a bug?
I still have no clue what a native query is.
Nice bro
Appreciate that! Thanks for watching 👊
So do everything in SQL, gotcha
There are certain things mentioned here, I don’t agree with.
Not all power query transformations are pushed to the source.
If you know tour SQL you can beat power query performance. Specifically when using a gateway.
So you should put all your logic in your native query yourself and you can confidently assume no weird stuff will be added after. That is how I hear what you are saying.
Sub-spanish please
has anyone worked out a way to sort a value.nativequery SQL statement?