What a way to avoid all those extra query steps when using files from a folder. NOT having to see all those steps Power Query adds is fantastic. You are a very skilled and intelligent guy. So keep digging into the Excel world to explain all those useful methods/tips. Thank you very much for sharing them.
Brilliant! Been using for a few years and always struggled with those helper files. Sometimes ending with 20 or even more. And it creates major issues if you just copy a query and maybe change the source only. It still works but I am always scared something goes wrong. Thanks so much. More of these please.
Holy Cow!! This was an incredible mind blowing bit of knowledge! I for one have always HATED the helper queries because when you swap out entire sets of files in a folder, the helper queries are still looking for the filename that they first referenced and when that file is gone, they break. I have some pretty heavy queries that have too many steps. I know I can clean those up much better now. Thank you.
@@bcti-bcti At work, I was asked to automate the updating and consolidation of spec files when part numbers change. Prior process was to assign someone to manually go through hundreds of files, with tens of thousands of rows and MANUALLY copy/paste/Q.C. the data. Whenever specs were updated (several times per year), this error prone process would take a full time user about a month. Using PQ, I got that down to about 7 minutes... Now that I've seen this, I can make that and other processes even better!
Great stuff, I'll use that next time I touch a folder full of files. The only thing I'll do different is to filter out "grand total" and surplus header rows first. Filtering just based on error may (silently) remove and hide bad entries in the csv
Simple, clear, focused --> understood and implemented in my work. Excelent presentation, clear use of words and examples are one of the best I have seen. Thank you for the tutorial.
Like the in depth explanation of tables, binary, errors. And the in depth explanation of all steps. You have opened my eyes to a new way of thinking about queries. Subscribed. Teach me more.
Great video! Thanks for sharing this very practical information. I have performed multi-file import operations similar to what was demonstrated in the lecture. However, I just retain the [Name] column from the initial metadata table and don't require the additional steps outlined in the presentation. I do remove the file extensions, of course. It propagates through the subsequent steps, as far as I can tell. I have been using and programming Microsoft Excel for 27 years but only started learning about Power Query some six months ago. It's a game changer! It's so cool that PQ can replace many of the VBA routines I have developed over the years to clean and transform data, all with just a few clicks. Throw in the added power of customizing the generated M Code script - I have created dozens of general-purpose PQ custom functions plus my own Excel VBA Add-in to help load multiple tables and ranges in one go - and my data cleansing capabilities have now reached the next level. I will *never* interact with Excel the same way again! Thank you kindly.
I, like you, retain the [Name] column from the meta data. This was to demonstrate how to achieve the goal if you didn't take that approach. I used to rely very heavily on VBA for me cleanup and automation solutions. Now it only gets used for situations where Power Query can't do the job. MY need for VBA has declined about 90% since Power Query came to being. Thanks for watching.
Superb explanation and now I can get rid of my ugly, confusing helper queries. Always used to drive me mad they did and now I can control the data much more easily EDIT: I would add though when I do the Folder.Contents thing I like to have the steps expanded out in the Applied Steps window so that I can easily see the path/folders I've drilled down into. To do this, you add a line in the advanced editor to set a variable e.g. file = "expand" and put a comma on the end and then you will find that the steps following that are expanded out when you drill down into folders
I hated these helper files from the beginning. So far I tried to avoid them by creating a function like let GetFiles=(Path,Name) => let Source = Excel.Workbook(File.Contents( Path & Name ), null, true), Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data], #"Promoted Headers" = …, #xxx = … in #"xxx" in GetFiles and applied this function to the files in a folder. Your solution seems smarter as it even avoids having the function code. Brilliant. I will certainly give it try. Thanks for sharing your knowledge.
Ive been wanting to get around these helper queries for ages. Your guide is incredibly straight forward, and i cant wait to apply this new methodology in my future queries!
Wonderful! Excellent tips, except now it's increased my workload, as I'm going to have to go back and rework some queries to avoid the "helpers" and streamline multiple steps. Well done, well explained and illustrated.
8:55 I'm on my phone so just typing out this line as it is what I'm referencing. I'll type out the full M when I get to my computer later today. Please forgive any typos. I'm currently on a crowded L heading into downtown Chicago. I think you will be able to understand from this. =Table.ExpandTableColumn{#"Removed Columns", "CSV_Data", ("Column1", "Column2", "Column3", "Column4"), ("TranDate", "Account", "Dept", "Sum of Amount")
I thought you were alluding to that, but I wasn't sure you were doing something else. Yes, that would be a great step-saver. I should have done that in the video. Thanks for the input, and thanks for watching. @@michaelt312
A gentleman and a scholar. Thank you so much, dude. Now I get to spend the day optimizing/rebuilding all my tools. Question for you: does this speed up the performance due to not having all that overhead?
I haven't run any scientific studies, but I can't help but think that things would run a bit quicker without the extra overhead. I'm sure someone could demonstrate a scenario where having the helper queries improves performance, so I guess it's just a matter of what you're willing to tolerate. Thanks for watching!
12:59 inside content there is another sheet come up first column is "record " the second column is "data" < the table here How can I do that if happen like my case another table inside content and the subject column dynamic not fixed?
@@bcti-bcti my bad sorry, check in 13:02 you extract the content tables in new column. In my case when I’m trying to extract the content I got table but it’s not belong to my data sheet, that table with 2 columns the first one referred "name" and sometimes "record" and second column is "data" in first row is the table that related to my sheet. How can I make expression to get the table that inside content. In my case usually comes second column inside the content and first row ? I hope that’s clear if not i can share screenshot to ur email.
@@bcti-bcti you ought to have an idea how much I and other users here appreciate the time you took to create content that is not only fun to watch - and learn from - but adds so much value to our lives.
Great view and excellent training! What do you do when your data is not in table form? I am using a date created column from the file selection to separate the data that is on Sheet1 of the individual files. When I click to expand the Data column containing the tables, I get "Column 1", "Column 2", etc. When I promote headers, I lose the first date of the date created because it is then promoted to a header as well. How do you correct for this?
Without seeing your data, I would think that you would just have to rename the date column to something more appropriate. Are you losing the entire record when you promote the header row, or just a single heading for a single column?
I love your channel and hope it thrives! Specific question from this video. I have a file that I've applied this technique. It works but the refresh rate is exceedingly slow. It's a very small file at this point. The only thing I can think of that would take a long time is filtering through the Sharepoint directory to get to the target folder. The root folder is parent to a lot of subfolders and sub-subfolders and so on. In any case, anything you can do to help me speed this up would be helpful. We're talking over 10 minutes or more every time a change is made.
Thank you for the nice encouragement. We hope it thrives, too. If you want to post here, or email me your full query (copied from the Advanced Editor), I can take a look and see if there's anything that may come to mind when it comes to optimizing the performance. training@bcti.com
Great job!! However, in my case it was not so simple since my CSV files have a more complex encoding. When I first tried to use the Csv.document command, it threw an error. To work around the problem, I first expanded using the normal process and copied the formula from the Transform Sample File located in the help folder. After that, just replace the expression “Parameter1” with “Content”. This was the original formula: Csv.Document(Parameter1,[Delimiter=";",Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None]) This is the new one: Csv.Document([Content],[Delimiter=";", Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None])
@@bcti-bcti The crazy thing about those "helper queries" is: If you use them, it prevents you from building a Power BI template app. Power BI template apps don't allow parameters of the type "any" or "binary". So without eliminating those helper queries from the data model one is not able to use an app to publish Power BI reports built from that whole data model. That's a very annoying relation I had to uncover. 😅 Thanks for helping me solve this issue. My app is running now. 😁
@@bcti-bcti I made some progress with the PDF files with that Custom Column code you provided, thanks! I was able to dig out the tables without Sheet Kind. The problem is the Table.PromoteHeaders function. Is there any way I can make this dynamic without Hard coding column names? Table.PromoteHeaders( Table.ExpandTableColumn(#"Expanded Custom", "Data", {"Column1", {"Column2", {"Column3", {"Column4", {"Column5", {"Column5", {"Column7", {"Column8", {"Column9",{"Column10", {"Column11"})) It promotes the headers in this case but what if I get more columns or their names change? Column names as a list somehow?
Realy nice tutorial! I see Im not the only one that do not feel comfortable with those "helper" queries... I would like to know if this video is part of an on-line course that I can take ?
This isn't part of an online course, it's just me making videos that I find interesting. The company I work for does perform training for other companies, but they're typically local (i.e., within 150km), but we do have some students that take the class over the Internet. The class(s) are live and not recorded. It's the same experience for students in the classroom versus remote. Time zones are also an issue with live, distance-learning sessions. We're flattered that you would wish to take a full course. Where are you located?
what are the gains in speed? I'm struggling to the major benefit here. Can't you just ignore the helper queries? It's so much easier to click the button than go through these steps.
@@reggydavis You could, but it’s my experience that many users don’t like all the additional overhead and would rather take matters into their own hands. Thanks for watching.
Can I ask - why do you start by connecting to the parent folder and then go through the steps of filtering just to the sub folder? Is it in case people have their files across multiple folders? If not, wouldn't it easier to connect straight to the sub folder? Thanks for your informative videos.
Yes, exactly. I would just start at the sub-folder level, but this gives the operator the ability to "catch" more files that may be spread across multiple folders. It also allows the video to demonstrate "path filtering", something some users may have never thought to do. Thanks for watching.
I like to think I know more than most about PQ, but I still have a LOT to learn. It’s always a great feeling to learn something new. Thanks for watching.
hello, great video! i have a question, i can i extract a certain table if there are more than one in an excel workbook? should i [Kind] = "Table" & [Name] = "TableName" ?
You could do something like the following: let Source = Excel.Workbook(File.Contents("C:\Excel Data.xlsx"), null, true), // Filter the Excel file's metadata by the name of the desired table (ex: "MyData"). #"Filtered Rows" = Table.SelectRows(Source, each ([Item] = "MyData")), // Extract the data from the "MyData" table MyData_Table = #"Filtered Rows"{[Item="MyData",Kind="Table"]}[Data] in MyData_Table
You could keep the [Name] and [Content] columns, that way you could use the [Name] as a transaction-level "stamp" so you always know where each transaction came from. See the below M code for an example (this is done with the CSV files, but can easily be done the same way with any files.) ============================== let // Connect to the parent folder holding all files and subfolders Source = Folder.Files("C:\Camtasia Projects\Power Query - Avoid Scary Queries"), // Standardize the casing of the file extensions (using UPPERCASE) Uppercase_File_Extensions = Table.TransformColumns(Source,{{"Extension", Text.Upper, type text}}), // Filter to keep only files in the "CSV Files" folder and subfolders Filter_By_Path = Table.SelectRows(Uppercase_File_Extensions, each Text.Contains([Folder Path], "\CSV Files\")), // Filter to keep only CSV files Filter_By_CSV_Files = Table.SelectRows(Filter_By_Path, each Text.Contains([Extension], ".CSV")), // Remove all columns of folder metadata EXCEPT the [Name] and [Content] columns Keep_Only_Content = Table.SelectColumns(Filter_By_CSV_Files,{"Name", "Content"}), // Standardize the casing of the Filenames (UPPERCASE) #"Uppercased Text" = Table.TransformColumns(Keep_Only_Content,{{"Name", Text.Upper, type text}}), // Extract the tables from the Binary files Extract_Tables = Table.AddColumn(#"Uppercased Text", "CSV_Data", each Csv.Document([Content])), // Delete the [Content] column (column of Binaries) Delete_Binaries = Table.RemoveColumns(Extract_Tables,{"Content"}), // Expand the table columns of data from the [CSV Data] column Expand_Tables = Table.ExpandTableColumn(Delete_Binaries, "CSV_Data", {"Column1", "Column2", "Column3", "Column4"}, {"Column1", "Column2", "Column3", "Column4"}), // Promote the first row of data to a Header Row position Promoted_Headers = Table.PromoteHeaders(Expand_Tables, [PromoteAllScalars=true]), // Set the Data Types for the columns Set_Data_Types = Table.TransformColumnTypes(Promoted_Headers,{{"TranDate", type date}, {"Account", Int64.Type}, {"Dept", Int64.Type}, {"Sum of Amount", Currency.Type}}), // Remove any redundant headers and total rows (rows with text are rendered as errors by the previous step) Remove_Error_Rows = Table.RemoveRowsWithErrors(Set_Data_Types, {"TranDate", "Account", "Dept", "Sum of Amount"}) in Remove_Error_Rows
at 13.25 I want to extract a specific named range in the file, but it is not showing in the list at the bottom. How can i do that? So what I am doing is getting a file from Sharepoint folder, then go through all the steps to get a named range out of it, resulting in hundreds of Helper Queries. I want to do what you doing here. Get the NamedRange out of the file without the helper queries. The list does not show named ranges in the file???
@@ExcelWithChris I can assist with this, but it would be easier if I were to call you. We could share screens and I could walk you thru the process. That way, if there's a glitch, I can see what is happening and we can figure it out. You can email me at "training@bcti.com". cheers.
If there is no error, how can the header rows be removed for all the input files except for one? In this case, there was one field which was numeric. But there may be case where all the columnns may be of same type.
at 5:52 you'll select Content column first, just as on video, then you hold Ctrl key and click on Name column to add it to selection. then you follow the video and add "Remove other columns" step
All steps should be the same for .TXT files as they use the same connector for decoding. The only issue may be when you are using a TAB as the file's delimiter instead of a COMMA. If this is the case, you will have to modify the Csv.Document function as follows: Csv.Document([Content], [Delimiter=" "]) (There is a TAB character between the double-quotes.) NOTE: The "Custom Column" dialog box will not allow you to type a TAB character in the formula; it interprets TAB as an instruction to move to the next field in the dialog box. For this reason, you may have to type the above command in something like Notepad, then copy/paste it into the "Custom Column" dialog box. Hope this helps. Thanks for watching.
Pls get me a Formula to count meals (breakfasts, lunches & Diners) between 2 dates in different columns i.e 20/11/2024 Diner to 24/11/24 Breakfast BF (Column) 4nos Lunch (Column) 3nos Diner (Column) 4nos If you give me a solution i will be very greatful.
@@johank4361 If you can email me a sample file with an example of data spanning several days, I would be happy to write the formulas and send it back. Thanks.
This was unbelievably helpful! Thanks! I am however, now facing an issue...we're bringing in Ledger Accounts and due to the length of the numbers/codes used for these, it looks like PQ is literally making numbers up? Code 123.1234.123 in the Excel has been brought in as 123.1234.122999998 - has anyone else had this happen, and how do you counteract it? We can't just round as due to the .s they're coming in as text.
@@bcti-bcti I'm honestly not sure what to share, the steps are identical to what you've done above, but this also happens when using the Combine button, so it's not related to the method. I don't think I can share actual account codes, but to show their length (which might be the issue?) they are 1234567.1234 (with various numbers) and some have subledgers which show as 1234567.12345.123 (weirdly these show correctly in PQ). The codes with only one . in them seem to be rounded away from their actual values as shown in the original excel files (which appear to be stored as text as well), and become 1234567.1234000002 (five 0s) or 1234567.1233999998 (five 9s). The data first shows after the expanded columns step to show the actual table data so there are no transforms/promoted headers changing anything, the data seems to be being read this way? Really appreciate your help/insight on this, I can't really find anything about it online but it's very frustrating to have your ledger and subledger account codes being changed upon import!
@@bcti-bcti Seems my reply disappeared, oddly. In terms of the code, it's literally exactly what you're doing here to the letter, so I think the issue lies with the data perhaps. A text field is brought in from excel which contains the ledger codes in the format 1234567.1234 or 1234567.1234.123 (ledger and subledger). Subledgers seem to come in fine and remain correct, however the ledgers with only one . get weirdly rounded to either 1234567.1234000002 (with five 0s) or 1234567.123999998 (five 9s). In the Excel files which are being brought in the ledger just fixed at .1234, there's no numbers after that, no rounding etc. Hope that makes some more sense? In terms of clarity of the situation, the issue makes no sense! Thanks
What a way to avoid all those extra query steps when using files from a folder.
NOT having to see all those steps Power Query adds is fantastic.
You are a very skilled and intelligent guy.
So keep digging into the Excel world to explain all those useful methods/tips.
Thank you very much for sharing them.
Brilliant! Been using for a few years and always struggled with those helper files. Sometimes ending with 20 or even more. And it creates major issues if you just copy a query and maybe change the source only. It still works but I am always scared something goes wrong. Thanks so much. More of these please.
Thanks for taking the time to watch. Glad it helped. 👍👍
I like this video ❤❤❤❤
Fantastic
Holy Cow!! This was an incredible mind blowing bit of knowledge! I for one have always HATED the helper queries because when you swap out entire sets of files in a folder, the helper queries are still looking for the filename that they first referenced and when that file is gone, they break. I have some pretty heavy queries that have too many steps. I know I can clean those up much better now. Thank you.
So glad it will help. I prefer this method (in most cases.) Thanks for watching!!
@@bcti-bcti At work, I was asked to automate the updating and consolidation of spec files when part numbers change. Prior process was to assign someone to manually go through hundreds of files, with tens of thousands of rows and MANUALLY copy/paste/Q.C. the data. Whenever specs were updated (several times per year), this error prone process would take a full time user about a month. Using PQ, I got that down to about 7 minutes... Now that I've seen this, I can make that and other processes even better!
@@Adam_K_W FANTASTIC!!!! It’s always great to hear a PQ success story.
Great stuff, I'll use that next time I touch a folder full of files. The only thing I'll do different is to filter out "grand total" and surplus header rows first. Filtering just based on error may (silently) remove and hide bad entries in the csv
Simple, clear, focused --> understood and implemented in my work. Excelent presentation, clear use of words and examples are one of the best I have seen. Thank you for the tutorial.
Thank you so much for your VERY kind and supportive words. Thanks for watching!!!
Like the in depth explanation of tables, binary, errors. And the in depth explanation of all steps. You have opened my eyes to a new way of thinking about queries. Subscribed. Teach me more.
Thank you for the subscription. I do like to see that number increase!!! Here's hoping future videos server you just as well. Cheers.
Just what I was looking for - getting rid of helper queries and hard coded steps. I love the fool proofing tips. Thank you.
You're very welcome. 👍👍
Great video! Thanks for sharing this very practical information. I have performed multi-file import operations similar to what was demonstrated in the lecture. However, I just retain the [Name] column from the initial metadata table and don't require the additional steps outlined in the presentation. I do remove the file extensions, of course. It propagates through the subsequent steps, as far as I can tell.
I have been using and programming Microsoft Excel for 27 years but only started learning about Power Query some six months ago. It's a game changer! It's so cool that PQ can replace many of the VBA routines I have developed over the years to clean and transform data, all with just a few clicks. Throw in the added power of customizing the generated M Code script - I have created dozens of general-purpose PQ custom functions plus my own Excel VBA Add-in to help load multiple tables and ranges in one go - and my data cleansing capabilities have now reached the next level.
I will *never* interact with Excel the same way again!
Thank you kindly.
I, like you, retain the [Name] column from the meta data. This was to demonstrate how to achieve the goal if you didn't take that approach. I used to rely very heavily on VBA for me cleanup and automation solutions. Now it only gets used for situations where Power Query can't do the job. MY need for VBA has declined about 90% since Power Query came to being. Thanks for watching.
Thank you so very much for being so thorough in your explanation. You are a life saver!!
@@celestebenitez6688 You are very welcome. I’m glad the videos have helped. Thank you for your support.
Thank you. Didn't know about the "kept errors" functionality and "Remove errors".
I love those features for investigating oddities in data. Thanks for watching.
Superb explanation and now I can get rid of my ugly, confusing helper queries. Always used to drive me mad they did and now I can control the data much more easily
EDIT: I would add though when I do the Folder.Contents thing I like to have the steps expanded out in the Applied Steps window so that I can easily see the path/folders I've drilled down into. To do this, you add a line in the advanced editor to set a variable e.g. file = "expand" and put a comma on the end and then you will find that the steps following that are expanded out when you drill down into folders
So glad to hear it helps. Thank you so much for watching and commenting. It really helps.
I like this format of going through long examples with a lot of embedded tricks.
That Detect Data Types feature will save me a lot of time.
So glad you're enjoying them. I know it's not a format for everyone. Thanks for watching.
Great video, and very clearly explained, as always. I also use that technique to avoid those confusing Helper Queries!
I hated these helper files from the beginning. So far I tried to avoid them by creating a function like
let GetFiles=(Path,Name) =>
let
Source = Excel.Workbook(File.Contents( Path & Name ), null, true),
Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
#"Promoted Headers" = …,
#xxx = …
in
#"xxx"
in GetFiles
and applied this function to the files in a folder.
Your solution seems smarter as it even avoids having the function code. Brilliant. I will certainly give it try. Thanks for sharing your knowledge.
I'm not a fan either. Thanks for watching.
Ive been wanting to get around these helper queries for ages. Your guide is incredibly straight forward, and i cant wait to apply this new methodology in my future queries!
Thank you so much for saying nice things. Thanks for watching!!!
Wonderful! Excellent tips, except now it's increased my workload, as I'm going to have to go back and rework some queries to avoid the "helpers" and streamline multiple steps.
Well done, well explained and illustrated.
Yeah, but's a "good" kind of work. Very satisfying. Thanks for watching.
Hats up sir. I watched so many youtube views did not get expiation like you.
Appreciated for sharing such a useful information.
That is exceedingly nice of you to say. Thanks for taking the time to watch.
Honestly, this was very well explained, at a good pace, with lots of cool tips and tricks. Keep these vids coming, subscribed!
Thank you so much. We appreciate your viewership.
This has been super helpful. I didn't know about the combined function. This is saving me a bunch of time. Thank you!
Glad we could be of service. Thanks for your viewership!
Excellent - and a great introduction for anyone looking to learn how to 'program' with power query.
Thank you so much for taking the time to watch and comment.
In your Expand_CSV step, couldn't you change the header names there? This would avoid the need to promote the first row to name the columns?
Can you provide a sample of M-code to demonstrate your combined strategy? Thanks.
8:55 I'm on my phone so just typing out this line as it is what I'm referencing. I'll type out the full M when I get to my computer later today. Please forgive any typos. I'm currently on a crowded L heading into downtown Chicago. I think you will be able to understand from this.
=Table.ExpandTableColumn{#"Removed Columns", "CSV_Data", ("Column1", "Column2", "Column3", "Column4"), ("TranDate", "Account", "Dept", "Sum of Amount")
I thought you were alluding to that, but I wasn't sure you were doing something else. Yes, that would be a great step-saver. I should have done that in the video. Thanks for the input, and thanks for watching. @@michaelt312
@@bcti-bcti , your channel is awesome. I tripped on this process when I was trying to clean some code. It is now my go-to in every build.
this is AWESOME! and, learned fundamentals in the process! thanks!!
You are very much welcome. Thanks for watching.
Great video. Not sure how you keep putting out videos on projects I just finished using a better methodology, but keep them coming!
So what's the NEXT project I can assist with? 🤣 Thank you for taking the time to watch.
A gentleman and a scholar. Thank you so much, dude. Now I get to spend the day optimizing/rebuilding all my tools.
Question for you: does this speed up the performance due to not having all that overhead?
I haven't run any scientific studies, but I can't help but think that things would run a bit quicker without the extra overhead. I'm sure someone could demonstrate a scenario where having the helper queries improves performance, so I guess it's just a matter of what you're willing to tolerate. Thanks for watching!
Great Stuff...Very Informative...and easy to understand. Great work guys!!!🖐
Much appreciated!
12:59 inside content there is another sheet come up first column is "record " the second column is "data" < the table here
How can I do that if happen like my case another table inside content and the subject column dynamic not fixed?
I hate to sound dense, but I'm unclear as to your question. Can you rephrase it, or provide an example? Thanks.
@@bcti-bcti my bad sorry, check in 13:02 you extract the content tables in new column. In my case when I’m trying to extract the content I got table but it’s not belong to my data sheet, that table with 2 columns the first one referred "name" and sometimes "record" and second column is "data" in first row is the table that related to my sheet. How can I make expression to get the table that inside content. In my case usually comes second column inside the content and first row ? I hope that’s clear if not i can share screenshot to ur email.
@@Hemo2YoYo yes, send a screenshot to training@bcti.com and I’ll give a look. Thanks.
@@bcti-bcti i shared the video to your email.
I sincerely thank you for this awesome, well-done, informative, mind-blowing tutorial.
Thank you so much!!! We appreciate you taking the time to watch and comment.
@@bcti-bcti you ought to have an idea how much I and other users here appreciate the time you took to create content that is not only fun to watch - and learn from - but adds so much value to our lives.
@@AJ-fd3yj thank you for the encouraging words. It makes me want to try harder with each video. Thank you.
Great view and excellent training!
What do you do when your data is not in table form? I am using a date created column from the file selection to separate the data that is on Sheet1 of the individual files. When I click to expand the Data column containing the tables, I get "Column 1", "Column 2", etc. When I promote headers, I lose the first date of the date created because it is then promoted to a header as well. How do you correct for this?
Without seeing your data, I would think that you would just have to rename the date column to something more appropriate. Are you losing the entire record when you promote the header row, or just a single heading for a single column?
I love your channel and hope it thrives! Specific question from this video. I have a file that I've applied this technique. It works but the refresh rate is exceedingly slow. It's a very small file at this point. The only thing I can think of that would take a long time is filtering through the Sharepoint directory to get to the target folder. The root folder is parent to a lot of subfolders and sub-subfolders and so on. In any case, anything you can do to help me speed this up would be helpful. We're talking over 10 minutes or more every time a change is made.
Thank you for the nice encouragement. We hope it thrives, too. If you want to post here, or email me your full query (copied from the Advanced Editor), I can take a look and see if there's anything that may come to mind when it comes to optimizing the performance.
training@bcti.com
Great job!! However, in my case it was not so simple since my CSV files have a more complex encoding.
When I first tried to use the Csv.document command, it threw an error. To work around the problem, I first expanded using the normal process and copied the formula from the Transform Sample File located in the help folder.
After that, just replace the expression “Parameter1” with “Content”.
This was the original formula:
Csv.Document(Parameter1,[Delimiter=";",Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None])
This is the new one:
Csv.Document([Content],[Delimiter=";", Columns=17, Encoding=1252, QuoteStyle=QuoteStyle.None])
Great idea! As this is not a 100% perfect solution, there are always going to be scenarios that require creativity. Thanks for watching!!!
I have the exact same problem and your solution works for me. I can't say how much I want to thank you, saving me so much time and nerves! 😄🤗
@@ladytigereye6145 🤗
@@ladytigereye6145 so glad it helped. Thanks for taking the time to watch.
@@bcti-bcti The crazy thing about those "helper queries" is: If you use them, it prevents you from building a Power BI template app. Power BI template apps don't allow parameters of the type "any" or "binary". So without eliminating those helper queries from the data model one is not able to use an app to publish Power BI reports built from that whole data model. That's a very annoying relation I had to uncover. 😅 Thanks for helping me solve this issue. My app is running now. 😁
This is gold, thank you so much! :)
Thank you. We try to mine as much gold as possible. Thanks for watching.
This was excellent!. Thank you! Love your channel!
Thank you for saying so. And thanks for watching.
Amazing For any one dealing with multiples monthly files types 🤩
Thanks for taking the time to watch. 👍🏼
@@bcti-bcti no problem... I enjoyed it🤗
Thank you for sharing your knowledge.
It's a pleasure to serve. Glad you are enjoying the videos. Thanks for watching.
Great video again, thanks!
How about pdf-files and tables on them? Does combining them differ from csv/xslx?
You can use the following Custom Column code:
= Pdf.Tables([Content])
Let me know how it works out for you.
@@bcti-bcti I made some progress with the PDF files with that Custom Column code you provided, thanks! I was able to dig out the tables without Sheet Kind. The problem is the Table.PromoteHeaders function. Is there any way I can make this dynamic without Hard coding column names?
Table.PromoteHeaders( Table.ExpandTableColumn(#"Expanded Custom", "Data", {"Column1", {"Column2", {"Column3", {"Column4", {"Column5", {"Column5", {"Column7", {"Column8", {"Column9",{"Column10", {"Column11"}))
It promotes the headers in this case but what if I get more columns or their names change?
Column names as a list somehow?
Realy nice tutorial!
I see Im not the only one that do not feel comfortable with those "helper" queries...
I would like to know if this video is part of an on-line course that I can take ?
This isn't part of an online course, it's just me making videos that I find interesting. The company I work for does perform training for other companies, but they're typically local (i.e., within 150km), but we do have some students that take the class over the Internet. The class(s) are live and not recorded. It's the same experience for students in the classroom versus remote. Time zones are also an issue with live, distance-learning sessions. We're flattered that you would wish to take a full course. Where are you located?
I am located out side the U.S...
Which of the trainings is related to Power query?
what are the gains in speed? I'm struggling to the major benefit here. Can't you just ignore the helper queries? It's so much easier to click the button than go through these steps.
@@reggydavis You could, but it’s my experience that many users don’t like all the additional overhead and would rather take matters into their own hands. Thanks for watching.
This was fantastic! Thank you for a great video!
Thanks for the compliment. We appreciate you watching.
Looks Great - how would it be different if I had workbook files (xlsx, xlsb etc) in the folder ?
If you look at 12:08 in the video, I show how to use the strategy on Excel files using the Excel.Workbook() function. Thanks for watching.
Can I ask - why do you start by connecting to the parent folder and then go through the steps of filtering just to the sub folder? Is it in case people have their files across multiple folders? If not, wouldn't it easier to connect straight to the sub folder?
Thanks for your informative videos.
Yes, exactly. I would just start at the sub-folder level, but this gives the operator the ability to "catch" more files that may be spread across multiple folders. It also allows the video to demonstrate "path filtering", something some users may have never thought to do. Thanks for watching.
Great and clean method! Thanks a lot.
You're welcome. Thanks for watching.
Simply briliant. I have learned a trick or two, while considered myself as skilled in PQ
I like to think I know more than most about PQ, but I still have a LOT to learn. It’s always a great feeling to learn something new. Thanks for watching.
I can whole heartedly second this comment. I have reasonably good PQE skills but you sir are next level!!!
hello, great video! i have a question, i can i extract a certain table if there are more than one in an excel workbook? should i [Kind] = "Table" & [Name] = "TableName" ?
You could do something like the following:
let
Source = Excel.Workbook(File.Contents("C:\Excel Data.xlsx"), null, true),
// Filter the Excel file's metadata by the name of the desired table (ex: "MyData").
#"Filtered Rows" = Table.SelectRows(Source, each ([Item] = "MyData")),
// Extract the data from the "MyData" table
MyData_Table = #"Filtered Rows"{[Item="MyData",Kind="Table"]}[Data]
in
MyData_Table
@@bcti-bcti thank you i will try it... i was actually thinking about something like that, i didnt have the time yet to try iy! thank u!!!
Thanks, cool trick to avoid subfolders!
Yep. Sure is. Thanks for watching.
What if my files are XLS or XLSX in 8:06 ? What is the expression will be?
If you look at 12:08 in the video, I show how to use the strategy on Excel files using the Excel.Workbook() function. Thanks for watching.
@@bcti-bcti thanks so much
I like the cleaner combining of files, but I often need the name of the source file for each row. How would you approach that? Thanks.
You could keep the [Name] and [Content] columns, that way you could use the [Name] as a transaction-level "stamp" so you always know where each transaction came from. See the below M code for an example (this is done with the CSV files, but can easily be done the same way with any files.)
==============================
let
// Connect to the parent folder holding all files and subfolders
Source = Folder.Files("C:\Camtasia Projects\Power Query - Avoid Scary Queries"),
// Standardize the casing of the file extensions (using UPPERCASE)
Uppercase_File_Extensions = Table.TransformColumns(Source,{{"Extension", Text.Upper, type text}}),
// Filter to keep only files in the "CSV Files" folder and subfolders
Filter_By_Path = Table.SelectRows(Uppercase_File_Extensions, each Text.Contains([Folder Path], "\CSV Files\")),
// Filter to keep only CSV files
Filter_By_CSV_Files = Table.SelectRows(Filter_By_Path, each Text.Contains([Extension], ".CSV")),
// Remove all columns of folder metadata EXCEPT the [Name] and [Content] columns
Keep_Only_Content = Table.SelectColumns(Filter_By_CSV_Files,{"Name", "Content"}),
// Standardize the casing of the Filenames (UPPERCASE)
#"Uppercased Text" = Table.TransformColumns(Keep_Only_Content,{{"Name", Text.Upper, type text}}),
// Extract the tables from the Binary files
Extract_Tables = Table.AddColumn(#"Uppercased Text", "CSV_Data", each Csv.Document([Content])),
// Delete the [Content] column (column of Binaries)
Delete_Binaries = Table.RemoveColumns(Extract_Tables,{"Content"}),
// Expand the table columns of data from the [CSV Data] column
Expand_Tables = Table.ExpandTableColumn(Delete_Binaries, "CSV_Data", {"Column1", "Column2", "Column3", "Column4"}, {"Column1", "Column2", "Column3", "Column4"}),
// Promote the first row of data to a Header Row position
Promoted_Headers = Table.PromoteHeaders(Expand_Tables, [PromoteAllScalars=true]),
// Set the Data Types for the columns
Set_Data_Types = Table.TransformColumnTypes(Promoted_Headers,{{"TranDate", type date}, {"Account", Int64.Type}, {"Dept", Int64.Type}, {"Sum of Amount", Currency.Type}}),
// Remove any redundant headers and total rows (rows with text are rendered as errors by the previous step)
Remove_Error_Rows = Table.RemoveRowsWithErrors(Set_Data_Types, {"TranDate", "Account", "Dept", "Sum of Amount"})
in
Remove_Error_Rows
Cool, I hate all the helper junk, way too much clutter. Nicely done!
Thanks. I couldn't agree more.
Will the model load everything in the parent folder before filtering down or is it intelligent enough to only request the data it needs?
I can't be entirely certain as to your question. Can you rephrase it with a procedural example? Thanks for watching!
Amazing. Thank you for this, very helpful.
You are most welcome. Thanks for watching!!!
Good video. While I agree that those helper functions clutter any detail on performance for larger datasets with query?
Thanks. I haven’t done any large scale performance tests, so I’m uncertain as to the benefit/detriment of either approach. Thanks for watching.
Thank you! this is so valuable!
Glad it helps. Thanks for watching.
at 13.25 I want to extract a specific named range in the file, but it is not showing in the list at the bottom. How can i do that? So what I am doing is getting a file from Sharepoint folder, then go through all the steps to get a named range out of it, resulting in hundreds of Helper Queries. I want to do what you doing here. Get the NamedRange out of the file without the helper queries. The list does not show named ranges in the file???
I'm tied up for a couple days, but let me look into this and see what I can figure out.
@@bcti-bcti Thanks a million. Would help a lot. Currently I import 15 files and it creates a lot of Helper Queries.
@@ExcelWithChris I can assist with this, but it would be easier if I were to call you. We could share screens and I could walk you thru the process. That way, if there's a glitch, I can see what is happening and we can figure it out. You can email me at "training@bcti.com". cheers.
@@ExcelWithChrisdo you mean 15 files of different structure each?
If there is no error, how can the header rows be removed for all the input files except for one? In this case, there was one field which was numeric. But there may be case where all the columnns may be of same type.
Then a different strategy will need to be employed. Every scenario is different and requires its own special adjustments. Thanks for watching.
Before expanding table you can promote headers for each table at once!
Very true. Thanks for the input. And thanks for watching.
Thats a wonderful solution. What if the data type is in .xml?
Once you have the folder data reduced to just a [Content] column, you could create a custom column using the following formula:
=Xml.Tables([Content])
Do your input excel files have- tables named after region?
Yes; the tables within each Excel file contains a table that was named according to the state the date is related.
What a great video!!
Fabulous video❤
Excellent
Marvelous 🎉🎉
Thanks for watching!!!
How to retain the file names?
Can you provide additional information regarding your question? I am uncertain as to what you are asking. Thank you.
at 5:52 you'll select Content column first, just as on video,
then you hold Ctrl key and click on Name column to add it to selection. then you follow the video and add "Remove other columns" step
Nice 😃👍
Thanks for taking the time to watch.
Great tips
Thanks. We appreciate you taking the time to watch.
Follow up question for you-what if the data type is txt?
All steps should be the same for .TXT files as they use the same connector for decoding. The only issue may be when you are using a TAB as the file's delimiter instead of a COMMA. If this is the case, you will have to modify the Csv.Document function as follows:
Csv.Document([Content], [Delimiter=" "])
(There is a TAB character between the double-quotes.)
NOTE: The "Custom Column" dialog box will not allow you to type a TAB character in the formula; it interprets TAB as an instruction to move to the next field in the dialog box. For this reason, you may have to type the above command in something like Notepad, then copy/paste it into the "Custom Column" dialog box.
Hope this helps. Thanks for watching.
thank you very much
You are quite welcome!
Powerfull
Thanks for watching.
Pls get me a Formula to count meals (breakfasts, lunches & Diners) between 2 dates in different columns
i.e
20/11/2024 Diner to 24/11/24 Breakfast
BF (Column)
4nos
Lunch (Column)
3nos
Diner (Column)
4nos
If you give me a solution i will be very greatful.
@@johank4361 If you can email me a sample file with an example of data spanning several days, I would be happy to write the formulas and send it back. Thanks.
@@johank4361 You can send the file to “training@bcti.com”
@bcti-bcti great, thank s a lot
Kindly mention your email id
@ It was in the second reply “training@bcti.com”
Awesome video!
Thank you!!! That means a LOT to me.
This was unbelievably helpful! Thanks! I am however, now facing an issue...we're bringing in Ledger Accounts and due to the length of the numbers/codes used for these, it looks like PQ is literally making numbers up? Code 123.1234.123 in the Excel has been brought in as 123.1234.122999998 - has anyone else had this happen, and how do you counteract it? We can't just round as due to the .s they're coming in as text.
I am unable to replicate your issue with the example provided. Can you provide additional information? A copy of the M code would help.
@@bcti-bcti I'm honestly not sure what to share, the steps are identical to what you've done above, but this also happens when using the Combine button, so it's not related to the method. I don't think I can share actual account codes, but to show their length (which might be the issue?) they are 1234567.1234 (with various numbers) and some have subledgers which show as 1234567.12345.123 (weirdly these show correctly in PQ). The codes with only one . in them seem to be rounded away from their actual values as shown in the original excel files (which appear to be stored as text as well), and become 1234567.1234000002 (five 0s) or 1234567.1233999998 (five 9s). The data first shows after the expanded columns step to show the actual table data so there are no transforms/promoted headers changing anything, the data seems to be being read this way? Really appreciate your help/insight on this, I can't really find anything about it online but it's very frustrating to have your ledger and subledger account codes being changed upon import!
@@bcti-bcti Seems my reply disappeared, oddly. In terms of the code, it's literally exactly what you're doing here to the letter, so I think the issue lies with the data perhaps. A text field is brought in from excel which contains the ledger codes in the format 1234567.1234 or 1234567.1234.123 (ledger and subledger). Subledgers seem to come in fine and remain correct, however the ledgers with only one . get weirdly rounded to either 1234567.1234000002 (with five 0s) or 1234567.123999998 (five 9s). In the Excel files which are being brought in the ledger just fixed at .1234, there's no numbers after that, no rounding etc. Hope that makes some more sense? In terms of clarity of the situation, the issue makes no sense! Thanks