30 years ago I was the guy spending that hour on a payroll text file in excel every month to reconcile the bank account at university. Now I’m using Power Query to cleanse a csv file from the bank in seconds for a small municipality. Progress!
Before watching this video, I had a lot of trouble with reports like this, and didn't know how to use it with power query to automate it, many thanks Mr Excel
Even with this solution, we can't automate it. Because If we get the data with any different positions, It fails. Did you find any different way for that?
You can sort of do it in Power Query. Right click and select Duplicate on the name column and call the new column "Names" (just change in the formula bar). Then add another step in the applied steps window, and use: =Table.Combine({#"Duplicated Column", Table.FromList(List.Distinct(#"Duplicated Column"[#"Names"]), null, {"Names"})}) This step (and you can separate into a few if easier) takes a distinct list of all the names in the duplicated column and converts to a table, which is then combined with the original data, giving you a list of null lines with only the duplicate name field populated. You can then sort this by your name column, but first you have to replace values again in the actual column that you wish to sort by, replacing null with "" (replace null in all other columns created by the combined tables too). Then sort by duplicated name column and then delete the duplicated name column and you're left with the data with spaces. Obviously, the correct answer is that spaces are for satan.
Or a function might be better. Copy this into a blank query and you can call it from your own query. Pass the table (previous step name), the column name you want to create a blank row under and the column that you next want to sort by after your column name. Should work nicely (edited from first post because it didn't work nicely!): (sourceTable as table, columnName as text, nextSortColumnName as text) as table => let Source = Table.DuplicateColumn(sourceTable, columnName, columnName & "_"), columnNames = List.RemoveMatchingItems(Table.ColumnNames(Source),{columnName & "_"}), distinctRowsAsTable = Table.FromList(List.Distinct(Table.Column(Source,columnName & "_")),null,{columnName & "_"}), newTable = List.Accumulate(columnNames, distinctRowsAsTable, (state, current) => Table.AddColumn(state, current, each "")), sortedData = Table.Sort(Table.Combine({Source,newTable}),{columnName & "_",nextSortColumnName}), result = Table.RemoveColumns(sortedData, {columnName & "_"}) in result
Very cool! Thank you!! Working with the same kind of csv file, is there a way do delete all rows after a cell with a specific text value? The text may appear in several rows, but I'd like to delete everything after the first instance.
Hahaha this looks like the output from out 'new' system. I just spent an hour making a 'companion' table to the data using offsets and =MID only to find there's a secondary row "sometimes" that has important data as they have offset rows of data. I'm going to use these tips to import the sheet two ways, one for the main row, one for the offset rows, then recombine them I think..... Thanks MrExcel!
This was amazing . But my file has headers spread into 2 rows and data is 2 rows seperated by a blank row . it runs into many pages . how to clean it ?
Hey, Thanks for the Video, It was very helpful. But I couldn't automate the report with this solution, As I am getting different position of column every day, Any idea for that??
Empty row after each Name? No problem😁 WithEmptyRow = Table.Combine(Table.Group( MrExcelLastStep, {"Name"}, {{"tbl", each _ & #table({"Name"}, {{null}}), type table}})[tbl]) in WithEmptyRow
30 years ago I was the guy spending that hour on a payroll text file in excel every month to reconcile the bank account at university. Now I’m using Power Query to cleanse a csv file from the bank in seconds for a small municipality. Progress!
That is awesome. What is too bad is the lack of knowledge that Power Query exists. Once I can show people they will see the light.
Before watching this video, I had a lot of trouble with reports like this, and didn't know how to use it with power query to automate it, many thanks Mr Excel
Even with this solution, we can't automate it. Because If we get the data with any different positions, It fails. Did you find any different way for that?
You can sort of do it in Power Query. Right click and select Duplicate on the name column and call the new column "Names" (just change in the formula bar). Then add another step in the applied steps window, and use:
=Table.Combine({#"Duplicated Column", Table.FromList(List.Distinct(#"Duplicated Column"[#"Names"]), null, {"Names"})})
This step (and you can separate into a few if easier) takes a distinct list of all the names in the duplicated column and converts to a table, which is then combined with the original data, giving you a list of null lines with only the duplicate name field populated.
You can then sort this by your name column, but first you have to replace values again in the actual column that you wish to sort by, replacing null with "" (replace null in all other columns created by the combined tables too). Then sort by duplicated name column and then delete the duplicated name column and you're left with the data with spaces.
Obviously, the correct answer is that spaces are for satan.
Or a function might be better. Copy this into a blank query and you can call it from your own query. Pass the table (previous step name), the column name you want to create a blank row under and the column that you next want to sort by after your column name. Should work nicely (edited from first post because it didn't work nicely!):
(sourceTable as table, columnName as text, nextSortColumnName as text) as table =>
let
Source = Table.DuplicateColumn(sourceTable, columnName, columnName & "_"),
columnNames = List.RemoveMatchingItems(Table.ColumnNames(Source),{columnName & "_"}),
distinctRowsAsTable = Table.FromList(List.Distinct(Table.Column(Source,columnName & "_")),null,{columnName & "_"}),
newTable = List.Accumulate(columnNames, distinctRowsAsTable, (state, current) => Table.AddColumn(state, current, each "")),
sortedData = Table.Sort(Table.Combine({Source,newTable}),{columnName & "_",nextSortColumnName}),
result = Table.RemoveColumns(sortedData, {columnName & "_"})
in
result
Thanks! Thanks, hard to believe that Power Query can't figure out the fixed-width delimiters.
Thank you, this is great! I have been trying to solve this for over five years!
Thanks Mr Excel! Power Query continues to amaze me with all its capabilities
yeah, and this is one of the most basic barly automated solution with a lot of manual inputs.
Imagine what is possible when all is automated!
You just made my life so much easier. Thank you!!!!
Very cool! Thank you!! Working with the same kind of csv file, is there a way do delete all rows after a cell with a specific text value? The text may appear in several rows, but I'd like to delete everything after the first instance.
Hahaha this looks like the output from out 'new' system. I just spent an hour making a 'companion' table to the data using offsets and =MID only to find there's a secondary row "sometimes" that has important data as they have offset rows of data.
I'm going to use these tips to import the sheet two ways, one for the main row, one for the offset rows, then recombine them I think..... Thanks MrExcel!
Fill down with null values is cool feature
Right! Easier that Home, Find & Select, Go To Special, Blanks, OK, = UpArrow Ctrl+Enter!
Thank you Mr Excel for this amazing video. 👏
That is amazing. thank you for the new information.
Thank you this worked out ...
Nice
This was amazing . But my file has headers spread into 2 rows and data is 2 rows seperated by a blank row . it runs into many pages . how to clean it ?
Amazing ❤
Hey, Thanks for the Video, It was very helpful. But I couldn't automate the report with this solution, As I am getting different position of column every day, Any idea for that??
I do not. But Ken Puls might have a solution. Google him.
Excellent Thanks!!!! 🤩
I LOVE YOU, thank you
Empty row after each Name? No problem😁
WithEmptyRow = Table.Combine(Table.Group( MrExcelLastStep, {"Name"}, {{"tbl", each _ & #table({"Name"}, {{null}}), type table}})[tbl])
in
WithEmptyRow
Thanks Mr Excel!!! :)
Can I have this file?
I would love to play around with it in PQ
Yes. Just went to your About tab but no email there. Send an email to me (it’s on the About tab) and I will send you the txt file.
Hey Mr.Excel,
You can insert the blank row in Power Query quite easily:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
DuplicateName = Table.DuplicateColumn(Source, "Name", "Name2"),
Combine = Table.Combine({DuplicateName,Table.Distinct(DuplicateName[[Name2]])}),
SortRows = Table.Sort(Combine,{{"Name2", Order.Ascending}, {"Value", Order.Ascending}}),
RemoveColumnRow = Table.Skip(Table.RemoveColumns(SortRows,{"Name2"}),1)
in
RemoveColumnRow