Love it. I am so used to this sort of thing from writing in SQL (and adodb in Excel VBA). I've seen your example of duplicates on many an occasion. There are two options, the first is to do exactly as you did and the second is to leave the duplicate in place with the incorrect value, but with conditional formatting to highlight. Because what you've done isn't an xlookup, it's a table join and they're not the same thing (unfortunately). A table join requires a unique key, and your dataset didn't have one. That's because your second table isn't a list of customers (or accounts), it's a list of customers by region. The two are not interchangeable.
To add to the above. It's often very difficult to explain this data issue to an accountant, manager or other. They'll argue that it's perfectly possible that a customer can belong to multiple regions, and that's likely correct. That being the case, their customer revenue report, with added region will always be wrong unless you're splitting the revenue by region too. In one interpretation, you'll be assigning revenue twice, another is that you're assigning total revenue from one customer to a single region, thus overstating the region's revenue. The answer is to go back to the person asking for the report and ask what they want to happen. Perhaps the correct answer will be that your customer by region table is recalculated to be the customer by assigned region, where the assigned region is based on sales volumes, geographical location or other such measure. It also might be that you extract the data in your left hand table with more granular detail, that allows you to create a join on a different field that splits revenue by customer and region. Certainly just picking one is arbitrary (alphabetical, probably), but may also be sufficient.
Very nice! These days, if the lookup table has dupes I'd be tempted to use something like FILTER and TEXTJOIN to return both / all matches (so e.g. Access Analytic would should up as Consulting, Retail). Of course, you'll need to work with that column a little differently later if you do... If you wanted to do the same thing in Python, I'm fairly sure (haven't tried yet!) that you could group the lookup table by customer and return all sectors so you could then use it for a merge.
Joins always duplicate rows if multiple matches, be it SQL, PowerQuery, or Python. The expectation being that you know the granularity and modify it before joining whenever needed, like you did. I will sometimes do a group by instead of remove duplicates. This lets me aggregate (textjoin/concat) the adjacent text column rather than choose the first or last arbitrarily. Not always the right option, but an option. In PowerQuery this would look something like: =Table.Group(Table, "Customer", { {"Sectors", each Text.Combine(List.Sort([Sector]), ", "), type text}, {"Count", each Table.RowCount(_), Int64.Type} })
I would use a unique lookup value/table. This may save a lot of work. Also, I would write the name of the lookup column "customer" twice; (good practice) to ring a bill if someone changes the column's title. One option is to create a function to replace extra spaces or to replace undesired titles with the standard header titles). Another option is to try to create dynamic header names (create a function that determines the header and uses it), but it would be tedious, complicated, and have its own weak points.
Yes, something like: nums = [1,2,3,4] total = 0 for i in nums: total+= i total Obviously, not necessary to calculate the sum of a column by using a loop, but just as a sample to loop through rows and calculate the total of the Value columns in a DataFrame (range): df = xl("A1:B4", headers=True) total = 0 for i in df.index: total+= df["Value"][i] total
Hmmm... Python is completely new to me, and I'm slightly mystified about what the point of it is in this situation. If you're avoiding using Excel lookup functions (and Power Query) here, it seems to me that you're just using the Excel worksheet structure as a 'shell' for 'doing' Python? Which begs the question, where else can you 'do' Python?
We're opening a new chapter with this PY for excel ❤🎉😊 what a time to be alive 🙌🙏👏
🥳🥳🥳
The most important thing now ... is that "this excel file is a treasure for study and learn" .... thanks Mr.Excel
Love it. I am so used to this sort of thing from writing in SQL (and adodb in Excel VBA). I've seen your example of duplicates on many an occasion. There are two options, the first is to do exactly as you did and the second is to leave the duplicate in place with the incorrect value, but with conditional formatting to highlight.
Because what you've done isn't an xlookup, it's a table join and they're not the same thing (unfortunately). A table join requires a unique key, and your dataset didn't have one. That's because your second table isn't a list of customers (or accounts), it's a list of customers by region. The two are not interchangeable.
To add to the above. It's often very difficult to explain this data issue to an accountant, manager or other. They'll argue that it's perfectly possible that a customer can belong to multiple regions, and that's likely correct. That being the case, their customer revenue report, with added region will always be wrong unless you're splitting the revenue by region too. In one interpretation, you'll be assigning revenue twice, another is that you're assigning total revenue from one customer to a single region, thus overstating the region's revenue. The answer is to go back to the person asking for the report and ask what they want to happen. Perhaps the correct answer will be that your customer by region table is recalculated to be the customer by assigned region, where the assigned region is based on sales volumes, geographical location or other such measure. It also might be that you extract the data in your left hand table with more granular detail, that allows you to create a join on a different field that splits revenue by customer and region. Certainly just picking one is arbitrary (alphabetical, probably), but may also be sufficient.
Very nice!
These days, if the lookup table has dupes I'd be tempted to use something like FILTER and TEXTJOIN to return both / all matches (so e.g. Access Analytic would should up as Consulting, Retail). Of course, you'll need to work with that column a little differently later if you do...
If you wanted to do the same thing in Python, I'm fairly sure (haven't tried yet!) that you could group the lookup table by customer and return all sectors so you could then use it for a merge.
Joins always duplicate rows if multiple matches, be it SQL, PowerQuery, or Python. The expectation being that you know the granularity and modify it before joining whenever needed, like you did. I will sometimes do a group by instead of remove duplicates. This lets me aggregate (textjoin/concat) the adjacent text column rather than choose the first or last arbitrarily. Not always the right option, but an option. In PowerQuery this would look something like:
=Table.Group(Table, "Customer", {
{"Sectors", each Text.Combine(List.Sort([Sector]), ", "), type text},
{"Count", each Table.RowCount(_), Int64.Type}
})
I would use a unique lookup value/table. This may save a lot of work.
Also, I would write the name of the lookup column "customer" twice; (good practice) to ring a bill if someone changes the column's title.
One option is to create a function to replace extra spaces or to replace undesired titles with the standard header titles).
Another option is to try to create dynamic header names (create a function that determines the header and uses it), but it would be tedious, complicated, and have its own weak points.
Thanks Bill for this amazing video.
I need to learn more in depth about phyton excel
I am eagerly waiting for the python option in my excel. Thank you..
It just arrived in my Excel today. Update your Office, perhaps your Office is in this week's update batch.
Nice example. Thanks!
Hi great videо. Have a question-can df table be defined from several sheets, having same column titles but located in two or more sheets? Thanks
Interesting 🤔
can python in excel do looping like
for..
do ...while
while ..
etc..
Yes, something like:
nums = [1,2,3,4]
total = 0
for i in nums:
total+= i
total
Obviously, not necessary to calculate the sum of a column by using a loop, but just as a sample to loop through rows and calculate the total of the Value columns in a DataFrame (range):
df = xl("A1:B4", headers=True)
total = 0
for i in df.index:
total+= df["Value"][i]
total
is it worth it to do this type of x lookup? I don't see the benefits
All is revealed at the end of the video
Hmmm... Python is completely new to me, and I'm slightly mystified about what the point of it is in this situation. If you're avoiding using Excel lookup functions (and Power Query) here, it seems to me that you're just using the Excel worksheet structure as a 'shell' for 'doing' Python? Which begs the question, where else can you 'do' Python?