Someone mentioned using Vectorize, but I kind of think using Vectorize might not be good for large DataFrame's because of the overhead of converting a DF into a numpy array. Not tested yet.
If you search for "pandas iterate speed comparison," you can find info on the relative speeds of these methods. But the deeper point is that all of these methods are MUCH slower than vectorized (columnwise) operations, which you should use pretty much whenever possible when writing pandas code. This video does viewers a huge disservice by not pointing out that row-wise iteration should be treated as a last-resort option for when you really can't vectorize your particular task.
So many problems with this video. One big one is that you never explain WHY you would want to iterate over the rows of a DF - but even bigger than that is that you never explain that you usually SHOULDN'T loop over the rows! It is WAY slower to iterate over rows than to do columnar (vectorized) operations, and indeed, learning to think in vectorized steps is a key to getting good at pandas. Then there are sloppy details... for instance, in the .loc section, you iterate over range(len(df)) instead of df.index. This only works in your example because df happens to have a RangeIndex. To be safe with .loc, you should iterate over df.index itself, or else use .iloc if you're going to iterate over a range. I suppose for someone getting started with pandas, this video could be helpful insofar as it shows the basic syntax and options for row-wise operations, but you are doing that inexperienced viewer a disservice by failing to discuss the WHY and the WHY NOT around the whole topic.
@@cxl2377 Good question! I tried to reply with some links but it looks like the comment got deleted. Search for "pandas iterrows vs apply vs vectorized" and you will find many articles discussing this topic. Hope this helps...
2nd method me loc index data use krta he iss example me index range tha nahi toh error aata meri csv file me index datetime object tha isliye error aaya 2nd example me df.index use krna tha loop me ..... Not a proper video and explaination
Apply For Video Internship Program - script.geeksforgeeks.org/on-boarding/youtube
This video helped a lot, it was my first day trying panda, i am no expert but as a beginners I think this is very useful.
Very useful. Thanks.
Someone mentioned using Vectorize, but I kind of think using Vectorize might not be good for large DataFrame's because of the overhead of converting a DF into a numpy array. Not tested yet.
This is useful but you haven’t answered the fundamental question of when I should use each method! Which is most efficient?
If you search for "pandas iterate speed comparison," you can find info on the relative speeds of these methods. But the deeper point is that all of these methods are MUCH slower than vectorized (columnwise) operations, which you should use pretty much whenever possible when writing pandas code. This video does viewers a huge disservice by not pointing out that row-wise iteration should be treated as a last-resort option for when you really can't vectorize your particular task.
So many problems with this video. One big one is that you never explain WHY you would want to iterate over the rows of a DF - but even bigger than that is that you never explain that you usually SHOULDN'T loop over the rows! It is WAY slower to iterate over rows than to do columnar (vectorized) operations, and indeed, learning to think in vectorized steps is a key to getting good at pandas.
Then there are sloppy details... for instance, in the .loc section, you iterate over range(len(df)) instead of df.index. This only works in your example because df happens to have a RangeIndex. To be safe with .loc, you should iterate over df.index itself, or else use .iloc if you're going to iterate over a range.
I suppose for someone getting started with pandas, this video could be helpful insofar as it shows the basic syntax and options for row-wise operations, but you are doing that inexperienced viewer a disservice by failing to discuss the WHY and the WHY NOT around the whole topic.
hello is there some kind of online documentation that can provide me with the info that u mentionned? im just a beginner
@@cxl2377 Good question! I tried to reply with some links but it looks like the comment got deleted. Search for "pandas iterrows vs apply vs vectorized" and you will find many articles discussing this topic. Hope this helps...
@@DrewLevitt yes thank you! appreciate it!!
It might not be true, considering the overhead converting a large DF into a Numpy array.
Lambda is the fastest method? Can we get multiple columns using Lambda method?
Bro, in the 2nd method you can just write df.loc(index, ['col1', 'col2', 'col3']) rather than writing df.loc again and again
And, same in other methods too
Please I can't see the fonts clearly
super
2nd method me loc index data use krta he iss example me index range tha nahi toh error aata meri csv file me index datetime object tha isliye error aaya 2nd example me df.index use krna tha loop me ..... Not a proper video and explaination
Hi