Different Ways to Iterate Over Rows in Pandas DataFrame | GeeksforGeeks

Поділитися
Вставка
  • Опубліковано 10 січ 2025

КОМЕНТАРІ • 18

  • @GeeksforGeeksVideos
    @GeeksforGeeksVideos  3 роки тому

    Apply For Video Internship Program - script.geeksforgeeks.org/on-boarding/youtube

  • @mansinghchauhan4389
    @mansinghchauhan4389 2 роки тому +1

    This video helped a lot, it was my first day trying panda, i am no expert but as a beginners I think this is very useful.

  • @Sunil_Manohar
    @Sunil_Manohar 5 місяців тому

    Very useful. Thanks.

  • @maxmax0
    @maxmax0 Рік тому +1

    Someone mentioned using Vectorize, but I kind of think using Vectorize might not be good for large DataFrame's because of the overhead of converting a DF into a numpy array. Not tested yet.

  • @johngwheeler
    @johngwheeler 2 роки тому +2

    This is useful but you haven’t answered the fundamental question of when I should use each method! Which is most efficient?

    • @DrewLevitt
      @DrewLevitt 2 роки тому +1

      If you search for "pandas iterate speed comparison," you can find info on the relative speeds of these methods. But the deeper point is that all of these methods are MUCH slower than vectorized (columnwise) operations, which you should use pretty much whenever possible when writing pandas code. This video does viewers a huge disservice by not pointing out that row-wise iteration should be treated as a last-resort option for when you really can't vectorize your particular task.

  • @DrewLevitt
    @DrewLevitt 2 роки тому +6

    So many problems with this video. One big one is that you never explain WHY you would want to iterate over the rows of a DF - but even bigger than that is that you never explain that you usually SHOULDN'T loop over the rows! It is WAY slower to iterate over rows than to do columnar (vectorized) operations, and indeed, learning to think in vectorized steps is a key to getting good at pandas.
    Then there are sloppy details... for instance, in the .loc section, you iterate over range(len(df)) instead of df.index. This only works in your example because df happens to have a RangeIndex. To be safe with .loc, you should iterate over df.index itself, or else use .iloc if you're going to iterate over a range.
    I suppose for someone getting started with pandas, this video could be helpful insofar as it shows the basic syntax and options for row-wise operations, but you are doing that inexperienced viewer a disservice by failing to discuss the WHY and the WHY NOT around the whole topic.

    • @cxl2377
      @cxl2377 2 роки тому

      hello is there some kind of online documentation that can provide me with the info that u mentionned? im just a beginner

    • @DrewLevitt
      @DrewLevitt 2 роки тому +4

      @@cxl2377 Good question! I tried to reply with some links but it looks like the comment got deleted. Search for "pandas iterrows vs apply vs vectorized" and you will find many articles discussing this topic. Hope this helps...

    • @cxl2377
      @cxl2377 2 роки тому

      @@DrewLevitt yes thank you! appreciate it!!

    • @maxmax0
      @maxmax0 Рік тому

      It might not be true, considering the overhead converting a large DF into a Numpy array.

  • @kishorep3668
    @kishorep3668 2 роки тому

    Lambda is the fastest method? Can we get multiple columns using Lambda method?

  • @naajidsarosh2881
    @naajidsarosh2881 2 роки тому +1

    Bro, in the 2nd method you can just write df.loc(index, ['col1', 'col2', 'col3']) rather than writing df.loc again and again

  • @dorapomaa2273
    @dorapomaa2273 Рік тому

    Please I can't see the fonts clearly

  • @muraliaxonifytech2160
    @muraliaxonifytech2160 2 роки тому

    super

  • @Purity12
    @Purity12 Рік тому

    2nd method me loc index data use krta he iss example me index range tha nahi toh error aata meri csv file me index datetime object tha isliye error aaya 2nd example me df.index use krna tha loop me ..... Not a proper video and explaination

  • @Harshji11
    @Harshji11 3 роки тому

    Hi