Solving 100 Python Pandas Problems! (from easy to very difficult)

Поділитися
Вставка
  • Опубліковано 31 гру 2024

КОМЕНТАРІ • 119

  • @KeithGalli
    @KeithGalli  8 місяців тому +74

    Woo!!! 5 hours of Pandas practice, what could be better. Hope you all enjoy!

    • @jonpounds1922
      @jonpounds1922 8 місяців тому +2

      Will watch this on repeat until I am an expert. Thank you.

    • @KeithGalli
      @KeithGalli  8 місяців тому

      @@jonpounds1922 haha my man 💪

    • @simonmasters3295
      @simonmasters3295 8 місяців тому

      It will not make you and expert
      Consider the examples trivial. Average age of each animal (a SQL "group by" for instance...) try doing it this way with a million animals and computing the variance at the same time

    • @PaYaMv2
      @PaYaMv2 8 місяців тому

      You are a sight for sore programming eyes Keith! We cannot thank you enough for this!!

    • @foland2619
      @foland2619 8 місяців тому

      Awesome work and skills Keith. Thank you, great effort

  • @JatinKumar-cn9wt
    @JatinKumar-cn9wt 8 місяців тому +84

    Are you crazy man , 5 hour + course only for pandas , man your dedication for teaching is amazing

    • @KeithGalli
      @KeithGalli  8 місяців тому +12

      I appreciate the support!

  • @JW-pu1uk
    @JW-pu1uk 8 місяців тому +7

    Dude I freaking LOVE your content. I am so stoked to see this video and have it bookmarked for the rest of my data science career lol

    • @KeithGalli
      @KeithGalli  8 місяців тому

      haha love that! Glad you like the content :)

  • @ngoclinhvu5381
    @ngoclinhvu5381 8 місяців тому +24

    5 hours of pandas puzzles??? Just what I need!

    • @ngoclinhvu5381
      @ngoclinhvu5381 8 місяців тому

      never thought I'd ever say that in my life tbh

    • @KeithGalli
      @KeithGalli  8 місяців тому +2

      @@ngoclinhvu5381 Haha very fair. I found these exercises very educational for me personally, so hope that you do as well!

  • @d.g0101
    @d.g0101 8 місяців тому +7

    This is what I was looking for.
    Thanks Keith

  • @raphaelmatthew5165
    @raphaelmatthew5165 8 місяців тому +27

    Please guys give this video a like if you haven't, it takes a lot of work to create such a masterpiece. Welcome back Keith🎉.

  • @rakshitshukla4205
    @rakshitshukla4205 Місяць тому +1

    That small dance at 3:37:27 was a pleasant surprise out of nowhere. You are amazing :D

    • @KeithGalli
      @KeithGalli  Місяць тому

      Hahahah glad you enjoyed that 😂

  • @rrrprogram8667
    @rrrprogram8667 5 місяців тому +1

    Subscribed.... the way you are doing is a genuine way of making mistakes and then learning

  • @klausditrich7323
    @klausditrich7323 8 місяців тому +3

    I'm too old for all the Minecraft or Fortnight streams, so here I'm and loving it :-)

  • @nabinkoirala5054
    @nabinkoirala5054 5 місяців тому +1

    you are so genuine and humble!

  • @soroushnazari5596
    @soroushnazari5596 8 місяців тому +1

    Great to have you back Keith! Going to watch it over the next couple of days and it’s gonna be my sort of bible I guess for future reference haha

    • @KeithGalli
      @KeithGalli  8 місяців тому

      Love that! I found the exercises very educational myself.

  • @Kidpambi
    @Kidpambi 8 місяців тому

    It is great to have you back teaching 🎉

  • @vedantlssj2
    @vedantlssj2 6 місяців тому

    This the kind of content that makes UA-cam the great source of learning it is!

  • @chetan8577
    @chetan8577 8 місяців тому

    This video would be really helpful.
    Keep up the great work!😊

  • @passportbro904
    @passportbro904 Місяць тому

    Im so happy you had to Google something easy so quick and I knew the answer 😂 it just made me feel that one im really learning and its ok to seek help if needed lol. Amazing video. Thank you

  • @JudyLuHe
    @JudyLuHe 5 місяців тому

    Hi Keith, Genuinely appreciate you solving all these pandas problems. I am not sure if you already have but I was wondering if you could also do one on the 100 Numpy problems? Again, thanks for you work.

  • @AgustinGonzalez-tz3yr
    @AgustinGonzalez-tz3yr 8 місяців тому +3

    19:30 I do this a lot, by passing a dict to the agg function after grouping (it allows you to asign multiple operators to several cols at once). Eg df.groupby(“animal”).agg({“age”:”mean”})

    • @KeithGalli
      @KeithGalli  8 місяців тому +2

      This is super useful, thank you for the tip!

    • @thiagosiqueira4690
      @thiagosiqueira4690 7 місяців тому

      this works too, df.groupby('animal').mean('age')

    • @AgustinGonzalez-tz3yr
      @AgustinGonzalez-tz3yr 7 місяців тому

      @@thiagosiqueira4690 I think that doesn't work for grouping by multiple columns and adding a specific function for every column

  • @conykuo4308
    @conykuo4308 7 місяців тому

    5 hours pandas video is crazyyyy. Must give a thumb up!

  • @ryandavis280
    @ryandavis280 7 місяців тому +1

    OMG keith you are a lifesaver! thank you!

  • @sarveshpadav2881
    @sarveshpadav2881 8 місяців тому +1

    24:27 We can use gropyby to count the animals in the following way...
    df.groupby('animal')['animal'].count()

  • @abdulbasitnisar
    @abdulbasitnisar 7 місяців тому

    Thank you such much!! whatever you are doing actually is life changing for people like me who is self learning these! Thank you!!!

  • @Leomessii99
    @Leomessii99 Місяць тому

    Just came here to say a big THANK YOU 🙌🏻🙌🏻

  • @renatolippi
    @renatolippi 8 місяців тому +1

    Excellent! Thank you very much for this video!! Please more with this format 👏

  • @chandrasekars8904
    @chandrasekars8904 8 місяців тому

    This is really an excellent channel on Python like "techie talkee"

  • @gamersgame43
    @gamersgame43 5 місяців тому

    1:23:56 Here's the updated code of pandas for the question 27. df.groupby(['grps'])['vals'].nlargest(3).groupby(level=0).sum()

  • @ameyb9241
    @ameyb9241 5 місяців тому

    Thanks Keith! This is so goodd

  • @rodrigo100kk
    @rodrigo100kk 5 місяців тому

    At 33:57 - "22) Filter duplicate integers" - Might as well try: pd.DataFrame(data=df['A'].unique(),columns=['A'])

  • @tdcode
    @tdcode 7 місяців тому

    Man, you're crazy 🤣🤣🤣🤣🤣🤣🤣🤣🤣. This is awesome! Thanks for a colossal and great video!!!🎉🎉

  • @paraglide01
    @paraglide01 6 місяців тому

    Thanks man, I was just looking for getting into Pandas.

  • @programmer1010
    @programmer1010 3 місяці тому

    50:08 In problem 23 they used df.sub to be able to specify axis, to make subtraction row-wise.
    I don’t know how - operator does, does it do always row wise or always column wise or maybe it chooses every time based on input?

  • @Soso65929
    @Soso65929 8 місяців тому

    more of this buddy enjoyed each second

  • @mubasshiraquraishi9328
    @mubasshiraquraishi9328 15 днів тому

    For question 27, this is something i find easy to code and understand
    df = df.sort_values(by=['grps', 'vars'], ascending=[True, False])
    df = df.groupby('grps').head(3).reset_index(drop=True)
    df.groupby('grps')['vars'].sum()

  • @rohitsharma-mg7hd
    @rohitsharma-mg7hd 8 місяців тому

    another solution to puzzle 26:
    for i in [0,1,2,3,4]:
    a=df.iloc[i].sort_values(ascending=True).index[7]
    print (a)

  • @souravbarua3991
    @souravbarua3991 7 місяців тому

    Thank you for making such wonder videos on python.🙏.please make some videos on pyspark also.

  • @Ghost____Rider
    @Ghost____Rider 4 місяці тому +2

    Great video! However I believe your solution for q23 was wrong. You subtracted the mean of the entire DataFrame instead of the mean of each row. It worked for your example of np.ones because the entire DataFrame had the same mean as the mean of each row (a mean of 1). You want a solution that subtracts a different value for each row, namely the row's mean.

  • @vishalcrazy5121
    @vishalcrazy5121 7 місяців тому

    Thank you for this Keith .

  • @loganmclaughlin1288
    @loganmclaughlin1288 8 місяців тому +1

    Love the long form !

  • @balajijadhav6080
    @balajijadhav6080 5 місяців тому

    Thank you so much sir for this pandas session

  • @second1799
    @second1799 7 місяців тому

    This is awesome! can you do for other libraries too please!!!

  • @APP-ld6jf
    @APP-ld6jf 8 місяців тому

    Looking forward to numpy puzzles now!!

  • @rrrprogram8667
    @rrrprogram8667 5 місяців тому

    Blindly subscribed

  • @NuanceWebsites
    @NuanceWebsites 7 місяців тому

    Bro, you are a genius!!!

  • @NavjotSidhu-qo6ry
    @NavjotSidhu-qo6ry 4 місяці тому

    Man you are crazy....but amazing... !! love from india.. !!

  • @tameemalkhliefat3036
    @tameemalkhliefat3036 19 днів тому

    Is it ok if am struggling with the questions that are labeled hard or am I supposed to be able to answer all without searching functions up?

    • @KeithGalli
      @KeithGalli  18 днів тому +1

      Very normal! I'm always looking things up in real-world projects. I would say that you should be able to get to the point where you can answer the problems without looking up the specific answer, but by looking up the functions that will help you implement the logic to get to the answer.

  • @StalkedHuman
    @StalkedHuman 8 місяців тому

    What do you think is a good method to concatanate a string value from datafram column to anothee dataframe column by index key. Example, df_1 rows 10, 20, 26, 30, 40 column 5 (string) concatonate to df_2 rows 9, 19, 25, 29, 39 column 1?..

  • @tasmisa6778
    @tasmisa6778 3 місяці тому

    Thanks dude🎉it is helpful

  • @chandanirohan
    @chandanirohan 4 місяці тому

    in your solution no 24, the operation df - df.mean(axis=1) won't work directly because the dimensions won't align

  • @AndyJagroom-ur7xh
    @AndyJagroom-ur7xh 5 місяців тому +1

    Can you do the numpy one also

  • @dianapestriaeva9853
    @dianapestriaeva9853 8 місяців тому

    pure gold 🤩

  • @GugheleLavoro
    @GugheleLavoro 8 місяців тому +2

    answer to question 23 is incorrect: it say the ROW mean, while df.mean() gives you the column mean

  • @Aya_Chagra
    @Aya_Chagra 8 місяців тому

    Very Nice ⚡thanks a lot ⚘

  • @chamaraweerasinghe5836
    @chamaraweerasinghe5836 8 місяців тому

    Thanks Lot Keith...😘

  • @jdmcivicrrr
    @jdmcivicrrr 8 місяців тому

    This is awesome. Thanks. ❤

    • @KeithGalli
      @KeithGalli  8 місяців тому

      You are very welcome!

  • @yaj_at8787
    @yaj_at8787 7 місяців тому

    thanks for this...needed

  • @AchuVlogs
    @AchuVlogs 8 місяців тому

    Awesome!! Can you please make one for Pyspark? :)

  • @jehfodelrey
    @jehfodelrey 8 місяців тому

    Thanks for this. Do you know some app or website, to practice or improve my scripting skills?

  • @ricklove8358
    @ricklove8358 5 місяців тому

    i think i have a better solution for filtering rows which contain the same integer as the row above
    df = pd.DataFrame({'A': [1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7]})
    def precedingDuplicateCheck(row):
    if row.name == 0:
    prev = None
    return False
    else:
    prev = df.loc[row.name -1,'A']
    if prev == row['A']:
    return True
    else:
    return False
    df_new = df[~df.apply(precedingDuplicateCheck,axis=1)]
    df_new

  • @FIBONACCIVEGA
    @FIBONACCIVEGA 8 місяців тому

    Such a good videos!!!

  • @aloSolo
    @aloSolo 8 місяців тому

    You look great 🎉 and thanks for posting this video.

  • @chalijutt3478
    @chalijutt3478 6 місяців тому

    could somebody explain q23 because the way he is doing, i think its wrong because "df.mean()" going to give us the mean values with respect to individual columns not rows and then also in subtraction each mean value going to subtracted from individual columns respectively. we have to use the "df.mean(axis=1)" and then in subtraction also we have to take care of it .
    I have done it like that "df.subtract(df.mean(axis=1),axis="index").multiply(-1)". Please correct me if i am wrong.

    • @Ghost____Rider
      @Ghost____Rider 4 місяці тому

      His solution is wrong, I agree, but not for the reason you say. His solution calculated the mean of the entire DataFrame but he should've subtracted the mean for each row (axis=0).
      Edit: Never mind, I agree with you.

  • @nacef7606
    @nacef7606 8 місяців тому

    for the 22th quiz it could be done as simple as
    for i,k in df.items():
    df=set(k)

    • @KeithGalli
      @KeithGalli  8 місяців тому +3

      A couple of issues with the set solution.
      One is that a set is not guaranteed to preserve any order that numbers are inserted into it, so even though it worked in this example, if we changed the numbers to [3,3,2,2,1,1,8,8,9], the output show {1,2,3,8,9} instead of [3,2,1,8,9].
      Another issue with the set solution is that if a number reappears later on in the list, it will disregard it from the solution. So if we had [1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7, 2, 2] as our input, your solution would output {1,2,3,4,5,6,7} while the correct solution would be [1,2,3,4,5,6,7,2].
      Hope this is helpful!

  • @derekborders9647
    @derekborders9647 6 місяців тому

    Weird to see you open VSCode with the ui when the cli command is the easiest one to add to your other flow once you’re in that directory after cloning.
    code .
    Opens the current directory in VSCode.

  • @sofiarequena2616
    @sofiarequena2616 8 місяців тому

    Great video!!

  • @dsisimridijsbs1969
    @dsisimridijsbs1969 6 місяців тому

    Hi guys, is there something similar or equivalent for SQL and scikit-learn? Thank you in advance!

  • @priyatiwari234
    @priyatiwari234 5 місяців тому

    great work

  • @rrrprogram8667
    @rrrprogram8667 5 місяців тому

    I believe the answer for the 23rd problem is not the correct one because the average mean of the vertical column is also one

  • @sarvanandgaikwad3048
    @sarvanandgaikwad3048 8 місяців тому

    Do the same for all the popular libraries.

  • @panth5501
    @panth5501 7 місяців тому +1

    Great content, solution of 23 I believe is wrong.

    • @Ghost____Rider
      @Ghost____Rider 4 місяці тому

      I agree, he subtracted the mean of the entire DataFrame instead of each row. It worked for his example of np.ones because the entire DataFrame had the same mean as the mean of each row (1).

  • @Cynosure11
    @Cynosure11 8 місяців тому

    Thank you for your video, Keith!
    Question for you, do you think its too late to get data science job in 2024?

    • @KeithGalli
      @KeithGalli  8 місяців тому +2

      The job market is challenging right now, but data science positions aren't going anywhere. You definitely can still get a data science job in 2024.
      That being said, I wouldn't only look for data science positions. There are a lot of software engineering & data engineering roles that use a similar skillset that can be less competitive to land. I'd recommend keeping track of the most popular skills on job openings for all these types of roles, and tailor what you learn moving forward based on that.
      I also recommend trying to network with people that are working at companies you find interesting. You'll give yourself a much better chance at landing a data science job if you are referred by someone already at a company you are applying to. Job postings on a site like LinkedIn can really difficult to progress in the process because so many people apply.

  • @MrBeavis2014
    @MrBeavis2014 8 місяців тому

    thank you very much

  • @ikersanchez8222
    @ikersanchez8222 8 місяців тому

    I love your content

  • @realzeejay
    @realzeejay 8 місяців тому

    great video! however, regarding the usage of the terminal to create directories etc at 0:59 , can anyone recommend some youtube videos or sources to get more familiar with it? thanks a bunch! good luck getting good at pandas everybody :)

  • @weiwei2587
    @weiwei2587 8 місяців тому

    Great tutorial!

  • @starlordhero1607
    @starlordhero1607 7 місяців тому

    Bro, can you do it for other libraries like numpy, seaborn, and matplotlib. Please !!!!!

  • @sssimp4216
    @sssimp4216 8 місяців тому

    Thank you 😭🩵🩵

  • @saikumar7247
    @saikumar7247 7 місяців тому

    sir could u make same like numpy video

  • @DataScience-oj4hc
    @DataScience-oj4hc 6 місяців тому

    1:10:03

  • @brigitayantie
    @brigitayantie 8 місяців тому

    You really so serious learn and post this class

  • @riteshpatel9313
    @riteshpatel9313 4 місяці тому +1

    Plz make practice video on "NUMPY" and "MATPLOTLIB" .
    PEOPLE WHO WANT THIS VIDEO "LIKE THIS COMMENT"

  • @Intellectualmind4
    @Intellectualmind4 8 місяців тому +1

    Great job boss 🎉🎉🎉🎉🎉

  • @reach2puneeths
    @reach2puneeths 8 місяців тому

    Nice video and content. Can you also come up with similar video of pyspark.

    • @KeithGalli
      @KeithGalli  8 місяців тому

      Thank you! As of now I don't have immediate plans to Pyspark video, but I'll look more into it.

  • @sairajtrimbake2801
    @sairajtrimbake2801 7 місяців тому

    Legend

  • @ДмитрийКолышницын-с2л
    @ДмитрийКолышницын-с2л 8 місяців тому

    🎉🎉Cool!!!!

  • @meeFaizul
    @meeFaizul 8 місяців тому

    Finally❤😂

  • @brigitayantie
    @brigitayantie 3 місяці тому

    Is okay to collab

  • @edwinroman30
    @edwinroman30 8 місяців тому

    🎉🎉🎉🎉

  • @adamlongon53
    @adamlongon53 8 місяців тому

    Wow OMG ...

  • @andreogimenes
    @andreogimenes 7 місяців тому

    49 seconds theres a disgusting sound!

  • @sebastianalvarez1537
    @sebastianalvarez1537 8 місяців тому

    pants