Complete Python Pandas Data Science Tutorial! (2024 Updated Edition)

Поділитися
Вставка
  • Опубліковано 5 вер 2024

КОМЕНТАРІ • 130

  • @Kevin-cy2dr
    @Kevin-cy2dr 2 місяці тому +50

    Back when the first iteration was released i was in college having no idea about what a dataframe is now I'm a developer and still watching your videos. Thanks Keith for being a part of my learning journey❤

    • @la-dev
      @la-dev Місяць тому

      I'm totally new to Python and learn the basics from the Corey Schafer. And now moved here to learn Pandas. I'm on right track? My goal is to become data engineer and then data scientist.

  • @dabunnisher29
    @dabunnisher29 2 місяці тому +16

    Your last pandas tutorial helped save me hours and hours of work. Don't ever forget that you are AWESOME!!!!

  • @pierresorel28
    @pierresorel28 2 місяці тому +15

    People wait for new episodes on Netflix but legends wait for Keith's new tutorials 😎

  • @aflah7572
    @aflah7572 2 місяці тому +3

    Strongly resonating with another comment here
    I recall watching your tutorials in my first year of college. I just graduated recently and became research software engineer. Your videos have been pivotal for all the stuff I've done :)

    • @KeithGalli
      @KeithGalli  Місяць тому +1

      Awesome stuff! Congrats on the new role. Keep up the good work 😎

    • @aflah7572
      @aflah7572 Місяць тому

      @@KeithGalli Thank You!!

  • @benjoanc
    @benjoanc Місяць тому +1

    I always love your content because of the ease of understanding ❤
    I've been hearing alot of the polars library but there's limited content on it. Please if possible do something on it

  • @rodrigo100kk
    @rodrigo100kk 2 місяці тому +4

    Absolutely amazing! A hint: make a Python Pandas Advanced Tutorial more focused on graphics.

    • @NewsChannel-y4g
      @NewsChannel-y4g 2 місяці тому +1

      Would love to have a follow up video on seaborn from this guy with these same csv files shown. the parquet and excel files do not seem to want to copy paste from the browser when you select raw

  • @RealBenBizman
    @RealBenBizman 2 місяці тому +4

    No way- I just watched your other video on this the other day! Crazy!

  • @utkarshkapil
    @utkarshkapil 24 дні тому +1

    Bro's content is still the best out there after 5 years

  • @faugno-1516
    @faugno-1516 22 дні тому

    I really appreciate your efforts , you are delivering such a best content related to python and its libraries. I saw your first dataset cleaning with pandas and i truly loved your live tutorial . Please come with more real word pandas dataset cleaning live tutorials which helps junior developer lime me a lot. Once again Thanks for sharing this type of content

  • @jaideepsingh870
    @jaideepsingh870 29 днів тому

    this is honestly the best tutorials i have ever seen, really looking forward to new learnings

  • @ben_tyler5
    @ben_tyler5 2 місяці тому +2

    Did anyone notice how our keith has been sneaking a quick peek to the right at the beginning in the last few videos? 😂 Seriously though, loving the content!"

  • @udaynj
    @udaynj Місяць тому

    Awesome video, right speed and comprehensive. My thanks to you for taking the time to do this - am sure it was hours and hours of work and I truly appreciate your effort

  • @mikhailbandurist8652
    @mikhailbandurist8652 Місяць тому +1

    It's an honour to me to be among the first viewers of this excellent tutorial!

  • @rimpan1556
    @rimpan1556 Місяць тому

    Great tutorial. You keep teaching new things all the time with practicao examples and speak just the exact amount not to make it boring. Good job. I wait for sklearn, np, matplotlib, sns, streamlit tutorials 😂

  • @skyeshwin
    @skyeshwin 2 місяці тому +3

    Hey Keith! Big fan of your work! Keep it going brother!

  • @JJGhostHunters
    @JJGhostHunters Місяць тому

    This is great content! Please make a similar tutorial or recommend one that relates to using vectorization via Numpy arrays. I have applications that do what I need them to do, but involve nested loops that iterate over millions of rows of data. I really need to move away from these loops to improve execution time.

  • @gaumeuvlog2603
    @gaumeuvlog2603 Місяць тому

    Thanks for uploading new video about Pandas. I learn a lot from you. Can't wait to watch your next videos 🤩

  • @masonhyde9411
    @masonhyde9411 2 місяці тому

    1:14:00 Yes this is true! I analyzed an Olympic dataset for a college final project, and we used the fact that the plurality of NHL players are born in Jan-March to pitch our analysis proposal.

    • @KeithGalli
      @KeithGalli  2 місяці тому +1

      Cool to hear that you have validated this with data! 💯

  • @bouallaguiali2906
    @bouallaguiali2906 Місяць тому

    Well done Keith . Please do more videos about Data Analysis .

  • @meeFaizul
    @meeFaizul 2 місяці тому

    Keith, your tutorial is a game-changer!
    Your content is top-notch. Can't wait for more!
    ❤️ from 🇵🇰

  • @ahillsavio5607
    @ahillsavio5607 Місяць тому +1

    Good stuff man! Keep up the good work!

  • @corporate_guyfitness
    @corporate_guyfitness 7 днів тому

    Thanks Keith love from India it is really helpful to new learners like me

  • @VishnuChandran-zj7sq
    @VishnuChandran-zj7sq Місяць тому

    Thank you for making this video. Keep rocking!

  • @adarshravindran9137
    @adarshravindran9137 2 місяці тому +1

    00:01 Complete Python Pandas Data Science Tutorial
    02:12 Setting up virtual environment for data science project
    07:03 Exploring DataFrame Functions
    09:26 Learn how to load CSV files in pandas
    14:12 Accessing and filtering data in Pandas
    16:31 Understanding data slicing and indexing in Pandas
    21:08 Accessing and manipulating data in Pandas
    23:18 Iterating through rows in Pandas can be done but may affect performance.
    27:45 Advanced conditional filtering based on string operations
    29:59 Filtering data using regular expressions in pandas
    34:18 Adding and removing columns in Pandas data frame
    36:34 Using inplace parameter in Pandas for modifying data in place
    41:22 Extracting specific data fields from a Pandas dataframe
    43:40 Convert date objects to datetime type for easy manipulation
    48:14 Custom functions using Lambda for data manipulation
    50:40 Merging and concatenating data at scale.
    55:24 Data frame manipulation for filtering and combining data.
    57:52 Merging data frames and handling null values
    1:02:26 Handling missing data using pandas dropna method
    1:04:44 Analyzing Olympic athlete data using Pandas in Python
    1:09:03 Pivot tables convert data into a useful format.
    1:11:32 Analyzing Popular Birthdates of Olympic Athletes in Python Pandas
    1:16:54 Ranking heights of individuals using Python Pandas
    1:19:16 Utilizing rolling functions in Pandas for cumulative sums and other calculations
    1:24:22 Using specific data types in Pandas like string types within Pi Arrow can optimize performance at scale.
    1:27:00 Using Pandas to filter and pivot data in Python
    1:32:11 Explore Olympics dataset and pandas functionalities
    1:33:52 Wrap up and thank viewers for watching
    Crafted by Merlin AI.

  • @abdouseck4894
    @abdouseck4894 4 дні тому

    AWESOME VIDEO, BTW ITS SHOWING YOUR CHATGPT HISTORY 😄

    • @KeithGalli
      @KeithGalli  3 дні тому

      Lol yeah I know. I thought about blurring it, but I figured people might get a kick out of seeing my chatgpt history xD.

  • @CesarSantosLopezYolo
    @CesarSantosLopezYolo 2 місяці тому

    Hey I love these vids... Keep them coming! Love from Mexico buddy

  • @francisco_ponce
    @francisco_ponce Місяць тому

    Me parece increible como hay gente capaz de almacenar tanta informacion, muchas gracias por el video!!

  • @ObinnaWGMI
    @ObinnaWGMI День тому

    Would've been nice to have mentioned the shortcuts you used

  • @Hoan9duy
    @Hoan9duy 2 місяці тому +1

    Awesome content as always 🔥🔥🔥

  • @tobibaby
    @tobibaby 16 днів тому

    Köszönöm a videót.

  • @aleksandrajovanovic2631
    @aleksandrajovanovic2631 Місяць тому

    how to split dataframe for example i want dataframe for every sport or country great video :)

  • @MachineLearning-mv8zb
    @MachineLearning-mv8zb 2 місяці тому +1

    Great you're back!

  • @ayodejiisarinade857
    @ayodejiisarinade857 Місяць тому

    You are doing a great job. Well-done

  • @ahmedbadal3795
    @ahmedbadal3795 2 місяці тому

    nice 2:00 pm course for me thanks alot

  • @random-drops
    @random-drops 2 місяці тому +1

    Thanks. While watching your introduction, I start to wonder if you're going to do a video on NumPy, especially when a major version has released. No hurry, please take your time. Thanks in advance.

    • @KeithGalli
      @KeithGalli  2 місяці тому +4

      Good suggestion. I need to do some more research into the new release, but an updated NumPy video is definitely a possibility!

    • @NewsChannel-y4g
      @NewsChannel-y4g 2 місяці тому

      @@KeithGalli dude this video was exactly what i was looking for as someone relatively new to python trying to get into data science. NumPy and Seaborn would be good follow up videos if you used the same data. The CSV files seemed to copy paste well from the browse but the parquet and excel did not want to and made me load as a .txt at that point i just crossed my fingers hoping you would use the csv and 20 mins in so far you have great video so far. excellent focus on detail great beginner level examples and functions...tried datawars and datacamp before coming here...thank you truely..

    • @KeithGalli
      @KeithGalli  2 місяці тому

      @@NewsChannel-y4g Happy to hear that!! Yeah I think that because Excel & Parquet files aren't human readable in their raw form, it doesn't let you copy & paste the URL in the same way as CSV. It is a good test to be able to read those files though, so I recommend that you try downloading them (there's a download raw file button on Github) and then reading them in locally with your code. You'll probably want to move the files from your downloads folder to the same location as your notebook file and then you should be able to load it in with a command like pd.read_excel('./olympics-data.xlsx') & pd.read_parquet('./results.parquet') respectively. That being said, I plan to continue using CSV files in most of my videos so you should be fine with the method you have been using. Not sure if I'll use the same data, but I hope to do some videos that incorporate NumPy & Seaborn in the not-so-distant future. Keep up the good work!

  • @FIBONACCIVEGA
    @FIBONACCIVEGA 2 місяці тому

    good its the update of the old video . Excellent!!!

  • @mandy6622
    @mandy6622 11 днів тому

    Keith please make a tutorial on pyspark

  • @adjieaja23
    @adjieaja23 2 місяці тому

    i have been waiting for this. thank you teacher

  • @somerandomdude-hoyeaaaaa
    @somerandomdude-hoyeaaaaa Місяць тому +1

    Tysm

  • @omsingh5525
    @omsingh5525 2 місяці тому

    Hey , Thanks for the amazing tutorial.

  • @massimo5019
    @massimo5019 2 місяці тому

    Just WOW. Great tutorial!

  • @KumR
    @KumR Місяць тому

    Very Nice Mr. Galli. Can u pl do one in polars too???

  • @tristoneyang1255
    @tristoneyang1255 Місяць тому

    very helpful, thx K.

  • @stu8924
    @stu8924 2 місяці тому

    Brilliant, thank you.

  • @BluesAndWater
    @BluesAndWater 2 місяці тому

    Muy bueno, gracias por todo!
    Very good, thank you for everything!

    • @KeithGalli
      @KeithGalli  2 місяці тому +1

      ¡Por supuesto! Estoy feliz que te gustó 🙂

  • @AstroidegitaTech
    @AstroidegitaTech 2 місяці тому +1

    Well-done man

  • @user-ss9nl9dm7j
    @user-ss9nl9dm7j 2 місяці тому

    Hey Keith, it was a nice promo. From Bangladesh 🇧🇩

    • @KeithGalli
      @KeithGalli  2 місяці тому

      Glad you liked the promo!!

  • @eu_dz8684
    @eu_dz8684 2 місяці тому

    Could you please tell me how to teach pandas after this course, what topics should be covered and what's the best way to teach that?

  • @karangoyal8646
    @karangoyal8646 Місяць тому

    Great work bro !! where do you live in boston. I am from boston too

  • @mandy6622
    @mandy6622 26 днів тому

    Hi keith please make a video on pyspark

  • @chiragsoni6990
    @chiragsoni6990 2 місяці тому

    bro axis 0 is horizontal frame and axis 1 is vertical frames but the function works when applied vertically by using axis 1 which is weird but thats how it works i guess

  • @Divyansh-n3h
    @Divyansh-n3h Місяць тому

    continue from filtering data 24:12

  • @ericwang5126
    @ericwang5126 2 місяці тому

    Amazing video!

  • @msbeau5341
    @msbeau5341 Місяць тому

    What did he say we should click to get copilot to come out please? I am using windows

  • @asfasdfsd8476
    @asfasdfsd8476 2 місяці тому +1

    Bro I got a job after your first video!

    • @KeithGalli
      @KeithGalli  2 місяці тому

      That's awesome!! Nice work 💪

  • @rushikeshkharat4022
    @rushikeshkharat4022 Місяць тому

    I was asked in an interview - how to import multiple files at once in pandas instead of importing files one by one if there are so many files. Is there a quicker way? how to accomplish that in pandas?

  • @ramarisonandry8571
    @ramarisonandry8571 2 місяці тому

    Love from Madagascar

  • @DanielValenzuelaPerez
    @DanielValenzuelaPerez 2 місяці тому

    🔥 Thanks!

  • @alisher.m
    @alisher.m Місяць тому

    Can you release polars course?

  • @shreyalalit1460
    @shreyalalit1460 2 місяці тому

    Awesome!

  • @crystalkishore4974
    @crystalkishore4974 Місяць тому

    Thanks Man ❤

  • @lifewithrahi_inuk
    @lifewithrahi_inuk 2 місяці тому

    Amazing!

  • @AbdulVajid-fz3vs
    @AbdulVajid-fz3vs 2 місяці тому

    Please upload an end to end machine learning project

    • @KeithGalli
      @KeithGalli  2 місяці тому

      I recommend checking out this video:
      ua-cam.com/video/MeyVptCRubI/v-deo.htmlsi=RqO--khHDJdNRI0a
      A real-world project (an actual consulting project of mine) that you can follow along with that uses LLMs.

  • @KalmahRulez
    @KalmahRulez 2 місяці тому

    Thank you sir.

  • @lord_voldemort44
    @lord_voldemort44 2 місяці тому

    awesome video

  • @GenZdev
    @GenZdev 2 місяці тому

    would like to refresh numpy too with you

  • @sebastianalvarez1537
    @sebastianalvarez1537 2 місяці тому

    Beast mode

  • @abhinavawasthi1730
    @abhinavawasthi1730 Місяць тому

    from where can i download the csv file for practice?

  • @khalidlachhab7510
    @khalidlachhab7510 2 місяці тому

    OMG beside this great video, you really are the young version of Christian Bale man hhh

    • @KeithGalli
      @KeithGalli  Місяць тому +1

      Haha I'll take it 😎

  • @user-my2zq6td8z
    @user-my2zq6td8z Місяць тому

    pointers 38:16

  • @Wilson5150Wilson
    @Wilson5150Wilson 27 днів тому

    what is this workstation called? It seems ideal for experimentation. I'm currently using VS Code and can't test individual lines like you are. Or maybe you can in VS Code, I'm jut new!

    • @KeithGalli
      @KeithGalli  26 днів тому

      Make sure you use the ".ipynb" file extension and then in VSCode you will need to install the "Jupyter" extension. Hope this helps you get set up!

  • @SyedAbdulrazak-h8e
    @SyedAbdulrazak-h8e Місяць тому

    use metioned in 15:00 minutes of the video, press control and enter for changing sample that is random . i tried it in my pycharm but it did not work what should i do for this ?

    • @KeithGalli
      @KeithGalli  Місяць тому

      When I said ctrl + enter, within a Jupyter notebook that just re-runs my current code cell thus producing a new sample row from the dataframe. In your pycharm editor you should be able to just re-run your code and if you print out the sample, you'll see it change.

    • @SyedAbdulrazak-h8e
      @SyedAbdulrazak-h8e Місяць тому

      @@KeithGallii am glad u repiled thank u .

  • @AnmolBajwa-pq2bm
    @AnmolBajwa-pq2bm 2 місяці тому

    Thanks boss

  • @garyphan-lo4vi
    @garyphan-lo4vi 2 місяці тому

    wake up babe new keith drop

  • @tanjumraisa-id4de
    @tanjumraisa-id4de Місяць тому

    why i couldn't download data raw file..kindly say? thats why im stuck...

  • @user-me4pb8qs2t
    @user-me4pb8qs2t 2 місяці тому

    Cool!!!!

  • @skyeshwin
    @skyeshwin 2 місяці тому

    At 30:40, the code for athletes that start and end with the same letter throws an error. Can anyone suggest me the correct solution? I tried str.extract but I can't include the na=False since it's throwing an error.
    Wrong code - bios[bios["name"].str.contains(r'^(.).*\1$', na=False)]
    Correct code -??

    • @KeithGalli
      @KeithGalli  2 місяці тому +1

      I just double checked and I see that a warning pops up (but it's not actually an error). You can ignore the warning. That being said, you might not see results because the names start with an uppercase letter & end with a lowercase letter. You can fix this by passing case=False into.your str.contains() method (see below)
      Correct code:
      start_end_same = bios[bios['name'].str.contains(r'^(.).*\1$', na=False, case=False)]

    • @skyeshwin
      @skyeshwin 2 місяці тому

      @@KeithGalli Hey thanks for the correction! One more thing I wanted to mention. At 48:22 when the result pops up, I can still see the rows whose height_cm is 'NaN', the height_category is showing to be 'Tall'. So I tweaked your code a little bit:
      Existing code: bios['height_category'] = bios['height_cm'].apply(lambda x: 'Short' if x < 165 else ('Average' if x < 185 else 'Tall'))
      New code: bios['height_category'] = bios['height_cm'].apply(lambda x: 'Short' if x < 165 else ('Average' if x < 185 else 'Tall' if x >= 185 else 'NA'))
      This will show those rows whose height_cm have no information(NaN), the corresponding height_category to be 'NA'.
      Similarly, this issue occurs again at 50:29.
      My intent is not to pinpoint your mistakes but just to educate anyone who's a newbie to Python!
      Love your work always!

  • @TheWayHome-wb1uh
    @TheWayHome-wb1uh 2 місяці тому

    Can you do a course on langchain?

    • @KeithGalli
      @KeithGalli  2 місяці тому +1

      Not a dedicated course, but here's a video I did using Langchain in a real-world project:
      ua-cam.com/video/MeyVptCRubI/v-deo.htmlsi=CeGcaKvG6eSAbGpg

    • @TheWayHome-wb1uh
      @TheWayHome-wb1uh 2 місяці тому

      @@KeithGalli Thank you for responding. I did go through this and it was pretty cool. Just wanted to know if you might consider a full tutorial on langchains and LLMs.
      Also, love your channel.

  • @thunde7226
    @thunde7226 2 місяці тому

    Great video keith.......:) bye

  • @rodrigo100kk
    @rodrigo100kk 2 місяці тому

    1:16:15 - This was 20% increase, not 120% increase.

    • @KeithGalli
      @KeithGalli  2 місяці тому

      Good catch. It was 120% of the previous day, which is a 20% increase :).

  • @NickMaverick4
    @NickMaverick4 24 дні тому

    Can anybody help me with practicing pandas . Like is there any website like w3school where I can practice the code

  • @NormieDead
    @NormieDead 2 місяці тому

    dude just woke up to gave another PANDA

  • @rogerroger14
    @rogerroger14 Місяць тому

    Not feeling longterm compatibility chat LOL

  • @jonpounds1922
    @jonpounds1922 2 місяці тому +6

    you're telling me this isn't kung fu panda 4? bummer

  • @user-my2zq6td8z
    @user-my2zq6td8z Місяць тому

    46:44 where is the chear sheet did anyone know

    • @KeithGalli
      @KeithGalli  Місяць тому

      I got you! here is the cheat sheet: strftime.org/ (this link can also be found in the video description)

    • @user-my2zq6td8z
      @user-my2zq6td8z Місяць тому

      @@KeithGalli thanks

  • @Darkdevil2000
    @Darkdevil2000 2 місяці тому

    Vote for Numpy guies ❤❤❤❤

  • @roshanchhetri3252
    @roshanchhetri3252 Місяць тому +1

    bro be like what does na = false do in chat gpt not here to troll just thought it was intresting

  • @djblaccs
    @djblaccs Місяць тому

    WHAT IS WITH THE SPEEDING OF VIDEOS...?!

    • @KeithGalli
      @KeithGalli  Місяць тому

      I'm not sure if I understand the question, what are you seeing?

  • @gustavojuantorena
    @gustavojuantorena 2 місяці тому

    First

  • @aditya3david
    @aditya3david 2 місяці тому

    Age difference in 5 years😂

  • @Intellectualmind4
    @Intellectualmind4 2 місяці тому

    🎉🎉🎉🎉🎉 come on

  • @truptisriharshith5049
    @truptisriharshith5049 20 днів тому

    why not using pokemon data this time 🥲