Python Pandas Tutorial 5: Handle Missing Data: fillna, dropna, interpolate

Поділитися
Вставка
  • Опубліковано 21 лип 2024
  • In this tutorial we'll learn how to handle missing data in pandas using fillna, interpolate and dropna methods. You can fill missing values using a value or list of values or use one of the interpolation methods.
    Topics that are covered in this Python Pandas Video:
    0:00 Introduction
    2:30 Convert string column into the date type
    3:15 Use date as an index of dataframe usine set_index() method
    4:10 Use fillna() method in dataframe
    7:35 Use fillna(method="ffill") method in dataframe
    8:57 Use fillna(method="bfill") method in dataframe
    9:56 "axis" parameter in fillna() method in dataframe
    11:18 "limit" parameter in fillna() method in dataframe
    13:46 interpolate() to do interpolation in dataframe
    15:34 interpolate() method "time"
    16:50 dropna() method Drop all the rows which has "na" in dataframe
    17:50 "how" parameter in dropna() method
    18:33 "thresh" parameter in dropna() method
    Code link: github.com/codebasics/py/tree...
    Do you want to learn technology from me? Check codebasics.io/?... for my affordable video courses.
    Popular Playlist:
    Complete python course: • Python 3 Programming T...
    Data science course: • Data Science Full Cour...
    Machine learning tutorials: • Machine Learning Tutor...
    Pandas tutorials: • Pandas Tutorial (Data ...
    Git github tutorials: • Git/Github Tutorial
    Matplotlib course: • Matplotlib tutorial
    Data structures course: • Data Structures And Al...
    Data Science Project - Real Estate Price Prediction: • Machine Learning & Dat...
    To download csv and code for all tutorials: go to github.com/codebasics/py, click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file.
    🌎 My Website For Video Courses: codebasics.io/?...
    Need help building software or data analytics and AI solutions? My company www.atliq.com/ can help. Click on the Contact button on that website.
    #️⃣ Social Media #️⃣
    🔗 Discord: / discord
    📸 Dhaval's Personal Instagram: / dhavalsays
    📸 Instagram: / codebasicshub
    🔊 Facebook: / codebasicshub
    📝 Linkedin (Personal): / dhavalsays
    📝 Linkedin (Codebasics): / codebasics
    📱 Twitter: / codebasicshub
    🔗 Patreon: www.patreon.com/codebasics?fa...

КОМЕНТАРІ • 297

  • @codebasics
    @codebasics  2 роки тому +6

    Check out our premium machine learning course with 2 Industry projects: codebasics.io/courses/machine-learning-for-data-science-beginners-to-advanced

    • @cromllo7162
      @cromllo7162 2 роки тому

      This Man its Amazing ❤❤👍👍👌👌

    • @mixmaza6489
      @mixmaza6489 Рік тому

      @codebasics
      CAN we replace missing value using Na_values like we did in pervious videos cleaning messy data ?

  • @RHCIPHER
    @RHCIPHER Рік тому +14

    It's been 5 years since you posted this video but still, we can't find a better video than this to understand the concepts. Not only this but your complete playlist is GEM. Thanks a ton.

  • @MrMaipeople
    @MrMaipeople 5 років тому +12

    I just don't know how exactly can I say THANK YOU enough to convey my thank you to you THANK YOU, THANK YOU, THANK YOU for all the tutorials in ALL your playlists, so awesome and helpful.
    ALL ARE SUPER EXCELLENT.

  • @PaulColclough47
    @PaulColclough47 7 років тому +11

    Thanks very much for this! I'm doing an edX course and I was doing fine until this very topic came up. Your video perfectly cleared everything up which is a great relief!

  • @codebasics
    @codebasics  5 років тому +2

    Step by step roadmap to learn data science in 6 months: ua-cam.com/video/H4YcqULY1-Q/v-deo.html
    Machine learning tutorials with exercises:
    ua-cam.com/video/gmvvaobm7eQ/v-deo.html

  • @codebasics
    @codebasics  3 роки тому +4

    Step by step roadmap to learn data science in 6 months: ua-cam.com/video/H4YcqULY1-Q/v-deo.html
    5 FREE data science projects for your resume with code: ua-cam.com/video/957fQCm5aDo/v-deo.html

  • @juanliu5778
    @juanliu5778 5 років тому

    Best course that I can find after 2 months research, you are the only guy who know why we need pandas to do excel operations

  • @adnanhowlader143
    @adnanhowlader143 3 роки тому +30

    I think your playlist has more rich content than paid courses

  • @amandaahringer7466
    @amandaahringer7466 2 роки тому

    I love the way you used the graph/visual to explain the different interpolation techniques!

  • @koumospecial
    @koumospecial 6 років тому +1

    What an amazing presentation!! You did a fantastic job there! Very professional approach! You just made me a subscriber with that! :) Big thanks!

  • @MyChusko
    @MyChusko Рік тому +1

    great tutorial. A very comprehensive explanation of the topic. I have a good command of pandas but I still learnt some new tricks I didn't know.
    This video has convinced me to buy some of your courses. Serious and in depth approach.
    Totally recommended!

  • @sangorilla1
    @sangorilla1 6 років тому +2

    Excellent presentation of these tools. I like the practical way you have demonstrated HOW to use the tools. Thanks.

  • @StevenSmith68828
    @StevenSmith68828 5 років тому

    Finally! Thank you so much I was trying figure out how to drop only the columns that have NaN. I had done it by hand it took me so long Pandas is amazing.

  • @sippu9722
    @sippu9722 5 років тому +11

    LIFE SAVER BRO... THANKS A LOT
    THIS IS WHAT I EXACTLY WANTED

  • @purushothamgowda6449
    @purushothamgowda6449 5 років тому +3

    Good presentation in simple words, easy to understand. Great job Sir!

  • @jasonwong8315
    @jasonwong8315 3 роки тому

    the threshold parameter is so useful! never heard about this before! thank you!

  • @sep69
    @sep69 5 років тому +1

    Thanks for your great pandas tutorials. Just what I was looking for. You explain it very clear. Thanks again :)

  • @matthiasmusterman5139
    @matthiasmusterman5139 5 років тому +2

    Very well organized and intuitive tutorial. Great job. Thanks a lot

  • @saeedbaig4249
    @saeedbaig4249 7 років тому +6

    OMG this is exactly what I was looking 4!
    Thanks so much man u just earnt yourself a new sub

  • @NidhiSharma-vd6wm
    @NidhiSharma-vd6wm 2 роки тому

    This tutorial should be awarded as the best tutorial, understood all the concepts ............thankyou so much.

  • @vatsalsingh4668
    @vatsalsingh4668 5 місяців тому

    Thankyou so much. I love how you have given all ipynb files with such good markdowns and details. Love you for it. Youre a life saver...got my Python exam exam

  • @vishaljhaveri6176
    @vishaljhaveri6176 3 роки тому

    Awesome video. Loved all the concepts and mostly your teaching style. :) Thank you, sir.

  • @pramodpoudel6020
    @pramodpoudel6020 4 роки тому

    This guy is a great teacher. Very simple way to teach. Very basics. great dude. Thanks for sharing you skill for free.

    • @codebasics
      @codebasics  4 роки тому

      I am glad you liked it pramod 😊

  • @vaibhavkumar38
    @vaibhavkumar38 5 років тому

    Simple, lucid explanations with examples. Wonderful!!!

    • @codebasics
      @codebasics  5 років тому +1

      Vaibhav, I am glad you liked it

  • @prateekkachoria301
    @prateekkachoria301 4 роки тому

    completeness of the material is commendable. keep it up thanks a lot :D

  • @MasterCoder99
    @MasterCoder99 3 роки тому +1

    Interpolate will definitely boost my kaggle score! Thanks so Much!

  • @himaniagarwal3671
    @himaniagarwal3671 4 роки тому +2

    Very well explained sir. I appreciate that you suggested those little tricks rather than just sticking to the concept.

    • @codebasics
      @codebasics  4 роки тому +1

      Thanks Himani, glad you liked it. 😀

  • @saurabh9416
    @saurabh9416 5 років тому

    You need more subscribers buddy !
    Simply awesome

  • @Lukas7360
    @Lukas7360 3 роки тому +2

    clean and simple, yet effective, understandable and efficient. Thanks !

  • @surewin3006
    @surewin3006 3 роки тому

    Thanks! Great tutorial! Just finished python and now continue right away with pandas...still am enjoying and learning...

  • @azmaryzannataurin2844
    @azmaryzannataurin2844 3 роки тому

    The only video that makes sense to me :')
    Thanks a lot for this tutorial!

  • @leythecg
    @leythecg 4 роки тому +1

    As always very well and understandably explained! Many thanks for that!

  • @vishwa4908
    @vishwa4908 5 років тому +1

    the perfect example had taken to explaining all the scenarios and nicely presented

  • @codebasics
    @codebasics  4 роки тому

    How to learn coding for beginners | Learn coding for free: ua-cam.com/video/CptrlyD0LJ8/v-deo.html

  • @garylanigan1
    @garylanigan1 3 роки тому

    Simple and clear demonstrations coupled with simple and clear explanations. Very well presented!!
    kudos to you sir!!

  • @julesboileau
    @julesboileau 5 років тому +1

    This one of the best explain step by step tutorial,

  • @minhazarnab8660
    @minhazarnab8660 4 роки тому +1

    Well explanation than most paid courses. Thanks a lot.

  • @iloveno3
    @iloveno3 5 років тому

    You have done an amazing job. Thanks a lot!

  • @emailamit08
    @emailamit08 2 роки тому

    Awesome method of teaching code. it is so clear to the beginners, you even tell minute things like what shortcuts you are using and how to find them. Really boss you deserve a respectful SALUTE. I am so satisfied with your teaching method that i am writing this pausing the video in between.

  • @alexandrelucas5621
    @alexandrelucas5621 6 років тому

    Sir, you are top! Keep up the good work!

  • @SrisairamRajasekar
    @SrisairamRajasekar 4 місяці тому

    When you googled, I saw the pangolin valentine Google doodle. I remember that vividly, as I played that game in my old office in 2017. I can't believe it's been 7 years and wish I had known these videos 7 years back.

  • @jennythedancer5139
    @jennythedancer5139 Рік тому

    Nice one, I watched it second time and finds very helpful for EDA starters.

  • @ijeffking
    @ijeffking 6 років тому

    I have been learning a lot from watching your videos. Than you so much.

  • @ashutoshranjan4644
    @ashutoshranjan4644 Рік тому

    It is a real gem🤩 for those who wants to learn pandas

  • @bouseuxlatache4140
    @bouseuxlatache4140 5 років тому

    man I hope you are lecturing at the best university where you are. you have such a pedagogical approach.

  • @curious8338
    @curious8338 2 місяці тому

    Thank you for the great explanation!!!!

  • @debashissahoo5031
    @debashissahoo5031 6 років тому

    thank you sir for clear explanations. one small doubt: after setting up day as index value, there is a gap in the first row, why is it so and how to avoid that.

  • @fadimehdibourarach3666
    @fadimehdibourarach3666 6 років тому

    Superb Explanation !
    Thank you :)

  • @optimizacioneningenieria3385
    @optimizacioneningenieria3385 2 роки тому

    Great video. Thanks!

  • @glauberbrito8685
    @glauberbrito8685 4 роки тому +1

    Great job. Congrats!!

  • @sakshidevi353
    @sakshidevi353 Рік тому

    you are a magician sir...thanks a lot

  • @oscarquintero6981
    @oscarquintero6981 6 років тому

    Thank you, you got yourself a new subscriber.

  • @kunwar_divyanshu
    @kunwar_divyanshu 3 роки тому

    🔥🔥🔥🔥🙏Thank you sir for these all videos ! Really helpful .

  • @tkrath8348
    @tkrath8348 4 роки тому

    Really a life saver bro!! Thanks a ton!!

  • @armagaan007
    @armagaan007 5 років тому

    Beautifully explained! :)

  • @chitrajain5059
    @chitrajain5059 6 років тому

    Your tutorials are great. Thanks so much. While executing the df in the jupyter notebook, why I can't see the table outline as I can see the video?

  • @samuelnikhade5612
    @samuelnikhade5612 3 роки тому +1

    Thanks a lot it was helpful !!

  • @nithyagowda9800
    @nithyagowda9800 2 роки тому

    I was using fill('np.nan') that changed the dtype to 'object' from 'float64' that did not allow me to interpolate. I was able to pick-up on that because of your video! Now, I have to try and use linear interpolation.

  • @abhinandansingh39
    @abhinandansingh39 5 років тому

    really nice videos and much better than paid courses

  • @michaeltwiton2276
    @michaeltwiton2276 5 років тому

    Thank you sir. This video is fantastic!

  • @vedanthbaliga7686
    @vedanthbaliga7686 3 роки тому

    This is a very useful video! Thank you🙂

  • @rohinijadhav744
    @rohinijadhav744 3 роки тому

    Just wow.... Thank u so much!!!!

  • @cihangiraydoner7962
    @cihangiraydoner7962 3 роки тому

    Thank you for step by step explanation. Good job!

  • @evitaooo7
    @evitaooo7 3 роки тому +1

    Thank you so much! So helpful!

  • @talamuslu
    @talamuslu 5 років тому +1

    perfect job. keep going

  • @geekyprogrammer4831
    @geekyprogrammer4831 5 років тому

    This is much better than Coursera Intro to Data Science course

  • @JustPython
    @JustPython Рік тому

    Best clearing understand video

  • @erfanebrahimi9748
    @erfanebrahimi9748 5 років тому +1

    Great tutorial. Thank you.

  • @kmnm9463
    @kmnm9463 4 роки тому +1

    Hi,
    Excellent examples and explanation.
    I am facing an issue , after using dictionary with fillna method for replacing 0 values in 'event' column , the df still has only 0s.
    Krish

  • @shockey3084
    @shockey3084 4 роки тому

    Good One bro keep it up.

  • @EcExplorer
    @EcExplorer 2 роки тому

    TBH, this is better than the paid course contents like datacamp

  • @AnimeshSinghIITM
    @AnimeshSinghIITM 2 роки тому

    T
    hank you very much, it was really helpful.

  • @chiragmakwana7577
    @chiragmakwana7577 3 роки тому

    Sir thank u so much...
    With one dataset i am facing problems but by seeing u video i got to know sir ...
    Thanksss sir !

  • @mostafaserag4035
    @mostafaserag4035 3 роки тому

    Can you please make a series of videos for the datetime and os libraries?
    Thanks for your awesome vids!

  • @abdulmalekwalayeh4790
    @abdulmalekwalayeh4790 5 років тому

    what a great video
    thanks a lot!!

  • @swamyabhishek2393
    @swamyabhishek2393 4 роки тому

    Excellent work!!

  • @ireneashamoses4209
    @ireneashamoses4209 4 роки тому

    Excellent video!! Thank you!! 😊👍👍

    • @codebasics
      @codebasics  4 роки тому +1

      Irene, I am glad you liked it

  • @anmol4390
    @anmol4390 4 роки тому

    Very very helpful.

  • @joseluisbeltramone599
    @joseluisbeltramone599 3 роки тому

    Very good explanation! Thank you very much.

  • @pranitbhisade3174
    @pranitbhisade3174 6 років тому

    wait for it....................Awsome..:)

  • @debashissahoo5031
    @debashissahoo5031 6 років тому

    thank you sir, really nice explanation.

  • @TheMarComplex
    @TheMarComplex 2 роки тому

    You're the best!

  • @HawkingMerchant
    @HawkingMerchant 3 роки тому

    This is the best rich in content best for free you are a man of god

  • @yusuffarah5602
    @yusuffarah5602 4 роки тому

    brother what kind of method do you use to record this video so clear and professional please let me know i am trying to videos for data analysis. Thanks.

  • @nagavidya6780
    @nagavidya6780 Рік тому

    Superb!!!

  • @bhairavidhakras7933
    @bhairavidhakras7933 Рік тому +1

    Have a few questions:
    (1) your data is already sorted by the date column, hence using just the interpolate method makes sense. But if my data is not sorted, should I first create a data frame by sorting on the dates and then use that data frame as an input to the interpolate method?
    (2) when you use interpolate with "time", how does the program know that it has to use the date column for the time? What if I had date1 like you do and had another column date2 with some other dates, how would the program know that it has do a time-based interpolation on date1 and not date2?
    (2) can interpolation be done for specific columns only? What if I wanted to do interpolate for temperature and forward fill for windspeed? what would be the syntax like?
    Thank you! Great video!

  • @arashalizade9583
    @arashalizade9583 4 роки тому

    Thank you very much it is very efficient to me , Really helpful !!

  • @elysel9424
    @elysel9424 6 років тому

    really well done!

  • @manoj9
    @manoj9 6 років тому

    Very nicely explained

  • @sumit121285
    @sumit121285 3 роки тому

    thanks for all your videos...

  • @frankservant5754
    @frankservant5754 4 роки тому

    Assuming from your data that you have all the events, how can you fill in the temperature based on the event, eg if the event was "sunny" fill in 32. etc ?

  • @skkkks2321
    @skkkks2321 4 роки тому

    Wow ,learned a lot to handle datasets. Thank you Sir

  • @vishnujatav6329
    @vishnujatav6329 3 роки тому

    Very helpful. Thank you

  • @stevemungai3542
    @stevemungai3542 2 роки тому

    You are a smooth talking pro

  • @karthikc7288
    @karthikc7288 5 років тому

    Good job..thank u . My request is make some videos for "seaborn" it will be more useful..

  • @hanaaghaouti5069
    @hanaaghaouti5069 4 роки тому

    the perfect tutorial thanks a lot

  • @mehmetkaya4330
    @mehmetkaya4330 4 роки тому

    Great! Thank you!

  • @ayushy2537
    @ayushy2537 4 роки тому

    At 20:20 , you passed dt in DatetimeIndex() to make it DateIndex type. But when we will create a date range from pd.date_range it itself is DatetimeIndex type and we can skip the pd.DatetimeIndex function part.

  • @arunkumar-us8ei
    @arunkumar-us8ei 6 років тому

    Lovely explanation

  • @rahulsailwal4025
    @rahulsailwal4025 4 роки тому

    BTW - i have also subscribed. Thank you once again.
    Wow. Thank you for uploading series on pandas. Currently going through each and every video and it seems to be a better video.
    Could you please help me to understand below scenario -
    16:45 - Lets assume, we have two dates...Eg. Invoice Pay date, Invoice rec date..is it possible to specify particular date for guessing using interpolate ?