Python Pandas Tutorial (Part 9): Cleaning Data - Casting Datatypes and Handling Missing Values

Поділитися
Вставка
  • Опубліковано 3 гру 2024

КОМЕНТАРІ • 181

  • @coreyms
    @coreyms  4 роки тому +86

    Hey everyone. Hope you all had a great weekend! I will be traveling to Vancouver this week to visit a Quantum Computing company and learn more about the work they're doing, so I'm not sure when the next Pandas video will be ready for release. I will be working on it while I'm there, but I likely won't have it recorded and released until midway through next week. Let me know if anyone has any questions they would like me to ask them about Quantum Computing!

    • @harshvardhan1156
      @harshvardhan1156 4 роки тому +7

      Hey, Corey. Thankyou for everything. I am not from Computer Science background, Out of curiosity I started learning to code and here I am now, has done more than 20 datascience project. Your videos are literally best, I have taken some courses for high price and I can un-undoubtedly say that your way of teaching is way more interactive, complete and easy to grab.
      I just want to know how you plan for any course, like in 1st or 2nd video You said that I will cover this topic in later videos. So do you make whole content, practice it? deepdive in it and make your own order and then start teaching?
      It would be very helpful for me if you share about how you prepare for any topic.
      Thank you very much
      Love from INDIA

    • @harjotsinghbaidwan2204
      @harjotsinghbaidwan2204 4 роки тому +4

      I have many times seen while using dataframe that column names are not at same level and this creates an issue during extraction of values.
      Do you have any idea about it?

    • @JiminPark-ld2xx
      @JiminPark-ld2xx 2 роки тому

      How do I download dataset after cleaning my data using Jupyter notebook online? Plzz ans..

  • @ahammadshawki8
    @ahammadshawki8 4 роки тому +140

    Please make a playlist on numpy after pandas.

  • @malikdiallo9976
    @malikdiallo9976 4 роки тому +65

    I like this series in pandas. thank you so much Corey.

  • @gauravmarwaha8466
    @gauravmarwaha8466 4 роки тому +5

    this series on pandas is the most complete and informative series ive found till date...!!!

  • @corben3348
    @corben3348 4 роки тому +37

    Good teaching is an art... This playlist is so helpful ! Thank you for your work !

  • @saravanannatarajan6515
    @saravanannatarajan6515 4 роки тому +64

    Corey you're teaching is awesome!!! Much appreciated!!!
    Expecting series on Machine Learning/Deep Learning in the near future...

  • @kuls43
    @kuls43 4 роки тому +6

    11:36 we can use df.replace(['NA', 'Missing'], np.nan, inplace=True) instead

    • @AtlasIndustries101
      @AtlasIndustries101 4 роки тому +2

      could've used in other df.replace(...) line too. But I think he is trying to keep it simple for us to understand it easily.

  • @sayantanchakraborty75
    @sayantanchakraborty75 4 роки тому +13

    Best series on Python Pandas . Thank you so much Mate. Love from India

  • @ashishdeora8522
    @ashishdeora8522 4 роки тому +13

    Thank you Corey for this. My parents urged me to join your community. They are saying you are doing wonderful job. Thank you Corey for enabling us

  • @ishanpand3y
    @ishanpand3y 4 роки тому +12

    This is the most amazing series on Pandas ever. I just finished watching number 9th. Sir thank you so much providing such great content. 🧡🤍💚

  • @adamgdev
    @adamgdev 4 роки тому +2

    You never disappoint!! And I never have to speed you up because you keep a great pace with no BS! Thank you!!

  • @benhancock1541
    @benhancock1541 4 роки тому +7

    Thanks for this Corey - your tutorials are always great! I've been using pandas for almost 2 years and still learned stuff 👍

  • @minxxdia1132
    @minxxdia1132 4 роки тому +2

    wow, this is the best playlist for python pandas. thankyou so much!

  • @gagansoni9665
    @gagansoni9665 4 роки тому +3

    i understand your pandas tutorials very clearly. this is helping me a lot. thank you so much corey. i wish to see your tutorials on machine learning using python.

  • @codegeek8256
    @codegeek8256 4 роки тому +3

    Hi @ Corey Schafer
    I am very with your teachings, these are great building blocks towards data science, i hope one day we arrive there.

  • @srivathsgondi191
    @srivathsgondi191 11 місяців тому

    Now thats a lovely explaination, i like how u showed the function can be used in different scenarios!

  • @YeekyYeeky
    @YeekyYeeky 3 роки тому

    can't wait for your numpy series , this channel is gold , Thank you Corey

  • @njgaming4422
    @njgaming4422 10 місяців тому +1

    instead of replacing separately you can just pass the list of strings that you want replace
    E.g : df['YearsCode'].replace(['Less than 1 year','More than 50 years'],[0,51],inplace=True)

  • @mapa5000
    @mapa5000 Рік тому

    You really care about making a video addressing many scenarios and possible issues … that’s phenomenal !! … I really appreciate it … thank you so much!!

  • @Al-Ahdal
    @Al-Ahdal 4 роки тому +2

    Boss, it is requested to kindly make videos on comprehensive data analysis series, covering all aspects in much detail, and covering all possible areas for data analysis. Your channel and vdos are awesome. Great work indeed...... 👍

  • @zixinlee2165
    @zixinlee2165 4 роки тому +2

    Thank you so much for creating these videos!! They're really valuable for self-learners like me.

  • @stanislawjarzynski6133
    @stanislawjarzynski6133 3 роки тому +1

    You're a great teacher, Corey!

  • @kirannagar8295
    @kirannagar8295 4 роки тому +1

    Hey , truly glad for your all series . If possible , please do make a course video on Pyspark .

  • @ABDULKARIMHOMAIDI
    @ABDULKARIMHOMAIDI Місяць тому

    Thanks man for such valuable series of videos, please add more video on new features on pandas !!!

  • @kingjoshuamanatad2140
    @kingjoshuamanatad2140 4 роки тому

    In 27:28 of the video. For a one liner code. df['YearsCode'].replace(['Less than 1 year','More than 50 years'],[0,51]), inplace=True). Correct me if I'm wrong I'm new to Python. But great video again Corey! Hats off!

    • @vladimirwimmer11
      @vladimirwimmer11 6 місяців тому

      this does not work anymore as Corey was mentioning, rather like this>> df.replace({'YearsCode': {'Less than 1 year':0, 'More than 50 years':51} },inplace= True)

  • @gayatriwaghmare6293
    @gayatriwaghmare6293 4 роки тому +1

    The series is very helpful to me. Thank you sir.

  • @zzzorgjanbatist564
    @zzzorgjanbatist564 4 роки тому +2

    As usual Corey best of the best!!!

  • @anubhavrauniyar3192
    @anubhavrauniyar3192 2 роки тому

    We love you Corey Schafer!!!! Lots of love from India🥰

  • @Ian-bb7vv
    @Ian-bb7vv 3 роки тому

    I had to say, thank you!! I think you guys are really helping to fill the unequal educational resources between the rich and the poor. Great job and I hope you now that what you are doing is really meaningful

  • @andreykaok9497
    @andreykaok9497 4 роки тому +2

    Brilliant tutorials on Pandas!
    Very much looking forward to the time series lessons.

  • @saraghafelehbashi5808
    @saraghafelehbashi5808 2 роки тому

    much appreciated! could you please have more video like that? cleaning data and see the diffrent errors come with it!
    it would really helpful for juniors.

  • @analyticswithothello8213
    @analyticswithothello8213 2 роки тому

    Corey, you are teaching the best!

  • @finncollins5696
    @finncollins5696 Рік тому +1

    Learnt a lot so far. Thanks so much Corey,.

  • @alexthewebdesigner1856
    @alexthewebdesigner1856 2 роки тому

    @Corey Schafer
    Something told me that I'hd better watch this video. Just when I thought that I'd sanitized a large data set, I realize now that there could potentially be some data (or missing data) that could crash my application. Great video. Thank you Sir!

  • @rockeyvalley
    @rockeyvalley 4 роки тому +1

    Great stuff Corey!!! Keep up the good work!

  • @darkmaraux
    @darkmaraux 4 роки тому +1

    This video was so smooth! Right in the point! Thanks!!!

  • @juancarcelen3437
    @juancarcelen3437 4 роки тому +2

    Hi Corey thank you so much for posting these videos. Your tutorials have helped me transition the concepts I know into actual useful code. I would like to test my progress and would really appreciate if you can put out a link with some data analysis projects (i.e. a database to download, questions to answer using data analysis, and the code that was written to answer those questions).
    Thank you so much and keep the videos coming you're an amazing teacher!!

  • @rauberhozenplotz7009
    @rauberhozenplotz7009 4 роки тому +1

    Great content - great style of speaking and explaining - thank you!

  • @kameshinipillay4587
    @kameshinipillay4587 2 роки тому +1

    Thank you, learning so much :)

  • @002_priyanshugoswami5
    @002_priyanshugoswami5 4 роки тому +3

    love you coreyyyyy best channel

  • @teetanrobotics5363
    @teetanrobotics5363 4 роки тому +3

    I love your tutorials. Could you also make tutorials for scipy and scikit learn?

  • @dadoll1660
    @dadoll1660 4 роки тому +4

    This is gold.

  • @stressfreetrading1341
    @stressfreetrading1341 4 роки тому +1

    Love the way u teach. thanks a lot... Love from India

  • @samratsengupta8881
    @samratsengupta8881 4 роки тому +2

    Thanks Corey, i have no words to say. As an inspiring data scientist, your pandas videos were really cool.
    I don't know if you will ever read this but this has helped and has put a smile on my 'confused about pandas' face.
    i have subscribed and will watch your videos for becoming a self taught data scientist.
    God Bless You

  • @lucasbartomioli7861
    @lucasbartomioli7861 7 місяців тому

    Man, i love you! Thanks a lot from Argentina!

  • @danielflorea3001
    @danielflorea3001 3 роки тому

    Simple and clear explanations. Great job.

  • @FakeAccount
    @FakeAccount 4 роки тому +2

    You're a legend, my guy.

  • @davebeckham5429
    @davebeckham5429 4 роки тому +1

    Many thanks for sharing excellent tutorials Corey.

  • @codewithluq
    @codewithluq 4 роки тому +2

    Thank you Corey again. My resume is getting more interesting everyday. Viva

  • @VikasGuptacherie
    @VikasGuptacherie 4 роки тому +1

    Very helpful series with nice explanations !!!

  • @mikkybricks
    @mikkybricks 4 роки тому +2

    Thanks Corey

  • @aegystierone8505
    @aegystierone8505 4 роки тому +1

    Please do a video about your visit to the Quantum Computing trip in Vancouver!

  • @Shkkmj6868
    @Shkkmj6868 4 роки тому

    It's very useful .You are great at articulating . Thank you so much .

  • @TopicalAuthority
    @TopicalAuthority 4 роки тому +1

    Great lesson!

  • @haiderali2050
    @haiderali2050 4 роки тому

    Thank you so much, i have learnt a lot and able to automize my daily Excel routine work

  • @ahmedhosny3855
    @ahmedhosny3855 Рік тому

    such a great work done by you , hope you all the best man

  • @ajinzrathod
    @ajinzrathod 3 роки тому

    Corey you are great.❤️
    Love from India ❤️

  • @arkahm
    @arkahm 4 роки тому +1

    Great video! How about a video in spitting data and passing the split into a function? That would be great!

  • @muntadher8087
    @muntadher8087 2 роки тому

    useing this func ( df.fillna("Unfilled", inplace = True) ) to replace the missing values is good practice I belive, for me it's easier than replace and more dynamic

  • @interestingstudies4422
    @interestingstudies4422 3 роки тому

    Amazing video...solved my problems ☺️☺️🙏🏻

  • @bharaths1396
    @bharaths1396 4 роки тому +1

    Your content is awsome....!
    How do replace nan values with other values only in a particular column?
    Please Help
    Thank You

  • @quoit99training83
    @quoit99training83 4 роки тому

    amazing series - hi Corey, how many PARTS u think will end up in this playlist? Thank you for helping the community :)

  • @robertmnganya7533
    @robertmnganya7533 3 роки тому

    Excellent teaching. Thank you.

  • @ironpolux
    @ironpolux 3 роки тому

    Great vid, pls do one on multiple indexes!

  • @athas12
    @athas12 Рік тому

    for the last part of the video, you can actually create two lists and use these lists in replace method to change all values at once. It is slightly easier especially if the df has multiple values to replace

  • @saqibhussain1354
    @saqibhussain1354 4 роки тому +1

    Great video - I wonder if you can do a few on the business side like freelancing and how to get clients as python developers?

  • @manishgpt25
    @manishgpt25 3 роки тому

    thanks a ton for this series..helped a lot in clearing concepts!!

  • @nikhilb3880
    @nikhilb3880 4 роки тому

    I love this series man, more than you could expect.
    If I may ask, what state and country are you from? Because I saw snow on your 2nd channel and now I'm confused about whether you live in the USA or in a European country.
    Thanks again for this series

    • @coreyms
      @coreyms  4 роки тому +1

      Hey there. I currently live in Greenville SC in the United States. The snow videos were likely from Boulder Colorado where I lived for several years.

  • @maheryagub
    @maheryagub Місяць тому

    Plus 1 for using Vimium plugin at 18:55

  • @KimJennie-fl3sg
    @KimJennie-fl3sg 4 роки тому

    This also work if we want to drop a column if 0 and 1 index have NaN
    df.dropna(axis='columns', how='any', subset=[0, 1])

  • @mohammedkaifmirza7585
    @mohammedkaifmirza7585 2 роки тому

    Amazing tutorial 😍👌

  • @andr101
    @andr101 4 роки тому +1

    great series, thanks!

  • @pandeyUkislay
    @pandeyUkislay 5 місяців тому +1

    hi Corey, at around 16:14 you say that "age column is a string, because it says object!" this is confusing me coz iirc everythin in python is an object, please correct me here

  • @NikitaSharma-bs4gg
    @NikitaSharma-bs4gg 3 роки тому

    That was such a good video- thank you for sharing

  • @noureddineettayyeby5210
    @noureddineettayyeby5210 4 роки тому +2

    Thank you

  • @stephanie_ong
    @stephanie_ong 4 роки тому

    Thanks again for such a helpful video.

  • @mohamedikbalguetout32
    @mohamedikbalguetout32 3 роки тому

    hey bro always I fond the solutions in your videos thanks man

  • @litan5006
    @litan5006 2 роки тому

    Good pandas video. Thank you

  • @shivavijaya1537
    @shivavijaya1537 4 роки тому +1

    Hi Corey, please post a video on python sys module

  • @JoKaR80-d5r
    @JoKaR80-d5r 3 роки тому

    These are awesome! Thanks a million!

  • @harishrudroju1379
    @harishrudroju1379 4 роки тому +1

    Hii corey, can u plz make a video on how to bypass captcha while scrapping a web site

  • @prasad1686
    @prasad1686 4 роки тому +2

    Hi Corey, your videos are "the great". I am beginner can you please tell me how to get "cheet sheet" or ".py scripts" of your video playlist "Python Tutorials" "1 to 136", to speedup learning as i am slow in typing. Thank you.

  • @ADNANAHMED-eo5xx
    @ADNANAHMED-eo5xx 4 роки тому

    Please continue the series sir

  • @gauravmarwaha8466
    @gauravmarwaha8466 4 роки тому +1

    good video again..!! thanks a lot

  • @varunkrishnaKyathanpally
    @varunkrishnaKyathanpally 4 роки тому

    Thank you , excellent tutorial as always :)

  • @anshuldwivedi7210
    @anshuldwivedi7210 3 роки тому +1

    Great series, but I must say I find R much easier to understand since there are not as many exceptions as there are in python. Everything is a function and there is no concept of method (at least that I'm aware of)

  • @kinjalvora256
    @kinjalvora256 4 роки тому +1

    Hi Corey,
    Thanks for the awesome series. While I have not yet finished the series, I would like to know, how we can deal with duplicates.
    If you have a column let's say with duplicate apps and the apps have reviews, size, installations and you want to let's say get a mean for the reviews, take the first size and sum of the installations and merge the rest of the columns for those apps as they were, like Ratings. How would one do that?

  • @stayinawesum
    @stayinawesum 4 роки тому +1

    can you make video explaining:
    primitive data types vs data types vs adt vs data structure

  • @dhanraj112
    @dhanraj112 4 роки тому +3

    is brilliant give certificates after completion of course?

  • @muntadher8087
    @muntadher8087 2 роки тому

    Thank you so much!! You are the best

  •  9 місяців тому

    14:57 - Casting / fixing data types

  • @abhishekgupta1060
    @abhishekgupta1060 4 роки тому

    Hi Corey, great video! Learned a lot! And have a Happy trip , and also I would like you to ask one question - "Is time travel possible with Quantum computing as depicted in the movie 'Avengers:Endgame' ".

  • @kunal_tajane
    @kunal_tajane 2 роки тому

    more than 50 years. is crazy 😁

  • @SusanAmberBruce
    @SusanAmberBruce 4 роки тому +2

    Corey, do you happen to know what Linux distro's ship with python 3 currently?

  • @ADNANAHMED-eo5xx
    @ADNANAHMED-eo5xx 4 роки тому +1

    Thanx a lot

  • @adarshtiwari7395
    @adarshtiwari7395 3 роки тому

    That is BRILLIANT

  • @temitayoadedipe
    @temitayoadedipe 4 роки тому +1

    "And that is brilliant!" 😁

  • @bobsalita3417
    @bobsalita3417 4 роки тому +1

    Can you use join() or merge() to do multiple replacements?

  • @pc0riginal870
    @pc0riginal870 3 роки тому

    we can convert using apply(int) also .

  • @nayeemuddinmoinuddin2186
    @nayeemuddinmoinuddin2186 4 роки тому

    @Corey Schafer - Please do a video series on PySpark.