Introduction to Data Processing in Python with Pandas | SciPy 2019 Tutorial | Daniel Chen

Поділитися
Вставка
  • Опубліковано 25 лис 2024

КОМЕНТАРІ • 74

  • @geeebeeez
    @geeebeeez 2 роки тому +18

    1:03 Intro
    5:37 Intro to Pandas
    47:29 Tidying Dataset
    1:39:17 Apply method on a DataFrame
    2:45:05 Modelling and Data Preparation for ML

  • @aghileslounis
    @aghileslounis 4 роки тому +10

    Daniel, best teacher in the world ! nothing is better than teaching with live examples it is very intuitive !

  • @xt.7933
    @xt.7933 4 роки тому +8

    This is really awesome. I just started as an absolute beginner of coding, only finished Dojo's tutorial for the absolute beginner, and I am able to catch up with most of what you taught so far (1:39:00)!! Thank you!!!

  • @siddiqkhan246
    @siddiqkhan246 4 роки тому +3

    This video explains Pandas so well. Great job Daniel, this is by far the best Pandas video on youtube.

  • @zoexu3997
    @zoexu3997 4 роки тому +27

    This is hands down the best panda tutorial I've ever watched so far. Thank you, Daniel:)

    • @Don_Modern_Ancestor
      @Don_Modern_Ancestor 3 роки тому +1

      His Book Pandas for everyone is the best out there. Really in-depth.

  • @MehdiZouaoui
    @MehdiZouaoui Рік тому

    That was a long video but I managed to complete it. I liked the honesty of the guy and he was doing things on the go. Chapeau bas!

  • @PP-im6lu
    @PP-im6lu 3 роки тому

    I've watched bunch of Pandas tutorial videos and this is definitely the best one so far.

  • @Barry_L
    @Barry_L 3 роки тому +1

    Sweet! all these for freeeee.... I'm a true believer that information should be free and i say a BIG THANK YOU for this Daniel,

  • @zkinguk
    @zkinguk 5 років тому +7

    Watched the entire video - really helpful stuff as a pseudo beginner.

  • @zzhou3894
    @zzhou3894 4 роки тому +6

    Best Pandas tutorial so far I can find. Thanks.

  • @semrana1986
    @semrana1986 4 роки тому +1

    one of the best tutorials on pandas

  • @vigneshpadmanabhan
    @vigneshpadmanabhan 3 роки тому

    Best pandas tutorial… glad I found this talk.

  • @bbyum7618
    @bbyum7618 3 роки тому

    Very useful class for understanding some basic aspect of pandas that is often not explained in other tutorials, Long data, applying functions to dataframes and using accessors. Thank you!!

  • @jmyable4
    @jmyable4 3 роки тому +1

    that mitigated my pandas headache! Thanks!

  • @tonypendletoniii3209
    @tonypendletoniii3209 5 років тому +5

    @1:15:40 it is:
    ebola_long['cd_country'].str.split('_').str.get(0)

  • @marialaustsen9016
    @marialaustsen9016 5 років тому +3

    Great video for beginners. Thanks for sharing.

  • @hashimkhan4731
    @hashimkhan4731 3 роки тому +1

    Nice tutorial indeed. Can you point out any such nice tutorial for beginners of ML?

  • @dhananjaywalunj3652
    @dhananjaywalunj3652 4 роки тому +2

    Well explained ...Thank you Daniel.

  • @narendraful
    @narendraful 3 місяці тому

    Great lecture ! Thanks
    I just have one doubt at 2:05:39 we use avg_2 function, but we did not need to vectorise it on the other hand avg_2_mod needed vectorisation. I can’t understand what is the difference between two functions… I.e. why does one need vectorisation and the other doesn’t for the same inputs ??

  • @yasseralkindi7350
    @yasseralkindi7350 3 роки тому

    is there a single place where we can find these datasets, like a shared drive perhaps? Would be good to follow along with that as well.

  • @rohitpurkait4046
    @rohitpurkait4046 4 роки тому +6

    Sir at 1:15:45 , we need to call two str to get the desired value,
    Like, ebola_long['cd_country'].str.split('_').str.get(0)

    • @vijaypalmanit
      @vijaypalmanit 4 роки тому

      true, I know it works by calling it twice but it does it make intuitive sense to call it twice.

    • @MouradBENKADOUR
      @MouradBENKADOUR 2 роки тому

      Excellent, he forget to do it this time, but he did it in pyData conference in 2018
      ua-cam.com/video/iYie42M1ZyU/v-deo.html

  • @thegreatgreenpea835
    @thegreatgreenpea835 5 років тому +1

    It was very helpful and informative. Thank you very much for posting this video!

  • @sidhantmahipal9934
    @sidhantmahipal9934 Рік тому

    Where can I access the datasets being used in this video?

  • @souhamahmoudi7745
    @souhamahmoudi7745 2 роки тому

    where can i find the data that has been used in this video, please ?

  • @_asim_ktk
    @_asim_ktk 4 роки тому +1

    @1:18 How would be I sure that the new columns corresponds to correct row?

  • @shereenkhanzada7953
    @shereenkhanzada7953 4 роки тому +1

    I have a query regarding running my python code in jupyter notebook. Sometime in the middle during running code, the cursor jumps to the next cell instead of running code. I have tried so many things e.g restart the notebook, rewrite code and so many but the same result. Can anybody help me regards this issue?

    • @puar6124
      @puar6124 4 роки тому

      Check if your kernel shut off due to inactivity or something

    • @shereenkhanzada7953
      @shereenkhanzada7953 4 роки тому

      @@puar6124 checked it too... but still the same :(

  • @rohscx
    @rohscx 4 роки тому +1

    This is awesome. Thank you.

  • @AjayKumar-mh9um
    @AjayKumar-mh9um 5 років тому +1

    Recommended for beginners

  • @radyoalmikyel6881
    @radyoalmikyel6881 4 роки тому

    you dropped total_bill in X=tips_dummy no?

  • @steveoshaughnessy3736
    @steveoshaughnessy3736 4 роки тому +5

    Excellent tutorial. Very detailed. I have one gripe though. And it's not Daniel. EVERYONE/EVERY tutorial does this. They name their dataframe df. That's like naming your spreadsheet "spreadsheet" or "ss". Or naming a variable by it's datatype. No one ever names age as "i" or "int". They call their variables by the real world things they are. And a dataframe is a variable. DataFrames should be named like we name spreadsheets (their tabs) or database tables.

    • @rje4242
      @rje4242 3 роки тому

      hungarian notation has a place in python. including the type in the name tells you what type it should be, though you need typechecking and asserts to guarantee that.

  • @srinivasdasari6614
    @srinivasdasari6614 4 роки тому

    At @2:10:49 you directly split the Series without using.. Str. Split('/'). How it split data frame Series. In previous example while splitting we use.. Str. Split. Pls explain

    • @woOpPerjr
      @woOpPerjr 4 роки тому +1

      I think you're talking about the "function" example/question.
      so i'm not using str.split becuase that's how you use split in a pandas series. but we're writing a function that takes in a single string so we have direct access to the string methods becuase it's really regular "my_string".split("_") in base python.
      we then apply the function to our data.

  • @surajviswakarma254
    @surajviswakarma254 4 роки тому +1

    Can i have the access to your notes u have? please
    of if someone is having ?

  • @rahularanger407
    @rahularanger407 2 роки тому

    Why does my output even include Nan values from the table shown in ua-cam.com/video/5rNu16O3YNE/v-deo.html like for the day "Thur" it shows Lunch and Dinner(this has Nan) but in video, there's only lunch

  • @vittorio8087
    @vittorio8087 4 роки тому +1

    Great tutorial ,great Daniel :) thanks

  • @adds5257
    @adds5257 4 роки тому +2

    I need to remember the syntax, while at the same time excel show you average value ,jus drag to your data , the average showed

    • @vijaypalmanit
      @vijaypalmanit 4 роки тому

      yeah, but you cant automate any reporting in excel, with pandas you need to write code only once for any report and next time onward you can reproduce it.

  • @maxbart1353
    @maxbart1353 4 роки тому +1

    i need an extra tutorial for that

  • @DanWhalen
    @DanWhalen 4 роки тому +1

    what os is that, is he on kde neon?

    • @ankushm3t
      @ankushm3t 4 роки тому

      KDE for sure.

    • @woOpPerjr
      @woOpPerjr 4 роки тому

      I run/ran arch (antergos) with KDE.

  • @tunkyi7162
    @tunkyi7162 5 років тому

    the window size for coding should be full windowed, can't see quite well

    • @woOpPerjr
      @woOpPerjr 5 років тому +1

      thanks for letting me know. I just realized the other day that I can get a little more screen real estate by hitting F11 so I'll be sure to do this in the future.

  • @MrEstate
    @MrEstate 5 років тому

    Is the Slack Channel still working? I can't find it.

    • @enthought
      @enthought  5 років тому

      Sorry, Robb. The SciPy 2019 Slack Channel is no longer active.

    • @KukaKaz
      @KukaKaz 4 роки тому

      @@enthought hi! Where can i find the dataset and the codes to follow along? Couldnt find it in Daniel Chen's github page. Could u please send me the link or email me at tolekbaeva@bk.ru. Thank you!!

    • @elliottscott666
      @elliottscott666 4 роки тому

      @@KukaKaz I just found it today by searching on GitHub using the description in the video

  • @calluma8472
    @calluma8472 5 років тому

    Does the audio artifact on this video ever stop? Driving me crazy.

    • @RAL2010
      @RAL2010 4 роки тому

      it's his phone, he should have switched it off.

    • @woOpPerjr
      @woOpPerjr 4 роки тому +2

      @@RAL2010 oh did not know that's what caused it. :\ I use my phone for my teaching notes. Since it's a live coding sessions it would be super disruptive to tab back and fourth on the screen...
      Might also be that the phone was probably pluged in and charging. Would the interference be just from the charger? or does putting it in airplane mode help?

  • @hamoda510
    @hamoda510 2 роки тому

    Thanks Daniel

  • @mohammadghanatian114
    @mohammadghanatian114 4 роки тому

    He was really a nice guy

  • @habrom1000
    @habrom1000 2 роки тому

    2:00:00 vectorize it is useful for me

  • @da_ta
    @da_ta 4 роки тому

    excellent thanks you

  • @yasseralkindi7350
    @yasseralkindi7350 3 роки тому +1

    great content, annoying crackling noise :(

  • @pavansaitadepalli6097
    @pavansaitadepalli6097 5 років тому

    excellent

  • @kamilwhite8139
    @kamilwhite8139 3 роки тому

    million likes

  • @anveshicharuvaka4650
    @anveshicharuvaka4650 4 роки тому

    Melt around 50:00

  • @inhyeokbaek6258
    @inhyeokbaek6258 4 роки тому

    24:35

  • @codingwithjoyk
    @codingwithjoyk 5 років тому

    you think. wowl

  • @Imran_et_al
    @Imran_et_al 3 роки тому

    Pandas was never that easy elsewhere

  • @maxbart1353
    @maxbart1353 4 роки тому

    in excel the pivot table stuff is much easier, (for me at least)

  • @Dev-yv5xl
    @Dev-yv5xl 4 роки тому +1

    Again you make video. Put that Mobile phone away from your mic.

    • @Michael-ur3zs
      @Michael-ur3zs 4 роки тому

      he has great content, but that phone interference is so distracting