I'm teaching myself Python and getting from basic knowledge to machine learning level is made so much easier by little explanations like this. I have to comb through code to make sure I understand exactly what is happening and where, and your efforts speed up my learning to no end! thanks!
I don't know why I didn't find your channel till now...your videos are super helpful; these help me understand things that I always wanted to ask someone in person. Thanks for putting your simple videos out here! superlike!!
Omg I've been looking everywhere for an explanation on the inplace function and I don't know why everyone else just can't explain it properly when the explanation is so simple!!! Thank you so much, for teaching like a normal person.
Your video is awesome! I was looking for a quick explanation of this puzzling parameter, and you explained so much than that! I will definitely check more of your videos ^^ Thank you so much!!!!
a big thanks to you! I finally understand what it does. By the way, I really love the suggestions that point to another video if a concept is being mentioned, but not explained.
You're very welcome! And, I'm glad the suggested videos are helpful to you! I have been wondering whether it is worth the time to add those suggestions to each video :)
Your presentation, voice and videos are excellent. Very informative. I have a request, if you kindly make a "Comprehensive vdo Playlist on Data Analysis", it will be awesome. Thank you Kevin for awesome channel.
Thank you so much for your effortless teaching ways. I just started learning Data analysis with python and started watching your videos on the recommendation of my friend. I see that most of your videos are 2 years old. So by any chance can you please upload a video on how pandas changed over this period of 2 years. What's new in panda and what's going to obsolete in the near future. That would be of great help. Also can you please recommend any book or study reference from where I can learn more about Data science with python.
I'm interested in learning more about the Panda's melt() function. Could you recommend anything on that topic or could you perhaps do a video on it? Thanks!
Thank you Kevin, your videos are really knowledgeable and I have a question regarding usage of Pandas for handling unstructured or semi-structured data as most of your videos are dealing with structured data, just curious to know is there any examples for showing usage of pandas or it provides any data structures for handing unstructured or semi-structured data..?
Generally speaking, pandas is most useful when the data is already structured, but you can also use pandas to add structure to your data. However, other tools might be better for this task - it really depends upon the particulars of the data.
Hello Kevin! I have a question that came up when working in a ML problem. Sometimes, when you use sklearn's functions you have to convert the df into a numpy nd.array, which, as far as i know, will make you loose the column names of the df. Now suppose i did something with a numpy array that was previously a df, how to i convert it back, with the proper column names? For example, suppose you did a feature selection procedure that returned a boolean vector for which columns it used. How do i proceed if: 1. I just want to know which columns are marked with True of false and subset those. 2. Transform the numpy array (like "unormalizing it or somthing") and name the columns properly (with or without the boolean conditions)
Lots of great questions! You are correct that NumPy arrays don't have column headers. Let's pretend that you had a pandas DataFrame (df) and a NumPy array (arr) with the same contents. To recreate the DataFrame from the array, you can just use: df2 = pd.DataFrame(arr, columns=df.columns) Let's then say that you created a list of booleans (bool) that tells you which columns you're interested in. To select those columns, you can just use: df2.loc[:, bool] If you just want to know the names of those columns, you can use: df2.columns[bool] Does that help?
Hey, I'm using a MacBook and when I press the shift button, it doesn't appear the textbook that appears in your videos, there's any other way to visualise that on Mac?
Hi how can I export data set after I manipulate the data set. ex. df['Zoom_Name'].value_counts() I want this data to be exported to csv without affecting the main data set.
Could you do a video about handling large CSV files in Pandas (ex: over 3 gb)? For example, is there a practical way to randomly sample rows? Thank you for your videos
Great question! If you have already read the file into a DataFrame, there is a "sample" method you can use for random sampling. If you're trying to randomly sample rows as you read them in, I'd have to think about whether that can be done. I'll consider that for an upcoming video!
Hi , I tried the same to impute missing value in Titanic data set but this is not working. df[(df['Age'].isnull()) & (df['Pclass']==2)].fillna(29,inplace=True)
So, when you use the inplace feature set to True, it changes the data from the source (the bit.ly file) or just the dataframe in pandas created from it? So, say I have an excel file I'm working on and drop a column and use the inplace=True, will it alter the original excel file? Love your videos, by the way, I have learned so much!
Hi, Thanks for your videos! I have a test case where in I have to read a file with Year/date as a column and I want to split them by year.The requirement is the dataframe name should be like sales_2018, sales_2019,sales_2020 .This will help me to iterate them in a for Loop.Also is there any way to parameterize python code. e.g. I have a variable name Year=2018, and in the dataframe statement I write sales_&year and it should get resolved to sales_2018 and so on.Thanks in advance .Ashish
I'm sorry, this is beyond what I can answer in a UA-cam comment! If you want to ask a detailed question, you're welcome to join Data School Insiders and ask it in a webcast or in the forum: www.patreon.com/dataschool
Sir i was finding some videos on step by step guidance of learning all the mahine learning algorithms with their use cases, can u please help me with it sir?
This video was helpful and very easy to understand, indeed. While watching, a question came up which is not related to this topic but to python in general. You were using Python from a web interface i.e. you were server side scripting and executing. So my question is what is the name of this web interface? And moreover, do you have some tips/hints/tutorial how to set up Python on server side and using a database and/or server directory? I was thinking to use then Python over the web interface for ad hoc data analysis and for a routinely call, I may use my local Python client (Spyder/Anaconda). I Would be very happy if anyone can help me. Thanks a lot in advance!
Glad the video was helpful to you! Regarding your question, this interface is known as the Jupyter notebook (and was previously known as the IPython notebook). The second part of this video explains the notebook: ua-cam.com/video/IsXXlYVBt1M/v-deo.html
Because you have already executed the statement before so time is not a column now. I did the same mistake .. you have to restart the karnel to run all the cells from the beginning.
I'm teaching myself Python and getting from basic knowledge to machine learning level is made so much easier by little explanations like this. I have to comb through code to make sure I understand exactly what is happening and where, and your efforts speed up my learning to no end! thanks!
That's awesome to hear!
i liked your videos very much cause they are very clear, it makes me love Python and Data Science
please don't stop making videos
Thank you :)
Thanks for your kind words and encouragement! :)
I really liked the way you made your videos in the form of questions. So when I have a specific question I just have to look for that video
I don't know why I didn't find your channel till now...your videos are super helpful; these help me understand things that I always wanted to ask someone in person. Thanks for putting your simple videos out here! superlike!!
Thank you! 🙏
Omg I've been looking everywhere for an explanation on the inplace function and I don't know why everyone else just can't explain it properly when the explanation is so simple!!! Thank you so much, for teaching like a normal person.
So glad to hear that I was able to provide you with some insight! 😄
Understood the point of having inplace parameter and why it is set to false by default. Great one again. Thanks
You're welcome!
Very clear explanations. You are not only demystifying 'inplace' parameter but along with many other Panda's methods as well.Thank you so much .
You're very welcome! Thanks for your kind words! 🙏
exceptionally well presented course for Python. Great job!
Thanks!
Watched 3 videos until I finally found this one with a clear explanation. ty
You're welcome!
This is the most clear explanation for "inplace". Thank you so much!
You're very welcome!
Your video is awesome! I was looking for a quick explanation of this puzzling parameter, and you explained so much than that! I will definitely check more of your videos ^^ Thank you so much!!!!
You're welcome! And thanks for your kind comment :)
Lucky to see your video! Very clear and helpful! Especially for beginners.
Thank you!
a big thanks to you! I finally understand what it does. By the way, I really love the suggestions that point to another video if a concept is being mentioned, but not explained.
You're very welcome! And, I'm glad the suggested videos are helpful to you! I have been wondering whether it is worth the time to add those suggestions to each video :)
Please don't stop making videos.
I won't! :)
@@dataschool bro its been already 5 months please go on
You are great man! Not all heroes wear capes.
You are too kind! :)
Phenomenal explanation! Liked and subscribed, thank you for making this so easy to understand!
Awesome! Glad it was helpful to you!
Superb! You are a great teacher and I instantly got this info that you shared in my head...Thank YOU!
Thank you so much! 🙏
Thank you very much for your videos, they are very useful and easy to understand, they are helping me a lot!
Your presentation, voice and videos are excellent. Very informative. I have a request, if you kindly make a "Comprehensive vdo Playlist on Data Analysis", it will be awesome. Thank you Kevin for awesome channel.
Thanks! Here's the playlist: ua-cam.com/play/PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y.html
no question here.. just a big thanks for your time!
You're very welcome! I enjoy creating these videos :)
These videos are great. I hope you will keep making them!
Thanks! This is video 20, and I'll be making at least 30 :)
Thats a very clear explanation. Thank you
Glad it was helpful!
I very much congratulate you for sharing code used in video with us. Many thanks for that. It is very much useful to me. My warm regards to you.
Thanks! Glad I could be of help!
Thanks for this useful explanation man!! Greetings from Perú :D
My pleasure!
Thank you so much for your effortless teaching ways. I just started learning Data analysis with python and started watching your videos on the recommendation of my friend. I see that most of your videos are 2 years old. So by any chance can you please upload a video on how pandas changed over this period of 2 years. What's new in panda and what's going to obsolete in the near future. That would be of great help. Also can you please recommend any book or study reference from where I can learn more about Data science with python.
I have a few pandas resources on this page that may be helpful to you: www.dataschool.io/start/
I'm interested in learning more about the Panda's melt() function. Could you recommend anything on that topic or could you perhaps do a video on it?
Thanks!
Thanks for the suggestion, I'll see if I can cover it in a future video. I don't have any good resources on it off-hand, sorry!
Thank you very much. U gave me a point of understanding. It's really a cool option. :)
You're welcome!
Great! I was just looking for this everywhere before youtube suggested you. Thanks ;)
Great! You can watch the whole pandas series here: ua-cam.com/play/PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y.html
;)
Thank you for another great video!
Thank you Brenden! 🙌
Thanks for the info please explain can we save the cleaned data which is ready for model saved with new CSV or XL file for future use
You would use the DataFrame method .to_csv()
Thank you Kevin, your videos are really knowledgeable and I have a question regarding usage of Pandas for handling unstructured or semi-structured data as most of your videos are dealing with structured data, just curious to know is there any examples for showing usage of pandas or it provides any data structures for handing unstructured or semi-structured data..?
Generally speaking, pandas is most useful when the data is already structured, but you can also use pandas to add structure to your data. However, other tools might be better for this task - it really depends upon the particulars of the data.
Hi Kevin, I have a question whats the main difference between making df=df.drop() , df.drop(inplace=True) in terms of efficiency ?
Not sure off-hand, sorry!
Very helpful, thank you!
Glad it was helpful!
Hello Kevin! I have a question that came up when working in a ML problem. Sometimes, when you use sklearn's functions you have to convert the df into a numpy nd.array, which, as far as i know, will make you loose the column names of the df.
Now suppose i did something with a numpy array that was previously a df, how to i convert it back, with the proper column names? For example, suppose you did a feature selection procedure that returned a boolean vector for which columns it used. How do i proceed if:
1. I just want to know which columns are marked with True of false and subset those.
2. Transform the numpy array (like "unormalizing it or somthing") and name the columns properly (with or without the boolean conditions)
Lots of great questions! You are correct that NumPy arrays don't have column headers.
Let's pretend that you had a pandas DataFrame (df) and a NumPy array (arr) with the same contents. To recreate the DataFrame from the array, you can just use:
df2 = pd.DataFrame(arr, columns=df.columns)
Let's then say that you created a list of booleans (bool) that tells you which columns you're interested in. To select those columns, you can just use:
df2.loc[:, bool]
If you just want to know the names of those columns, you can use:
df2.columns[bool]
Does that help?
Love this series!
Awesome! Feel free to let me know if you have any suggestions for future videos.
OMG!!! AMAZING AS ALWAYS!!!!!!
I am becoming a fan of you man!
Thanks! Feel free to subscribe to the Data School newsletter if you haven't already :)
www.dataschool.io/subscribe/
Amazing videos. Thank you so much!
You're very welcome!
That was Helpful , Thanks a lot !
You're welcome!
Thank you, very well explained :)
Thank you!
Hey, I'm using a MacBook and when I press the shift button, it doesn't appear the textbook that appears in your videos, there's any other way to visualise that on Mac?
I'm sorry, I don't understand your question. Could you clarify? Thanks!
Amazing video.
Thank you!
Hi how can I export data set after I manipulate the data set.
ex.
df['Zoom_Name'].value_counts()
I want this data to be exported to csv without affecting the main data set.
Great question! Convert the results to a DataFrame and then use the .to_csv() method.
Could you do a video about handling large CSV files in Pandas (ex: over 3 gb)?
For example, is there a practical way to randomly sample rows?
Thank you for your videos
Great question! If you have already read the file into a DataFrame, there is a "sample" method you can use for random sampling. If you're trying to randomly sample rows as you read them in, I'd have to think about whether that can be done. I'll consider that for an upcoming video!
I featured your question in a new video that I just posted... hope it helps! ua-cam.com/video/oH3wYKvwpJ8/v-deo.html
I just watched that video! Thank you for answering and featuring my question :)
You're welcome :)
Thank you Sheldon!
😄
you are great Instructor
Thanks!
Could you do a video on pd.melt, please?
Thanks for the suggestion! I'll consider it for the future.
you are a gift sent from god my dear friend. Thank you
Wow, thank you!
Hi , I tried the same to impute missing value in Titanic data set but this is not working.
df[(df['Age'].isnull()) & (df['Pclass']==2)].fillna(29,inplace=True)
So, when you use the inplace feature set to True, it changes the data from the source (the bit.ly file) or just the dataframe in pandas created from it? So, say I have an excel file I'm working on and drop a column and use the inplace=True, will it alter the original excel file?
Love your videos, by the way, I have learned so much!
inplace just changes the DataFrame, and not the source file. Glad you like the videos!
Hi, Thanks for your videos! I have a test case where in I have to read a file with Year/date as a column and I want to split them by year.The requirement is the dataframe name should be like sales_2018, sales_2019,sales_2020 .This will help me to iterate them in a for Loop.Also is there any way to parameterize python code. e.g. I have a variable name Year=2018, and in the dataframe statement I write sales_&year and it should get resolved to sales_2018 and so on.Thanks in advance .Ashish
I'm sorry, this is beyond what I can answer in a UA-cam comment! If you want to ask a detailed question, you're welcome to join Data School Insiders and ask it in a webcast or in the forum: www.patreon.com/dataschool
Sir i was finding some videos on step by step guidance of learning all the mahine learning algorithms with their use cases, can u please help me with it sir?
I don't have such a video, sorry!
Dear Kevin, thanks for great videos on pandas. Would you please also create a video on dataframe.corr() and VIF
Thanks
Thanks for the suggestion!
This video was helpful and very easy to understand, indeed.
While watching, a question came up which is not related to this topic but to python in general.
You were using Python from a web interface i.e. you were server side scripting and executing.
So my question is what is the name of this web interface?
And moreover, do you have some tips/hints/tutorial how to set up Python on server side and using a database and/or server directory?
I was thinking to use then Python over the web interface for ad hoc data analysis and for a routinely call, I may use my local Python client (Spyder/Anaconda).
I Would be very happy if anyone can help me.
Thanks a lot in advance!
Ok, I got it: It is called Jupyter. Easiest installation: Download Anaconda.
Glad the video was helpful to you! Regarding your question, this interface is known as the Jupyter notebook (and was previously known as the IPython notebook). The second part of this video explains the notebook: ua-cam.com/video/IsXXlYVBt1M/v-deo.html
Thank you so much sir
You're very welcome!
But I have a question that why some method of pandas have inplace and other don't.
It's hard to generalize, but it makes sense for some methods and not for others.
Does pandas have any function by default "inplace='True'"??
Not that I can think of.
i didnt understand the difference b/w 'ffill' 'bfill' ''pad'.
Does this help? pandas.pydata.org/pandas-docs/stable/missing_data.html#filling-missing-values-fillna
I am so used to his x2 speed voice I feel the normal speed weird now
Ha! 😆
When i do ufo.tail why it shows keyerror : "None of ['Time'] are in the columns".. please someone explain
Are you using the tail method like this? ufo.tail()
Because you have already executed the statement before so time is not a column now. I did the same mistake .. you have to restart the karnel to run all the cells from the beginning.