Starting in pandas version 0.18.1, you can create a new datetime column directly from a DataFrame, based solely on the column names! It's a useful trick, which I explain in this video: ua-cam.com/video/-NbY7E9hKxk/v-deo.html
Hi DataSchool tk u for gr8 vid on working with dates and times. I am trying to work out how to group data for days, months and years in the same plot, e.g. bar graphs for months and different colours for the years
I've been watching your videos for a while now but never got the chance to comment on it i just want to say keep up the great work! You are just awesome!
Hello , I have a dataset with datatime index col. and it is weekly data , do I need to set freq='W' to apply forecasting models such as Holt-Winters, I tried : df.index.freq ='W', and got this error: OverflowError: int too big to convert"" please help me to fix this. Thank you
I wrote this code after this video date = input("Input your birthday:") dt = pd.to_datetime(date) today = pd.to_datetime('today') day = today - dt print("After you born", day," hours passed")
I hope you could add some time series data mining analysis for us in the future. I really want to know how to mine time series data in Panda. Thank you very much!
Python is really not intuitive when compared to R.. datetime comparison makes me split my hair . Still try to figure out how to compare a date in a dataframe ( which in datetime format) to today's date and do some action if they both match.. Any help is appreciated
Hello, I have 2 large datasets and want to compare time differences by seconds for instance. I want to Group-by a certain column first, and then see the time differences or duration for a certain action. Can I do this in Python
this is an awesome video but the to_datetime is not working for me, it keeps giving me error like "hour must be in 0..23: 10/11/2006 24:00" I've tried everything
The videos is with very nice explanation .I am getting the time data only in hour:min:sec format and when i convert it from object to time then it gives also the current date with time stamp. I want to fill the missing seconds values so is there any other function available ?
I have a csv file which consist of placeid and date .I want to calculate in a particular month how many times is the particular place visited..how this could be done in python code..Please help @ Data School
Perhaps you could store the 'month' attribute from the 'date' column as a new column, then filter the DataFrame by 'placeid' column, and then check the shape of the DataFrame. This video might be helpful to you: ua-cam.com/video/2AFGPdNn4FM/v-deo.html
Basically how can i do a simple math operation like increasing 2 months to the month of any date when i do that as at the following, i'm taking an error like : 'datetime.date' objects is not writable not wirtable *** dt = pd.to_datetime('2016/9/28') dt.month = dt.month + 2
Great question! Here's how I would do it: dt = pd.to_datetime('2016/9/28') td = pd.to_timedelta('60 days') dt + td I don't think you can specify time units as months because of the ambiguity. It makes sense when you say 9/28 plus 2 months (11/28), but what about 12/31 plus 2 months (2/31?)
Thanks a lot Sir ! It was very Helpful. Another thing i am stuck with is how to prepare a dataset for a Machine Learning Model which contains GPS Co-ordinates. Any Guidance or Suggestions?
That's really a question of feature engineering, meaning that you need to decide how to best represent the data so that a model can learn from it. The best approach depends on the problem you are solving, but my default approach would be to keep the latitude and longitude as separate features, and represent them in decimal format (not degrees/minutes/seconds). Hope that helps!
Hi Kevin I have a doubt.. I have 1 column in which time is 42368.149155 When I convert it into year, days, month, hrs, min, sec I am getting 1970-01-01, 17:22:13.453068 I read that 1970 is the default year. How can I convert it into some other year, say 2016 or any other year. Kindly help.
i have a question panda time series where all my columns are dates and i want to find the hrs spent by employee. Pls help me with your Github id so can i can post details there. please help
Thanks for the tutorials. I want to compute the difference between two dates and return the result in integer. Much like the last example you showed. can that number of days be returned as integer? thanks
AB['Date']=pd.to_datetime(AB.Date) i am getting error Unknown string format: TOTAL my format is dd/mm/yyyy in column Date of a data frame named AB showing dtype as object .
HI Kevin, I met a problem when I read the csv file with date time. I use the following code to read the csv file, but got 2 warnings said PST and PDT can't be understood... Can you please help me solve this problem? thank you! data = pd.read_csv("datetime.csv",parse_dates = ['date/time']) Mar 3, 2019 12:16:44 AM PST UnknownTimezoneWarning: tzname PST identified but not understood. Pass `tzinfos` argument in order to correctly return a timezone-aware datetime. In a future version, this will raise an exception. category=UnknownTimezoneWarning)
As always (literally), Great video! Quick question, Is it trivial to combine the times from two independent time series' then resample both time series' (the data at those times) to the new combined times? I can't quite figure out how to do this pythonically. pandas.resample doesn't seem to be too friendly when using irregular timestamps.
Thanks for your kind comment! Regarding your question, it's hard for me to envision exactly how to do this... I'm sorry! Feel free to let me know if you figure out a good strategy.
Hi Kevin, Just like we have 'weekday_name' to know whether it is 'Sunday', 'Monday' etc... what should we use to find the 'month name' ? I serched for help in pandas doc, but not able to find. Kindly suggest.
Great question! I'm not sure if there is a built-in way in pandas to do this. I would probably write my own code to do this using the map method, explained here: ua-cam.com/video/P_q0tkYqvSk/v-deo.html
awesome videos. been watching quite a few now. So, I'm playing with my gpx running data. And I'm trying to convert the duration of my runs so I can plot them. But I just fail. How would you convert ints like 33:28 and 01:44:42 so it would be understood as 33 minutes and 1 hour 44 minutes and so on?
Glad you like the videos! As for your question, it seems like extracting the datetime attributes (hours, minutes, seconds) and then doing the math with those attributes would solve your problem. Hope that helps!
Thanks for the awesome video..... I want to create a DateTimeIndex that contains each business day of year 2017 and use it to index a series of random numbers..
Hi there, congrats for the videos. I'm new in Python and starting working with financial data. So, I'm looking for plotting chart skipping/hiding missing dates/weekends. Can you help on how to do that?
Hello, In My date column has the date of month missing. how do I add the date to the existing column Ex: My column is 04-1982 (which is not in date format) and I want to make it 30-04-1982.. and want to repeat for all the other sections.. please help. and how to add a date if there is no date available
Thanks for great videos! You are great tutor! Can you tell how I can change date representation in Jupyter lab or notebook ?My columns with dates already have type as datetime64 but representation when I use head() looks like year-month-day. How I can change it to day-month-year ?
Exception has occurred: AttributeError 'DataFrame' object has no attribute 'Time' I keep getting this error when I try to use: df['date'] = pd.to_datetime(df.Time)
while plotting it in x axis, its shows as decimal value(1.2,3.5 etc...). i want it show in hh:mm . However, currently i am retrieving hours(using components,hours) and plotting it. Is there a better way?
I've created a dataframe in python using pandas. The index used is a series of timestamp of type int64. However, for time series analysis, the index need to be type dates. Can somebody help me to do the conversion ? first few rows of the dataset is 'Elapsed time','ECG I' 'hh:mm:ss.mmm','mV' '0:00.000',-0.08 '0:00.002',-0.08 '0:00.004',-0.07 '0:00.006',-0.07 '0:00.008',-0.09 '0:00.010',-0.09 '0:00.012',-0.10 '0:00.014',-0.10 '0:00.016',-0.10 thanks in advance :)
I think these two pages might be helpful to you: pandas.pydata.org/pandas-docs/stable/timeseries.html pandas.pydata.org/pandas-docs/stable/timedeltas.html Good luck!
Starting in pandas version 0.18.1, you can create a new datetime column directly from a DataFrame, based solely on the column names! It's a useful trick, which I explain in this video: ua-cam.com/video/-NbY7E9hKxk/v-deo.html
Hello!
I have a table with the date column. I want to group the data by month / year how do I do this?
I love you Bruh.. 😂.. No homo..thanks a lot!!!!
I am an aspiring data scientist. I just found a series of your videos. Thank you for doing this for all of us. Keep doing great work!
Thanks for your kind words, and good luck to you!
These are the first videos I look for when I have pandas questions.
Thanks!
2022 and you're still saving us. Thanks for the excelent content
You're welcome!
You've saved my job on multiple occasions sir, thank you.
That's awesome to hear! 🙌
I'm impressed. Simple explanations with examples. subbed and hit that bell. Thanks for the vid!
Awesome, thank you!
5 years video, but the best on UA-cam!!!
Thanks very much for your kind words!
All your videos are worth watching. I have learned a lot about pandas just from your videos. Thanks a ton :)
Thanks Aarti!
I love the way you teach, easy to follow and to understand. Many Thanks.
You're welcome!
I've only recently stumbled onto your videos. Very clear and concise delivery. Good job!
Thanks so much! Glad you are enjoying them.
Hi DataSchool tk u for gr8 vid on working with dates and times. I am trying to work out how to group data for days, months and years in the same plot, e.g. bar graphs for months and different colours for the years
So many great videos . Absolutely guidness. thanks from GREECE !
Thank you!
I've been watching your videos for a while now but never got the chance to comment on it i just want to say keep up the great work! You are just awesome!
Thanks very much for your kind words! Much appreciated :)
Thanks for this! You explain things very clearly and concisely.
Thanks!
That bonus is what I needed. Thank you so much!
You are so welcome!
This is EXACTLY was a looking for. I love you.
Awesome!
You're king of this area man!!!!
Thank you!
Your video is really good, we'll be really helpful, if you make some more videos on Dates and Times.
Thank you.
I cover it a bit more in this series: ua-cam.com/play/PL5-da3qGB5IBITZj_dYSFqnd_15JgqwA6.html
thank you, Now I am planning to cover this series too.
Excellent videos.Please consider giving tutorials on time series forecasting ( with various statistical models ) with Pandas.
Thanks for the suggestion! I'll consider it for the future.
Excellent class! As always! Cheers!
Thank you!
Hello , I have a dataset with datatime index col. and it is weekly data , do I need to set freq='W' to apply forecasting models such as Holt-Winters, I tried : df.index.freq ='W', and got this error: OverflowError: int too big to convert"" please help me to fix this. Thank you
DataFrame.resample() is also a very, very useful feature in Pandas for working with time series.
Agreed!
when i am trying to convert my data column into datetype its showing an error unknown string format
I want to show the week of months as per the datetime columns. How can i do that? Please advise.
wow.. you're tutorials are just so awesome !!!!!
Thank you!
Very nice, just what I was looking for . Thanks!
You're welcome!
I wrote this code after this video
date = input("Input your birthday:")
dt = pd.to_datetime(date)
today = pd.to_datetime('today')
day = today - dt
print("After you born", day," hours passed")
Cool!
How can I export this "datetime64" (YYYY-MM-DD HH:MM:SS) to a CSV file with the format "DD/MM/YYYY HH:MM" ?
I have a series of " 23:21:06 31/07/2019" format how to convert this to a time series
Thanks for the tutorial. I have a question.
How can i find unique items under a given column as some could have been repeated?
thank you for this, I have a lot to learn
You're welcome!
I hope you could add some time series data mining analysis for us in the future. I really want to know how to mine time series data in Panda. Thank you very much!
Thanks for the suggestion, I'll consider it for the future!
Kevin, could we define a ts_min and ts_max, and select the events during this interval?
Hi, Very useful basics covered in your videos, thank you very much!
You're very welcome!
Thank you for all the videos - you are the best!
Can you please explain what is dt? is that a class? I am not sure how to think about it.
I believe it's called an "accessor". I just think of it as a way to organize related attributes! Does that help?
@@dataschool yes, thank you! so it's basically a method, right?
Bro can we use this date to split data into train and test date if yes how we can do that
Python is really not intuitive when compared to R.. datetime comparison makes me split my hair . Still try to figure out how to compare a date in a dataframe ( which in datetime format) to today's date and do some action if they both match.. Any help is appreciated
Hello, I have 2 large datasets and want to compare time differences by seconds for instance. I want to Group-by a certain column first, and then see the time differences or duration for a certain action. Can I do this in Python
I'm sure you can, but it's hard for me to say how off-hand. Sorry!
this is an awesome video but the to_datetime is not working for me, it keeps giving me error like "hour must be in 0..23: 10/11/2006 24:00"
I've tried everything
how can i calculate median for column of type datetime64 ns
How do i subtract dates? I need to get a range between my last date time and 6 months before. Thanks! Your videos are great.
I'm not sure off-hand, sorry!
Big thanks, Kevin! Great job!
You're welcome!
thank you so much for this video, it saved me so much time, thank you. Wow. so good.
@Data School Do you have any video on how to fill missing dates with zero (desired number) in a large csv file?
ua-cam.com/video/fCMrO_VzeL8/v-deo.html
The videos is with very nice explanation .I am getting the time data only in hour:min:sec format and when i convert it from object to time then it gives also the current date with time stamp. I want to fill the missing seconds values so is there any other function available ?
I'm not sure off-hand, I'm sorry!
Very helpful. Thank you so much bro
You're welcome!
I have a csv file which consist of placeid and date .I want to calculate in a particular month how many times is the particular place visited..how this could be done in python code..Please help @ Data School
Perhaps you could store the 'month' attribute from the 'date' column as a new column, then filter the DataFrame by 'placeid' column, and then check the shape of the DataFrame. This video might be helpful to you: ua-cam.com/video/2AFGPdNn4FM/v-deo.html
Basically how can i do a simple math operation like increasing 2 months to the month of any date
when i do that as at the following, i'm taking an error like :
'datetime.date' objects is not writable not wirtable
***
dt = pd.to_datetime('2016/9/28')
dt.month = dt.month + 2
Great question! Here's how I would do it:
dt = pd.to_datetime('2016/9/28')
td = pd.to_timedelta('60 days')
dt + td
I don't think you can specify time units as months because of the ambiguity. It makes sense when you say 9/28 plus 2 months (11/28), but what about 12/31 plus 2 months (2/31?)
Holly teacher! :D
How I should sort "How many trips are occurred per month?" and "Which hour of the day had the most frequent trips?"
Thank you, it helped a lot
Thanks You. But a have a query
If I want to get the data between two dates
How I can do ?
I don't know the code off-hand, I'm sorry!
@@dataschool OK. Thanks . But according some of my research to have the data between a period we should set data index as time stamp type.
Thanks a lot Sir ! It was very Helpful.
Another thing i am stuck with is how to prepare a dataset for a Machine Learning Model which contains GPS Co-ordinates.
Any Guidance or Suggestions?
That's really a question of feature engineering, meaning that you need to decide how to best represent the data so that a model can learn from it. The best approach depends on the problem you are solving, but my default approach would be to keep the latitude and longitude as separate features, and represent them in decimal format (not degrees/minutes/seconds). Hope that helps!
Exactly what I wanted!! Thanks.
You're welcome!
Hi Kevin
I have a doubt..
I have 1 column in which time is 42368.149155
When I convert it into year, days, month, hrs, min, sec I am getting
1970-01-01, 17:22:13.453068
I read that 1970 is the default year. How can I convert it into some other year, say 2016 or any other year. Kindly help.
You might need to adjust the "unit", see examples here: pandas.pydata.org/docs/reference/api/pandas.to_datetime.html
@@dataschool Thank You Kevin. Got it now.
You made my day, thank you
You're very welcome!
i have a question panda time series where all my columns are dates and i want to find the hrs spent by employee. Pls help me with your Github id so can i can post details there. please help
I'm sorry, I won't be able to help. Good luck!
plz do viedo on kaggle market basket analysis
Thanks for your suggestion!
You are gem bro. Thank you
Thank you!
what do you do if your dates are in a dd/mm/yyyy format?
If pandas doesn't recognize the format, you have to tell it the format explicitly.
You have the best tutorials for Python.
Thank you so much!
Thanks for the tutorials. I want to compute the difference between two dates and return the result in integer. Much like the last example you showed. can that number of days be returned as integer? thanks
Actually, this already returns the result as an integer:
(ufo.Time.max() - ufo.Time.min()).days
Thank you so much, wonderful explanation
You're welcome!
AB['Date']=pd.to_datetime(AB.Date)
i am getting error Unknown string format: TOTAL
my format is dd/mm/yyyy in column Date of a data frame named AB showing dtype as object .
THANK YOU! Very helpful video!! :)
You're very welcome! :)
Thanks for the video. Quick question: How can I change the data type of several columns at the same time?
Great question! See part 3 of this video: ua-cam.com/video/-NbY7E9hKxk/v-deo.html
Thank you very much. This has been quite helpful. :D
Excellent! You're very welcome :)
HI Kevin, I met a problem when I read the csv file with date time.
I use the following code to read the csv file, but got 2 warnings said PST and PDT can't be understood... Can you please help me solve this problem? thank you!
data = pd.read_csv("datetime.csv",parse_dates = ['date/time'])
Mar 3, 2019 12:16:44 AM PST
UnknownTimezoneWarning: tzname PST identified but not understood. Pass `tzinfos` argument in order to correctly return a timezone-aware datetime. In a future version, this will raise an exception.
category=UnknownTimezoneWarning)
It's hard to say without investigating, I'm sorry!
Thank you for the video!
You're welcome!
How to subtract to times and output required only time with out days ?
Not sure off-hand, sorry!
As always (literally), Great video! Quick question, Is it trivial to combine the times from two independent time series' then resample both time series' (the data at those times) to the new combined times? I can't quite figure out how to do this pythonically. pandas.resample doesn't seem to be too friendly when using irregular timestamps.
Thanks for your kind comment! Regarding your question, it's hard for me to envision exactly how to do this... I'm sorry! Feel free to let me know if you figure out a good strategy.
What about if I want to know some rows for a time interval? For example, between 2000 and 2001
I think something like this would work:
ufo[(ufo.Time.dt.year >= 2000) & (ufo.Time.dt.year < 2001)]
Really helpful! Thanks :)
Great Sir. And please dont says to like videos our hands automatically get clicked on like after watching this.
Ha! :)
i am getting error on weekday_name no attribute? why soo
try.... ufo.Time.dt.day_name() in new version of Jupyter
Hi Kevin,
Just like we have 'weekday_name' to know whether it is 'Sunday', 'Monday' etc... what should we use to find the 'month name' ?
I serched for help in pandas doc, but not able to find.
Kindly suggest.
Great question! I'm not sure if there is a built-in way in pandas to do this. I would probably write my own code to do this using the map method, explained here: ua-cam.com/video/P_q0tkYqvSk/v-deo.html
awesome videos. been watching quite a few now.
So, I'm playing with my gpx running data. And I'm trying to convert the duration of my runs so I can plot them. But I just fail. How would you convert ints like 33:28 and 01:44:42 so it would be understood as 33 minutes and 1 hour 44 minutes and so on?
Glad you like the videos! As for your question, it seems like extracting the datetime attributes (hours, minutes, seconds) and then doing the math with those attributes would solve your problem. Hope that helps!
This is helpful, thank you.
You're welcome!
thank you this was really helpful !
You're welcome!
Thanks for the awesome video.....
I want to create a DateTimeIndex that contains each business day of year 2017 and use it to index a series of random numbers..
I'm not sure how to do this off-hand, I'm sorry!
Hi there, congrats for the videos.
I'm new in Python and starting working with financial data.
So, I'm looking for plotting chart skipping/hiding missing dates/weekends. Can you help on how to do that?
I'm sorry, I won't be able to help... good luck!
For some reason This through s an error
ufo.Time.dt.weekday_name
but
ufo.Time.dt.weekday
this works any idea....
Thank you
Hello, In My date column has the date of month missing. how do I add the date to the existing column
Ex: My column is 04-1982 (which is not in date format) and I want to make it 30-04-1982.. and want to repeat for all the other sections.. please help. and how to add a date if there is no date available
I'm sure there's a string method that can help: ua-cam.com/video/bofaC0IckHo/v-deo.html&index=12&list=PL5-da3qGB5ICCsgW1MxlZ0Hq8LL5U3u9y
Is there any way to separate the time (h,m,s) that's on the same column of an imported excel data frame and pass it all to seconds?
I'm sorry, I don't understand your question. Could you clarify? Thanks!
Easy made. smart tutor
Thanks!
Thanks for great videos! You are great tutor! Can you tell how I can change date representation in Jupyter lab or notebook ?My columns with dates already have type as datetime64 but representation when I use head() looks like year-month-day. How I can change it to day-month-year ?
Not sure off-hand, sorry!
Exception has occurred: AttributeError
'DataFrame' object has no attribute 'Time'
I keep getting this error when I try to use: df['date'] = pd.to_datetime(df.Time)
Perhaps you read the file incorrectly or have a typo somewhere?
I know im late but what if i want to change the years of the dates in my data frame, how would i do that?
Perhaps you can overwrite the year attribute of each item? There might be a better way, however.
Great tutorial!
Thanks! Glad you liked it.
Great video, save a lot of code for me. Very good :)
Thanks!
Hello, suppose that "Time" column is Date of Birth column. How can age in years be calculated ? Thanks for your help in advance
I think you would just put the current time in a new column, and then subtract, and then only use the years part of the result.
Thanks. The time you find to answer every time is really appreciated.
You're very welcome! :)
Hi Kevin,
It was wonderful session. is there a method to convert timedelta64[ns] to Time(hh:mm:ss)
Thanks
Are you asking how to change the datatype, or the way in which the data is displayed?
while plotting it in x axis, its shows as decimal value(1.2,3.5 etc...). i want it show in hh:mm . However, currently i am retrieving hours(using components,hours) and plotting it. Is there a better way?
I'm not sure the best way to do that. Sorry!
there is no ad, like it.
Thanks!
Hello, what if I want to convert two date objects variables to datetime format? I tried, but keep getting error.
Give an example?
got it. thanks for asking anyway
Thanks a lot
YOU'RE THE BEST
Thank you!
Thank you
You're welcome!
I've created a dataframe in python using pandas. The index used is a series of timestamp of type int64. However, for time series analysis, the index need to be type dates. Can somebody help me to do the conversion ?
first few rows of the dataset is
'Elapsed time','ECG I'
'hh:mm:ss.mmm','mV'
'0:00.000',-0.08
'0:00.002',-0.08
'0:00.004',-0.07
'0:00.006',-0.07
'0:00.008',-0.09
'0:00.010',-0.09
'0:00.012',-0.10
'0:00.014',-0.10
'0:00.016',-0.10
thanks in advance :)
How to convert date and time to seconds so that i can use date time as continous variable
I'm not sure, sorry!
How to find the difference of current date with a series of dates? Also i dont have any dataframe of dates. How to make one?
I think these two pages might be helpful to you:
pandas.pydata.org/pandas-docs/stable/timeseries.html
pandas.pydata.org/pandas-docs/stable/timedeltas.html
Good luck!
Thank you.
Hello, I was wondering if you have done any videos of matplotlib.
Not yet! But thanks for the suggestion!
and timeseries?
Nope!
I think you have to also to have example of analysis from scratch to prediction using pandas and sciktlearn