Wow, this is an incredible dive into solving real-world data science problems! 🔥 Loved how you broke down complex concepts like regression so clearly at 0:47, and the step-by-step coding walkthrough at 2:29 was spot on. 💻 Task #4 (41:32) on fitting a linear regression model with sklearn was especially helpful! Great work on making data science approachable for everyone. Looking forward to seeing more content like this! 👍 #DataScience #Python #RegressionModeling
@ 20:52 and 21:21 there's a null value in charges. I checked the raw csv and found some '$nan' entries which didn't get dropped coz we first did .dropna, and then .strip(), I think?
Yeah retroactively looking at what happened, those entries didn't get dropped when we did our first dropna(). And then when we stripped the '$' and converted to a float type, they became a new null value. We handle this later in the video with an additional dropna(). Good catch!
Always helpful and education. I have a folder in my laptop with your name which contains the things I have learned from you. Just a quick question, how do you record your desktop?
It varies, but I often like to spin up either a Streamlit or Shiny app that is connected to my model and can show what the model outputs for different input values. Stakeholders often like this because they can interactively understand the types of values the model produces.
Hi Keith thank you for the detailed walk through. One question please, in real life how are these models maintained and run each month. For example, in my company if I'm running a linear regression on a similar monthly data, should I just run in Jupyter notebook linked to Git. Please share any best practices thanks again!
Get task scheduler like apache airflow. Enqueue tasks that do calculations, and dump the results into database on schedule. Wake up on Monday and retrieve already ready data from your database.
Hi, it was very nice explanation with a real world dataset application, I appreciate your effort, very clear programming skill and thinking about affordable medical charges for the population of the United States, congratulations good job done with help Regression Model Analysis of Machine Learning I Like It 🥰👍
Hey Keith, great Video as always. I have 2 Questions: First: Are Juptyter Notebooks used alot in a professional Setting, especially for Problems envolving creating Models? Second: Why did you only Dummie Encode 3 of the Regions? Is there any advantage to exclude one of them or is this just a efficency thing?
Thank you for the correction! You're right, 'n-dimensional hyperplane' is the proper terminology I should have used when describing fitting a linear regression model in a space with more than two dimensions 🙂
why is it that every "data scientist" does only have rudimentary statistics and econometrics knowledge? The model, that you are building is highly biased. You're not even checking for heteroscedasticity?
then what is the intention of this video? teaching people how to build biased models? there should be at least a mention about the possibility of biasness.
This is my 2nd project im doing along with you. Looking forward to the next one! Looks fun so far
Wow, this is an incredible dive into solving real-world data science problems! 🔥 Loved how you broke down complex concepts like regression so clearly at 0:47, and the step-by-step coding walkthrough at 2:29 was spot on. 💻 Task #4 (41:32) on fitting a linear regression model with sklearn was especially helpful! Great work on making data science approachable for everyone. Looking forward to seeing more content like this! 👍 #DataScience #Python #RegressionModeling
Thank you so much for these uploads, hope you continue them in the future
Please do more of these, this is a perfect way to learn ML. Please please please do one for each algorithm
Keith loved this. Very useful for refreshing on linear regression.
Please do more of those! So helpful!
Thank you so much for the real-time project, it's helpful
Awesome, was waiting for It!
Thanks @Keith
thank you so much for this tutorial!
Loved it🎉. 13 children moment was awesome 😂
Thanks keith ❤ !!
You're welcome!
@ 20:52 and 21:21 there's a null value in charges. I checked the raw csv and found some '$nan' entries which didn't get dropped coz we first did .dropna, and then .strip(), I think?
Yeah retroactively looking at what happened, those entries didn't get dropped when we did our first dropna(). And then when we stripped the '$' and converted to a float type, they became a new null value. We handle this later in the video with an additional dropna(). Good catch!
@@KeithGalli yes! this walkthrough was insightful. Thank you for the content.
Always helpful and education. I have a folder in my laptop with your name which contains the things I have learned from you. Just a quick question, how do you record your desktop?
Great video.
I curious to what presenting a model to stakeholders looks like.
I can’t seem to find that
It varies, but I often like to spin up either a Streamlit or Shiny app that is connected to my model and can show what the model outputs for different input values. Stakeholders often like this because they can interactively understand the types of values the model produces.
@@KeithGalli That’s great. Thank you.
Hi Keith thank you for the detailed walk through. One question please, in real life how are these models maintained and run each month. For example, in my company if I'm running a linear regression on a similar monthly data, should I just run in Jupyter notebook linked to Git. Please share any best practices thanks again!
Get task scheduler like apache airflow. Enqueue tasks that do calculations, and dump the results into database on schedule. Wake up on Monday and retrieve already ready data from your database.
Hi, it was very nice explanation with a real world dataset application,
I appreciate your effort, very clear programming skill and thinking
about affordable medical charges for the population of the United States,
congratulations good job done with help Regression Model Analysis of Machine Learning
I Like It
🥰👍
Hey Keith,
great Video as always. I have 2 Questions:
First: Are Juptyter Notebooks used alot in a professional Setting, especially for Problems envolving creating Models?
Second: Why did you only Dummie Encode 3 of the Regions? Is there any advantage to exclude one of them or is this just a efficency thing?
I live in Leeds 🇬🇧 I'm glad that this isn't a real word problem for me
So in order to follow along we have to purchase a subscription to the website right? Because you chose a premium user only data set?
Nope! I added the data to my Github and linked that in the description :)
@ oh my bad I didn’t see that, thank you!!
24:06
so , cool!
would nice to see how to make it available on html website
Watched till the end.
I am curious, why didn't you use chatgpt? Also do we have to create pipelines all the time?
n-dimensional hyperplane*
Thank you for the correction! You're right, 'n-dimensional hyperplane' is the proper terminology I should have used when describing fitting a linear regression model in a space with more than two dimensions 🙂
love u
why is it that every "data scientist" does only have rudimentary statistics and econometrics knowledge? The model, that you are building is highly biased. You're not even checking for heteroscedasticity?
Bro this is clearly a beginner video made to get people started. Heteroscedasticity would require another 30 minutes explaining.
then what is the intention of this video? teaching people how to build biased models? there should be at least a mention about the possibility of biasness.
@@Kurtosis3- Can you pls share your teaching videos?
@@Kurtosis3 you are right brother. What's the point of throwing Biased data analysis at people..
Isnt that CNN's Job?