Learn Exploratory Data Analysis (EDA) from Scratch | EDA in 5 hours | Satyajit Pattnaik
Вставка
- Опубліковано 15 вер 2023
- Are you ready to unleash the power of data analysis and gain valuable insights in just 5 hours? Look no further! In this comprehensive video course, we'll guide you through the fundamentals of Exploratory Data Analysis (EDA) from scratch, helping you unlock the potential of your data like never before.
📊 Dive into the world of EDA: You'll learn how to navigate and understand your data, identify patterns, and reveal hidden trends. From data visualization techniques to statistical analysis, this course covers it all!
📈 Gain practical knowledge: Our hands-on approach ensures that you not only grasp the theoretical concepts but also apply them to real-world scenarios. Through interactive exercises and examples, you'll develop the skills to clean, preprocess, and analyze data with confidence.
🔍 Uncover valuable insights: EDA is all about uncovering the story behind the data. We'll teach you how to ask the right questions, interpret your findings, and communicate your insights effectively. Whether you're a beginner or an experienced professional, this course will take your data analysis skills to the next level.
𝐓𝐢𝐦𝐞𝐋𝐢𝐧𝐞:
✅ 2:35 Agenda
✅ 5:20 DA/DS Process
✅ 11:58 What is EDA?
✅ 15:16 What is Visualization?
✅ 20:17 Steps in EDA
✅ 20:20 Data Sourcing
✅ 24:50 Data Cleaning
✅ 47:23 Feature Scaling
✅ 1:19:25 Outlier Treatment
✅ 1:42:42 Invalid Data
✅ 1:47:43 Types of Data
✅ 1:50:36 Types of Analysis
✅ 1:51:00 Univariate Analysis
✅ 2:02:26 Bivariate Analysis
✅ 2:07:47 Multivariate Analysis
✅ 2:43:38 Derived Metrics
✅ 2:48:19 Feature Binning
✅ 3:06:03 Feature Encoding
✅ 3:37:41 Case Study
🔴 Code: github.com/pik1989/EDA ❤️❤️
Subscribe our UA-cam Channel and press the bell icon to get regular updates👇: bit.ly/3tt2eNY
Join our Telegram Channel For Exclusive Data Science Resources👍 :
bit.ly/3FzObsk
Check Out Our Other Helpful Videos:😍
➮ Data Analyst vs Data Scientist vs Data Engineer - Roles || Responsibilities & Skills
bit.ly/3qsg2pL
➮Live Implementation of End To End Machine Learning Project With Deployment | Customer Churn
bit.ly/3FqPQjF
➮ Build your own Alexa in 30 minutes using Python | NLP | Data Science
bit.ly/3FFBWdP
✅✅✅ Follow us on: ✌️
👥 FACEBOOK: bit.ly/3I89SB4
📸 INSTAGRAM: bit.ly/3GLSvFn
💬 LINKEDIN: bit.ly/3fmwmCa
🔴 TELEGRAM: bit.ly/3MjBODX
🔴 TELEGRAM Discussion Group: bit.ly/3mcusay
Hello Friends , I am Satyajit Pattnaik, In my channel you will find every information about Data Science & Analytics which will help you become an expert Data Scientist or a Data Analyst along with which you would enjoy a loads of interesting and useful projects.
More & more great stuffs coming soon, keep supporting & learning 🎓
THANKS FOR WATCHING 😊
Music by: www.bensound.com
License code: OSLSCHN8ZUGZL0W2
satyajit pattnaik,data science,data analytics,machine learning,data analyst,artificial intelligence,Projects,Learn Exploratory Data Analysis (EDA) from Scratch,learn EDA,exploratory data analysis end to end,learn eda end to end,what is eda,how to learn eda from scratch,eda projects,eda satyajit,satyajit pattnaik data,satyajit data analysis,learn eda from scratch,feature binning,feature scaling,univariate analysis,bivariate analysis,outlier treatment
Files, Codes & Data: github.com/pik1989/EDA (Don't forget to provide stars in the Github repo)
The churn_modelling file has no missing values in the codebase, can you please update the file.
the dataset u provided for handling missing values showing no null values
@@pdivyanshupandey104 you have to use CustomerChurn.csv file
@@SatyajitPattnaik All the csv file in the data folder has no missing values
I want to ask do i need to do any model in EDA Part Such as the Linear regression do i need that in the EDA part?
1:19:25 OUTLIER TREATMENT
1:28:51 OUTLIER TREATMENT PRACTICAL
1:42:40 INVALID DATA
1:47:35 TYPES OF DATA
1:58:38 7 TYPES OF ANALYSIS
2:02:26 BIVARIATE ANALYSIS
2:07:48 MULTIVARIATE ANALYSIS
2:09:20 NUMERICAL ANALYSIS
2:13:12 PRACTICALS
2:43:38 DERIVED METRICS
2:48:21 FEATURE BINNING THEORY
2:55:10 PRACTICALS
3:06:06 FEATURE ENCODING
3:16:03 PRACTICALS
3:37:44 CASESTUDY
This is the only proper EDA video available on UA-cam, very simple and clear explanation. Thank you for your hard work.
✌️
Thank you so much! Really needed an in-depth, but concise overview of EDA and this was just the video :)). Much thanks.
Never saw a proper explanation of EDA on UA-cam channel. Great content 👍 and thanks to share it.
Not sure why this video has so less views. This is one of the most comprehensive videos to learn a lot of important concepts along with python , pandas practical implementation which is crucial as a Data scientist.
Hello Satyajeet I have find so many video regarding EDA but no one explain it correctly. I was asking so many data scientist but they have not explain it properly. Very thank you for explaining it very smoothly.
Greate work, thank you very much for simple and clear explanation.
No one can teach EDA better than youuu for sure 👍
🔥🔥
Very informative video about EDA. Looking forward to the dataset.
Great content never seen such a content in youtube from anyone.
very simple and clear explanation. Thank you
NEAT CLEAN SIMPLE UNDERSTANDING OF EDA
You are just simply amazing...I didn't got bored while watching this entire video. I completed this video within 2 days. You can see how amazing your whole video is, looking forward to attend more videos from your channel👍
Very well explained. Thanks
Really awesome content as well explanation
the video is insightful and learned a good number of topics. but a heads up to people looking into this video before checking it out please have insightful knowledge about SK Learn or Scikit library along with Matplotib and Seabon. overall, thank you for your efforts and contribution to the data community's betterment.
thank you satyajit.......The only video on You tube that explains EDA in depth. Thank you so much for your efforts!
So nice of you
Amazing work done .
very informative.... thank you sir!.....👌👍
Very informative and helpful
The video is posted one year ago but it is still totally worth watching..
This is the best video on EDA to date. So much depth and clarity. Thank you @Satyajit
Thanks, does that deserve a shoutout on Linkedin via a post 😝
Thanks for this video. Please try to do entire video on excel, sql separately and a video on inferential statistics and hypothesis testing
Outstanding and Neat and Simple way of teaching with Details..Love From Pakistan....
Thank you very much sir 🎉
Perfect bh chota word he is video k llie......Absoulterlly.......What to say bhai....Kmal video he. Structure behtreen rkha apne. Love from Pakistan
Awesome ❤
Great effort sir❤
Great job! You give bigger picture and it is easier to understand this topic.
3:04:07 checkout bar chart. 0-20 group doesn't have 6146
thnks buddy , very informative ,👌👌
nice video ... keep doing these kinds of video
Bhai bahut badhia😊
This video is fantastic! The concepts are explained so clearly, covering all the basic topics in such an easy-to-understand manner. I really enjoyed it and found it extremely helpful. Thank you for such a great lecture!
Welcome 💪
big fan sir 😍
Thank You Sir❣
Succinct and very helpful. Please be regular. I would like to see you cover all the topics to become a data Scientist.
write
Thanks for this video! It was really helpful in my learning process.
Brother hats off to you man.... best video for EDA i would say.... all the very best inshallah you will get millions of subscribers soon
I wish God fulfils your request ✌️🤣
zaroor zaroor😂😇@@SatyajitPattnaik
trust me guys this video is the best way to learn about EDA. dont forget to take notes .🤩
💪
Sooner you will get the recognition for you talent and effective teaching and presenting skills - sooner this channel gets 500k subs for sure .
☺️
Thanks, Satyajit. The video has been a blessing to me. It is so practical and beginner-friendly
Glad to hear that
Awesome lecture on point with examples too
I am glad you liked 😀
nice video
Watched full video. What a work mahnn!!!.
Thanks 🙏
Thank you so much, thanks a lotttt🎉
You're welcome 😊
Sir, i am get an error like File not found after running the Data set. Though i save the file and copy as path.
good
Very insightful ! Thank you.
yes its true
i was very help full
@ 5:09:00 row should be deleted if it has insignificant number of missing values? is this right ? shouldn't a row be deleted if it has significant no of missing values?
Can you please drop a csv file for handling data which is scratch one.
I opened the git hub and loaded the file in pandas and it didnt have any missing valued columns
In the feature scaling part, the standard deviation is 7874.6 for the income and if its 9643 .65 . How to get the number
Sir do you have EDA and Data Visualization notebooks done for any other different dataset. If yes please provide the link.
Very informative ✨
Glad it was helpful!
Want EDA on Bank Financial Loan project.. pls upload this type EDA .. thank you ❤❤❤❤.. and thank you for this video❤❤❤
Can you keep the link of data which is before performing EDA?
Why are we using log scale at 1:36:30? pls explain
Very informative compilation of complete EDA process.Thank you. Please do upload data files & python file so that one can practise while watching videos.
200 likes and 50 comments, just 13 comments away 😀
I am just 15 minutes in and many of the concepts have been cleared. In most of the courses they didn’t tell why we do EDA. Hats off to this instructor….
Thanks 💪
@@SatyajitPattnaikmy pleasure
@@fiza_Aslam Is the initial 15-20 minutes enough to do the EDA, I'm too impatient, can't complete this 5 hour video😢
@@shivam586gupta i did'nt say that you can be a master in 15 minutes. I just said, after watching the first 15 minutes I got to know the main reasons of why we do EDA which is I think the most important thing. You will have to take the whole course to learn complete EDA.
@@fiza_Aslam I understood, phir bhi yrr 5 ghante😵💫, I even finish the movies within an hour😅.
So, just wanted to know the content else profiling report is also an option.
How u thought I can be a master?🤣🤣
which dataset is to be used for handling null values part? the dataset you provide doesn't have any null value in them
@@johnxina7496 let me cross check again
Where is the dataset having missing values for Churn Modelling? All the values in the given dataset are filled.
Thank you. I benefited very much from the video.
sir Do you have any courses on other topics that I can learn from?
Yes, i have bunch of courses, pls reach out to me over whatsapp: +91 8237040802
Hi sir , where can i find similar data set on churning which is not used by much poeple. most of the people have use easily available dataset and to look different in this competition as a fresher i wanted to work on less used data sets
You won't find one :)
If i send a dataset, 100s will see and use it :)
Do your own research on this, or else you can use chatGPT to generate csv files for practice
Love from odisha 🎉❤
Can u plz tell me did u access the code ?
It was amazing EDA video.. can you please suggest some great and unique data science project which i can build to create my POC for the resume selections.
hi anurag , i was too looking for the same. do you have any. wanna disscus?
There can be many project ideas, if you are looking for NLP: Resume parsers, text-to-speech, speec-to-text, text summarization, subtitle generation etc etc can be really good projects.
Into DL, you can work on any image classification, voice classification or image captioning projects :)
Hi,Thanks for the video.can you tell me when should be actually remove the rows having null values in a column.or is it necessary thatnm we should always replace missing values with some values like avg or sd
Also iam working on www.kaggle.com/datasets/claytonmiller/cubems-smart-building-energy-and-iaq-data .but unable to find ourlt relation btwn temperature,relative humidity n light with energy consumption.if you hv worked on it can u pls provide some insights.Thanks in advance
Sir there are no missing values for Gender and Age in the churn modeling file. can u provide the updated file
Use the Data/CustomerChurn.csv file in the Github repository
Amazing I am also a Data Science Student was facing issues in EDA. But your videos just clear all my doubts.
Thank you Very much.
& One more thing any source to practice EDA by yourself?
😀
There are already few EDA projects on my channel, else just search EDA datasets on google, you will find tons of Kaggle repos
Can you suggest me a dataset for my college final year project
I want to ask do i need to do any model in EDA Part Such as the Linear regression or KNN ,do i need that in the EDA part?
No
thanku so much for this video
Welcome :)
can i get the notes of this session
@@SatyajitPattnaik
@@rajaryan6792 check pinned comments, files and codes are given
sir at the timestamp 3:05:44 i saw that the graph is wrong the range of 0-20=87 but it display the 21-40 age count can u check it
@@SatyajitPattnaik
Seen EDA videos. Great. Do you have any online EDA course? Interested.
Yes, i have an end to end DA program, pls ping me on whatsapp: +918237040802
Thanks for your EDA videos. Where did you learn Data Analysis course? Can you suggest me a good online platform to learn EDA?
Yes, i have an end to end DA program, pls ping me on whatsapp: +918237040802
1:38 hr - anything below 40 and greater than 80 considered as outlier- what is 40 and 80 here ? are these the length found from the anomaly function?
Based on the +- 3 standard deviation, we wrote that anything beyond 40 and 80 are outliers
Gem for datascience Aspirant
Stop using this fucked up word aspirant for everything
5:03:00 done ✔️
5point summary
"Churn" has multiple meanings depending on the context. Here are the two most common interpretations:
1. Customer Churn: In business, churn typically refers to the rate at which customers stop using a company's services or products. This can be measured in various ways, such as the percentage of customers who cancel subscriptions, close accounts, or stop making purchases within a specific period.
A high churn rate can be alarming for a business, as it indicates lost revenue and potentially dissatisfied customers. Businesses often analyze churn data to identify reasons why customers are leaving and implement strategies to reduce churn and retain customers.
2. Employee Churn: Churn can also refer to the rate at which employees leave a company. This metric is often used to assess employee satisfaction and company culture. High employee churn can be costly for businesses due to training new employees and the loss of institutional knowledge.
Both your interpretations has same meaning 😀
Hi @SatyajitPattnaik the ChurnModelling.csv dataset you provided on your github has no null values please upload corrected Dataset.
Use the Data/CustomerChurn.csv file in the Github repository
Hi@@SatyajitPattnaik CustomerChurn.csv in this files also there are no null values and columns are different than ChurnModelling.csv please upload ChurnModelling.csv with null values. Thanks!
@@AmbarGharat Its the same dataset, if u feel its not, just open and create some null values and practice
Hi Satyajit,
I have got an error while working with the dataset
In Types of Analysis - numerical analysis while doing corr() for the dataset , i got error that " string can't convert into float ''.
In that dataset the column -" Surname ".
How to rectify it?
Thank you ❤
Drop that column before doing corr()
convert categorical columns to numerical
1:38:09 what is 40 and 80? Where did these numbers come from? You did not explain this.
2 months and no reply. Yet Satyajit has time to heart the more recent comments.
sir pls try to provide the slides and codes file , with that our understand will improve more and end the end i just want to say i never see like this teaching abality and the content that you are providing.
I understand eda but i still have one doubt. How does understanding relationships between variables in a dataset help us in machine learning. We can just select feature selection and build a model. Can you explain me?
Eda mostly helps in identifying important KPIs, in ML yes we have feature selection techniques that are way advanced but eda helps too
I have a question.. are numpy and pandas included in this 5 hour video?
No, you can find separate videos on nUmpy and Pandas on my channel, or simply search “numpy satyajit” or “pandas satyajit” and you will find those videos
Sir when i use correlation methodh i got error that could not convert string to float. How can i solve this problem??
That means some of the columns are having string object, convert them to numericals by doing feature encoding and run corr))
Okay thankyou sir
to learn Time series analysis is important for data analyst?
@@aamirgaming4475 not mandatory
anyone help me with the csv file plss i am not able to find it
Check video description
Sir, I am not able to run the CSV After saved. It shows me error. Sir please help how to download on laptop and run the file to practice
@@sugandhakashyap9672 whats the error?
@@SatyajitPattnaik Actually I am practicing in Jupiter for Missing Value the very beginning. While I am saving the folder as trying to copy as path. Its shows me no file exists
38:57 i can't find any null values
can i get this ppt for interview preparation?
If you go through my Linkedin posts, you will find it 😀
does he teach pandas numpy etc in this video?
There are separate videos on Pandas, NumPy and Matplotlib on my channel, just search “Pandas Satyajit” or “numpy Satyajit” you will get those videos
at 1:38 hr what are 35 and 75 value. please explain
Is it 1:38 or 1:38:00?
@@SatyajitPattnaik 1:38:00
wonderful session ⭐👏
sir please add the comments
Hello sir. I want to do my project in eda with python
How can I help you?
Sir one doubt is necessary to handle missing values while doing data analysis
100%
2:34:27
Why will anyone go through 5 hours if data set is not provided
Any 1 hour video which provides me data to practice with it is more beneficial to me
Its upto you to go through or not, i have worked hard for this and theres a small request for viewers to share this video as much as possible and get 200 likes (its not a huge ask) and post which i will make the data public
And btw this is a part of my paid content which i made public for free, dont i deserve 200 likes?
Its not like i am not going to share the data, right, i have shared data in all of my videos in past!!
I understand and I can see the hard work. I am only saying that according to this current rule you are only punishing your first viewers.
And as a viewer it is not possible for me to watch a 5 hour long video if at the end I am gonna have nothing to add to my resume.
Your channel deserves more likes and views but this is not the right way as far as my knowledge goes.
Will wait for 200 likes. Best wishes
@@user-gq1ij you can just put some minor efforts in spreading the video, if 10 of you do the same, getting 100likes is not a big deal, but anyways the concepts i have taught, theres no need of codes, you can go through the video and work on codes, however I understand if i provide codes that will easy for you, but i can only share if i get what i want, so just wait for few days or try spreading this video 😀
@@SatyajitPattnaikI was not asking for code. Only data set.
Sir can u share the dataset. Likes already surpassed 200 please go easy on comments this time
Codes will be provided on Monday 🙏💪
hello satyajit i want to talk with you in one subject is it possible to talk with you
Please reach out on Linkedin: www.linkedin.com/in/satyajitpattnaik