Tutorial 23-Univariate, Bivariate and Multivariate Analysis- Part2 (EDA)-Data Science
Вставка
- Опубліковано 7 жов 2024
- If you are looking for Career Tansition Advice and Real Life Data Scientist Journey. Please check the below link
Spring board India UA-cam url: / channel
Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
/ @krishnaik06
github url: github.com/kri...
Connect with me here:
Twitter: / krishnaik06
Facebook: / krishnaik06
instagram: / krishnaik06
ur teaching skills are damn good man keep it up man lots of respect
One small correction. That Hue is pronounced "Hiu" instead of "Hui". You are making absolutely great content. Love them all. Keep growing. (Y)
But I like how he pronounced 'HUII' :D
slave mindset ?
thank you so much for this..I dont know why I was unable to understand this concept. Thanks for this
The best explanation about these variates ...
Great job. Your sincerity shows. Wonderful effort.
Just one tiny correction for Univariate x label should be Sepal Length ...all other good ..Thanks Krish
you are grate sir .i am really grateful to your vedios thank you thank you so much sir.
I love when krish calls Hue as Huiii
Thank you
X lab should have been 'Sepal length' instead of 'Petal Length'
I came in comment box to check same
Thank you so much sir . Great explanation
Really helpful. Thanks
Wow what a nice explaination! 👌 👋
Another easy way to do the bivaruate plot at 11:20 is sns.scatterplot(df['sepal_length'],df['sepal_width'],hue=df['species'])
Thanks for tutorial.Please arrange tutorials in proper sequential of related tutorials.
Pretty badass :) Thanks!
@Quincy Sebastian please provide me an account :/
So here are objective u can obtained by using this statistical method,
1)Which features have good impact for ur model
2)Which type of algorithms u should choos
You need to have x label as sepal length in univariate analysis.
Wow...
Thanks for the excellent tutorial..!
But this works well for classification problems. How shall we perform the similar analysis for Regression problem..!?
univariate, bivariate and multivariate analysis should be done before data prep-processing or after......Please Reply...
after
Thanks Sir!
Interesting method to plot univariate, I generally create scatterplots to make similar deductions in terms of what kind of classifier will make sense.
Here's some sample code:
import matplotlib.pyplot as plt
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target
F = iris.feature_names
fig, ax = plt.subplots(1, len(F), figsize=(15,2))
for i,f in enumerate(F):
ax[i].scatter(X[:,i],y, c=y)
ax[i].set(xlabel=f)
ax[i].get_yaxis().set_visible(False)
May be I am wrong, should that be "sepal length" instead of "petal length" in xlabel? based on your plot variables or feature used for univariate analysis
ya its sepal length may be there is some mistake
Sir can you make. Video on EDA only using python. Means what are necessary steps in EDA
In the uni-variate analysis, why do you put all data points on the same level? By putting them onto different levels, e.g. by setting np.zeros_like()+0, np.zeros_like()+1 and np.zeros_like()+2, it will be very clear that these 3 data sets overlap very heavily as opposed to what you say @9:00 (unless I have misunderstood what you said there). Otherwise great lectures, thanks a lot!
great suggestion!
Line 17th code needs modification as follows:
sns.FacetGrid(df,hue="species").map(plt.scatter,"petal_length","sepal_width").add_legend();
plt.show()
Hello Sir, could you please help me out with multivariate correlation through SPSS??
Isn't multivariant analysis a consolidated representation of bivariant analysis, where all possible combinations of bivariant analysis are represented together?
Hi I have a doubt these plots are ok for small datasets and interesting while learning but is these graphs helps when handling real time data or while working with real data science projects.
Use DataExplorer package in r
Thank you very much for your great videos.
However, this is the first video of your playlist that I could not understand. The dataset was not clear and you did not explained much.
Sir can u plz make one video with use of spss and univariate, bivariates and multivariate analysis
Just use the graph node and plot your histograms and scatter plots for all the variables you require.
Question: it is possible to use categorical features to make predictions for a numerical targer variable ??
can you also include link to dataset used
Why not just plot histograms for every feature for univariate analyis?
sir i think there is 'sepal length' instead of 'petal length' in xlabel. am i wrong or right??
sir can you provide some practice dataset
Hi krosh what will be the codes for R for same analysis??
Hi Krish, Why you are keeping the Y-axis as 0. In the previous lecture also it's not explained. In graph you just kept it as 0.
Please reply.
hey , he's just trying to visualize the dependency of output feature on that particular feature i.e. "petal_width" .so there is no need for y axis if u want u can put x =0 , and plot it on y axis and we endup with a vertical stack :)
if we have more than 10 or 20 features, how can we do multivariate analysis. will it be visible clearly in pairplot
why put semicolons after your lines of code?
❤❤❤❤❤❤❤❤❤❤
Are those 4 plots along with the diagonal density plots?
so from multivariate if we some graphs with overlapping variables like sepal length and sepal width, we can ignore one of them while doing any further analysis ? Please help here
what if we have dimension in order of 100s...??
sir a virginica or versicolor kaya ha
sir, what is web address you are using and is it free or paid please give some details about that also.
How to do eda when we have many features, say 20+ and all are non correlated.
Hello sir huge fan following ur ML playlist and I'm getting error in stringIO sir I also saw youtube video but I'm not able to slove the error it say No module something can u please guide me I'm stuck in your 7th playlist pls let me know sir it will be helpful
How orange , green colours came into picture, coz we didn't mention any color parameters like palette, colour?
Colors are automatically assigned if you don't mention them in the parameters
After executing the same code for univariate analysis my output is not color distributed as shown in video. can anyone help
Sir how we can the data ???
Hello sir how to know categories of given data in python? For eg. Here We want to know species categories?
if u r talking about getting the unique values in species then following code will help:-
for unique numbers of species - iris_data['Species'].nunique()
for names of those unique species - iris_data['Species'].unique()
Sir how much is necessary to know to get job in data science (is there any bounds)
My personal recommendation would be to start with python , basics of SQL and couple of ML algorithms i.e regression. It all comes to how many projects you have actually created..good luck 👍
When I import iris in python , no commands is working I am getting error as "AttributeError: info" , and also "AttributeError: describe" , please solve this, why I am getting this error
sir evertime whenever i am running code then also error messege comes with "name df is not defined" can you please help me
try to load the data once again
Hi Krish...when I am executing this code 'plt.plot(df_setosa['Sepal.Length'],np.zeros(df_setosa['Sepal.Length']),'o') it is returning a value error that reads as 'sequence too large; cannot be greater than 32'. How did you execute without getting this error. How to resolve?
U haven't written like after np. Zeros_like
In univariate analysis, you have taken sepal length and labelled it as petal length , can you explain me about that.
its by mistake
how you are calling a url or internet file to read in pandas..... its like impossible for me to do... plztellme how?
Switch on internet would make it work
coaching institutes just looted me
taught nothing like this
I can't believe you pronounced it as hueee....😂😂
Thank you