Multiple Linear Regression using python and sklearn

Поділитися
Вставка
  • Опубліковано 30 січ 2019
  • Multiple linear regression is the most common form of linear regression analysis. As a predictive analysis, the multiple linear regression is used to explain the relationship between one continuous dependent variable and two or more independent variables.
    References - Kirell Ermenko Projects On Linear Regression. This video is dedicated to him
    Please subscribe and share the channel
    Simple Linear Regressiion link: • Simple Linear Regressi...
    Github link: github.com/krishnaik06/Multip...
    You can buy my book where I have provided a detailed explanation of how we can use Machine Learning, Deep Learning in Finance using python
    Packt url : prod.packtpub.com/in/big-data...
    Amazon url: www.amazon.com/Hands-Python-F...

КОМЕНТАРІ • 134

  • @somjitdas6034
    @somjitdas6034 2 роки тому +6

    I was struggling with Linear and Multiple regression for over 1 month. Finally, the puzzle is solved. Thanks a million, Krish. You are simply outstanding.

  • @ShahidIqbal-sq7bf
    @ShahidIqbal-sq7bf 4 роки тому +1

    You are a genius I have spent so much for a course that offered me nothing. Thank you again Sir may God bless you

  • @ijeffking
    @ijeffking 5 років тому +2

    Very nice pointers. Thank you Krish. Keep up the good work.Looking forward to your Deep Learning Videos. Learning a lot from you.

  • @annalyticsannalizaramos5890
    @annalyticsannalizaramos5890 3 роки тому +3

    I like this comprehensively explain the details. Thank you for this content. Excellent job!

  • @Rafian1924
    @Rafian1924 4 роки тому

    Multiple linear regression made so simple Krish sir.. I am highly indebted to you. Great job!!

  • @drvk999
    @drvk999 5 років тому +2

    You described the multiple regression very well...Thank you. I would appreciate if you can do a detailed video on r square and adjusted r square etc (Intuition and concept wise)

  • @akilaj
    @akilaj 3 роки тому

    This is a good one. Plain and simple for a beginner. Keep up your work

  • @nanditasahu2358
    @nanditasahu2358 3 роки тому

    Sir your content are brisk clear , the content is accurate as well as the explanation .Thanks for the effort.

  • @noahrubin375
    @noahrubin375 3 роки тому +1

    Yep, this was the video I was looking for!

  • @vedantmhatre4731
    @vedantmhatre4731 5 років тому

    Nice Explanation. You deserve more viewers.

  • @meetmeraj2000
    @meetmeraj2000 4 роки тому +4

    Sir, dont we have to first check the assumption for linear regression before fitting into the model? and adjusted r2 should be a good option in multiple linear regression?

  • @deepakkumarsingh9781
    @deepakkumarsingh9781 5 років тому +1

    Thanks Sir for this video..
    It's really unique and helpful..

  • @manishhedau119
    @manishhedau119 4 роки тому

    You are doing great of the world is teaching and it will helpful for so many people I personally thank you a lot for doing this kind of good things for us ..
    Salute you man

  • @abhishek-hb1vg
    @abhishek-hb1vg 5 років тому +7

    Please make a video on Multiple Linear Regression using the stats model. with forward and backward elimination technique.

  • @tanvisharma8346
    @tanvisharma8346 5 років тому +1

    Thank you. This is Very Helpful!!

  • @harisjoseph117
    @harisjoseph117 3 роки тому

    Very nice Explanation. Keep it up Krish.

  • @rajsuraaj2125
    @rajsuraaj2125 5 років тому +1

    Hi Krish Thanks for adding all these videos which are very helpful .. and plz plot a pair plot graph for this MLR model. Thankyou.

  • @shilpikulshrestha9487
    @shilpikulshrestha9487 5 років тому +1

    Help full video, thank you sir

  • @sohaibzq9649
    @sohaibzq9649 4 роки тому +6

    You missed the most important part of plotting the best fit line
    The second question is when we have n dimensions (n variables in linear regression) can we apply pca ??

  • @shivtripathi2843
    @shivtripathi2843 4 роки тому

    Very useful tutorial. Keep it up!

  • @dr.nafeesahamad8567
    @dr.nafeesahamad8567 3 роки тому

    This is a really good video, Sir. Thanks

  • @gorenekli
    @gorenekli 2 роки тому

    Thank you for your detailed explanation.

  • @shakyasarkar7143
    @shakyasarkar7143 4 роки тому +1

    Sir this was super useful!!! Hats off!
    But since we are deleting the California feature, then how are going to find the co-efficient for that particular dropped out California independent feature?

  • @gaznavie8420
    @gaznavie8420 4 місяці тому

    Whatever i have learnt from the theory today finally i came to know how to implement Thankyou Sir

  • @amazon628
    @amazon628 4 роки тому

    Hello Krish. Really good work. We do not these in-depth knowledge on high end paid courses.

  • @sowjanyay3424
    @sowjanyay3424 Рік тому

    Extraordinary explanation 👌👌👌

  • @nehasehta7762
    @nehasehta7762 3 роки тому +2

    Sir how we decide that we should go for linear regression, there may be non linear relationships in dependant and independent features.

  • @SatendraYadav-cs1yh
    @SatendraYadav-cs1yh 4 роки тому +1

    This video is really help to me thanks you so much bro

  • @shaileshmallya9857
    @shaileshmallya9857 5 років тому

    Good video. Thanks Krish.

  • @ThobelaGoge
    @ThobelaGoge 2 місяці тому

    Spyder looks cool...I might switch to it😃. Great video man👌

  • @gokuljith
    @gokuljith 4 роки тому +1

    Kudos Krish bro. In LinearRegression from sklearn.linear_model, how do we reduce the error. Is there no gradient descent to reduce the error or Linear regression itself gives you the output ?

  • @riteshshrimali1358
    @riteshshrimali1358 2 роки тому

    excellent explanation of MLR with python coding

  • @okamasnr4891
    @okamasnr4891 4 роки тому

    good presentation..kindly do a video on logistic regression where you are using to make prediction..

  • @middle_class_Me
    @middle_class_Me 4 роки тому +2

    1)sir take some big data sets and explain like daily time works....
    2)how do we identify the algorithm by seeing data set also explin sir plzzz..
    3)make some analysis parts of an data sets how to analise the data by seeing those datasets

  • @mmouhnari
    @mmouhnari 3 роки тому

    Good explanation even if the sound wasn't very good :) But we would like to know how could we make a data visualisation event with 3 or more explicative or dependent variables before the regression and surface that could occur once we get our model. Plz if you have any idea that's matter to be shared. Thank you !

  • @user-wj9nc2yh3i
    @user-wj9nc2yh3i 11 місяців тому

    When you come in board is good to understand sir you are well and excellent teacher

  • @bobbyreynaldo7266
    @bobbyreynaldo7266 3 роки тому

    I like your explanation

  • @jongcheulkim7284
    @jongcheulkim7284 2 роки тому

    Thank you so much.

  • @haridaasan
    @haridaasan Рік тому

    Thanks for the video sir - but could you say why we have not scaled the values using standard scaler?

  • @ssalvi28
    @ssalvi28 2 роки тому +1

    That was a great explanation Krish, thank you!
    **Doubt : If the number of states would have been 20 (or greater ) how to proceed in such case? **

  • @SANYOG41
    @SANYOG41 4 роки тому

    @krish amazing explaination

  • @shubhendusingh5143
    @shubhendusingh5143 3 роки тому

    This was a good start to explain the basics of regression. But this doesnt seem complete as there could have been a visualization piece as well to explain how the regression worked. Also, what did we do with the train dataset? Is there a follow-up video on this?

  • @bhalchandrakolekar8176
    @bhalchandrakolekar8176 4 роки тому

    hey krish, why didnt you use feature scaling for independent variables?

  • @BhupinderSingh-rg9be
    @BhupinderSingh-rg9be 4 роки тому

    sir its a request that u pls upload more and more data set on your github with code so that we can practice more.Thank u sir!

  • @hridoyahmed9964
    @hridoyahmed9964 4 роки тому

    Its a good video, but sir ensure the good sound quality of the video.

  • @kaveenjayamanna1509
    @kaveenjayamanna1509 2 роки тому

    Hi Krish,
    Don't we have to find out the P-value for this model? or R squared is just good enough?

  • @ManpreetSingh-ew8qs
    @ManpreetSingh-ew8qs 5 років тому

    Hey can we use backward elimination method? And what's its purpose

  • @chakree100
    @chakree100 5 років тому

    Thanks a lot krish

  • @jogindersingh4281
    @jogindersingh4281 5 років тому

    What made you use MLR model on this data set?? Why not other model..i have understood the concept of MLR but how do we know when to use it??

  • @ballesulaimonolanrewaju6451
    @ballesulaimonolanrewaju6451 2 роки тому

    Hi @Krish Niak,
    1. Why didn't you use IDE Jupyter notebook for Multiple Linear Regression?
    2. Why did you decided to use IDE Spider?

  • @2010aurnob
    @2010aurnob 4 роки тому +1

    Great video!!!
    What does test_size=0.2 imply? Is it going to take 20% of data from the dataframe for randomly testing?
    Also, is it possible to do multivariate non-linear regression in python?

    • @rajdeeproy5264
      @rajdeeproy5264 2 роки тому +1

      we are dividing the dataset into 80:20 ratio of train and test split respectively. So test_size = 0.2 implies 20% test and 80% train

  • @rbwebcom1658
    @rbwebcom1658 4 роки тому

    sir how to check with a simple an get a predcition..Thank you

  • @akhiljose7539
    @akhiljose7539 4 роки тому

    good video! Thank you

  • @raviteja2475
    @raviteja2475 5 років тому +1

    Thanks for the explanation....
    Sir if R2 coming nearer to zero....in this case what we need to do....How to check which attributes are spoiling the regression line

    • @aashishdagar3307
      @aashishdagar3307 4 роки тому

      @ravi Teja there are multiple methods to feature selection(attributes) like forward selection, backward, etc you can use either it's your comfortability.

  • @ShahidIqbal-sq7bf
    @ShahidIqbal-sq7bf 4 роки тому

    if I have to plot a graph to visually understands the difference between y_pred and y_test is there a code to do it I have checked multiple sites but none has answered my question

  • @arunxavier502
    @arunxavier502 4 роки тому

    Hi..when u compared the data of testy and predy..the indexes were different? is that ok?

  • @raghuvamsi8762
    @raghuvamsi8762 5 років тому

    can you explain visulatization of multiple linear regression

  • @cypheranalytica6066
    @cypheranalytica6066 5 років тому

    Nice Video!

  • @devendarreddydev8545
    @devendarreddydev8545 3 роки тому

    Tq sir for this vedio and also provide graph for this plz

  • @osamazafar7350
    @osamazafar7350 5 років тому

    Sir. A big thanks

  • @shubhamkundu2228
    @shubhamkundu2228 3 роки тому

    Can we use Ridge and LAsso Regression models as well where we use Multiple Linear Regression?

  • @prabuddh_mathur
    @prabuddh_mathur 3 роки тому

    Hey! I was wondering why you didn't use OneHotEncoder from sklearn.preprocessing??
    It would have been a nice two step conversion as follows
    from sklearn.compose import ColumnTransformer
    from sklearn.preprocessing import OneHotEncoder
    cl=ColumnTransformer(transformers=['encoder', OneHotEncoding(), [*Column index which needs to be encoded*]], remainder = 'passthrough')
    x = np.array(cl.fit_transform(x))

  • @user-fh3gr7sq1z
    @user-fh3gr7sq1z 4 роки тому +1

    this tutorial help me, but it missimg few thinsgs : how to find PVALUE of each x + how to calculte cost function + how to do prediction on new dataset with the model we made .... do you have this kind of tutorial to ?

    • @ShahidIqbal-sq7bf
      @ShahidIqbal-sq7bf 4 роки тому

      I believe the sckiit learns automatically calculates the best model and the best value for each variable and makes the prediction so you don't have to manually do it.

  • @santosh_Benkiee
    @santosh_Benkiee 3 роки тому

    Thank you 🙏 sir

  • @gauravtak9787
    @gauravtak9787 4 роки тому

    sir u predict X_test data if we want to predict some random data given by user how to predict that......how to giev random data for each variables for new prediction....

  • @sangamithrajen
    @sangamithrajen 4 роки тому

    hi Krish, what is the purpose of converting categorical predictors into indicators like 0,1, or 2? Does it mean we can do manipulation with quantitative values only?

    • @theshishir24
      @theshishir24 3 роки тому

      ML algo always take numerical values.
      Hope it helped.

  • @sauravksingh
    @sauravksingh 3 роки тому

    Can you upload a video on Mulitiple variable/feature Logistic Regression

  • @omduttpandey8201
    @omduttpandey8201 4 роки тому

    Hey Krish its a humble request please explain practically conversion of categorical features practically as i m getting problems with that even i have watched several tutorials such as the one of kirell eremenko of super data science and now even these pd.get_dummies isnt working and i m getting errors like index errors and value errors.

  • @noteuler314
    @noteuler314 3 роки тому

    please use regularisation and MCA also sir with multiple linear regression

  • @sarvjeetbhardwaj6964
    @sarvjeetbhardwaj6964 2 роки тому

    In this problem , why we didn't we scale the features using Standardscalar or Minmax ?

  • @syedmuhammadaskarizaidi2294
    @syedmuhammadaskarizaidi2294 3 роки тому +1

    Can non-numeric feature problem be solved by label encoding?

  • @mukulmishra2296
    @mukulmishra2296 5 років тому

    please do make some videos on Target encoding.

  • @abhisknowledge5514
    @abhisknowledge5514 5 місяців тому

    sir just one small doubt instead of x=dataset.iloc[:,:-1] can i use dataset.drop("price")

  • @MrJaga121
    @MrJaga121 5 років тому +1

    Hi,
    Can you explain when to go for linear regression? What are the pre requisites to check if a input and output will fit in a linear regression model or not.

  • @learnforfuture2611
    @learnforfuture2611 3 роки тому

    Sir , is random state will affect our model if we increase it.

  • @rohitlalwani8462
    @rohitlalwani8462 3 роки тому

    If while doing get dummies we would not have dropped california state will it have an impact?

  • @tejasahuonly4u325
    @tejasahuonly4u325 4 роки тому

    Sir after including sklearn library module not found error id occur what should I need to do sir

  • @ammanh
    @ammanh 3 роки тому

    if my R^2 value is not close to 1 than what should i do?

  • @shubhamkundu2228
    @shubhamkundu2228 3 роки тому

    what is the disadvantage of dummy variable trap? What if I don't drop any dummy variable, what's the impact of it in ML Model? Reason to drop any one dummy variable ?

  • @sidduhedaginal
    @sidduhedaginal 4 роки тому

    Hey Krish, Good explanation.
    have a doubt here, why do you take X_train, X_test, y_train, and y_test am confused? Kindly clarify

    • @GagandeepSingh-qs2vh
      @GagandeepSingh-qs2vh 4 роки тому

      We usually reserve some data for testing purpose. Let's say you trained your model on 80% data. Now, it might be possible that your model says it has accuracy of 90% on training data but that doesn't mean it is a good model. So, you'll have to test it on testing data which is unseen for model. In simple words it is going to tell you how your model is going to perform in real world scenario (the data which it has never seen).

  • @abhinavm9685
    @abhinavm9685 3 роки тому

    You are god!
    thanks a lot for this!!

  • @ssshanmugam4514
    @ssshanmugam4514 5 років тому

    Nice

  • @marishakrishna
    @marishakrishna Рік тому

    How we calculate beta not ie. the intercept

  • @praveenkumarsingh8723
    @praveenkumarsingh8723 4 роки тому

    sir can you please provide videos on neural network implementation using python

  • @HariPrasad-et6ci
    @HariPrasad-et6ci 4 роки тому

    sir ,please explain how to test on new data...

  • @akashgayakwad9550
    @akashgayakwad9550 4 роки тому

    U used index as bo is it right?

  • @dilippradhan94
    @dilippradhan94 5 років тому

    Bro make a video on Ridge regression

  • @tejassutar4198
    @tejassutar4198 4 роки тому +1

    If more than 5 cities are present in State Column how to check that in pyhton?

    • @AK-ws2yw
      @AK-ws2yw 3 роки тому

      I think Linear Regression handles Numeric type of data, if your aim is to solve with categorical data as input variables then go for Logistic Regression

  • @badalsingh3733
    @badalsingh3733 5 років тому +1

    Would you please visualize it.

  • @jayeshjadhav3024
    @jayeshjadhav3024 4 роки тому

    why you didn't plot graph?

  • @tuhindas6745
    @tuhindas6745 3 роки тому

    Should we not perform feature scaling (MinMax or Normal) before fitting the dataset into training and test sets ?? Is it not an important step? If anyone from the viewers can help..

    • @bobbygajbhiye3139
      @bobbygajbhiye3139 3 роки тому

      Not sure, but the first 3 features are in the same scale thats why he might not applied feature scaling

    • @tuhindas6745
      @tuhindas6745 3 роки тому

      @@bobbygajbhiye3139 No, first 3 features are fine, but there are dummy variables which are in the range of 0 and 1.. So feature scaling should be performed so that everything can be of similar scale.I am still not sure about this..

  • @kanyadharani6844
    @kanyadharani6844 3 роки тому

    Can we use mapping instead of getting dummies and concatenating it.

  • @anirudhsrivastava3530
    @anirudhsrivastava3530 2 роки тому

    sir do for decision tree regressor model, adaboost, xgboost please

  • @abhishekpawar3845
    @abhishekpawar3845 4 роки тому

    how to plot it on graph?

  • @sairampenjarla
    @sairampenjarla 3 роки тому

    guys, we can remove test_size and put random state = 10 to get 98% accuracy

  • @zaedgtr6910
    @zaedgtr6910 10 місяців тому

    Why does r2_ score changes each time i run this code?

  • @abhijotsingh8502
    @abhijotsingh8502 2 роки тому

    Found input variables with inconsistent numbers of samples: [100, 50]. Sir this error is coming can you help me solve it

  • @skkAI
    @skkAI Рік тому

    1 ques: can't we implement cross validation in it???

  • @jithperingathara1443
    @jithperingathara1443 5 років тому

    why don't we drop the index column instead of considering it as beta 0.

  • @praveenbhatt3127
    @praveenbhatt3127 3 роки тому

    I am getting R squared value of 0.95, its so coooool.