Testing Assumptions Linear Regression in R

Поділитися
Вставка
  • Опубліковано 17 гру 2024

КОМЕНТАРІ • 25

  • @timmytesla9655
    @timmytesla9655 9 місяців тому +3

    This is gold. I searched everywhere and this is the best video I have seen on this topic. Thank you!

  • @SmartLearn-Everything
    @SmartLearn-Everything 10 місяців тому +1

    Thank you very much. I am from Uzbekistan. I am learning R. This video is very useful to me.

  • @salmaaita8967
    @salmaaita8967 2 роки тому +1

    This is by far one of the best videos addressing the validation of the regression assumption. Thank you so much for making it easy and simple to understand!

  • @matteoframba1854
    @matteoframba1854 2 роки тому +1

    your explanation is INCREDIBLE, exactly how it's supposed to be.
    You have a new sub.
    Great job!

  • @ahmetjeyhunov4435
    @ahmetjeyhunov4435 Рік тому

    Thank you for the excellent video. Great explanation. I have a question. You ran diagnostic tests for an entire mutliple regression model, which has multiple IVs. Aren't we supposed to run seperate diagnostic tests for each IV-DV relationship in a model, instead of running diagnostic tests for the entire set of variables in a model at once? I have checked the other tutorials online, and they applied the former approach. Or is it matter of preference?

    • @RegorzStatistik
      @RegorzStatistik  Рік тому +1

      When it comes to assumptions concerning the residuals (normality, homoskedasticity) you have to test for the entire multiple regression model you use, in the end, for your hypothesis tests. For linearity, there bivariate scatterplots can make sense.

    • @ahmetjeyhunov4435
      @ahmetjeyhunov4435 Рік тому

      Thank you for a quick reply. My multivariate model (including control variables) doesn't violate any of the assumptions. The moment I regress DV on the individual IVs from the multivariate model and conduct diagnostics for them, results come awful. The residuals are not normal, and the variance is not constant. This work is for my graduation thesis. Do you think an inferential analysis from such case would still be reliable?

    • @RegorzStatistik
      @RegorzStatistik  Рік тому +1

      @@ahmetjeyhunov4435 One crucial implicit assumption for a regression analysis is that you have include all relevant predictors. So, it is quite possible that with a single regression you get a violation of assumptions.
      Example:
      Gender (m/f) as a covariate has a significant influence. If you don't control for gender, then it is quite likely that the distribution of the resiudals will be nonnormal because this could lead to a bimodal distribution of the residuals (one peak for males, the other for females).
      I haven't seen any academic literature yet that suggests that you should test the assumptions for a mulitple regression by looking at the single regressions using only one IV each.

    • @ahmetjeyhunov4435
      @ahmetjeyhunov4435 Рік тому +1

      @@RegorzStatistik Awesome! Thank you so much!

  • @hannahantonia2540
    @hannahantonia2540 2 роки тому +1

    Very helpful and well explained video! Dankeschön :)

  • @lisakrijnen2453
    @lisakrijnen2453 3 роки тому +1

    Very usefull video! Thanks! I have one question: I am running a moderated mediation model using Process Hayes method (model 7). Do I have to check these assumptions as well? And if so, do I make a reg.fit model using lm with (DV ~ IV 1 + IV 2....+ W + M1 +M2 +M3)? So, would I add my mediators and moderators as independent variables and run lm so that I can check all assumptions? Many thanks in advance!

    • @RegorzStatistik
      @RegorzStatistik  3 роки тому +1

      Here is a video about assumption checks with a PROCESS model:
      ua-cam.com/video/D6t5TZ0-j6g/v-deo.html
      So, yes, you could replicate the different regression models PROCESS calculates in R and check the regression assumptions there. With 3 Mediators I think you would get 4 regression models (one for the moderated prediction of each mediator and one for the prediction of the DV), you would have to check those regarding the regression assumptions.

    • @lisakrijnen2453
      @lisakrijnen2453 3 роки тому

      @@RegorzStatistik Many thanks! That video is very usefull as well! One thing I still wonder is regarding the lineairity assumption. Is this supposed to be checked by plotting the IV on the x-axis and the DV on the Y axis, or do you have to look at the residuals? On some websites I see the first, but sometimes it's only about the residuals.

    • @RegorzStatistik
      @RegorzStatistik  3 роки тому +1

      @@lisakrijnen2453 For a simple regression you could look at the residuals, too. But for a mulitple regression I prefer bivariate scatterplots (one predictor + criterion), because with the residuals alone it's hard to spot which predictor has a nonlinear relationsship to the criterion variable.

    • @lisakrijnen2453
      @lisakrijnen2453 3 роки тому +1

      @@RegorzStatistik Ok thanks! I have some trouble interpreting my scatterplots, as they are not clearly lineair. Do you know whether there are tests / other ways to assess whether it is a lineair relationship?

    • @RegorzStatistik
      @RegorzStatistik  3 роки тому +1

      There is the rainbow-test (0:05:47 in this video), a significance test for linearity. It is part of the lmtest-package in R.
      cran.r-project.org/web/packages/lmtest/lmtest.pdf

  • @haoranxi3319
    @haoranxi3319 2 роки тому

    Terrific video! Thanks!! Do you know how to label those outliers in the plots ? I run the same code but the outliers in the plot are not labelled as it showed in this video.

  • @jacekbuczny4567
    @jacekbuczny4567 3 роки тому +1

    Excellent! Thank you.

  • @popps6402
    @popps6402 Рік тому

    Hey, great video and everything basically worked, however I have a model with gender as independent variable (0 and 1) and rapport level as dependent variable (scale data). That makes some ofthe graphs look kinda weird since all the dots are exclusively on 0 and one and this code does not work with a categorical IV:
    ols_vif_tol(reg.fit)
    Do you have advice on that?

    • @RegorzStatistik
      @RegorzStatistik  Рік тому

      For a binary IV you don't have to check the linearity assumption. Nonlinearity can only be an issue with more than two values for the IV.

  • @krishnaiyer2556
    @krishnaiyer2556 3 роки тому

    sir this video runs on a data? where is the data sir?

    • @RegorzStatistik
      @RegorzStatistik  3 роки тому

      Unfortunately, that data set is not publicly available.

  • @SmartLearn-Everything
    @SmartLearn-Everything 10 місяців тому

    I want to write an article for Scopus. If someone knows Econometrics/analysis well, we will write together. I will write the rest.