Multivariable Linear Regression in R: Everything You Need to Know!
Вставка
- Опубліковано 23 лип 2024
- The world is complex and messy because multiple factors constantly affect each other. That’s why univariable models fail to describe complex relationships. In this video, we’ll explore multivariable models, which provide a more accurate representation of reality. Expect to learn how to effectively visualize model results, how to extract the most knowledge out of multivariable models, how to interpret the model correctly, and much more.
If you only want the code (or want to support me), consider join the channel (join button below any of the videos), because I provide the code upon members requests.
Enjoy! 🥳
Welcome to my VLOG! My name is Yury Zablotski & I love to use R for Data Science = "yuzaR Data Science" ;)
This channel is dedicated to data analytics, data science, statistics, machine learning and computational science! Join me as I dive into the world of data analysis, programming & coding. Whether you're interested in business analytics, data mining, data visualization, or pursuing an online degree in data analytics, I've got you covered. If you are curious about Google Data Studio, data centers & certified data analyst & data scientist programs, you'll find the necessary knowledge right here. You'll greatly increase your odds to get online master's in data science & data analytics degrees. Boost your knowledge & skills in data science and analytics with my engaging content. Subscribe to stay up-to-date with the latest & most useful data science programming tools. Let's embark on this data-driven journey together!
I am a PhD candidate and I am using these models in my data analysis. Great video so far on this topic!
Glad it was helpful! Thanks for watching!
Hands down some of the most informative videos that exist on Linear models. Very concise too ! Thank you 🙏
You're very welcome! And thanks for such a generous feedback! ☺️
Excellent work! Thanks YuzaR
Glad you like it! Thanks for watching!
Wow! One of the best videos I have ever seen. Vwer informative.
Wow, thanks for such a generous feedback! If you know some folks who also would benefit from it, feel free to share it! I wish I had something like this video as I started to learn R. I hope the other videos are helpful too! Thanks again! Cheers!
Very good explanations! Very good visualizations! Great - thank you!
Glad you enjoyed it! Thanks for positive feedback! 🙏
Outstanding! Thank you so much for putting so much effort in these educational videos!
So nice of you! Greatly appreciate your positive feedback!
Fantastic and concise content as usual with elegent explanation! Keep up your great work!
Much appreciated! I always enjoyed creating content, but such warn feedback as yours tells me that it's also useful! Thanks so much!
Incredible work! Thanks :) Could you please deepdive into tidymodels too?
Great suggestion! :) And I actually tried to do that for linear model twice, but broke up due to a unnecessary complexity of tidymodels for simple models, like LM of GLM. Paradoxically, the functions I use here are much more insightful :). But I still have it in the back of my mind to dive deeper into tidymodels. I just created one video on resampling in tidymodels, and it did not perform better as good as one might think in the age of machine learning. The emmeans video is better, may be because it's more useful and pragmatic. Anyway, thanks for watching and for your nice feedback!
I'm not surprised by your observation. Why would anyone do machine learning using the complicated recipes in TidyModels when SciKitLearn in Python is much tidier and straightforward?
well, one day I have to learn Python too I am afraid ;)
Thanks for the great video!
Glad you liked it! Thank for watching!
thanks for sharing
Thanks for watching!
Thanks so much!
You're very welcome! Thanks for watching and commenting! That's the best support for the channel!
Beautiful
Thanks 🙏
Hi Prof/Dr. Yury, I really love your videos because they are very intuitive. Please share how to analyse Likert scale questions using R. Thank you.
Great suggestion! It's actually already on my to-do list! Time is the limiting resource, because I have a normal job to pay the bills. But I'll do my best to produce more content. Thanks you very much for feedback and for watching!
Great work. Please do videos on Survival Analysis in R as well😊
Thank you very much! Will definitely do. Just made some videos on linear regression in R. Logistic will follow and then the rest of models including Survival and ML one day. 3 years ago I've done two videos on survival already, but they are old, theoretical and low quality. I'll redo them in a more concise and R focused way. Thanks for nice feedback and for watching!
Great and to the point. Thanks!
I'd like to see you explain some topics like propensity score matching, multiple imputation, and principal component analysis.
I guess that would be fantastic if you present them in a such elegant way.
Thanks, Ahmed! I already have an older video on Imputation, but it still works, I use {missRanger} package for imputation to this day. Feel free to check it out! Thanks for the suggestions, I'll put the on the list! I plan to cover basic most common models first though, like logistic regression, before coming to PCA. So, please, stay tuned. Cheers!
1:12 It gets worse. In the context of ANOVA you have 'factors' for categorical independent variables and 'covariate' for numerical ones. 😢
1:50 yeah, performance::check_model() is the bomb.
yeah, factor is another useless name. drives me nuts. especially, when ML folks come and name old good known things in a new fancy manner :), just complicating things for everyone while feeling great. check_model rocks 🤘
Thanks
Welcome! As always! Thanks for watching!
This is super cool! Best rstats content on UA-cam! Will you show multivariate (multiple outcomes) next?
Wow, thanks for such a generous feedback! 🙏 Love it! Multivariate? Eventually, but I never understood what's the difference between multivariate models with 3 different outcomes and 3 separate models of the same outcomes? Is there any connection between outcomes, are they somehow weighted or so? Because if not, I already created a video which is much better than that "Many Models At Once". If yes, please, let me know, and, if you can, recommend some good literature. Cheers!
@yuzaR-Data-Science my understanding is that if the outcomes are correlated but still different, then, not accounting for this could bias the estimate, and on the other hand, accounting for it could reduce the standard error, providing more precision (and statistical power). For example, if you want to study the effect of a medicine on some sort of mental health phenomenon, you could ask for patients to self-report their state, but in the same study, you could also ask their doctors to give their ratings, and on top of that you could maybe use their smartwatch to collect some behavioral data. All three are indicators of the same thing but also a bit different.
that's interesting, I work in vet-med area, and we rarely do multivariate models, but I would have a look at them, and in the end, if I'll find them useful, I'll definitely make a video on them. For now I am concerned about the assumptions. what if assumption for one response hold, but the assumptions for the other not? What model do you use then? Quantile regression? Is there something line a non-parametric MANOVA in R anyway? soooo, many questions :)
Thank you sir
So nice of you! Thanks you for watching and commenting!
Great video
Glad you enjoyed it! Thanks for watching!
Hi, thanks again for this great video.
I really enjoy your explanations, but your blog was much more useful for quick access to concentrated information.
Videos are very helpful to understand from scratch, but seeing important screenshots in your blog quickly during a time-sensitive project is very important.
Plus, Koji is also down.
Thanks a lot for your nice feedback! Unfortunately, my blog was shut down, since Netlify wanted me to start paying for increasing traffic. Since R is open source and I do not earn anything from my blog, I refuse to pay for trying to do something useful for the world. I know the blog is useful, thus I am working on a solution to find the alternative for Netlify. But since I am not the IT guy, it might take time. If you have any suggestions what service is really free for static websites, don't hesitate to inform me.
As for the Koji, this nice startup was sold, thus, I took the link down from the last video. But if you would like to support me somehow, the best free support is likes, comments and watch-time. While when you would support me financially, you can Thank the video (although youtube takes ca. 50% of that), send me something via paypal or, become a member of the channel. I still did not set up the membership, because I don't believe anyone your become a member even for 1$ per month. But may be I am wrong? Anyway, thanks for watching and being part of my journey on this channel mate!
I'm so into your channel for your clear concise explanation. Watching and learning from each of your videos one by one. Thank you very much.
[One problem I have been facing when using check_model() command, It shows Plot area is too small. I've tried the solutions it shows in console panel, but couldn't make it work, didn't find any solution online]
Thank you so much for such a nice feedback! I greatly appreciate that!
Sure, it's normal. There are several solutions for it. First, just increase your plot area via dragging the plot window higher and wider. Then, you can click on the + in the plot area to open an extra window, like I do in my videos. And lastly you can use ggsave command to save the last plot like I do in videos. Let me know whether one of those worked.
i like equatiomatic package (EDIT i found it)
Thanks again for your efforts !
Excellent! :) you are very welcome!
It is no longer in cran. Where did you install it from?
try this: remotes::install_github("datalorax/equatiomatic")
As usual: excellent class! Best content about stats on yt! But it is not possible to access the scripts via the link in the description (it is broken)
Thanks for such a generous feedback, Alex! Much appreciated!
Sorry, Netlify shut down my blog since they want me to pay for increased traffic. I refuse to pay for doing something useful for the world (without earning absolutely nothing) and since R is open source. But I want to reopen it ASAP, as soon as I find an alternative for Netlify. It'll take some time though, because I am not an IT guy. But since UA-cam is still free, please use the videos till blog is up and running again, since my blog is actually the script for the video, word by word, code by code. Thanks for understanding!
Excélsior!
thanks mate!
thanks a bunch!! I have a random question: how to add a proportional weightage to a numeric variable on the outcome variable in a lm/glm and visualise. It is much like giving the weightage to the sample size of each study “n” to the outcome. Much like meta analysis but it is a pooled analysis. Thanks in advance
hmm, before I say something wrong or stupid, I'd rather say I don't know.
There is a "weights" argument in the "lm/glm" (lm(data = mtcars, am ~ mpg, weights = )), but I never used it to be honest.
@@yuzaR-Data-Science thanks!
you are always welcome!
First here. Good job as usual.
:) Thanks mate! Greatly appreciate your support!
Now that I'm not rushing to be the first commenter, I have to ask:
Why use ggeffects instead of emmeans? And although you pointed out that the result is not averaged over all levels of all factors, you should have mentioned that emmeans (or maybe other libraries) should be used in that case. What I use is sjplot tab model to just tell which factor have significant effects, and then emmeans with contrast/pairwise which I like to plot out.
By the way, what do you think about showing both in a publication?
Another question is that does it not matter that your multivariate model has only + and not * between all the predictors.
Sorry, I know these may not all be short answer stuff, but I'll appreciate your response as always. Thanks.
Hi mate, first, the ggeffects instead of emmeans it's just convenience for visualization. In fact, ggeffects uses emmeans in the background. Tab_model is fine, but does not produce contrasts, emmeans and tbl_regression(add_pairwise_contrasts = T) do. I always show contrasts in my publications! Don't know what do you mean by "both". You can't use interactions between all the predictors, your model would most likely collapse and be hardly interpretable. Interactions are cool, and I use them, but they can be pain in the ass, so I try to use only bivariable interactions (between only two predictors), not multiple interactions in the same model. Cheers!
You set the bar high sir, I am new to this channel but I found it highly informative and educational. Please can you share the code in pdf or some other means. How can I find you on githu. I am YBMengist from Ethiopia.
Thank you again for your nice feedback. Have a look at my responses to your two previous questions. Kind regards!
Which version of RStudio are you using? I'm running into error's often.
The last one. But R Stuio is most likely not a problem. Just update everything: R, Rstudio and packages 📦 let me know whether it worked
@@yuzaR-Data-Science It's mostly when I do Multiple LR with many features. I already made sure all were numeric and no categorical but it doesn't plot for check_model(). For some other lr it still works..?
still, the rstudio is not a problem, it could be too many predictors, bad data quality, total separation or something else. get it to work with a few predictors first, then you can select good predictors and go with them. check_model works fine a my pc with most models.
Is there any way .. from where i can get the data used by you in this video
of coarse!
install and load ISLR package, the data is in there ;)
library(ISLR)
Hi yuzaR, is your website down? I'm trying to get get the R code from some of your videos, but is unable to visit your site.
Hi Benjiz, unfortunately it was blocked for too much traffic. I’ll try to reopen it ASAP with free alternative, but in the meanwhile please just rewatch the videos, because my blog is the script for them, so you won’t miss anything. If you wanna get the whole code now, consider to join my channel to become a member, because I send the code to members. Cheers
what happened to your website?
Netlify shut down my blog since they want me to pay for increased traffic. I refuse to pay for doing something useful for the world (without earning absolutely nothing) and since R is open source. But I want to reopen it ASAP, as soon as I find an alternative for Netlify. It'll take some time though, because I am not an IT guy. but since UA-cam is still free, please use the videos till blog is up and running again, since my blog is actually the script for the video, word by word, code by code. thanks for understanding!
@@yuzaR-Data-Science I see, too bad I really enjoyed your work! Maybe github pages is an alternative?