Multivariable Logistic Regression in R: The Ultimate Masterclass (4K)!

Поділитися
Вставка
  • Опубліковано 20 січ 2025

КОМЕНТАРІ • 74

  • @hikeaway1596
    @hikeaway1596 3 місяці тому +3

    finally! waited for that video long time! thanks!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      Glad I did it! Took me a long time to make. Hopefully it’s useful!

  • @eliasmoonen9992
    @eliasmoonen9992 2 місяці тому +1

    What a great video, waw! Even the small section on the ROC-curve, thaught me more than all the other videos out there! Would love a video in which you break down these metrics of the curve more into detail. Thank you so much!!!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому

      Glad you enjoyed it, Elias! I am working on roc curve and optimal cutpoint video right now. Hope it will deliver the things you are interested in. Stay tuned. Kind regards from holidays in Australia

  • @MarcoBozzo-mj9uw
    @MarcoBozzo-mj9uw 3 місяці тому +1

    man your presentation is staggering. keep doing your thing, do not lose an inch

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      Thanks a ton, Marco 🙏 I’ll do my best to keep the content going 😉 hope you like other videos too. Kind regards

  • @BonesFrielinghaus
    @BonesFrielinghaus 3 місяці тому +1

    Like how you explain everything. And a clear, easy to understand voice (some none-1st Lang English speakers are SO MUCH WORK to understand - way too much cognitive load for me). You're easy to parse...and thanks for the non- YT generated caption text 💙💙💙

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      Awesome! I’m really stoked you find it easy to understand! Makes all the work worth it! The subtitles were also suggested by my permament viewer. Thus, don't hesitate to suggest any improvements I can make for the video to increase the quality of content. Thanks for watching! 🙌

  • @muhammadahmadkhalid364
    @muhammadahmadkhalid364 Місяць тому

    Really like your videos and want to follow along with them with my own data. You have great expertise and know how to code one's own data in the best way so that you can do everything that you taught on your channel. I think this is the only hurdle left for me. I want to apply what you taught on my own data. Your way of teaching and also your videos being more towards real research and article writing orientated makes me ask for a video on coding data the right way in R which will go through the tools that you teach such as, flextable, gtsummary, sjplot, etc without any issue and giving some common pitfall there can be. The main problem I am facing is to code the levels and labels of factors and order them. In SPSS we give it a number and a label. Well, I think most of us are trying to come for SPSS to R so this will also be a good video idea if it is contrasted with SPSS also. Really can't find a video on youtube that teaches it more towards research orientated. Love your content. The best channel for teaching what you need to know in R.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Місяць тому +1

      Thank you very much Muhammad for such a nice feedback! Sure, in the beginning we'll all had difficulties to switch to R. I came from Matlab and NCSS to R. And also needed to box myself through the error messages. The good news is - the error messages are finite. The are only a few (20-50) error messages, you quickly learn how to deal with. After it error message will become a help. Levels are easy, you can determine the order yourself:
      library(dplyr)
      library(forcats) # install the packages, if they don't load
      df

    • @muhammadahmadkhalid364
      @muhammadahmadkhalid364 Місяць тому

      @@yuzaR-Data-Science Thank you very much. Will also be looking forward to more video.

  • @ferhat5157
    @ferhat5157 3 місяці тому +5

    Great video. Could you please record mixed (random) effects models as well. I know it’s a big ask but at least linear and logistic would be great!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому +5

      Of course! The mixed-models content will come in the near future. Just need to cover some other basic models and topics, which also will be useful to you I hope, and then I plan to cover most of the spectra of the mixed models beyond linear and logistic and present the best (in my opinion) packages for mixed-models. thus, please, stay tuned! Kind regards! Yury

    • @Dhallager
      @Dhallager 3 місяці тому

      Another great video! My R output has become 1000% easier and better following your videos. Will you include interpretation and visualization of interactions also?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      hey man, thanks a lot for such a nice feedback! yes, I plan a video on interactions in logistic regression. stay tuned and if you think my content could be also helpful for someone you know, please share my videos with them :) cheers

  • @kellycriterion1019
    @kellycriterion1019 Місяць тому +1

    Great video❤
    Please make some videos on Survival Analysis as well.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Місяць тому

      thanks you very much for the feedback! will do survival analysis with R similar to this one! I have two very old not very good and not R, but a bit theoretical videos on survival analysis on this channel. I don't think they are helpful, but you want, you could check them out.

  • @Abdulaziz-yj1ns
    @Abdulaziz-yj1ns 29 днів тому

    Thank you so much very informative

  • @GreigR
    @GreigR 3 місяці тому

    That's an amazing summary - thank you so much and yes please to a mixed effects and a linear model video

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      Sure thing! They were on my radar anyway, but now I am getting serious about them! The content will come in the near future. Just need to cover some other basic models and topic, which also will be useful to you I hope, and then I plan to cover most of the spectra of the mixed models beyond linear and logistig and present the best (in my opinion) packages for mixed-models. thus, please, stay tuned! Kind regards! Yury

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому +1

      Hi Greig, When you mean a usual linear regression (not mixed-effects linear), then I have recently done 4 videos on it. Besides, I have content on quantile, robust, bootstrapping regressions .... since I use them too in my everyday work life. Hope other videos will also resonate with you. And hope you'll stick around until I create a mixed-effects series ;) Kind regards! Yury

  • @SUNILYADAV-tv5ze
    @SUNILYADAV-tv5ze 3 місяці тому

    Nice lecture deliverd and Best explanation about multivariable logistic through example. Thanks

  • @yurisilvadesouza3059
    @yurisilvadesouza3059 3 місяці тому

    The best explanation I have ever seen!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      Thank you so much for the kind words! Your support really motivates me to keep creating! If it helped you, please, share it with somebody, who also might benefit from it! That would mean the world to me! Cheers, Yury

  • @Undstoppablecricket
    @Undstoppablecricket Місяць тому

    Excellent work

  • @Ange-y1k
    @Ange-y1k 2 місяці тому

    Thanks a lot for this piece of work 👌

  • @alexisdosis5524
    @alexisdosis5524 2 місяці тому

    Your video is amazing and so explanatory!!!
    Thanks for posting!!!
    Could I ask something please, as I see conflicting information- if you have several independent variables(predictors) and you want to assess which ones are more important for your logistic regression (as in univariate analysis), is it appropriate to check each one with logistic regression?
    What would you recommend? I read that it is an outdated approach? But in medicine I have seen several authors using it?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому +1

      no, you can sort them out via p-values, e.g.

    • @alexisdosis5524
      @alexisdosis5524 2 місяці тому

      @@yuzaR-Data-Science thanks for replying! Just to clarify would you put all of the available predictors in a multivariate model and then based on p-values

  • @robertc2121
    @robertc2121 3 місяці тому

    Thank you so much. Love your content. This is incredibly helpful. I hope you do linear too :)

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      thanks Robert! When you mean linear regression, then I have recently done 4 videos on it. Besides, I have content on quantile, robust, bootstrapping regressions .... since I use them too in my everyday work life. Hope other videos will also resonate with you. Kind regards! Yury

  • @warrenmalambo578
    @warrenmalambo578 3 місяці тому

    Great video. Looking forward to a separate video on ROC curve and confusion matrix.

  • @edinsondelgado4895
    @edinsondelgado4895 3 місяці тому

    Please do a mixed effects model (random ) video! Your videos are the best on UA-cam so far.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      thanks you so much! I'll definetely do videos on mixed models in R! Stay tuned ;)

  • @RaoniDominguesMD
    @RaoniDominguesMD 3 місяці тому

    Great video! Please, can you make one on variable selection for multivariate models? Excellent content!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      :) I actually already did ;) check out my video on {glmulti} package and let me know whether it's what you wanted. Thanks for feedback and for watching!

  • @nikeforo2612
    @nikeforo2612 2 місяці тому +1

    Terrific video, very detailed yet clear. I don't know if you covered it already, but if you plan to cover cross-tabulation analysis, would you consider giving my 'chisquare' package a try?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому +2

      Hi Nike, thanks for the positive feedback. And I am interested in your 'chisquare' package. Unfortunately I did not find much info online on it. I have actually already made one video on chi-squared test. If you have seen this one, what does your package does better and differently? If you send me the code for what your package can do and explanations why it is useful and why it is better than usuals chi-square function or ggbarstats, I would love to make a video on your package!

    • @nikeforo2612
      @nikeforo2612 2 місяці тому

      @@yuzaR-Data-Science Hello, and thanks for your reply. The package is on CRAN, and it's currently in its version 1.1.1 (it started from vers 0.1 in 2022). In few words, the package is meant to provide a one-stop shop for chi-square analysis of cross-tabs, and provides a number of facilities that are not coherently integrated in existing packages (to the best of my knowledge). For example, it provides (in just one simple line of code), different types of chi-sq residuals (with adjustements for multiple comparisons, and color coded for easy visual interpretation) and a extensive suite of association coefficients (for both 2x2 and larger tables), some of which not currently implemented elsewhere (maximum-corrected version of the phi and Cramer's V coeff, corrected version of Goodman-Kruskal's lambda, both asymmetric and symmetric). Also, it provides different versions of the chi-sq test itself, like the N-1-corrected version, which (again) is not currently provided elsewhere. As for post-hoc-analysis, it provides measures not currently available elsewhere, like the so-called Quetelet index and the IJ association factor. Further, it computes independent odds ratios for tables larger than 2x2, while for 2xK tables it can optionally produce a plot of pair-wise odds ratios (plus confidence intervals). Also, it provides suggestions as to a 'viable' chi-sq test given the input table characteristics. Effect size verbal articulation for relevant association coefficients (both chi-square-based and marginal-free) are also reported. Finally, all the outputs are nicely formatted via the 'gt' table package. I think that should be almost pretty much all. Everything can be obtained by just running: chisquare(mytable). Cheers.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому +1

      hey, your package is impressive, I found the visualization of odds ratios good. I have two questions:
      - first, do you have more info, like article or so on post hoc pairwise tests with all the significance, like when we have a table 4x4 or 3x5, so that all categories (percentages) are checked automatically. till now I use a pairwise_fishers_test() function which is cool, but an extra code. It would be amazing when we could just use your function and get all we need - ORs plot with significances and all the pairwise 2x2 tests from bigger contingency table in some form of a table.
      - second, may be more important: I could not get chisquare() function work with a simple table() function:
      > chisquare(table(mtcars$cyl, mtcars$am) )
      Error in `gt::tab_style()`:
      ! Failed to style the body of the table.
      Caused by error in `cells_body()`:
      ! Can't select columns that don't exist.
      ✖ Column `0` doesn't exist.
      Run `rlang::last_trace()` to see where the error occurred.
      so, when this can be allowed and we could do bigger tables, like this one: chisquare(table(ISLR::Wage$jobclass, ISLR::Wage$education) ), this could be awesome!

    • @nikeforo2612
      @nikeforo2612 2 місяці тому

      @@yuzaR-Data-Science Hello. Thanks for taking the time to check that and for replying. I do not want to hijack your comments section here. If you want to contact me on the email you find in the package documentation, I will more than happy to discuss things further. Looking forward. Cheers.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому +2

      hey mate, no worries, you don't hijack the comments section! :) I am actually glad to read and answer the comments. the next weeks I'll be on holidays, but we can talk about your package next month. generally, as I said before, I would love to be able to apply your chisquare function to a simple cross table, like that "chisquare(table(mtcars$cyl, mtcars$am) )". do you think it's possible?

  • @rcanjino
    @rcanjino 2 місяці тому

    Fantastic intro to a whole analysis pipeline for logistic regression. Do you have something similar for survival regression? ❤

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому +1

      Unfortunately not. Only two older theoretical videos on survival, but they low quality and no programming. Plan to do the similar one in the future. So, please, stay tuned.

    • @rcanjino
      @rcanjino 2 місяці тому

      @ looking forward to that. Thanks for this great vid nonetheless!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому

      welcome!

  • @CROscarAbrahamJosePadillaSolis
    @CROscarAbrahamJosePadillaSolis Місяць тому

    Excellent video! I came here from the recommendation of the video on simple linear regression, and it's great. I have a question that I haven't been able to resolve. When using performance, I understand that categorical variables are analyzed by creating dummies, but I don't know how the VIF is calculated. Is there a formula, or how could we check multicollinearity for non-quantitative variables?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Місяць тому

      sure, vif works for both numeric and categorical variables. how it's calculated - I don't know exactly, just superficial formula like 1/(1 - summary(model)$r.squared) - but I treat it like a car: I don't know how engine works, but I know how to drive. so, if your vif is below 5 or in some cases below 10, you can accept the results. when vif is above 10 you'll find some multicollinear variables (both numeric and categorical)

  • @wasafisafi612
    @wasafisafi612 2 місяці тому

    Thank you for the video

  • @emredunder9108
    @emredunder9108 Місяць тому

    I have a short contibution: If any categorical variable exists, classical VIF values are not appropriate. Then, it would be the best to use generalize VIF values.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Місяць тому

      Nice contribution! Yeah, the gtsummary package uses GVIF be default:
      tbl_regression(model) |> add_vif()

  • @r.hainez2131
    @r.hainez2131 2 місяці тому

    That is another great video, thank you so much! For the ROC curve, the performance package provides a function which produces a similar result : performance_roc(x = m) %>% plot() . Is there a difference with pRoc::roc() ?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 місяці тому +1

      Glad it was helpful! Sure, there are several functions for ROC curves in R. Several packages provide good results, but I like two of them more then the rest:
      Epi::ROC(form = survived ~ predicted_glm, data = d, plot = "ROC", grid = F, MX = T, MI = F, lwd = 3)
      cutpointr() - I am workind on a whole video about this one, it's just amazing

  • @123eorl
    @123eorl 2 місяці тому

    amazing!!

  • @joshstat8114
    @joshstat8114 3 місяці тому

    Where (or when) could be the "Multivariate" Linear Regression one, since you covered the Multivariable Linear (this time, logistic) Regression?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      By multivariate you mean several outcomes? The terms is used often, but people define it differently.

    • @joshstat8114
      @joshstat8114 3 місяці тому

      @@yuzaR-Data-Science And I don't like that way. It should be defined equivalently. In that way, many literatures will be produced and reproduced.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      Oh man, the more I do science the more I see it's imperfections. Different definitions of the same think are the norm. Unfortunately. But still, I think, science is the best thing people can do.

  • @aram5704
    @aram5704 3 місяці тому

    You are a magician!

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому

      Really appreciate your feedback 🙏 Thanks for watching!

  • @GreenManXY
    @GreenManXY 3 місяці тому

    I tried using the performance package on various models and unfortunately it seems a bit limited to lm and glm. Doesn't work with glmnet, for example. Doesn't work with KNN or RandomForest. I'm assuming it's because it checks for linear assumptions only... Bit of a shame, I had hoped it could be a go to tool for all model types.
    For now, I find that the parsnip package has more standardized functions like collect_metrics. But they're not as visually cool as check_model...

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому +1

      well, yes, performance package doesn't work well with the machine learning models, but it works with almost all "important" statistical model, from frequentists to bayesian. I use more stats than ML, so I can't suggest an alternative better then collect_metrix at the moment. But I'll get into ML one day and will see what I'll find. In the meanwhile, I hope you enjoy the rest of the videos :) cheers

  • @ramigiopololo2912
    @ramigiopololo2912 Місяць тому

    Do you have a website where you share your code?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Місяць тому

      Of coarse, when you join my channel, I send you the pdf with code and explanations (transcripts) of any video. But, please, don't feel like you have to join! You just can pause the video and type up the code, it's free and not much of a code. Please, only join if you want to support my work and you'll get the benefit of getting the transcripts. Kind regards, Yury

  • @rubyamanda9009
    @rubyamanda9009 3 місяці тому

    I really like how you are patient and make the interpretations so understandable! I also love the memes 😂
    Please do you have a website where you share the codes?
    Please can you make a video explaining the basic assumptions, visualisations and interpretations of the outcomes from the nearest neighbour matching outcome?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  3 місяці тому +2

      Thank you soo much for such a nice feedback! :) I am never sure, whether people like my memes, but I find similar memes in other videos always good :) the nearest neighbour matching outcome is actually new to me, I check that our and find it totally interesting. I'll put it on the list ;) Thanks for watching!