R demo | Mann-Whitney U Test = Wilcoxon Rank Sum Test | How to conduct, visualise, interpret & more😉

Поділитися
Вставка
  • Опубліковано 18 лис 2024

КОМЕНТАРІ • 42

  • @bartoszkedziora3256
    @bartoszkedziora3256 Рік тому

    You are the best you can find on youtube! Thank you so much

  • @117chris9
    @117chris9 Рік тому

    Brilliant thank you so much

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Рік тому

      Thanks 🙏 if you liked this one, you might like the package reviews, gtsummary, for example is one of the most useful

  • @ismailabdelli7287
    @ismailabdelli7287 2 роки тому

    thank you so much! that was really a helpful and accurate explanation

  • @martinglhf
    @martinglhf 2 роки тому

    Very well explained, thanks!

  • @angvl8793
    @angvl8793 2 роки тому

    Great video! Thank you very much!

  • @so4ragb
    @so4ragb 2 роки тому +1

    Dear Yuri,
    Thanks for your great videos, which I have been following and recommending my fellow physicians. These are so great !
    Please consider making some tuts on univariable and multivariable analyses on oncology. With independent parameters like Age, cancer stage, treatment, baseline lab values, ECOG scores, etc and outcomes like time to event, death or not death. That would be great !

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 роки тому +1

      Dear so4ragb, thank you very much for your feedback! And thanks for the suggestion. Interestingly, I am already in process of making a video about a cool package for quick uni- and multivariate analyses in med area ... although statistics is truly agnostic. So, please, stay tuned ;)

    • @so4ragb
      @so4ragb 2 роки тому

      @@yuzaR-Data-Science that's a fantastic news. Very much looking forward to watching it. Hoping for more clinical stats 😉. Thanks for all your efforts.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 роки тому

      You are very welcome!

  • @moviezone8130
    @moviezone8130 6 місяців тому

    Thanks for the wonderful explanation. As I said before you set the bar high. By the way I want to move to data science, I have a bachelor's degree in Chemistry and Master's degree in Environmental Science from Addis Ababa University. I have started learning data science with R programing software for the last 6 months. What best can you advice me. Obviously I live in Ethiopia so I can't take online course because we don't have the international bank payment system so I depend on UA-cam and reading books that are freely available. Data science really excites me a lot. Thanks.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  5 місяців тому +1

      Hi man, glad to hear that you are excited for data science! The good news is, with internet you can learn anything! There are more than enough books and free resources to learn about data science and R or python! Please, don't pay for courses, they are usually crap. UA-cam, blogs and free books will be enough. If you want to really learn R, here are some free books: R4DS, Tidy Modeling with R and ISLR. If you focus on those (+ some practice and real work + learning from other ressourses) you'll be a better data scientist in a year then 90% of those who finished a fancy university. So, keep up the learning energy and I hope my youtube channel helps you on the way there! Cheers

    • @moviezone8130
      @moviezone8130 5 місяців тому

      Thanks for your prompt reply. I will do as you said, I'm into R first so I will stay in it for a while to master it. I will also stay in touch with your channel. I am on LinkedIn so we can be friends there too. Thanks.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  5 місяців тому

      sure, just send me the invite ;)

  • @syhusada1130
    @syhusada1130 2 роки тому

    Amazing

  • @alelust7170
    @alelust7170 Рік тому

    Very well explained!
    I had a doubt: In your example, the two groups have the same size of observations (15). Can I play in groups of different sizes with the same video parameter? Tks

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Рік тому

      Sure, since MWU test is for independent samples, it does not matter how many observations every sample has. For dependent samples, Wilkoxon test, it does. Thanks for the feedback and thanks for watching!

  • @luisa1551
    @luisa1551 6 місяців тому

    Thanks for the video! I have a question: The ranking makes it obsolete to know the distribution. However, how would you approach the same problem under the new Generalized Linear Models as base? For what I understand, all previous hypothesis testing tests can be done by Generalized Linear Models or Lineal Mixed Models. For GLMs, I would need a link function, but how do I decide which? I am not sure what the advantage of the ranking will be apart of getting around the assumptions of normality.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  6 місяців тому

      Dear Luisa, to answer what link function to choose would need the whole new video and I am planning to make one in the future. While the ranking resolves normality and heterogenety of variances, I am not a big fan of ranking, because it kills the real data we have measured. It was just important to describe, so that people dont think that they compare medians. Median, by the way, is the better choice to address many problems in the data, so that I would recommend to dive into quantile regression first, before getting to link functions. I have two videos on Quantile Regression on the channel, so, feel free to check them out. Cheers!

  • @ednacossa8863
    @ednacossa8863 Рік тому

    Hello @Yuzar, thank you for sharing all this knowledge. I"m working on some datas about changes in soil organic carbon after conversion of forest into agriculture.
    Those data were collected in diference depth (fives depth), besides doing a plot between forest and agriculture in each depth. Is there anyway using this package (ggbetweenstats) that I can plot all the depth into the same plot and see the changes among the groups?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Рік тому +1

      suer, you either use grouped_ggbetweenstats to produce subplots for different depth, or you can put all the depth into one column and determine the order of categories on the x-axis via "factor" and "levels" and then put the variable on the x-axis. then you'll be able to get post-hoc tests

    • @ednacossa8863
      @ednacossa8863 Рік тому

      @@yuzaR-Data-Science Thank you. I'm gonna do that.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  Рік тому

      @@ednacossa8863 you are welcome!

  • @Alex-gw6pm
    @Alex-gw6pm 9 місяців тому

    Thank you so much! can you tell me please, I have 4 animal groups, in each group there are 5 animals. the groups are: 1- group of intact animals, 2- group which exposed to first factor, 3- group which exposed to second different factor and 4- control group without exposure to the second factor. I'm interested in comparing between the 3rd and 4th groups, in same time i want to compare 4th group with 1st group. In your opininon which test i should choose, Mann-whitney to compare firstly 4th group with 1st group and then 3rd with 4th group, or Kruskal-Wallis to compare all the groups together. I just tried the both test, Kruskal-Wallis gives me no differences while Mann-whitney gives. I guess the reults of Mann-Whitney more trustful but I am not sure so i decided to ask you as a statistician.
    P.S. I didn't apply any correction method for mann-whitney

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  9 місяців тому

      Hi Alex, the short answer: ggbetweenstats(mtcars, x = cyl, y = mpg, p.adjust.method = "none", pairwise.display = "all"). The longer answer is: you have to correct for multiple comparisons! Or at least explicitely state it in your paper. I have a video on kruskal wallis on my channel, in case you still did not discover it. Hope that helps!

  • @DarshiPatel-x6u
    @DarshiPatel-x6u 11 місяців тому

    great explanation!! However I am getting error while using ggstatsplot function. Can you please suggest an alternative here or recommendation of solving this error?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  11 місяців тому

      sure, since ggstatsplot works on top of other languages, there might be discrepancies between packages. so, update R, then update RStudio, then update all the packages.
      then if you still get the error message, just read it carefully, there is may be one package missing, check whether you data is in a right format, or just google the error message, there were tons of folk, who hat it before too, and most of them are already solved. cheers

  • @syhusada1130
    @syhusada1130 2 роки тому

    The lowest p-value in one of the group I want to test is 0.06, is it low enough to be called not normally distributed?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 роки тому +1

      I would still go with normal distribution. If not sure, you can use plot_density() or ggqqplot() for this group and visually test for normality, when it is aproximately (nobody knows what approximately means ;) everyone decided for himself) normla, use a parametric test

    • @syhusada1130
      @syhusada1130 2 роки тому

      @@yuzaR-Data-Science Okay, so I did use ggqqplot, and the data sits in the grey color area, so, they're normally distributed?

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 роки тому

      Yes

    • @syhusada1130
      @syhusada1130 2 роки тому

      @@yuzaR-Data-Science Okay, I guess I'll use Welch t-Test, since the variance are not equal.

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 роки тому

      yes, and this is pretty sure, no guessing ;) the two tests (Shapiro and Levene's) are useful, because they help you to decide which final test to take.

  • @so4ragb
    @so4ragb 2 роки тому

    Thank you. When I rung ggbetweenstats, I get following error msg. Any idea where the problem lies ?:
    Error in `mutate()`:
    ! Problem while computing `n_label = paste0(one_drug1, "
    (n = ", .prettyNum(n), ")")`.
    Caused by error in `vapply()`:
    ! values must be length 1,
    but FUN(X[[1]]) result is length 3
    > rlang::last_error()
    Error in `mutate()`:
    ! Problem while computing `n_label = paste0(one_drug1, "
    (n = ", .prettyNum(n), ")")`.
    Caused by error in `vapply()`:
    ! values must be length 1,
    but FUN(X[[1]]) result is length 3
    ---
    Backtrace:
    1. ggstatsplot::ggbetweenstats(...)
    15. statsExpressions:::.prettyNum(n)
    16. base::prettyNum(x, big.mark = ",", scientific = FALSE)
    17. base::vapply(...)
    Run `rlang::last_trace()` to see the full context.
    > rlang::last_trace()
    Error in `mutate()`:
    ! Problem while computing `n_label = paste0(one_drug1, "
    (n = ", .prettyNum(n), ")")`.
    Caused by error in `vapply()`:
    ! values must be length 1,
    but FUN(X[[1]]) result is length 3

    • @yuzaR-Data-Science
      @yuzaR-Data-Science  2 роки тому +1

      Hey, try to update all packages 📦 that should solve it

  • @yasinnabi
    @yasinnabi 2 роки тому

    This is a wonderful and interesting channel. I found it very useful. worth subbing and liked ! a fellow creator,,,,