FDR, q-values vs p-values: multiple testing simply explained!

Поділитися
Вставка
  • Опубліковано 23 лис 2022
  • Why is multiple testing a big issue in biostatistics? In this video, we will explain why multiple testing is so dangerous when analysing large datasets, and how to correct for it. We will cover some of the most common methods: Bonferroni correction, Benjamini-Hochberg (BH) and q-values.
    Don't let the monster of multiple testing eat your data!
    --------------------------------------------------------------------------------------------------------------------
    Watched it already? If you liked this video or found it useful, please let me know! Your comments and feedback are very much appreciated😊 If you have questions, don't hesitate to leave me a comment down below, I will answer as soon as I can:) --------------------------------------------------------------------------------------------------------------------
    For more biostatistics tools and resources, you can visit: biostatsquid.com/ for more
    • simple and clear explanations of biostatistics methods
    • computational biology tools
    • easy step-by-step tutorials in R and Python
    to analyse and visualise your biological data!
    Or follow me on Instagram at @biostatsquid: / biostatsquid
    Don’t forget to subscribe if you don’t want to miss another video from me! --------------------------------------------------------------------------------------------------------------------
    More multiple testing resources:
    Check the difference between different multiple testing corrections in R:
    www.stat.berkeley.edu/~mgoldm...
    A really cool explanation of the FDR from Statquest!
    • False Discovery Rates,...
    ------------------------------------------------------------------------------------------------------------
    Trash FM by Alexander Nakarada | www.serpentsoundstudios.com
    Music promoted by www.free-stock-music.com
    Attribution 4.0 International (CC BY 4.0)
    creativecommons.org/licenses/...

КОМЕНТАРІ • 31

  • @jorgea.servert9490
    @jorgea.servert9490 Рік тому +5

    This video is brilliant! You are a natural at explaining statistics. Thank you so much!

    • @biostatsquid
      @biostatsquid  Рік тому

      Thanks for your kind words Jorge! Glad it was useful:)

  • @ankushjamthikar9780
    @ankushjamthikar9780 Рік тому +1

    This video is very good! You explained it in a nice way. Thank you for the video. Keep posting more videos on biostatistics.

    • @biostatsquid
      @biostatsquid  Рік тому

      Thanks Ankush! I am glad you found it helpful:)

  • @user-fw2kc6iv1f
    @user-fw2kc6iv1f 8 місяців тому

    thanks, you make me truly understood q_value

  • @mihacerne7313
    @mihacerne7313 Рік тому +2

    Multiple thanks for the video!

  • @sanjaisrao484
    @sanjaisrao484 Рік тому

    Thanks, I finally understood something about p value and FDR

  • @svetlanavasileva8961
    @svetlanavasileva8961 Рік тому

    It’s gorgeous !!! Please do more about biostatistics !!!!!

  • @biotales371
    @biotales371 Рік тому +1

    simply brilliant...

  • @anmolpardeshi3138
    @anmolpardeshi3138 Рік тому

    this is an awesome video! Applaud the simple and fun explanation. just two things: (a) the "coffee" being NOT associated (among the significant outcomes) comes from a prior knowledge. but we might not always have this prior knowledge - then what do we do? (b)its not shown how the adjusted p values were calculated if you can pls make that clarification. otherwise this is a good video! Thanks.

  • @mocabeentrill
    @mocabeentrill Рік тому +1

    Thank you Biosquidee!

  • @carlosdomingues3551
    @carlosdomingues3551 3 місяці тому

    Thank you for this great concise video, you can tell you put alot of work into it =] ..Any follow-up on red smarties linked to baldness??

  • @anphan7526
    @anphan7526 Рік тому +2

    Shouldn't it be 1/16 at 7:07, since we have 16 objects being marked as significant?

    • @tgc7053
      @tgc7053 10 місяців тому

      I think so.

  • @ZLYang
    @ZLYang 11 місяців тому +2

    At 1:29, if you find a link, why p is still larger than 0.05?

    • @biostatsquid
      @biostatsquid  11 місяців тому

      Oh nicely spotted! That's a typo, sorry for the confusion! It's p < 0.05. Thanks for noticing and commenting about it!

    • @ZLYang
      @ZLYang 11 місяців тому

      😁@@biostatsquid

  • @enzolong9085
    @enzolong9085 11 місяців тому

    Thank you so much!!!

  • @artarz9542
    @artarz9542 4 місяці тому

    How do you determine the number of false positives? What are the criteria?

  • @emotaph5709
    @emotaph5709 Рік тому +1

    Thank you for this video and the effort that must've gone into this. Everything you explained was very easy to understand.
    I had a question:
    You spoke about "correlations" in the video but what about relations one way to the other such as regressions where we speak in terms of "dependent" and "independent" variables. In the examples you shared, the genes would be independent variables and we want to see their relation with the "dependent" variable of being a morning person. Now if we were to check if 1 gene in particular (independent variable) affects different things (different dependent variables)- blindness, baldness, wakefulness, color blindness, etc. would the same logic of q values hold?
    It would be lovely if you get the time to get back to this. If not-thanks anyway for the great video!

    • @biostatsquid
      @biostatsquid  Рік тому +1

      Hi! Thank you so much for your comment! That was a great question. My answer is... yes and no:)
      So, in general, q-values are not typically used for linear regression. Let's see why.
      As we saw in the video, q-values are commonly used in the context of multiple hypothesis testing, specifically in controlling the false discovery rate (FDR). They are used to adjust p-values for multiple comparisons when conducting hypothesis tests on a large number of variables or features simultaneously (for example, gene expression studies).
      Linear regression, on the other hand, is a statistical method used to model the relationship between a dependent variable (let's take one of the ones you mentioned, for example, blindness) and one or more independent variables (genes). It aims to estimate the coefficients of the independent variables to predict the value of the dependent variable. We then see how well our model is by evaluating the overall goodness of fit (e.g., using R-squared of the RMSE).
      However, this is where the 'yes' comes in. We usually assess the statistical significance of the coefficients through p-values. In the context of linear regression, if you are performing multiple hypothesis tests-for example, when evaluating the significance of multiple coefficients (because you have multiple genes) or conducting variable selection-it may be appropriate to use q-values to adjust the p-values associated with each coefficient.
      Hope this cleared things up a bit. However, I recommend consulting a statistician or reading the literature to ensure you're applying the q-values correctly in the specific context of your analysis:)

    • @emotaph5709
      @emotaph5709 11 місяців тому +1

      @@biostatsquid Thank you for the swift reply and the detailed explanation. And yes, good idea to keep reading the literature before making a decision!

  • @SmiladaXD
    @SmiladaXD 10 місяців тому

    Thank you so much for this video. Could you please just clarify how you calculated the P-adjusted values/Q-values? I've been looking everywhere for that and would truly appreciate if you can explain that to me.

    • @biostatsquid
      @biostatsquid  10 місяців тому

      Hi Claire, thanks for your feedback! I don't think I've ever calculated p-adjusted values myself, usually when you get the output of a statistical test you get p-values and p-adjusted values. But I did a quick search and found this article: Why, When and How to Adjust Your P Values? It explains how to calculate p-adjusted values from your p-values. Hope it helps!
      www.ncbi.nlm.nih.gov/pmc/articles/PMC6099145/

  • @dome1844
    @dome1844 7 місяців тому

    i did not get if q-value is more stringent than FDR. I had an analysis in which I used FDR for gene expression, but I think the results are too stringent un confront of difference I observed by experiments and to have a good G0 ontology analysis that represents the biological process going on. what to do in this case?

    • @biostatsquid
      @biostatsquid  7 місяців тому

      Hi, thanks for your comment! Not sure I understand your question - could you rephrase? Perhaps this answer helps answer it? www.biostars.org/p/462897/
      In any case, when you are doing GO analysis it's good practice to correct for multiple testing and use p adjusted values,, even when there may not be many significant results.

  • @cowboycatranch
    @cowboycatranch 13 днів тому

    The P value for the red smarties still says P > 0.05 (1:28), whereas it should be P < 0.05. Same for 2:12.

  • @Nikolaj-qz9kw
    @Nikolaj-qz9kw Рік тому +1

    The person in red was asking if smarties cause *blindness, not *baldness :)

  • @willychrosnik1925
    @willychrosnik1925 7 місяців тому

    Now i want smarties

  • @shiyiyin3403
    @shiyiyin3403 2 місяці тому

    start watching at 7:00 intro is too long