this is an awesome video! Applaud the simple and fun explanation. just two things: (a) the "coffee" being NOT associated (among the significant outcomes) comes from a prior knowledge. but we might not always have this prior knowledge - then what do we do? (b)its not shown how the adjusted p values were calculated if you can pls make that clarification. otherwise this is a good video! Thanks.
Thank you for this video and the effort that must've gone into this. Everything you explained was very easy to understand. I had a question: You spoke about "correlations" in the video but what about relations one way to the other such as regressions where we speak in terms of "dependent" and "independent" variables. In the examples you shared, the genes would be independent variables and we want to see their relation with the "dependent" variable of being a morning person. Now if we were to check if 1 gene in particular (independent variable) affects different things (different dependent variables)- blindness, baldness, wakefulness, color blindness, etc. would the same logic of q values hold? It would be lovely if you get the time to get back to this. If not-thanks anyway for the great video!
Hi! Thank you so much for your comment! That was a great question. My answer is... yes and no:) So, in general, q-values are not typically used for linear regression. Let's see why. As we saw in the video, q-values are commonly used in the context of multiple hypothesis testing, specifically in controlling the false discovery rate (FDR). They are used to adjust p-values for multiple comparisons when conducting hypothesis tests on a large number of variables or features simultaneously (for example, gene expression studies). Linear regression, on the other hand, is a statistical method used to model the relationship between a dependent variable (let's take one of the ones you mentioned, for example, blindness) and one or more independent variables (genes). It aims to estimate the coefficients of the independent variables to predict the value of the dependent variable. We then see how well our model is by evaluating the overall goodness of fit (e.g., using R-squared of the RMSE). However, this is where the 'yes' comes in. We usually assess the statistical significance of the coefficients through p-values. In the context of linear regression, if you are performing multiple hypothesis tests-for example, when evaluating the significance of multiple coefficients (because you have multiple genes) or conducting variable selection-it may be appropriate to use q-values to adjust the p-values associated with each coefficient. Hope this cleared things up a bit. However, I recommend consulting a statistician or reading the literature to ensure you're applying the q-values correctly in the specific context of your analysis:)
Thank you so much for this video. Could you please just clarify how you calculated the P-adjusted values/Q-values? I've been looking everywhere for that and would truly appreciate if you can explain that to me.
Hi Claire, thanks for your feedback! I don't think I've ever calculated p-adjusted values myself, usually when you get the output of a statistical test you get p-values and p-adjusted values. But I did a quick search and found this article: Why, When and How to Adjust Your P Values? It explains how to calculate p-adjusted values from your p-values. Hope it helps! www.ncbi.nlm.nih.gov/pmc/articles/PMC6099145/
i did not get if q-value is more stringent than FDR. I had an analysis in which I used FDR for gene expression, but I think the results are too stringent un confront of difference I observed by experiments and to have a good G0 ontology analysis that represents the biological process going on. what to do in this case?
Hi, thanks for your comment! Not sure I understand your question - could you rephrase? Perhaps this answer helps answer it? www.biostars.org/p/462897/ In any case, when you are doing GO analysis it's good practice to correct for multiple testing and use p adjusted values,, even when there may not be many significant results.
Very clear explanation! Thanks!
This video is brilliant! You are a natural at explaining statistics. Thank you so much!
Thanks for your kind words Jorge! Glad it was useful:)
this is an awesome video! Applaud the simple and fun explanation. just two things: (a) the "coffee" being NOT associated (among the significant outcomes) comes from a prior knowledge. but we might not always have this prior knowledge - then what do we do? (b)its not shown how the adjusted p values were calculated if you can pls make that clarification. otherwise this is a good video! Thanks.
It’s gorgeous !!! Please do more about biostatistics !!!!!
This video is very good! You explained it in a nice way. Thank you for the video. Keep posting more videos on biostatistics.
Thanks Ankush! I am glad you found it helpful:)
thanks, you make me truly understood q_value
At 1:29, if you find a link, why p is still larger than 0.05?
Oh nicely spotted! That's a typo, sorry for the confusion! It's p < 0.05. Thanks for noticing and commenting about it!
😁@@biostatsquid
Multiple thanks for the video!
Thanks, I finally understood something about p value and FDR
Thank you for this video and the effort that must've gone into this. Everything you explained was very easy to understand.
I had a question:
You spoke about "correlations" in the video but what about relations one way to the other such as regressions where we speak in terms of "dependent" and "independent" variables. In the examples you shared, the genes would be independent variables and we want to see their relation with the "dependent" variable of being a morning person. Now if we were to check if 1 gene in particular (independent variable) affects different things (different dependent variables)- blindness, baldness, wakefulness, color blindness, etc. would the same logic of q values hold?
It would be lovely if you get the time to get back to this. If not-thanks anyway for the great video!
Hi! Thank you so much for your comment! That was a great question. My answer is... yes and no:)
So, in general, q-values are not typically used for linear regression. Let's see why.
As we saw in the video, q-values are commonly used in the context of multiple hypothesis testing, specifically in controlling the false discovery rate (FDR). They are used to adjust p-values for multiple comparisons when conducting hypothesis tests on a large number of variables or features simultaneously (for example, gene expression studies).
Linear regression, on the other hand, is a statistical method used to model the relationship between a dependent variable (let's take one of the ones you mentioned, for example, blindness) and one or more independent variables (genes). It aims to estimate the coefficients of the independent variables to predict the value of the dependent variable. We then see how well our model is by evaluating the overall goodness of fit (e.g., using R-squared of the RMSE).
However, this is where the 'yes' comes in. We usually assess the statistical significance of the coefficients through p-values. In the context of linear regression, if you are performing multiple hypothesis tests-for example, when evaluating the significance of multiple coefficients (because you have multiple genes) or conducting variable selection-it may be appropriate to use q-values to adjust the p-values associated with each coefficient.
Hope this cleared things up a bit. However, I recommend consulting a statistician or reading the literature to ensure you're applying the q-values correctly in the specific context of your analysis:)
@@biostatsquid Thank you for the swift reply and the detailed explanation. And yes, good idea to keep reading the literature before making a decision!
Thank you Biosquidee!
simply brilliant...
Thank you so much!!!
You're very welcome:)
Shouldn't it be 1/16 at 7:07, since we have 16 objects being marked as significant?
I think so.
Thank you for this great concise video, you can tell you put alot of work into it =] ..Any follow-up on red smarties linked to baldness??
Thank you so much for this video. Could you please just clarify how you calculated the P-adjusted values/Q-values? I've been looking everywhere for that and would truly appreciate if you can explain that to me.
Hi Claire, thanks for your feedback! I don't think I've ever calculated p-adjusted values myself, usually when you get the output of a statistical test you get p-values and p-adjusted values. But I did a quick search and found this article: Why, When and How to Adjust Your P Values? It explains how to calculate p-adjusted values from your p-values. Hope it helps!
www.ncbi.nlm.nih.gov/pmc/articles/PMC6099145/
How do you determine the number of false positives? What are the criteria?
The P value for the red smarties still says P > 0.05 (1:28), whereas it should be P < 0.05. Same for 2:12.
i did not get if q-value is more stringent than FDR. I had an analysis in which I used FDR for gene expression, but I think the results are too stringent un confront of difference I observed by experiments and to have a good G0 ontology analysis that represents the biological process going on. what to do in this case?
Hi, thanks for your comment! Not sure I understand your question - could you rephrase? Perhaps this answer helps answer it? www.biostars.org/p/462897/
In any case, when you are doing GO analysis it's good practice to correct for multiple testing and use p adjusted values,, even when there may not be many significant results.
thank u
The person in red was asking if smarties cause *blindness, not *baldness :)
This is not the point here
p > 0.05?
Now i want smarties
start watching at 7:00 intro is too long