Thank you for your comment! I'm glad you liked the video. Some more coding tutorials coming up but I will definitely write down topology-based methods in my to-do list:)
2:14 I am not clear on what contributes to the magnitude of the increase/decrease of the running statistic (i.e. what number specifically is the input for the running statistic calculation). Is it the rank value? In the video you focus explicitly on fold change, but in the previous video you mentioned that rank is determined by both fold change AND significance. Great video by the way :)
Hey Daniel, thanks for your comment, great question! I tend to use -log10(pval)*sign(FC), to get a combination of both, but there's not a consensus in the community as far as I know. There's a few blogs/papers that discuss it: www.biostars.org/p/375584/
@@biostatsquid ah makes sense! So it sounds like there are a number of different ways of doing this. Thanks for clarifying and the quick reply! I will have a look at the link.
Im confused at the end of the video. You said the q-value is the probability of the p-value for the test being wrong, ok but which p-value? The nominal or the adjusted one? Also, isnt the q-value just the adjusted p-value for multiple testing?
Hi Jesse, it is a great question, p-values, q-values and p-adjusted values can be confusing. Yes, as you say, the q-value is an adjusted p-value for multiple testing. So, in simple words: p-val = chance of a false positive (i.e., if you use a p-val cut off of 0.05, it means you are taking a chance that there are 5% of false positives --> calling something significant when it is actually not) Problem of multiple testing - the more tests, the more chance of observing at least one significant result, even if it is actually not significant. We need to correct for this - for which we can use different methods: p-adjusted values: p values corrected using the (most commonly) Bonferroni correction. Usually too stringent. q--values: p-values corrected based on the False Discovery Rate (FDR) - now we are not taking about 5% of all results being false positives, but 5% of SIGNIFICANT results being false positives. Hopefully that made sense. If you want a more exhaustive explanation you might want to check my video on multiple testing correction: ua-cam.com/video/LVKLyt1B35w/v-deo.html&embeds_euri=https%3A%2F%2Fbiostatsquid.com%2F&source_ve_path=Mjg2NjY&feature=emb_logo or blog post: biostatsquid.com/multiple-testing-correction-fdr/
@@biostatsquid Thank you for the indepth reply. Ok, if the q-value is the adjusted p-value then the only thing I don't get is why is there a column for the adjusted p-value and a column for the q-value in the chart near the end of the video? Furthermore, their values are different for each row? Ohhh wait I just accidently skipped over the adjusted p-value section of your reply, sorry. Ok I see so difference between the q and adjusted p-values is the method used to correct for multiple testing(adjusted p-value yielded from the Bonferroni correction and the q-value from FDR). Thanks, that really clears it up!
@@biostatsquidHold on. Now I'm even more confused. I thought that the adjusted p-value was a correction using Benjamini-Hochberg (as said in the video), not Bonferroni. Besides, I thought that Benjamini-Hochberg was a FDR-controlling method. So that means that both adjusted-p-val and q-val are FDR corrected? Please, help
Hi @@chusty93 thanks for pointing that out! So p-adjusted values are just p-values corrected for multiple testing. This adjustment can be made using various methods: Bonferroni, BH.... q-values are adjusted p-values that control the False Discovery Rate. There are several FDR-controlling methods but the most common one is Benjamini Hochberg. Therefore, when you see "p-adjusted," it often implies FDR correction, just because BH is much more common than Bonferroni, but it's essential to check the specific method used. Same way, if you have q-values, you know they are FDR-corrected p-values, and chances are they will have been corrected using BH since it's the most commmon one, but you should always check. Hope this clarified things, let me know!
Im so appreciated for this video that simplify the basic concept of enrichment analysis. Im look forward to topology-based method.
Thank you for your comment! I'm glad you liked the video. Some more coding tutorials coming up but I will definitely write down topology-based methods in my to-do list:)
Thank you! Really helped in my understanding so much better compared to trying to read articles :') your hard work is much appreciated by us all here!
BiostatSquidee!!! My enduring gratitude as always! You're the best.
This is very impressive video for better understanding the GSEA results, thank you for your effort
Your videos are a lifesaver! Thank you for making these
Thanks a bunch! Wonderful video describing how to interpret GSEA!
I love mountains but i love the ones in your video even more!
amazing teaching!! best gsea tutorial on UA-cam omg this helped me so much, thank u!
Nice explanation. Very inspiring. Thanks a lot.
Thank you for the clear explanation!! Great help!! Looking forward to upcoming videos:)
This is the best video ever
Thank you! Glad you found it useful:)
Great explanation!
Thank you for this video! Helped me out alot!
Beautifully explained! Keep up the good work, I'm a fan and will be spread your tutorials :
this is simply amazing, cant wait for new videos!
2:14 I am not clear on what contributes to the magnitude of the increase/decrease of the running statistic (i.e. what number specifically is the input for the running statistic calculation). Is it the rank value? In the video you focus explicitly on fold change, but in the previous video you mentioned that rank is determined by both fold change AND significance.
Great video by the way :)
Hey Daniel, thanks for your comment, great question! I tend to use -log10(pval)*sign(FC), to get a combination of both, but there's not a consensus in the community as far as I know. There's a few blogs/papers that discuss it: www.biostars.org/p/375584/
@@biostatsquid ah makes sense! So it sounds like there are a number of different ways of doing this. Thanks for clarifying and the quick reply! I will have a look at the link.
This is such a great video thank you so much!!!
Thank you, very useful !
You are genius !!
elite explanation. ELITE I TELL YOU. thanks very much
Amazing
Thank you! you are the best!
thank you alot
Thanks a lot for this!
Great thanks
Amazing work
thank u so much for this videos!! 😍
you are amazing. please keep doing what you do. ı am grateful.😍
Amazing!! Thank you
Amazing!
thank you!
Thanks! 🙂👍
Great. Have u ever tried to plot the p-value distribution just to get a relationship of the p-value with the corresponding FDR out put?
Im confused at the end of the video. You said the q-value is the probability of the p-value for the test being wrong, ok but which p-value? The nominal or the adjusted one? Also, isnt the q-value just the adjusted p-value for multiple testing?
Hi Jesse, it is a great question, p-values, q-values and p-adjusted values can be confusing. Yes, as you say, the q-value is an adjusted p-value for multiple testing.
So, in simple words:
p-val = chance of a false positive (i.e., if you use a p-val cut off of 0.05, it means you are taking a chance that there are 5% of false positives --> calling something significant when it is actually not)
Problem of multiple testing - the more tests, the more chance of observing at least one significant result, even if it is actually not significant. We need to correct for this - for which we can use different methods:
p-adjusted values: p values corrected using the (most commonly) Bonferroni correction. Usually too stringent.
q--values: p-values corrected based on the False Discovery Rate (FDR) - now we are not taking about 5% of all results being false positives, but 5% of SIGNIFICANT results being false positives.
Hopefully that made sense. If you want a more exhaustive explanation you might want to check my video on multiple testing correction: ua-cam.com/video/LVKLyt1B35w/v-deo.html&embeds_euri=https%3A%2F%2Fbiostatsquid.com%2F&source_ve_path=Mjg2NjY&feature=emb_logo
or blog post: biostatsquid.com/multiple-testing-correction-fdr/
@@biostatsquid Thank you for the indepth reply. Ok, if the q-value is the adjusted p-value then the only thing I don't get is why is there a column for the adjusted p-value and a column for the q-value in the chart near the end of the video? Furthermore, their values are different for each row? Ohhh wait I just accidently skipped over the adjusted p-value section of your reply, sorry. Ok I see so difference between the q and adjusted p-values is the method used to correct for multiple testing(adjusted p-value yielded from the Bonferroni correction and the q-value from FDR). Thanks, that really clears it up!
@@jessehines4044 Exactly! Glad it helped:)
@@biostatsquidHold on. Now I'm even more confused. I thought that the adjusted p-value was a correction using Benjamini-Hochberg (as said in the video), not Bonferroni. Besides, I thought that Benjamini-Hochberg was a FDR-controlling method. So that means that both adjusted-p-val and q-val are FDR corrected? Please, help
Hi @@chusty93 thanks for pointing that out! So p-adjusted values are just p-values corrected for multiple testing. This adjustment can be made using various methods: Bonferroni, BH....
q-values are adjusted p-values that control the False Discovery Rate. There are several FDR-controlling methods but the most common one is Benjamini Hochberg. Therefore, when you see "p-adjusted," it often implies FDR correction, just because BH is much more common than Bonferroni, but it's essential to check the specific method used. Same way, if you have q-values, you know they are FDR-corrected p-values, and chances are they will have been corrected using BH since it's the most commmon one, but you should always check. Hope this clarified things, let me know!