You're so smart! You made it look so easy! It took me 2 full weeks to complete this analysis. Picking the right soft threshold for SFT was helpful for me. I thank you profusely.
Great tutorial - I'm working through it slowly. One advice I have is to avoid the tidyverse route when you rename columns. If you used a simple indexed `match(gsub())` call instead of pivoting longer, inner joining then pivoting back wider, you'd not deal with the data at all, just with the vector of colnames. Saves a lot of memory that way.
Hi, thanks for this tutorial and your other videos. I followed your tutotial step by step, the only difference was when I got to the point were you used 14000 genes, I had to use 7000 for RAM. Now I get this error, any idea how to fix it? plotDendroAndColors(bwnet$dendrograms[[1]], cbind(bwnet$unmergedColors, bwnet$colors), + c("unmerged", "merged"), + dendroLabels = FALSE, + addGuide = TRUE, + hang= 0.03, + guideHang = 0.05) Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, : Length of colors vector not compatible with number of objects in 'order'.
i used maxBlockSize = 7000 and when i tried to plot last dendrogram i got that error. "Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, : Length of colors vector not compatible with number of objects in 'order'." got any idea why ?
i solved ("i guess") i got low capacity of Ram so it divides the data, when i try to color it. it doesnt cuz i got like 3 part of data but 1 part of color ( for all samples) so it doesnt match. I Figure out that much lets see how can i fix it.
Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, : Length of colors vector not compatible with number of objects in 'order'. madam getting this error. please help me.
can you please elaborate which data should we use here? raw seq count data from featureCount/htseq-count or the expression data generated from stringtie??
Thank you again for an excellent video. May you please explain how we have to choose the numbers for minModuleSize and maxBlockSize in blockwiseModules? thank you in advance, looking forward to hearing from you!
the data i download when i read in r it says epmty any solution? the data is a Tar file. i also have it in TXT file when i read it in r using read.table or or read,dilim function it reads it into only to variables like all the details in two columns. i am begginer at R and not good with coding any kind of help will be appreciated.
Hello ma'am, I am facing a problem. In my case, the author provided a normalized count matrix data that have decimal points. Should I work with that one because they did not provide any raw data?
Thank you for the amazing video. I was wondering if I want to start from Seurat object of single cell data how should I process the data to follow your tutorial?
Hi, Thank you for the informative videos, due to my ram (4 GB) I had to define '5000' instead of '14000' that you used in one block. as a result I'm having problems in the plotDendroAndColors, which does not show me the merged & unmerged part under the dendrogram. I've searched and I could not find a solution. Do you have any suggestions?
very nice video . I have couple of quick questions first 1) is finding the trait module relation compulsary for WGCNA.? if yes then what is a trait file ??means what information should be included in the trait file ? @) how to find/identify the hub genes after networking modling
Why did you set TOMtype = "signed"? I am trying to understand the difference between adjacency type and TOM type. See Signed vs. Unsigned Topological Overlap Matrix Technical report by Langfelder: "The take-home message from these notes is this: signed TOM takes into account possible anti-reinforcing connection strengths that may occur in unsigned networks. Since the anti-reinforcing connection strengths (practically) cannot occur in signed networks, in signed networks the signed and unsigned TOM are (practically) identical". Since you are using the blockwiseModule instead of the constructing the network step-by-step, I believe the adjacency type is "unsigned" by default. I think you want the networkType to equal "signed".
During my analysis I am getting a lot of gene in the ME0 (they are no in any network) and when I compare to a trait I am getting maximum correlation with this group. But I do also have good correlation with other modules as well. I am tweaking the number of genes to select for the analysis and threshold params. But is there anything I am missing blatantly?
Hello, Nice to provide a good video. I just wonder... I got raw data of Rseq, but I do not have a metadata which is not provided from a company generating Rseq data. This means should I make a metadata in person?
Hello, thank you very much for making these tutorial videos. However, I have encountered an error while plotting the dendrogram with module colors mentioned and the end of this video. Previously when I tried with the same dataset that you used in your analysis, it worked fine. But now I'm trying with one of my microarray data and I got the following error: Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, : Length of colors vector not compatible with number of objects in 'order'. Due to this error, it is not generating the panel of colors at the bottom of the dendrogram. Please help me to sort out this problem.
It is hard for me to recreate this error and troubleshoot it without code and data. I can look into this if you can send me you code and normalized data.
Hello , thanks for the amazing tutorial. But I am getting error after performing WGCNA. Could you please help me out to solve , too few genes with valid expression levels in the required number of samples ?
It depends on the question you are asking. If you are interested in identifying genes that are significantly associated with a particular time point, then building a network for each time point individually would make sense. Otherwise, analyzing them all together would be the right way to go about it.
Thank u so much for providing this video...i have a query that, the dataset you have selected has a single text file...but if we have the datset that have multiples text files, then how to deal with it?...please help as I am new to this field...
Batch effects need to be corrected for before DESeq2. If you have batch information in your colData in a column called "batch", then you could provide it in your design like you mentioned.
Hi, thanks for your video. It's really helpful! I have a question, however, what is randomSeed and what's the effect of changing it? I see the WGCNA manual also use 54321. What's the difference between that and 1234? Thanks very much.
Random seed to make the output of our R code reproducible. By setting a specific seed, the random processes in our script always start at the same point and hence lead to the same result. The result will not change if the seed is changed. You might want to set a different seed for your analysis however, to ensure your results are reproducible, you should always use the same seed for the particular analysis.
Thank you for this very informative video ! I was applying your tutorial on my dataset, however, I kept receiving this error when running the blockwiseModules : Error in colSums(!is.na(datExpr[useSamples, useGenes])) : 'x' must be an array of at least two dimensions I searched for it online but couldn't find an explanation, could you help me please ?
You mean I actually get to say I got something done tomorrow at work?! Killer tutorial, thank you so much for this
You're so smart! You made it look so easy! It took me 2 full weeks to complete this analysis. Picking the right soft threshold for SFT was helpful for me. I thank you profusely.
I am glad my video was helpful! Thank you!
Please make a tutorial on WGCNA with TCGA samples.
This would be great! Please make one!
Great tutorial - I'm working through it slowly. One advice I have is to avoid the tidyverse route when you rename columns. If you used a simple indexed `match(gsub())` call instead of pivoting longer, inner joining then pivoting back wider, you'd not deal with the data at all, just with the vector of colnames. Saves a lot of memory that way.
Thanks!
The package "janitor" is excellent for cleaning up column names if you do not want to do it manually at 19:20
Hi, thanks for this tutorial and your other videos. I followed your tutotial step by step, the only difference was when I got to the point were you used 14000 genes, I had to use 7000 for RAM. Now I get this error, any idea how to fix it? plotDendroAndColors(bwnet$dendrograms[[1]], cbind(bwnet$unmergedColors, bwnet$colors),
+ c("unmerged", "merged"),
+ dendroLabels = FALSE,
+ addGuide = TRUE,
+ hang= 0.03,
+ guideHang = 0.05)
Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
Length of colors vector not compatible with number of objects in 'order'.
I encountered the same error as you, did you solve it?
Happy teacher's day ma'am Thank you for providing this amazing tutorials that help me a lot 🎉🎉🎉
I am really glad to hear my videos have been helpful! Thank you!
amazing!! i was struggling with this
This is absolutely AMAZING! GREAT job and Many thanks!
Your videos are so amazing
Excellent. Thank you!
can you please make a tutorial video for de novo RNA seq assembly and its annotation
I will surely plan a video covering this. Thanks for the suggestion!
thank you very much for making these tutorial videos
Thanks! Your suggestions were very helpful.
i used maxBlockSize = 7000 and when i tried to plot last dendrogram i got that error.
"Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
Length of colors vector not compatible with number of objects in 'order'."
got any idea why ?
i solved ("i guess") i got low capacity of Ram so it divides the data, when i try to color it. it doesnt cuz i got like 3 part of data but 1 part of color ( for all samples) so it doesnt match. I Figure out that much lets see how can i fix it.
Hi, I have the same issue, may I ask how you fixed it?@@aytacoksuzoglu2975
Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
Length of colors vector not compatible with number of objects in 'order'. madam getting this error. please help me.
running into the same issue just now, did you ever find a solution?
Thanks for providing link of the tutorial, it was very useful
use count data to perform PCA?should be normalized data?(TPM or CPM)
can you please elaborate which data should we use here? raw seq count data from featureCount/htseq-count or the expression data generated from stringtie??
Same doubt. I hope you get an answer
Thank you again for an excellent video. May you please explain how we have to choose the numbers for minModuleSize and maxBlockSize in blockwiseModules? thank you in advance, looking forward to hearing from you!
the data i download when i read in r it says epmty any solution? the data is a Tar file. i also have it in TXT file when i read it in r using read.table or or read,dilim function it reads it into only to variables like all the details in two columns. i am begginer at R and not good with coding any kind of help will be appreciated.
Kindly make a tutorial of GWAS and eQTL analysis.
Hi there, does WGCNA work with TPM values? How should one proceed if all they have is TPM values? Regards
it says " Error in data %>% gather(key = "samples", value = "counts") %>% data % : could not find function "%>%
Thankyou mam.. I want to know that is it essential to have phenotypic data for ung this in my transcriptomics data?
Hello ma'am, I am facing a problem. In my case, the author provided a normalized count matrix data that have decimal points. Should I work with that one because they did not provide any raw data?
hi madam, the significant genes are of all the modules or only modules assosiated to the trait.
Thank you for the amazing video. I was wondering if I want to start from Seurat object of single cell data how should I process the data to follow your tutorial?
Hi,
Thank you for the informative videos,
due to my ram (4 GB) I had to define '5000' instead of '14000' that you used in one block. as a result I'm having problems in the plotDendroAndColors, which does not show me the merged & unmerged part under the dendrogram. I've searched and I could not find a solution. Do you have any suggestions?
I have similar issues! Did you figure it out? 😢
The same question
please let me know if you could solve it
Plz, make a video on WGCNA with microarray dataset. plz plz plz
very nice video . I have couple of quick questions first 1) is finding the trait module relation compulsary for WGCNA.? if yes then what is a trait file ??means what information should be included in the trait file ?
@) how to find/identify the hub genes after networking modling
Hello mam it could be really usefull if you make a video on how to interpret the results(images) obtained from wgcna
I will surely plan on making a video on this. Thanks for the suggestion :)
Why did you set TOMtype = "signed"? I am trying to understand the difference between adjacency type and TOM type. See Signed vs. Unsigned Topological Overlap Matrix
Technical report by Langfelder: "The take-home message from these notes is this: signed TOM takes into account possible anti-reinforcing connection strengths that may occur in unsigned networks. Since the anti-reinforcing connection strengths (practically) cannot occur in signed networks, in signed networks the signed and unsigned TOM are (practically) identical".
Since you are using the blockwiseModule instead of the constructing the network step-by-step, I believe the adjacency type is "unsigned" by default. I think you want the networkType to equal "signed".
Effective vedio
During my analysis I am getting a lot of gene in the ME0 (they are no in any network) and when I compare to a trait I am getting maximum correlation with this group. But I do also have good correlation with other modules as well. I am tweaking the number of genes to select for the analysis and threshold params. But is there anything I am missing blatantly?
Ma'am In the hclust plot I am getting the height scale as 20,40,60 so is there any parameter to set the height scale as 200000, 600000?
Does an equal number or matched Healthy and diseased patients matter for this analysis? Scientifically?
It is really helpful, thank you. I have a question, how if the maxBlockSize is 5000? how can I change the rest of code?
Hello, Nice to provide a good video. I just wonder... I got raw data of Rseq, but I do not have a metadata which is not provided from a company generating Rseq data. This means should I make a metadata in person?
Hello, thank you very much for making these tutorial videos. However, I have encountered an error while plotting the dendrogram with module colors mentioned and the end of this video. Previously when I tried with the same dataset that you used in your analysis, it worked fine. But now I'm trying with one of my microarray data and I got the following error:
Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
Length of colors vector not compatible with number of objects in 'order'.
Due to this error, it is not generating the panel of colors at the bottom of the dendrogram. Please help me to sort out this problem.
It is hard for me to recreate this error and troubleshoot it without code and data.
I can look into this if you can send me you code and normalized data.
@@Bioinformagician Thanks. Should I email it to you?
@@sonaaritra yes please
Were you able to fix the error? if yes could you please tell the solution
I encountered the same error. How was this solved??
Hello ma'am can you please explain how we can download data from GEO and convert the read count values to logfold and p-value
Hello , thanks for the amazing tutorial. But I am getting error after performing WGCNA. Could you please help me out to solve , too few genes with valid expression levels in the required number of samples ?
Hi, nice video. May I ask why using vst(counts) rather than the actual DESeq2 normalization process?
Thank you so much for this! Do you recommend doing anything different for longitudinal data?
It depends on the question you are asking. If you are interested in identifying genes that are significantly associated with a particular time point, then building a network for each time point individually would make sense. Otherwise, analyzing them all together would be the right way to go about it.
For my analysis ,0.4 is the highest r2 value that I found, so can go with that values to choose power and mean connectivity?
I was going through the same problem. Check if your expression matrix is in right format. It should have samples in rows and genes in column.
Please release" GWAS" tutorial videos....
Please can you do a tutorial on Gene set Enrichment Analysis. (Idea behind that) Like you did for WGCNA?
Sure, I'll definitely plan a video covering GSEA.
@@Bioinformagician Ok Thank you very much
I also have some videos on my channel related to that. Please do check out and see :)
Thank u so much for providing this video...i have a query that, the dataset you have selected has a single text file...but if we have the datset that have multiples text files, then how to deal with it?...please help as I am new to this field...
Does individual text file represent an individual sample?
can you perform WGCNA analysis on a pre-filtered set of differentially expressed genes, in a more downstream analysis approach?
WGCNA is an unsupervised method. It is NOT recommend to be used on a data that is pre-filtered for differentially expressed genes.
Thank you genius, can you please make a video about mitch and how to use it in R :)
I will surely plan a video on covering this :)
@@Bioinformagician Thank you so much really can't wait to watch it !!!
Thanks for your effort. Do we have to batch correct before Deseq2? I read that Deseq2 does batch correction like this: design = ~ condition + batch.
Batch effects need to be corrected for before DESeq2. If you have batch information in your colData in a column called "batch", then you could provide it in your design like you mentioned.
@@Bioinformagician If I have done "design = ~condition + batch", then I don't need to use ComBat to remove batch effect?
Hi, thanks for your video. It's really helpful! I have a question, however, what is randomSeed and what's the effect of changing it? I see the WGCNA manual also use 54321. What's the difference between that and 1234? Thanks very much.
Random seed to make the output of our R code reproducible. By setting a specific seed, the random processes in our script always start at the same point and hence lead to the same result. The result will not change if the seed is changed. You might want to set a different seed for your analysis however, to ensure your results are reproducible, you should always use the same seed for the particular analysis.
@@Bioinformagicianokay. Thanks very much.
Thank you so much - can you recommend any packages for batch correction?
you can use ComBat-seq for batch correction
Hi! Can I also use RSEM normalized gene expression data for WGCNA?
You mean RPKM normalized gene expression data?
Hello ma'am! It would be so helpful if you would provide your script for WGCNA as a file. It becomes difficult to note down every command
You can get all my scripts from github: github.com/kpatel427/UA-camTutorials/blob/main/WGCNA.R
What do you mean by merged and unmerged? Do you mean data merged with phenodata?
Can you provide timestamp?
@@Bioinformagician 35:05. Thank you!
@@athenanguyen442 Oh I meant modules before merging and modules after merging.
is it necessary that supplementary file must have rawcounts.txt.gz ?please reply and can I do co expression , if the file is in raw.tar
Thank you for this very informative video ! I was applying your tutorial on my dataset, however, I kept receiving this error when running the blockwiseModules :
Error in colSums(!is.na(datExpr[useSamples, useGenes])) :
'x' must be an array of at least two dimensions
I searched for it online but couldn't find an explanation, could you help me please ?
I got the same error! It is maybe because norm.counts is not a 2-dimensional as in lists in lists
So i removed the previous step to convert them into numeric, and it worked for me.
Thanks!
Thanks!