Weighted Gene Co-expression Network Analysis (WGCNA) Step-by-step Tutorial - Part 1

Поділитися
Вставка
  • Опубліковано 27 сер 2024

КОМЕНТАРІ • 100

  • @waffles764
    @waffles764 Місяць тому

    You mean I actually get to say I got something done tomorrow at work?! Killer tutorial, thank you so much for this

  • @mocabeentrill
    @mocabeentrill 2 роки тому +5

    You're so smart! You made it look so easy! It took me 2 full weeks to complete this analysis. Picking the right soft threshold for SFT was helpful for me. I thank you profusely.

  • @hemangininaik0998
    @hemangininaik0998 2 роки тому +9

    Please make a tutorial on WGCNA with TCGA samples.

    • @bobby5625
      @bobby5625 Рік тому

      This would be great! Please make one!

  • @RamakrishnanRS
    @RamakrishnanRS 11 місяців тому +2

    Great tutorial - I'm working through it slowly. One advice I have is to avoid the tidyverse route when you rename columns. If you used a simple indexed `match(gsub())` call instead of pivoting longer, inner joining then pivoting back wider, you'd not deal with the data at all, just with the vector of colnames. Saves a lot of memory that way.

  • @mocabeentrill
    @mocabeentrill 2 роки тому +5

    Thanks!

  • @PortleyPortions
    @PortleyPortions Рік тому +1

    The package "janitor" is excellent for cleaning up column names if you do not want to do it manually at 19:20

  • @amaliamurgueitio473
    @amaliamurgueitio473 9 місяців тому +3

    Hi, thanks for this tutorial and your other videos. I followed your tutotial step by step, the only difference was when I got to the point were you used 14000 genes, I had to use 7000 for RAM. Now I get this error, any idea how to fix it? plotDendroAndColors(bwnet$dendrograms[[1]], cbind(bwnet$unmergedColors, bwnet$colors),
    + c("unmerged", "merged"),
    + dendroLabels = FALSE,
    + addGuide = TRUE,
    + hang= 0.03,
    + guideHang = 0.05)
    Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
    Length of colors vector not compatible with number of objects in 'order'.

    • @user-gf4qt9mt4r
      @user-gf4qt9mt4r 9 місяців тому

      I encountered the same error as you, did you solve it?

  • @jaykishansolanki2935
    @jaykishansolanki2935 2 роки тому

    Happy teacher's day ma'am Thank you for providing this amazing tutorials that help me a lot 🎉🎉🎉

    • @Bioinformagician
      @Bioinformagician  Рік тому

      I am really glad to hear my videos have been helpful! Thank you!

  • @nataliagarcia5404
    @nataliagarcia5404 Рік тому +1

    amazing!! i was struggling with this

  • @asiyazhao3820
    @asiyazhao3820 Рік тому

    This is absolutely AMAZING! GREAT job and Many thanks!

  • @user-mh7iv1rb9m
    @user-mh7iv1rb9m 3 місяці тому

    Your videos are so amazing

  • @dennisscheper1
    @dennisscheper1 2 місяці тому

    Excellent. Thank you!

  • @learnersseekers904
    @learnersseekers904 2 роки тому +3

    can you please make a tutorial video for de novo RNA seq assembly and its annotation

    • @Bioinformagician
      @Bioinformagician  Рік тому

      I will surely plan a video covering this. Thanks for the suggestion!

  • @amarjeetyadav5661
    @amarjeetyadav5661 Рік тому

    thank you very much for making these tutorial videos

  • @sonaaritra
    @sonaaritra Рік тому

    Thanks! Your suggestions were very helpful.

  • @aytacoksuzoglu2975
    @aytacoksuzoglu2975 Рік тому +2

    i used maxBlockSize = 7000 and when i tried to plot last dendrogram i got that error.
    "Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
    Length of colors vector not compatible with number of objects in 'order'."
    got any idea why ?

    • @aytacoksuzoglu2975
      @aytacoksuzoglu2975 Рік тому

      i solved ("i guess") i got low capacity of Ram so it divides the data, when i try to color it. it doesnt cuz i got like 3 part of data but 1 part of color ( for all samples) so it doesnt match. I Figure out that much lets see how can i fix it.

    • @amaliamurgueitio473
      @amaliamurgueitio473 9 місяців тому

      Hi, I have the same issue, may I ask how you fixed it?@@aytacoksuzoglu2975

  • @drgutharajasekar6275
    @drgutharajasekar6275 Рік тому +3

    Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
    Length of colors vector not compatible with number of objects in 'order'. madam getting this error. please help me.

    • @stevendlg_
      @stevendlg_ Місяць тому

      running into the same issue just now, did you ever find a solution?

  • @sanjaisrao484
    @sanjaisrao484 Рік тому

    Thanks for providing link of the tutorial, it was very useful

  • @user-oe2qd9oq5i
    @user-oe2qd9oq5i 28 днів тому +1

    use count data to perform PCA?should be normalized data?(TPM or CPM)

  • @harkhabarparnazar4499
    @harkhabarparnazar4499 22 дні тому +1

    can you please elaborate which data should we use here? raw seq count data from featureCount/htseq-count or the expression data generated from stringtie??

    • @ruben8355
      @ruben8355 7 днів тому

      Same doubt. I hope you get an answer

  • @saraalidadiani5881
    @saraalidadiani5881 Рік тому +1

    Thank you again for an excellent video. May you please explain how we have to choose the numbers for minModuleSize and maxBlockSize in blockwiseModules? thank you in advance, looking forward to hearing from you!

  • @user-cz4qr4ot9x
    @user-cz4qr4ot9x 8 місяців тому

    the data i download when i read in r it says epmty any solution? the data is a Tar file. i also have it in TXT file when i read it in r using read.table or or read,dilim function it reads it into only to variables like all the details in two columns. i am begginer at R and not good with coding any kind of help will be appreciated.

  • @ps_scholar3407
    @ps_scholar3407 Рік тому

    Kindly make a tutorial of GWAS and eQTL analysis.

  • @saadzaheer3451
    @saadzaheer3451 4 місяці тому

    Hi there, does WGCNA work with TPM values? How should one proceed if all they have is TPM values? Regards

  • @narens8511
    @narens8511 8 місяців тому

    it says " Error in data %>% gather(key = "samples", value = "counts") %>% data % : could not find function "%>%

  • @grace-426
    @grace-426 2 місяці тому

    Thankyou mam.. I want to know that is it essential to have phenotypic data for ung this in my transcriptomics data?

  • @nazifahumaira4762
    @nazifahumaira4762 8 місяців тому

    Hello ma'am, I am facing a problem. In my case, the author provided a normalized count matrix data that have decimal points. Should I work with that one because they did not provide any raw data?

  • @drgutharajasekar6275
    @drgutharajasekar6275 5 місяців тому

    hi madam, the significant genes are of all the modules or only modules assosiated to the trait.

  • @pariaalipour61
    @pariaalipour61 3 місяці тому

    Thank you for the amazing video. I was wondering if I want to start from Seurat object of single cell data how should I process the data to follow your tutorial?

  • @user-pz5cb4zx3t
    @user-pz5cb4zx3t Рік тому +2

    Hi,
    Thank you for the informative videos,
    due to my ram (4 GB) I had to define '5000' instead of '14000' that you used in one block. as a result I'm having problems in the plotDendroAndColors, which does not show me the merged & unmerged part under the dendrogram. I've searched and I could not find a solution. Do you have any suggestions?

  • @user-hb5zf7ze4q
    @user-hb5zf7ze4q 5 місяців тому

    Plz, make a video on WGCNA with microarray dataset. plz plz plz

  • @mehwishwahid183
    @mehwishwahid183 Рік тому

    very nice video . I have couple of quick questions first 1) is finding the trait module relation compulsary for WGCNA.? if yes then what is a trait file ??means what information should be included in the trait file ?
    @) how to find/identify the hub genes after networking modling

  • @suhasinivr5614
    @suhasinivr5614 Рік тому

    Hello mam it could be really usefull if you make a video on how to interpret the results(images) obtained from wgcna

    • @Bioinformagician
      @Bioinformagician  Рік тому

      I will surely plan on making a video on this. Thanks for the suggestion :)

  • @Kaaaaaaaam
    @Kaaaaaaaam Рік тому

    Why did you set TOMtype = "signed"? I am trying to understand the difference between adjacency type and TOM type. See Signed vs. Unsigned Topological Overlap Matrix
    Technical report by Langfelder: "The take-home message from these notes is this: signed TOM takes into account possible anti-reinforcing connection strengths that may occur in unsigned networks. Since the anti-reinforcing connection strengths (practically) cannot occur in signed networks, in signed networks the signed and unsigned TOM are (practically) identical".
    Since you are using the blockwiseModule instead of the constructing the network step-by-step, I believe the adjacency type is "unsigned" by default. I think you want the networkType to equal "signed".

  • @merajulislam6179
    @merajulislam6179 Рік тому

    Effective vedio

  • @akshayavs3776
    @akshayavs3776 Рік тому

    During my analysis I am getting a lot of gene in the ME0 (they are no in any network) and when I compare to a trait I am getting maximum correlation with this group. But I do also have good correlation with other modules as well. I am tweaking the number of genes to select for the analysis and threshold params. But is there anything I am missing blatantly?

  • @AAK00419
    @AAK00419 Рік тому

    Ma'am In the hclust plot I am getting the height scale as 20,40,60 so is there any parameter to set the height scale as 200000, 600000?

  • @divyaagrawal6740
    @divyaagrawal6740 Рік тому

    Does an equal number or matched Healthy and diseased patients matter for this analysis? Scientifically?

  • @marziyehsalehi2290
    @marziyehsalehi2290 7 місяців тому

    It is really helpful, thank you. I have a question, how if the maxBlockSize is 5000? how can I change the rest of code?

  • @freezingtolerance7493
    @freezingtolerance7493 Рік тому

    Hello, Nice to provide a good video. I just wonder... I got raw data of Rseq, but I do not have a metadata which is not provided from a company generating Rseq data. This means should I make a metadata in person?

  • @sonaaritra
    @sonaaritra Рік тому +1

    Hello, thank you very much for making these tutorial videos. However, I have encountered an error while plotting the dendrogram with module colors mentioned and the end of this video. Previously when I tried with the same dataset that you used in your analysis, it worked fine. But now I'm trying with one of my microarray data and I got the following error:
    Error in .plotOrderedColorSubplot(order = order, colors = colors, rowLabels = rowLabels, :
    Length of colors vector not compatible with number of objects in 'order'.
    Due to this error, it is not generating the panel of colors at the bottom of the dendrogram. Please help me to sort out this problem.

    • @Bioinformagician
      @Bioinformagician  Рік тому +1

      It is hard for me to recreate this error and troubleshoot it without code and data.
      I can look into this if you can send me you code and normalized data.

    • @sonaaritra
      @sonaaritra Рік тому

      @@Bioinformagician Thanks. Should I email it to you?

    • @Bioinformagician
      @Bioinformagician  Рік тому

      @@sonaaritra yes please

    • @kartiksachdeva4323
      @kartiksachdeva4323 Рік тому +1

      Were you able to fix the error? if yes could you please tell the solution

    • @SwedishRagers
      @SwedishRagers Рік тому

      I encountered the same error. How was this solved??

  • @harshitasharma3675
    @harshitasharma3675 Рік тому

    Hello ma'am can you please explain how we can download data from GEO and convert the read count values to logfold and p-value

  • @namratasahu4247
    @namratasahu4247 Рік тому

    Hello , thanks for the amazing tutorial. But I am getting error after performing WGCNA. Could you please help me out to solve , too few genes with valid expression levels in the required number of samples ?

  • @MasMariusb
    @MasMariusb Рік тому

    Hi, nice video. May I ask why using vst(counts) rather than the actual DESeq2 normalization process?

  • @athenanguyen442
    @athenanguyen442 Рік тому

    Thank you so much for this! Do you recommend doing anything different for longitudinal data?

    • @Bioinformagician
      @Bioinformagician  Рік тому

      It depends on the question you are asking. If you are interested in identifying genes that are significantly associated with a particular time point, then building a network for each time point individually would make sense. Otherwise, analyzing them all together would be the right way to go about it.

  • @anithabavikatte192
    @anithabavikatte192 Рік тому

    For my analysis ,0.4 is the highest r2 value that I found, so can go with that values to choose power and mean connectivity?

    • @PoulomiChatterjee-me7oc
      @PoulomiChatterjee-me7oc 5 місяців тому

      I was going through the same problem. Check if your expression matrix is in right format. It should have samples in rows and genes in column.

  • @ramachandran8106
    @ramachandran8106 2 роки тому

    Please release" GWAS" tutorial videos....

  • @abelardnsangou2794
    @abelardnsangou2794 Рік тому

    Please can you do a tutorial on Gene set Enrichment Analysis. (Idea behind that) Like you did for WGCNA?

    • @Bioinformagician
      @Bioinformagician  Рік тому

      Sure, I'll definitely plan a video covering GSEA.

    • @abelardnsangou2794
      @abelardnsangou2794 Рік тому

      @@Bioinformagician Ok Thank you very much

    • @SaniyaKhullar
      @SaniyaKhullar Рік тому

      I also have some videos on my channel related to that. Please do check out and see :)

  • @sonialamba2767
    @sonialamba2767 Рік тому

    Thank u so much for providing this video...i have a query that, the dataset you have selected has a single text file...but if we have the datset that have multiples text files, then how to deal with it?...please help as I am new to this field...

    • @Bioinformagician
      @Bioinformagician  Рік тому

      Does individual text file represent an individual sample?

  • @nataliagarcia5404
    @nataliagarcia5404 Рік тому

    can you perform WGCNA analysis on a pre-filtered set of differentially expressed genes, in a more downstream analysis approach?

    • @Bioinformagician
      @Bioinformagician  Рік тому

      WGCNA is an unsupervised method. It is NOT recommend to be used on a data that is pre-filtered for differentially expressed genes.

  • @amrsalaheldinabdallahhammo663

    Thank you genius, can you please make a video about mitch and how to use it in R :)

  • @abdullahaltulea142
    @abdullahaltulea142 Рік тому

    Thanks for your effort. Do we have to batch correct before Deseq2? I read that Deseq2 does batch correction like this: design = ~ condition + batch.

    • @Bioinformagician
      @Bioinformagician  Рік тому

      Batch effects need to be corrected for before DESeq2. If you have batch information in your colData in a column called "batch", then you could provide it in your design like you mentioned.

    • @user-ej1lh5wl8f
      @user-ej1lh5wl8f Рік тому

      @@Bioinformagician If I have done "design = ~condition + batch", then I don't need to use ComBat to remove batch effect?

  • @yipan3694
    @yipan3694 Рік тому

    Hi, thanks for your video. It's really helpful! I have a question, however, what is randomSeed and what's the effect of changing it? I see the WGCNA manual also use 54321. What's the difference between that and 1234? Thanks very much.

    • @Bioinformagician
      @Bioinformagician  Рік тому

      Random seed to make the output of our R code reproducible. By setting a specific seed, the random processes in our script always start at the same point and hence lead to the same result. The result will not change if the seed is changed. You might want to set a different seed for your analysis however, to ensure your results are reproducible, you should always use the same seed for the particular analysis.

    • @yipan3694
      @yipan3694 Рік тому

      @@Bioinformagicianokay. Thanks very much.

  • @adampassman
    @adampassman Рік тому

    Thank you so much - can you recommend any packages for batch correction?

  • @bobby5625
    @bobby5625 Рік тому

    Hi! Can I also use RSEM normalized gene expression data for WGCNA?

  • @quinattasneemrafique536
    @quinattasneemrafique536 9 місяців тому

    Hello ma'am! It would be so helpful if you would provide your script for WGCNA as a file. It becomes difficult to note down every command

    • @Bioinformagician
      @Bioinformagician  9 місяців тому

      You can get all my scripts from github: github.com/kpatel427/UA-camTutorials/blob/main/WGCNA.R

  • @athenanguyen442
    @athenanguyen442 Рік тому

    What do you mean by merged and unmerged? Do you mean data merged with phenodata?

  • @ritikasingh8809
    @ritikasingh8809 Рік тому

    is it necessary that supplementary file must have rawcounts.txt.gz ?please reply and can I do co expression , if the file is in raw.tar

  • @fatimafarhan531
    @fatimafarhan531 Рік тому

    Thank you for this very informative video ! I was applying your tutorial on my dataset, however, I kept receiving this error when running the blockwiseModules :
    Error in colSums(!is.na(datExpr[useSamples, useGenes])) :
    'x' must be an array of at least two dimensions
    I searched for it online but couldn't find an explanation, could you help me please ?

    • @nanditapuri1916
      @nanditapuri1916 Рік тому

      I got the same error! It is maybe because norm.counts is not a 2-dimensional as in lists in lists

    • @nanditapuri1916
      @nanditapuri1916 Рік тому

      So i removed the previous step to convert them into numeric, and it worked for me.

  • @emilyzhang2755
    @emilyzhang2755 8 місяців тому

    Thanks!

  • @RajeshKumarDutta
    @RajeshKumarDutta 11 місяців тому

    Thanks!