DESeq basics

Поділитися
Вставка
  • Опубліковано 21 січ 2025

КОМЕНТАРІ • 80

  • @iketutgunarta760
    @iketutgunarta760 3 роки тому +18

    Hi Mike, may God bless you for making this video (and other video). Your channel is give the best explanation I've seen so far.

  • @marcolicalzi3841
    @marcolicalzi3841 3 роки тому +5

    Prolly the best RNA Seq analysis I've ever seen. Thank you!!!

  • @sebastianmihail6840
    @sebastianmihail6840 2 роки тому +1

    I am doing a rna-seq analysis on sugarcane and Ive never touched R and today I made my first graph with Deseq2 thanks to you!!!!

  • @lucast3006
    @lucast3006 2 роки тому +2

    Our lab always had money to pay bioinformaticians to do all of our RNAseq analysis. Now we missed out on a grant and don’t have money to pay them, so I have to learn how to do it myself. Thanks a lot, this was very helpful.

  • @milanbhattarai6472
    @milanbhattarai6472 2 роки тому +1

    Wonderful Brother!! Simply Wonderful!! Love from Nepal

  • @matthewseasock9636
    @matthewseasock9636 2 роки тому +1

    Amazing tutorial. I have been really trying to start learning R and this is the first tutorial I have found that starts with read counts and walks through DESeq2 in an accessible way. I really appreciate the education

  • @jakebezzina6729
    @jakebezzina6729 2 роки тому +1

    Amazing tutorial!!!
    For those who have a dataset with geneIDs such as entrez as the rownames and also want to annotate them before plotting, run this code after doing the resres[Order(res$padj),] and write.csv function (using the same variables as in the tutorials, but pretending that the gene names aren't provided):
    # Read the .csv function again so R Studio visualises it as a table (make sure to enable headers)
    res.annotated

  • @dr.suryanarayanrath6613
    @dr.suryanarayanrath6613 2 роки тому +1

    hello, Mike excellent tutorial. Really, your video is a great contribution to students, faculties, and researchers who want to work in this field. Thanks as I have learned much from you. GOD bless you

  • @grayneo
    @grayneo 2 роки тому

    Thanks Mike, this is the best explanation I have come across for a beginner like me

  • @yeeelwy
    @yeeelwy 3 роки тому

    Hi Mike, your video save my life as a beginner bioinformatician that will be giving a presentation in two weeks Lolol

  • @derekyuen5284
    @derekyuen5284 2 роки тому +1

    Hi Mike ! This video helped me alot ! thank you !

  • @asiyazhao3820
    @asiyazhao3820 3 роки тому

    The best best tutorial ever!!!!! Thank you sooooo much!

  • @genomicsandbioinformatics9628
    @genomicsandbioinformatics9628 3 роки тому

    Thank you, please keep sharing your knowledge. Very elaborative tutorial ever watched.

  • @vishweshbharadiya6758
    @vishweshbharadiya6758 2 роки тому

    I cannot tell you how great this was!!!
    I’m a Medical student currently doing research and this tutorial just made it so easy. Thanks a lot Mike for putting it on UA-cam! I feel guilty having access to this for free lol

  • @GovardhanKS123
    @GovardhanKS123 2 роки тому

    Thanks very much, really good explanation there Mike.

  • @mehdihjamadi3225
    @mehdihjamadi3225 2 роки тому

    please put the video for heatmap using complexheatmapk package for the differentially expressed genes. thanks

  • @NazaninSekhavat
    @NazaninSekhavat 10 місяців тому +1

    very useful thank you. also could you please give us the link of the dataset so we could have a practice on it?

    • @mikevandewege3007
      @mikevandewege3007  10 місяців тому

      Thanks for watching, but I don't have access to that dataset. We mapped public reads to transcripts to create it. This was 2 or 3 years ago for a class that I'm no longer teaching. Sorry about that.

  • @souravchakraborty9680
    @souravchakraborty9680 3 роки тому

    How are you setting up a notepad file to get two variables in the Coldata file? Please help me with this

  • @amrsalaheldinabdallahhammo663
    @amrsalaheldinabdallahhammo663 2 роки тому

    How DESeq2 calculates p-value, I know how the mean and LFC are calculated but how p-value is calculated? thanks

  • @ghanbarmahmoodi7584
    @ghanbarmahmoodi7584 3 роки тому +1

    HI Mike, Thank you so much for your great explanation. How I can download RNA seq from GEO to analysis based on the your method? thanks

  • @khanmohdsarim
    @khanmohdsarim 2 роки тому

    Hi Mike, Please suggest, As I have 4 sample (Control, Treatment 1, 2, and 3) with two condition of temperature and 3 replicate for each. How to put replicate 1,2,3 in columns? Should I use R1,R2,R3 of control of one condition and R1, R2, R3 of control of another condition.? Please suggest. Thanks

  • @danielalondonoserna3101
    @danielalondonoserna3101 3 роки тому

    Awesome, thank you eternally

  • @farnazzahedifard3806
    @farnazzahedifard3806 2 роки тому

    YOU are Fantastic. Thanks a lot.

  • @Zineb74
    @Zineb74 3 роки тому +1

    Hi, thank you so much .
    I have a question how did you make the first column as a row names? I get "duplicate 'row.names' are not allowed error' every time!!

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      Yea, i know that error. It's because you have the same gene name more than once and R won't allow duplicate row names. The best idea may be to i.d. the duplicates and rename them like gene.1 or gene.2.

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      Or use a more specific gene identifier.

  • @sagek7949
    @sagek7949 3 роки тому

    Hi! where can I find this read count file?

  • @Molpath-t4f
    @Molpath-t4f Рік тому

    Thank you so much. It helped me alot.

  • @ZahidHussain-xb8it
    @ZahidHussain-xb8it 3 роки тому

    Hi Mike, I have a problem in the command dds

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      I think you can round() your dataset.

    • @ZahidHussain-xb8it
      @ZahidHussain-xb8it 3 роки тому

      @@mikevandewege3007 thanks for your valuable suggestion. i will try the rounding up

  • @polycarpnalela2315
    @polycarpnalela2315 3 роки тому

    Very Good explanation. Would please make another video on Edger

  • @johirislam8174
    @johirislam8174 3 роки тому

    my data set is SARS-2 and control as like your water and 15psi.So in that case which one is equal to your condition??I am confused to interpret MA plot.Kindly help me in that case.

  • @lucast3006
    @lucast3006 2 роки тому

    Do you know of any good tutorials that show how to use Deseq for 4 groups at once instead of just 2? For instance, if I had A, B, C, D and I wanted to compare A vs B, A vs C, and A vs D. Is that pretty tricky?

  • @Maryashahere
    @Maryashahere 3 роки тому

    Thankyou for your video Sir. Sir in your video the command write .csv; when it create, it just prints the number without "+enumber na", without this is it correct?

  • @asdasdchen2336
    @asdasdchen2336 Рік тому

    Thank you!😊

  • @phoenix-z55
    @phoenix-z55 Рік тому

    What is the gse I'd of your data

  • @SK-fn5fc
    @SK-fn5fc 2 роки тому

    Beautifully explained. Thank you so much. For someone who doesn't know R why is the row name 1 in the first command when we have so many Genes (13K)

    • @mikevandewege3007
      @mikevandewege3007  2 роки тому

      In R you can label column names and row names. I think Im setting the names of the rows equal to what's in the first column (row.name = 1). So if i want to extract info for a specific gene or label genes in a graph, I can use the rownames. Default rownames are 1-n.

  • @jeremiahtrinidad6916
    @jeremiahtrinidad6916 3 роки тому

    thanks for making this video man!

  • @johirislam8174
    @johirislam8174 3 роки тому

    Some sorts of data are not like that. so in that kind of data how can i analyze?

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      For example?

    • @johirislam8174
      @johirislam8174 3 роки тому

      like this GEO data set GSE138252 .In the supplementary file there are 3 files in txt format. So i want to analyze the differentially expressed gene from this. But i cannot define the control and infection sample from them.Moreover how can i generate the phenodata of that data set?

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      That's probably when i'd ask the authors for help f available. I tend to not query the GEO, for no good reason though.

    • @johirislam8174
      @johirislam8174 3 роки тому

      @@mikevandewege3007 So how can i analyze the DEG easily by DESeq without any Obstacle in R.

    • @johirislam8174
      @johirislam8174 3 роки тому

      @@mikevandewege3007 So how can i analyze the DEG easily by DESeq without any Obstacle in R.

  • @abdou-samadkone6397
    @abdou-samadkone6397 3 роки тому

    Hi Mike. Thank you very much for this tutorial. I have a question to ask. Should the RNA-seq data normalized by RSEM follow the same method for DEseq2 normalization? Thank you

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому +1

      I would refer to RSEM's manual about diff exp and normalization.

    • @abdou-samadkone6397
      @abdou-samadkone6397 3 роки тому

      @@mikevandewege3007 Thank you. I am having some troubles applying a DEseq2 to my RNA seq data.

  • @Rydaholic
    @Rydaholic 3 роки тому

    Thank you for this!

  • @constanzaandreani7811
    @constanzaandreani7811 3 роки тому

    great video, great mohauk! thanks

  • @aqsamuzammil7791
    @aqsamuzammil7791 3 роки тому

    Hi, in summary(res) what do outliers mean?

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      I think it means: for these genes, a sample had an expression value that was an outlier of the expected distribution and a difference may or may not be trusted. I.e. I have 6 samples and for a gene I see expression values like 0,0,0,0,0,100. An outlier may be driving an observed difference. Could be wrong though.

  • @sabahedayati-t2r
    @sabahedayati-t2r Рік тому +1

    It was very useful

  • @Stella-o3w1z
    @Stella-o3w1z 3 роки тому

    Hi Mike, thank you so much for making such a great video!! I have one question about using apeglm. In the beginning you loaded the apeglm package, but I didn't see where you actually run the package in this example. Could you please give explanation on how to use the package, in terms of on what basis I should be choosing to use apeglm, the codes to run the package, and any other necessary setting to run apeglm (e.g. beta prior, fit type, and test type?) (Sorry for the lengthy question btw)

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      Its probably just a formality and may not have used apeglm in this example. It can be a dependency used by a couple DESeq commands like lfcShrink() which reduces log fold change for lowly expressed genes. Can't tell you any more about it than that lol

    • @Stella-o3w1z
      @Stella-o3w1z 3 роки тому

      @@mikevandewege3007 Thanks!

  • @mailchippull
    @mailchippull 3 роки тому

    Hi Mike, Thanks a lot for the tutorial. Can I please ask you a question regarding choosing Facotor levels for DeSeq2? I have 3 sets of samples (i) infected 1 (2) infected 2 (3) control.
    I would like to compare
    (1) control against infected 1,
    (2) control against infected 2
    (3) infected 1 against infected 2 and
    (4) control against infected 1 and infected 2 combined.
    I am not sure how set the factors and factor levels for this ? Could you please give me a suggestion? Many thanks, GG

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому +1

      All of that would be described in your coldata file. I believe deseq will do pairwise comparisons of your factors, so you'd have 1 column with factors: control, infected1, infected2 and it'll do all the pairwise comparisons (i think, never tried). You can also have another column with just control and infected as the factors and run a separate experiment just comparing control and infected samples. Hope that makes sense.

    • @mailchippull
      @mailchippull 3 роки тому

      @@mikevandewege3007 Thanks a lot for your quick reply!!!

  • @cherylhong2085
    @cherylhong2085 3 роки тому

    very useful. Thanks!!

  • @noereyna2553
    @noereyna2553 2 роки тому

    THANK YOU!

  • @arwamu7050
    @arwamu7050 3 роки тому

    Hi,
    Thank you so much for this, finally a good explanation that works:)
    I have one question please, I am more interested in the differentially expressed genes in the test group not the control, like I am trying to have "condition_15spu_vs_water". I tried to change the sample order in the sample sheet but it did not work.

    • @mikevandewege3007
      @mikevandewege3007  3 роки тому

      The sample order has to match exactly in the dataset and the column data sheet. In reality though, the order doesn't matter, like I just talk about control vs test, but the interpretation is the same if it's test vs control.

  • @nehagoel7405
    @nehagoel7405 3 роки тому

    > dds_new

    • @achishasaikia4781
      @achishasaikia4781 2 роки тому

      The number of variables in your info and cts files are not same. It has to be an equal number. Maybe you forgot to index your first data frame column. He indexed his genes column and his variable count became 6 in both the files.