How to rarefy community data in R with vegan and the tidyverse (CC200)

Riffomonas Project

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 12 січ 2025

КОМЕНТАРІ • 29

@silviagonzalezcolino858 8 місяців тому ⁺¹
Thank you so much for your tutorials!! You make complex things look easier which is very helpful (specially in analysing data)
@Riffomonas 8 місяців тому
Thanks!
@charleslehnen9636 2 роки тому
Check out the parameter of vegan's function:
"Instead of drawing a plot, return a “tidy” data frame than can be used in ggplot2 graphics. The data frame has variables Site (factor), Sample and Species."
@Riffomonas 2 роки тому ⁺¹
Thanks - yeah i think that's new since I made the video
@de.bora.bora.yt-chan 2 роки тому ⁺¹
Thank you so much for this detailed explanation! It was really useful for my own data.
@Riffomonas 2 роки тому
Wonderful - I'm glad it was useful! Thanks for watching
@brantainman 2 роки тому ⁺¹
@ 5:00 it is suggested that tibbles do not allow row names. I think this is incorrect and the following code is the tidy way to do it:
shared %>%
pivot_wider(names_from = name, values_from = values, values_fill = 0) %>%
column_to_rowname('Group')
@brantainman 2 роки тому ⁺¹
Also, vegan has a great new feature that avoids all the data manipulation for getting tidy data: my_curves
@Riffomonas 2 роки тому
That’s great to see!
@Riffomonas 2 роки тому
This actually creates a data frame rather than a tibble. A tibble is a special kind of data frame
@meseretmuche6984 2 роки тому ⁺¹
thank you so much for your unlimited help,
Dr, if you have a lecture regarding Hill number (q=0, q=1, q=2) for diversity analysis of vegetation, please provide me.
@Riffomonas 2 роки тому
My pleasure? Unfortunately I don’t have anything about hill numbers
@cristianvillenaalemany7972 2 роки тому ⁺³
Thank you very much for all the material, very useful!
I have different library sizes in my microbiome data and I would like to normalize it using rarefaction to min_n_seqs since the smaller sample contains more than 12000 reads, as you well explained. If I use vegan:rrarefy, I obtain the specified subset of reads from my original OTU table. One single random subset might not be representative enough for each sample since there is high diversity. Is there a way you recommend to carry on a multiple iteration rarefaction and a final OTU table in which the values are the average of the multiple subsets?
Thanks for your attention.
@Riffomonas 2 роки тому ⁺²
Thanks for watching! RUnning rrarefy a bunch of times and then averaging the counts is effectively the same as using the relative abundance, which I showed in an earlier episode causes problems. I would suggest running whatever test you're doing on single a subsampling and then repeat it a few times to see if the results change any. In my experience, the low relative abundance taxa are what change the most and for most OTU-based analyses they don't come up as significant. if they do, I generally discount them because they're so rare.
@cristianvillenaalemany7972 2 роки тому
@@Riffomonas I will try as you suggest, it makes a lot of sense. Again, thank you very much for your help! Your videos are awesome!
@oluwafemioyedele 2 роки тому ⁺¹
@Path, I think there is a function to deal with either rownames or columname in tibble package
@Riffomonas 2 роки тому ⁺¹
Ah - you're right! Thanks :) tibble.tidyverse.org/reference/rownames.html
@sven9r 2 роки тому ⁺¹
we rarely see this face @20:44 ! Pat thinking longer than a nanosecond about one of his 2198321673213 variables.
@Riffomonas 2 роки тому ⁺¹
Lol. Plus I think it was the end of a long day at the end of a long week 😂🤓
@edwinimfumu3221 Рік тому
Hi Sir, thanks for your videos. I rarefied my data using iNEXT. Now i am having problem to plot the data. Can you show how to resize plot, legend, etc when using iNEXT
@bugslutt Рік тому
If I want to loop the rrarefy command on my data matrix 1000 times and save all the output (to ultimately calculate an average), what code would I use? I've been trying to figure it out and am struggling!
@aabidhussain2138 Рік тому
using this code on my data, the shared file produced is empty with only column names in it. what could be the problem?
@nendinosaurus Рік тому
Ok nice, but what do you use for bar plots then for example? When you need a single dataframe. Do you use a single subsampling for things like that?
@Riffomonas 10 місяців тому
I don't usually use barplots 😂 I would take the average value for each sample and plot that as a jittered plot
@lisakelly4921 2 роки тому ⁺¹
If you rarefy to min_n_seq is there a risk of removing significance between two groups of samples when you downstream statistical analysis?
@Riffomonas 2 роки тому ⁺¹
Hi Lisa thanks for watching! I think there’s a trade off. If you increase the min_n_seqs value you will have a better limit of detection but fewer samples. With fewer samples you’ll have less statistical power to detect differences. It might be worth running an analysis at multiple levels and see what happens
@gimanibe Рік тому
Thanks Pat. How would you plot the output of drarefy?
@GabrielYan-r3g Рік тому
Thanks Pat. In QIIME2, your taxa will reach a plateau while your sampling depth increases to a certain level. Is there a similar approach to get that number of sampling depth while plotting the rarefaction curves in Vegan?
@Riffomonas 10 місяців тому
Sorry, I don't use qiime and am not really familiar wiht why you see that. Perhaps because they're using closed reference clustering and it is saturating all of the available taxa in the reference?

Наступне

Автоматичне відтворення

Using R to compare empirical and exact rarefaction values (CC199)