Hi Keith i am struggling to run my data which is on biological data collected on 5sites with 10 replicas. so i want to plot a scatter plot to show the distribution of the species. everytime i try to plot i get variation in sites and not biota
Sorry; retired with no access to software so I cannot help. Apologies ( but I have a vague memory that you might have to swap rows and columns; also may depend on what you are doing; as in, what type of analysis).
@@kamcgnt Hi Keith I want to use a scatter plot to analyse the distribution of the biota across the different sampling stations. Thank you for your response and assistance I went about by instantly going to wizard and selected MDS and went to plot option and selected a scatter plot, so the scatter plot is showing sites and not the distribution of the different specie?
Hi keith thank you for the tutorial but can I contact you personally regarding data arrangement for primer permanova? as I'm facing a lot of difficulties here
Hi Keith, thank you for your fantastic tutorials. I'm trying to do a PERMANOVA test with genera of diatoms obtained from 4 different substrates (rocks, sediment, plants and surface water). When I import my excel spreadsheet it doesn't automatically import my factors. I tried to add 'substrate' or 'genus' as a factor but it didn't work when I ran the PERMANOVA. Any ideas on how to fix this? Also, the PCO was 63.2% (PCO1) and 23.8% (PCO2) giving 87%, are these values too low? (I know you said most data wouldn't be as good as the 95% and 3% you had in your PCO). The environmental data I have can't be applied to all substrates.
+loosealliance It can occur with semimetric or non-metric resemblance measures and occurs because the distance matrix does "not allow a full representation of the relationships among objects in Euclidean space". The "representation is meaningful as long as the largest negative eigenvalue is smaller, in absolute value, than any of the m positive eigenvalues of interest for representation in a reduced space (usually, the first two or three)." That's from Legendre and Legendre: Numerical Ecology.
Hi Keith. Thank you so much for the really helpful tutorials. I am trying to run the PERMANOVA and I am running into 2 problems. Firstly when i try to run the test I get a 'WARNING No replication at the lowest level', and secondly my output does not generate the p(perm) for both of my factors (interaction effect). The output mentions 'excluded terms'. I have checked the terms before running the test and it shows that all terms are included. Please help :( Thank you in advance
Hello Keith, Once again thanks for the very insightful videos you have published on the primer/permanova tools.However, I have a question on how to edit the labels in the ordination.For example in this video, if I want to see just the different colours of the labels and not (colour+label) as you have displayed in the graph output now.How can I correct that?
Right-click on the graph and select the top option: 'Samp. labels & symbols'. On the left, there is a check box for 'Labels', 'Plot'. Uncheck it and labels are gone. On the right, are the options for the symbols. This is for PRIMER 7.
Keith, when I run my PERMANOVA design, I get an error warning me there was no replication at the lowest level. In my results It tells me that the combination of my variables was excluded (in my case it would be sitexdate). how can I fix that and see the effects of the combination of the two variables?
Your two factors are site and date. If you do not have more than one observation for each site on each date then the design is no replicated, so there is no replication at the lowest level. In such cases, PERMANOVA will use the two factor interaction as the error term (Residual SS) because there is no other way to estimate the error. With no separate estimate of error, there can be no test for the interaction term. Also, if date is a factor, this may be a repeated measures design and should be analysed as such. If you are specifying the design correctly, PERMANOVA should do this.
I have seen this kind of apparently contradictory result and I cannot really explain it. CAP is designed to give groups, so it does! To make sure that these are "real" you use PERMANOVA. CAP and PA both use the actual distances, whereas BEST is using ranks and this may account for some of the differences. You might want to try distLM, which also uses actual distances, to see if that gives results more consistent with CAP.
+Peter Tran Hi Peter. PRIMER/PERMANOVA+ recognise factors when importing the spreadsheet. If samples are in the rows, then leave a blank column after the last data column and then have the factor labels in the following columns. So, for a 2 factor design, I have a blank column after the last data column and then two columns with labels for the two factors: the text in the first row for each column will be the factor name.
Hi again Keith, Is the p value result of the PERMANOVA test is based in the similarity index created and not from the actual means of the sample? The reason I asked this is because in the bar graph with standard error I created from the means of my sample, it shows significant difference (coz the error bar didn't overlap) but in PERMANOVA results, the p value is >.05 (no significant difference). Many thanks again. Dex
It is not easy to judge significant differences from error bars and it is easy to make an incorrect conclusion. The answer to this question depends on whether you are doing a multivariate PERMANOVA or using the PA routine to get p-values by permutation for an ordinary ANOVA. The latter will use Euclidean distances (possibly of transformed data), so is run from the similarity matrix. But, because PA in this case gives the same ANOVA table (except with p-values from permutation), it must be doing, effectively, the same calculations as ordinary univariate ANOVA. As a consequence, the SS are, effectively, derived from calculations using means. The situation is different for multivariate PERMANOVA. Here the calculations are based on the group/treatment centroids and not the means.
Hi Keith , Thanks a million for these videos - they are life savers. Can I ask you one question in relation to SIMPER analysis. Should SIMPER analysis be carried out on transformed data or raw data. My inkling is that I should do it on raw data as the mean abundances that Primer will present me are a what I actually want t know as opposed to the log transformed or 4th root transformed mean abundances. If you had any insights I would greatly appreciate them. Matt
As you know, SIMPER is done on a data sheet, raw or transformed. But many other analyses are done on the resemblance matrix derived from the data sheet. For abundances, the transformations are to down weight the very abundant species, so that multivariate analyses reflect the patterns in species composition of the entire community. Having said that, the most abundant species will still be the most abundant species after transformation and, as a consequence, species characteristic of a group (e.g. site) will probably stay the same but the species distinguishing pairs of groups may change. I just did a little test and this was the case. The species characteristic of a group were the same, in the same order, for the raw data or fourth root. The species distinguishing the pair changed completely, with only one species in both lists. For this reason, if you do an ordination on transformed data, a SIMPER on the raw data may not match what the ordination displays.
I see that people sometimes transform in an attempt to make PERMDISP non significant and then carry out the PERMANOVA. Do you have a take on this and the arguments around issues with dispersion in permanova? and if you have a video dealing with PERMDISP could you point me in the right direction? Thanks again. These videos are great!
Hi there. Thanks for the perfect explanation. I have one small question. Why do your factor levels appear on the Y axis (on CAP)? How can I use them instead of CAP2?
hi... I have some data relating to gut microbiota structure. There are two treatments, corresponding to each treatment there is some base line and end values relating to microbiota structure. How i can compare the effect of these two treatment one another by using permanova.
Thanks for the videos, Keith. PERMANOVAs are pretty resource intensive, but my computer is only allocating 25% of the CPU to Primer. Know any way of making primer more of a CPU hog?
Tomer Czaczkes The only thing I know is to change the priority of the process. This is for Windows: www.sevenforums.com/tutorials/83361-priority-level-set-applications-processes.html
Thanks for the informative tutorial! How did you select the interaction term in the MDS? When I try to 2 plot factors in the nmds, I am only able to select and show 1 using Primer v7.
Hi Emily. In Primer, if you go to 'Edit factors', one option is 'Combine factors'. Usually, we do a PERMANOVA to test for differences then use the appropriate ordination option to review the results. Using 'Combine factors' we can highlight the relevant interaction, if there is a significant interaction.
Thanks! I have run a 2-factor permanova with 3 levels in my first factor (Size) and 4 levels in my second factor (Year), but am a bit concerned because one of my group only has a sample size of 1 (eg. only 1 small size was sampled in 2011). Would this be an issue for PERMANOVA and for testing the assumptions with PERMDISP?
***** Conclusions about groups with few samples must be tentative. If n = 1 for one or more groups, then they cannot be used in PERMDISP as they will have no dispersion. The authors of PERMANOVA, however, say (in the manual) that they do not think variation of the equal dispersions assumption are likely to create problems.
Thank you very much for the videos. I am running PERMONOVAs on some data in PRIMER 7 that has three factors and only one of them works in that it will generate a p value (one of ten locations) and the others do not (either inside or outside a building or inside or outside a system) What would be the reason(s) that I do not see any PERMANOVA results i.e. no test for those two factors? Both have dfs of 0 in the summary. Please let me know if you need any other information. Thank you for your time and help.
If you have 0 df, there is some issue with the way in which you have entered the design or data. If there is 0 df, PRIMER will be unable to estimate SS, so no MS or F tests.
I just entered the data into an excel worksheet and had the factors separated from the physical variables and biological data on both their sheets by a blank column. Similar to your first video. Is there any special way that the data needs to be put into Excel or saved? I could not find anything in the manuals talking about that. Thank you again for your help.
Under PERMANOVA results they read each of the factors at the correct levels i.e. 2 , 10, and 2. So it seems like it is recognizing that there are different categories. They are all fixed types but then in the results lists the df as 0, 7, and 0. Is there an options section to change the amount of change between levels and df? In your video it only goes down by one between the levels and df. Thank you for your help.
Hi Keith, Thanks a million for these videos, I am doing my thesis and this tutorial is very helpful. I have some questions: 1) When you do the PCO, what does the% of total variation mean? 2) Can you modify it in any way? 3) What is the difference between a PCO, nMDS and a CAP? regards
1) It is the percentage variation among the samples that can be represented in the ordination. Suppose there are 20 variables. The relationships among the samples can be perfectly represented in a 20 dimension display but there is no way to plot that. So ordination methods attempt to display the relationships in 2 or 3 dimensions. The % variation is the % of the total variation that is accurately represented in the 2D or 3D plot. 2) Modify the % variation? Transforming the data or using a different distance measure would change the % variation, and a 3D plot will display more than a 2D plot, but how much would depend on the data set. 3) PCO uses actual distances and has a user selected distance measure and is calculated, like PCA. nMDS has a user selected distance measure but uses the ranks and a permutation based procedure which may give different results if repeated, especially with small data sets. CAP is an ordination method which has constraints imposed, such as maximising the variation due to factors in an experimental design.
Again a million thanks for your help, one last question: reviewing the video again, I saw that apart from the P (perm) is the Pseudo-F, what is ?, What does it mean and what criteria should I use to determine if it is Acceptable or not? regards
Again a million thanks for your help, one last question: reviewing the video again, I saw that apart from the P (perm) is the Pseudo-F, what is ?, What does it mean and what criteria should I use to determine if it is Acceptable or not? regards
The Pseudo-F is the actual test statistic. In ordinary ANOVA, we can get the p-value associated with this from the F distribution but here the p-value is obtained by permutation. If the null is true, the F will be about 1 (one). If it is significantly larger, the null can be rejected.
I think I understand the foundation. In my case, perform an analysis of two samples, carrying out the analyzes I gave the following data: Sample 1: F = 7.9418 And P = 0.001. Sample 2: F = 4.8779 and P = 0.085. So my analysis tells me that they have no significant differences. It is right? Thanks and regards
Hi! I just wanted to thank you for these videos. Every time I have to use Primer and Permanova, I end up coming back here to remember some details!
Glad to help! And thank you for saying so. (I am now retired but still see comments.)
Thank you very much for your step-by-step tutorials! These have been hugely helpful.
It was my pleasure! (I am now retired, so no updates.)
Very helpful videos! Wondering if you are planing a video on Biodiversity measures and species curves, anytime soon!
NOTE: My description of the results of CAP lacks some important information. I will be doing an updated video soon to fix this.
No updated video: retired.
Hi Keith i am struggling to run my data which is on biological data collected on 5sites with 10 replicas. so i want to plot a scatter plot to show the distribution of the species. everytime i try to plot i get variation in sites and not biota
Sorry; retired with no access to software so I cannot help. Apologies ( but I have a vague memory that you might have to swap rows and columns; also may depend on what you are doing; as in, what type of analysis).
@@kamcgnt
Hi Keith
I want to use a scatter plot to analyse the distribution of the biota across the different sampling stations. Thank you for your response and assistance
I went about by instantly going to wizard and selected MDS and went to plot option and selected a scatter plot, so the scatter plot is showing sites and not the distribution of the different specie?
Hi keith thank you for the tutorial but can I contact you personally regarding data arrangement for primer permanova? as I'm facing a lot of difficulties here
I am sorry for your problems but I am retired and also have no access to the software or support.
But there are other videos which may help.
@@kamcgnt perhaps do you know data from CPCE?? how can I arrange that to the one that can be use in primer😅
Hi Keith, thank you for your fantastic tutorials. I'm trying to do a PERMANOVA test with genera of diatoms obtained from 4 different substrates (rocks, sediment, plants and surface water). When I import my excel spreadsheet it doesn't automatically import my factors. I tried to add 'substrate' or 'genus' as a factor but it didn't work when I ran the PERMANOVA. Any ideas on how to fix this? Also, the PCO was 63.2% (PCO1) and 23.8% (PCO2) giving 87%, are these values too low? (I know you said most data wouldn't be as good as the 95% and 3% you had in your PCO). The environmental data I have can't be applied to all substrates.
Thanks for another good tute. Could you comment a little on how to interpret the sections on the PCO that are minus?
+loosealliance It can occur with semimetric or non-metric resemblance measures and occurs because the distance matrix does "not allow a full representation of the relationships among objects in Euclidean space". The "representation is meaningful as long as the largest negative eigenvalue is smaller, in absolute value, than any of the m positive eigenvalues of interest for representation in a reduced space (usually, the first two or three)."
That's from Legendre and Legendre: Numerical Ecology.
Hi Keith. Thank you so much for the really helpful tutorials. I am trying to run the PERMANOVA and I am running into 2 problems. Firstly when i try to run the test I get a 'WARNING No replication at the lowest level', and secondly my output does not generate the p(perm) for both of my factors (interaction effect). The output mentions 'excluded terms'. I have checked the terms before running the test and it shows that all terms are included. Please help :(
Thank you in advance
Apologies, retired and no access to any of this software.
Hello Keith,
Once again thanks for the very insightful videos you have published on the primer/permanova tools.However, I have a question on how to edit the labels in the ordination.For example in this video, if I want to see just the different colours of the labels and not (colour+label) as you have displayed in the graph output now.How can I correct that?
Right-click on the graph and select the top option: 'Samp. labels & symbols'. On the left, there is a check box for 'Labels', 'Plot'. Uncheck it and labels are gone. On the right, are the options for the symbols. This is for PRIMER 7.
Keith, when I run my PERMANOVA design, I get an error warning me there was no replication at the lowest level. In my results It tells me that the combination of my variables was excluded (in my case it would be sitexdate). how can I fix that and see the effects of the combination of the two variables?
Your two factors are site and date. If you do not have more than one observation for each site on each date then the design is no replicated, so there is no replication at the lowest level. In such cases, PERMANOVA will use the two factor interaction as the error term (Residual SS) because there is no other way to estimate the error. With no separate estimate of error, there can be no test for the interaction term.
Also, if date is a factor, this may be a repeated measures design and should be analysed as such. If you are specifying the design correctly, PERMANOVA should do this.
Thanks! I found the video you made explaining this too. Thanks for all the videos, youre saving my thesis!
@@robertsheridan8224 hi there. your question to keith literally saved my life too. may i please ask where did you find that video explaining this?
I have seen this kind of apparently contradictory result and I cannot really explain it. CAP is designed to give groups, so it does! To make sure that these are "real" you use PERMANOVA. CAP and PA both use the actual distances, whereas BEST is using ranks and this may account for some of the differences. You might want to try distLM, which also uses actual distances, to see if that gives results more consistent with CAP.
When you created "Design 2", how did you allow yourself to use the Environmental data as a factor? Mine doesn't have anything under the drop box
+Peter Tran Hi Peter. PRIMER/PERMANOVA+ recognise factors when importing the spreadsheet. If samples are in the rows, then leave a blank column after the last data column and then have the factor labels in the following columns. So, for a 2 factor design, I have a blank column after the last data column and then two columns with labels for the two factors: the text in the first row for each column will be the factor name.
Hi again Keith,
Is the p value result of the PERMANOVA test is based in the similarity index created and not from the actual means of the sample? The reason I asked this is because in the bar graph with standard error I created from the means of my sample, it shows significant difference (coz the error bar didn't overlap) but in PERMANOVA results, the p value is >.05 (no significant difference). Many thanks again.
Dex
It is not easy to judge significant differences from error bars and it is easy to make an incorrect conclusion.
The answer to this question depends on whether you are doing a multivariate PERMANOVA or using the PA routine to get p-values by permutation for an ordinary ANOVA. The latter will use Euclidean distances (possibly of transformed data), so is run from the similarity matrix. But, because PA in this case gives the same ANOVA table (except with p-values from permutation), it must be doing, effectively, the same calculations as ordinary univariate ANOVA. As a consequence, the SS are, effectively, derived from calculations using means.
The situation is different for multivariate PERMANOVA. Here the calculations are based on the group/treatment centroids and not the means.
thanks again for the very useful info, Keith :-)
any suggestion to turn the trial version of Primer 7 into full version?? crack or something else
Hi Keith , Thanks a million for these videos - they are life savers. Can I ask you one question in relation to SIMPER analysis. Should SIMPER analysis be carried out on transformed data or raw data.
My inkling is that I should do it on raw data as the mean abundances that Primer will present me are a what I actually want t know as opposed to the log transformed or 4th root transformed mean abundances. If you had any insights I would greatly appreciate them.
Matt
As you know, SIMPER is done on a data sheet, raw or transformed. But many other analyses are done on the resemblance matrix derived from the data sheet. For abundances, the transformations are to down weight the very abundant species, so that multivariate analyses reflect the patterns in species composition of the entire community. Having said that, the most abundant species will still be the most abundant species after transformation and, as a consequence, species characteristic of a group (e.g. site) will probably stay the same but the species distinguishing pairs of groups may change. I just did a little test and this was the case. The species characteristic of a group were the same, in the same order, for the raw data or fourth root. The species distinguishing the pair changed completely, with only one species in both lists. For this reason, if you do an ordination on transformed data, a SIMPER on the raw data may not match what the ordination displays.
Thanks a million Keith.👍👍👍
I see that people sometimes transform in an attempt to make PERMDISP non significant and then carry out the PERMANOVA. Do you have a take on this and the arguments around issues with dispersion in permanova? and if you have a video dealing with PERMDISP could you point me in the right direction? Thanks again. These videos are great!
Hi there. Thanks for the perfect explanation. I have one small question. Why do your factor levels appear on the Y axis (on CAP)? How can I use them instead of CAP2?
That only happens with very simple data sets where CAP1 accounts for most of the variation.
ah ok! thats again!
hi...
I have some data relating to gut microbiota structure. There are two treatments, corresponding to each treatment there is some base line and end values relating to microbiota structure. How i can compare the effect of these two treatment one another by using permanova.
It sounds like PERMANOVA with two factors: treatment and time.
Thanks for the videos, Keith. PERMANOVAs are pretty resource intensive, but my computer is only allocating 25% of the CPU to Primer. Know any way of making primer more of a CPU hog?
Tomer Czaczkes The only thing I know is to change the priority of the process. This is for Windows:
www.sevenforums.com/tutorials/83361-priority-level-set-applications-processes.html
Thanks for the informative tutorial! How did you select the interaction term in the MDS? When I try to 2 plot factors in the nmds, I am only able to select and show 1 using Primer v7.
Hi Emily. In Primer, if you go to 'Edit factors', one option is 'Combine factors'. Usually, we do a PERMANOVA to test for differences then use the appropriate ordination option to review the results. Using 'Combine factors' we can highlight the relevant interaction, if there is a significant interaction.
Thanks! I have run a 2-factor permanova with 3 levels in my first factor (Size) and 4 levels in my second factor (Year), but am a bit concerned because one of my group only has a sample size of 1 (eg. only 1 small size was sampled in 2011). Would this be an issue for PERMANOVA and for testing the assumptions with PERMDISP?
***** Conclusions about groups with few samples must be tentative. If n = 1 for one or more groups, then they cannot be used in PERMDISP as they will have no dispersion. The authors of PERMANOVA, however, say (in the manual) that they do not think variation of the equal dispersions assumption are likely to create problems.
sorry what is the full name of PCO?
Principal Coordinates Ordination.
@@kamcgnt That's such a fast reply thankyou!
can you do a video on ANOSIM?
Thank you very much for the videos. I am running PERMONOVAs on some data in PRIMER 7 that has three factors and only one of them works in that it will generate a p value (one of ten locations) and the others do not (either inside or outside a building or inside or outside a system) What would be the reason(s) that I do not see any PERMANOVA results i.e. no test for those two factors? Both have dfs of 0 in the summary. Please let me know if you need any other information. Thank you for your time and help.
If you have 0 df, there is some issue with the way in which you have entered the design or data. If there is 0 df, PRIMER will be unable to estimate SS, so no MS or F tests.
I just entered the data into an excel worksheet and had the factors separated from the physical variables and biological data on both their sheets by a blank column. Similar to your first video. Is there any special way that the data needs to be put into Excel or saved? I could not find anything in the manuals talking about that. Thank you again for your help.
If 0 df, then PRIMER is only finding one category for the factor.
Under PERMANOVA results they read each of the factors at the correct levels i.e. 2 , 10, and 2. So it seems like it is recognizing that there are different categories. They are all fixed types but then in the results lists the df as 0, 7, and 0. Is there an options section to change the amount of change between levels and df? In your video it only goes down by one between the levels and df. Thank you for your help.
How could we add years into this? if we wanted to say measure something over time?
Time is tricky to work with and I do not have enough information to respond.
Hi Keith, Thanks a million for these videos, I am doing my thesis and this tutorial is very helpful. I have some questions:
1) When you do the PCO, what does the% of total variation mean?
2) Can you modify it in any way?
3) What is the difference between a PCO, nMDS and a CAP?
regards
1) It is the percentage variation among the samples that can be represented in the ordination. Suppose there are 20 variables. The relationships among the samples can be perfectly represented in a 20 dimension display but there is no way to plot that. So ordination methods attempt to display the relationships in 2 or 3 dimensions. The % variation is the % of the total variation that is accurately represented in the 2D or 3D plot.
2) Modify the % variation? Transforming the data or using a different distance measure would change the % variation, and a 3D plot will display more than a 2D plot, but how much would depend on the data set.
3) PCO uses actual distances and has a user selected distance measure and is calculated, like PCA. nMDS has a user selected distance measure but uses the ranks and a permutation based procedure which may give different results if repeated, especially with small data sets. CAP is an ordination method which has constraints imposed, such as maximising the variation due to factors in an experimental design.
Again a million thanks for your help, one last question: reviewing the video again, I saw that apart from the P (perm) is the Pseudo-F, what is ?, What does it mean and what criteria should I use to determine if it is Acceptable or not?
regards
Again a million thanks for your help, one last question: reviewing the video again, I saw that apart from the P (perm) is the Pseudo-F, what is ?, What does it mean and what criteria should I use to determine if it is Acceptable or not?
regards
The Pseudo-F is the actual test statistic. In ordinary ANOVA, we can get the p-value associated with this from the F distribution but here the p-value is obtained by permutation. If the null is true, the F will be about 1 (one). If it is significantly larger, the null can be rejected.
I think I understand the foundation. In my case, perform an analysis of two samples, carrying out the analyzes I gave the following data: Sample 1: F = 7.9418 And P = 0.001. Sample 2: F = 4.8779 and P = 0.085. So my analysis tells me that they have no significant differences. It is right?
Thanks and regards
Thanks Keith - I'll give that a go!
Thanks for nice demonstration.
ANOSIM video now available: see link in description.
Thank you! Very useful
Does any one has PRIMER7?
I'll try.
Hola