The variables are centered because the plot is based on residuals. Residuals are centered by definition because otherwise the intercepts would not be identified.
@@mronkko thank you very much (once more)! So, I'm used to ploting a scatter plot for my canonical correlations using the code I've posted in the other comment (basically, using stat_smooth lm putting my two variables as X and Y). Is it possible to plot a partial correlation mantaining the original variables' scale ? or the residuals' plot is the only alternative to do this visualization?
@@larissacury7714 Added variable plot maintains the scale but centres. If you want to center the data differently, you can recenter around the original means. However, I would not do that myself because the data in AV plot are residuals and not the original variables. If you recenter around original means, some readers might be confused and think that they are the original data.
Great explanation and plot. Only a minor recommendation: can you change the color of the independent variables? The current white color makes it very hard to see.
Good point. Unfortunately UA-cam does not allow editing published videos, but I will adjust this on my slideset so that colors will be better if I re-record this ever.
Hi, thank you! Is it equivalent of doing a partial correlation plot? I'm having some trouble finding an easy way to plot this partial correlation: pcor.test(data$X1,data$X2, data$Z method = "spearman") .... Z is the controlled variable, which influences both X1 and X2
Added variable plot and partial correlation plot differ in scaling. In AV plot, you plot two residuals. In partial correlation plot, you standardize the residuals and then plot. The following R code shows how partial correlation can be calculated with regression and is probably helpful in explaining the idea. library(ppcor) N
@@mronkko Hi, thank you very much! I guess that I was able to reproduce with my code (I'm adding a reproducible dataset on git) : library(ppcor) library(RVAideMemoire) ## alternative corr function library(tidyverse) ### filter Nas (pairwise) data1 % filter(!is.na(X1) & !is.na(X2)) ### run partial corr: # rho = 0.096 (Pearson) # rho = 0.11 "spearman" ### - Question: HOW CAN i plot the Spearman's correlation? pcor.test(data1$X1, data1$X2, data1$Z, method = "spearman") ### I want to see the corr between X1 and X2 controlling for Z: # m1
@@mronkko Thanks for your answer. I did it using R and it works very well. I meant... I would like to show the plots in my paper. Because I want to show the effect of each variable on my model. So, I'm looking for some references to cite.
Are you sure you are interpreting the plot correctly? The intercept tells the value of residual of prestige when the residual of education is zero. At least to me it looks like the value is zero in the plot.
@@mronkko Haha. But seriously, thank you so much. I was struggling to understand this for the past two days. The Venn diagram you used made it so clear! God bless you, sir!
I guess that your video has shed light on a problem I've been stuck all week! Let me explain it briefly (it's on Cross Validaded, but youtube delete my comments with links, so), basically: I have a problem in which: (this one is different from the previous comment) I have two different "Zs"/third variables.The idea is the following: I want to correlate X1 and X2, but I know that Z1 is extremely correlated to X1 and that Z2 is extremely correlated to X2. So, I'm not assuming that Z1 and Z2 are correlated to both X1 and X2 (as it would be via a partial correlation, right?) I'm not assuming that Z1 and Z2 only effects X1 or X2 (as it would be for a semi-partial) To put it simpler: Z1 is correlated to X1 Z2 is correlated to X2 I want to see if X1 and X2 are correlated controlling for the effects of Z1 on X1 and of Z2 on X2 Hence, If I apply the same logic, I guess that it would be all right: res1 the portion of X2 that is not explained by Z2 so, cor(res1,res2) = > the correlation between X1 (accouting for Z1's effect on it) and X2 (accounting for Z2's effect on it), right? Question: I'm assuming this would be fine for a Pearson's correlation, but my data is non-parametric. Would it be okay to apply a Spearman's correlation cor(res1, res2) ?
I do not understand what it means that your "data is non-parametric". Non-parametric is a term that is used in the context of describing or summarising data (i.e. with a model) but not with the data itself. I assume the data are ordinal. If yes, the question becomes what harm is there in treating them as metric?
@@mronkko , thank you again, let me give you some background: I was doing canonical correlations between X1 and X2 and I was pretty happy with that. But, then, I've noticed that X1 is extremely correlated to a third variable Z1 and that X2 is extremely correlated to a third variable Z2. The problem is that neither a partial correlation or a semi partial correlation seemed to account for this issue because I'm not assuming that Z1/Z2 influences BOTH X1 and X2 (partial) nor one or another (semi), but they influence X1 and X2 "respectively". And then I got stuck on how to account for this. What I mean by non-parametric is that X1 and X2 follow a non-normal distribution (Shapiro Wilk Tesk, histogram). So, I was running canonical Spearman correlations between X1 and X2. Does that make sense now? After watching you video and discussing my other problem, I started to guess that using the residuals may account for this "respectively" influence of Z1 and Z2... ps: all data are continuos
@@larissacury7714 Unfortunately I have so many things on my plate now that I cannot give you much more guidance. (I need to do 7 reviews and I have 5 paper revisions to write, so time is really tight.)
@@larissacury7714 As a general advice, if want to work with linear models, 1) Do a regression 2) Do diagnostics and think through the regression assumptions. Only use something other than regression if you know that 1) a specific regression assumption fails (MLR1-MLR6, see my video on assumptions and diagnostics) and 2) the consequences of that violation has substantial and important implications for your research question. (Correlation and canonical correlation are both linear.)
Finally AV plots are clear to me! Thank you very much!
This video is great! Explanations are very clear and helpful
Thank you so much! Clearly explained! Logic and concise
Glad it helped!
you are a good teacher and you know you stuff find another way to present your material this is tiring , good effort and thank you
Thanks for the comment.
Nice video 👍👍👍
You are welcome!
Thanks for your teaching! It is really helpful!
You are welcome.
Excellent explanation
Glad it was helpful!
Hi, thank you! Why is the variables centered in the plots?
The variables are centered because the plot is based on residuals. Residuals are centered by definition because otherwise the intercepts would not be identified.
@@mronkko thank you very much (once more)! So, I'm used to ploting a scatter plot for my canonical correlations using the code I've posted in the other comment (basically, using stat_smooth lm putting my two variables as X and Y). Is it possible to plot a partial correlation mantaining the original variables' scale ? or the residuals' plot is the only alternative to do this visualization?
@@larissacury7714 Added variable plot maintains the scale but centres. If you want to center the data differently, you can recenter around the original means. However, I would not do that myself because the data in AV plot are residuals and not the original variables. If you recenter around original means, some readers might be confused and think that they are the original data.
Great explanation and plot. Only a minor recommendation: can you change the color of the independent variables? The current white color makes it very hard to see.
Good point. Unfortunately UA-cam does not allow editing published videos, but I will adjust this on my slideset so that colors will be better if I re-record this ever.
Hi, thank you! Is it equivalent of doing a partial correlation plot? I'm having some trouble finding an easy way to plot this partial correlation: pcor.test(data$X1,data$X2,
data$Z method = "spearman") .... Z is the controlled variable, which influences both X1 and X2
Added variable plot and partial correlation plot differ in scaling. In AV plot, you plot two residuals. In partial correlation plot, you standardize the residuals and then plot.
The following R code shows how partial correlation can be calculated with regression and is probably helpful in explaining the idea.
library(ppcor)
N
@@mronkko Hi, thank you very much! I guess that I was able to reproduce with my code (I'm adding a reproducible dataset on git) :
library(ppcor)
library(RVAideMemoire) ## alternative corr function
library(tidyverse)
### filter Nas (pairwise)
data1 %
filter(!is.na(X1) & !is.na(X2))
### run partial corr:
# rho = 0.096 (Pearson) # rho = 0.11 "spearman"
### - Question: HOW CAN i plot the Spearman's correlation?
pcor.test(data1$X1, data1$X2, data1$Z, method = "spearman")
### I want to see the corr between X1 and X2 controlling for Z:
# m1
Thank you so much for the great explanation. Do you have any additional references that I can consult?
@@mronkko Thanks for your answer.
I did it using R and it works very well.
I meant... I would like to show the plots in my paper. Because I want to show the effect of each variable on my model. So, I'm looking for some references to cite.
Thank you my good sir, great video
Glad to help
Thank you very very much!
You are welcome
thank you!
You're welcome!
Hey in the video, the intercept between the regression of residuals is shown to be 0. But it doesn't look 0 in the plots, why is that ?
Are you sure you are interpreting the plot correctly? The intercept tells the value of residual of prestige when the residual of education is zero. At least to me it looks like the value is zero in the plot.
@@mronkko Oh yeah, my bad, thanks a lot. This video has been incredibly helpful.
If only there was an option to give 1000 likes.... Thank you! :D
You always have the option to create 1000 UA-cam accounts ;)
@@mronkko Haha. But seriously, thank you so much. I was struggling to understand this for the past two days. The Venn diagram you used made it so clear! God bless you, sir!
I guess that your video has shed light on a problem I've been stuck all week! Let me explain it briefly (it's on Cross Validaded, but youtube delete my comments with links, so), basically:
I have a problem in which: (this one is different from the previous comment)
I have two different "Zs"/third variables.The idea is the following: I want to correlate X1 and X2, but I know that Z1 is extremely correlated to X1 and that Z2 is extremely correlated to X2.
So, I'm not assuming that Z1 and Z2 are correlated to both X1 and X2 (as it would be via a partial correlation, right?) I'm not assuming that Z1 and Z2 only effects X1 or X2 (as it would be for a semi-partial)
To put it simpler:
Z1 is correlated to X1
Z2 is correlated to X2
I want to see if X1 and X2 are correlated controlling for the effects of Z1 on X1 and of Z2 on X2
Hence, If I apply the same logic, I guess that it would be all right:
res1 the portion of X2 that is not explained by Z2
so,
cor(res1,res2) = > the correlation between X1 (accouting for Z1's effect on it) and X2 (accounting for Z2's effect on it), right?
Question: I'm assuming this would be fine for a Pearson's correlation, but my data is non-parametric. Would it be okay to apply a Spearman's correlation cor(res1, res2) ?
I do not understand what it means that your "data is non-parametric". Non-parametric is a term that is used in the context of describing or summarising data (i.e. with a model) but not with the data itself. I assume the data are ordinal. If yes, the question becomes what harm is there in treating them as metric?
@@mronkko , thank you again, let me give you some background: I was doing canonical correlations between X1 and X2 and I was pretty happy with that. But, then, I've noticed that X1 is extremely correlated to a third variable Z1 and that X2 is extremely correlated to a third variable Z2. The problem is that neither a partial correlation or a semi partial correlation seemed to account for this issue because I'm not assuming that Z1/Z2 influences BOTH X1 and X2 (partial) nor one or another (semi), but they influence X1 and X2 "respectively". And then I got stuck on how to account for this. What I mean by non-parametric is that X1 and X2 follow a non-normal distribution (Shapiro Wilk Tesk, histogram). So, I was running canonical Spearman correlations between X1 and X2. Does that make sense now? After watching you video and discussing my other problem, I started to guess that using the residuals may account for this "respectively" influence of Z1 and Z2... ps: all data are continuos
@@mronkko , may I send you an email?
@@larissacury7714 Unfortunately I have so many things on my plate now that I cannot give you much more guidance. (I need to do 7 reviews and I have 5 paper revisions to write, so time is really tight.)
@@larissacury7714 As a general advice, if want to work with linear models, 1) Do a regression 2) Do diagnostics and think through the regression assumptions. Only use something other than regression if you know that 1) a specific regression assumption fails (MLR1-MLR6, see my video on assumptions and diagnostics) and 2) the consequences of that violation has substantial and important implications for your research question. (Correlation and canonical correlation are both linear.)
Thank you so much!
You are welcome.