Granger causality (prediction)

Mike X Cohen

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 20 гру 2019
This video lesson is part of a complete course on neuroscience time series analyses.
The full course includes
- over 47 hours of video instruction
- lots and lots of MATLAB exercises and problem sets
- access to a dedicated Q&A forum.
You can find out more here:
www.udemy.com/course/solved-c...
For more online courses about programming, data analysis, linear algebra, and statistics, see
sincxpress.com/
Наука та технологія

КОМЕНТАРІ • 68

@TheFk042 2 роки тому ⁺⁹
Wow, I am amazed by your comprehensible explanations. Thanks a lot for that! You really can get people hooked on statistics :-)
@mikexcohen1 2 роки тому ⁺¹
Glad it was helpful!
@kropinskipawel 2 роки тому ⁺²
One of the best explanations of time series I've heard so far - thanks for sharing
@mikexcohen1 2 роки тому ⁺¹
Glad you enjoyed it!
@HumaninWM Рік тому
Hi Mike, it really helped me a lot to get used the terms and concepts! Thanks !
@pranjalgarg702 Рік тому
I cannot belive something so descriptive on the topic exists. You are awesome!!
@mikexcohen1 Рік тому
No, *you're* awesome, Pranjal!
@lucydauphin4487 2 роки тому
Thanks a lot for this great content, super helpful!
@user-xc9ih8gv4h Рік тому
This is by far the best presentation of what granger causality is that I’ve come across. Bravo
@mikexcohen1 Рік тому
Thank you, kind internet stranger.
@pedram4967 4 роки тому
your explanation was great, thank you
@bonavich24 4 роки тому
Very nice lecture. Thank You!
@joseguzman9170 3 роки тому ⁺¹
The clearest explanation I've ever seen. I was looking the thumbs up (like) x100 button, thanks bro!
@mikexcohen1 3 роки тому
Thanks Jose. I don't do it for the likes, so I'm happy with just one :P
@CamiloCastelblancoRiveros Рік тому
You make things look so easy and fun! Thanks again Mike :)
@mikexcohen1 Рік тому
My pleasure!
@tatendamagodora584 4 роки тому
Well explained, thank you
@nikitrianta9896 10 місяців тому
Thank you very much for the effort you put in this video. Thanks to you, I finally understand what is granger causality.
@mikexcohen1 10 місяців тому
Awesome
@jimk.6379 4 роки тому
Great video. Thank you for that! :-)
@SantoshKumar-gf4kg 4 роки тому
Great! Sir!
@jindan9874 4 роки тому
nice lecture
@josephkomolgorov651 3 роки тому
Impressive explanation of all concepts! Thank you for sharing these! But I am still curious do you have recommended Python software to implement connectivity methods?
@mikexcohen1 3 роки тому
In general, I recommend using whatever language most people in your field use. I use and teach Python quite a bit, but for neuroscience analyses, I stick to MATLAB. Anyway, my guess is that a Python implementation of Granger causality exists somewhere, but I've never looked for it. That said, I suspect the algorithms are better developed in MATLAB.
@esquire9445 9 місяців тому
The warning at the end about spectral granger causality might have saved me. I was going to do that with wavelets on a project I’m working on. I now no longer have a plan on how to do it right. Haha.
@mikexcohen1 8 місяців тому
Awesome :)
@sumanachetia8914 Рік тому
Thank you! Really helpful!
@mikexcohen1 Рік тому
:)
@SocialPrime 3 роки тому
Oh man I suck at math (even more when it is explained in English which is not my native language), but this was great and very easy to understand, thank you very much. I subscribed.
@mikexcohen1 3 роки тому
Nice, I'm glad you found it useful.
@alidanandehmehr7147 2 роки тому
Great explanation. I have a question also. Shall both x and y time series be stationary? I mean if the series are stationary in the selected window, can we apply bivariate model for nanstationary series?
@mikexcohen1 2 роки тому ⁺¹
Yes, definitely the data should be stationary within the window. There are some tests to assess stationarity, but in practice with large datasets, it's common to apply a few strategies like detrending, z-scoring, and using smaller time windows where data are more likely to be stationary. It's also wise not to interpret GC in time windows when you know the data are highly non-stationary.
@phdnida989 2 роки тому
Thanks Prof!
@mikexcohen1 2 роки тому
:)
@TheBlackjovi 4 роки тому ⁺¹
Dude!!! This is amazing, thank you for such a good elcture
@mikexcohen1 4 роки тому
Glad you enjoyed it!
@kanejiang2938 Рік тому
Mike . Come on . We need more amazing vidio for the world like this video
@mikexcohen1 Рік тому
Thank you kindly, Kane. More coming out!
@FreeMarketSwine 7 місяців тому
Is there a way to convert a Bayesian Information Criterion value to a p-value? Or maybe modify an existing Z-score or p-value for the number of parameters?
@mikexcohen1 7 місяців тому
Interesting question. Not directly that I know of. The BIC values themselves are entirely dependent on the scale and units of the data, so there is no universal H0 distribution to compare against. I suppose you could generate a p-value empirically based on permutation testing (shuffling) to get an empirical H0 distribution. All that said, BIC is generally used to compare various models, not to evaluate the significance of one model.
@jyanguas3251 Рік тому
About Spectral Granger causality, you said that the way to compute it is using a Fourier transform over the autoregressive coefficients. Does it mean that we can use the stfft to get the time-frequency plot of the Granger causality?
Thank you for the explanaition
@mikexcohen1 Рік тому
Hi J. That's kindof correct. It's not formally a Fourier transform; it is a decomposition of the autoregressive coefficients that is conceptually similar to the Fourier transform, in that it involves computing the dot product between the coefficients time series and a transfer function that contains complex-valued sine waves.
@hamidehesmailpour5288 2 роки тому
I was checking the chapter 28 of the book "Analyzing Neural Time Series Data" to know more about time-frequency domain granger, but unfortunately, it was not much, just few pages, also not about non-parametric granger. I was wondering if you have other videos about these topics? couldn't find any myself .......
@mikexcohen1 2 роки тому
Hi Hamideh. I don't have any videos about non-parametric Granger, and I think this video is the only one I have about Granger causality. Sorry to disappoint you ;)
@hamidehesmailpour5288 2 роки тому
@@mikexcohen1 Thank you for replying to my comment. May I kindly ask you to answer my other question as well?
@hamidehesmailpour5288 2 роки тому
Thanks for the useful video. I have a confusion about model order and I will be thankful if you could elaborate more on that.
Around the minute: 18:56, you mentioned we have an assumption in statistics that if we have 2k parameters compared to having k parameters, then necessarily we are able to explain more variance in the data and explain the data better. So, the model with 2k parameters are better to be fitted to the data.
What is k? the number of datapoints happened in the past of one of the signals.
On the other hand, in minute: 25:22, when you are talking about the model order, k, and pros/cons, you said that when we have higher order model, we have more parameters to AR model to estimate that can lead to poor model estimation.
And my impression is these two stuffs contradicts with each other. The higher k should necessarily lead to better fitting the data, better estimation, on the other hand, having more parameters leads to poor estimation!!
Could you please clarify it more?
@mikexcohen1 2 роки тому ⁺¹
Hi. Yes, 'k' here is the total number of parameters estimated by the model. The fact that more parameters means more variance explained comes from statistics, and is related to overfitting data. I see what you mean about the apparent discrepancy, but explaining more variance in the data does not mean that the individual parameters are more accurate; it means the model overall captures more variance. What we care about with GC (and AR modeling more generally) is the quality of the estimated coefficients. Each coefficient is more accurately estimated when there is a sufficient amount of high-quality data. Furthermore, we don't want to overfit the data, so having fewer parameters is preferred over having more parameters.
@hamidehesmailpour5288 2 роки тому
@@mikexcohen1 thanks for ur explanation.
@bokkieyeung504 3 роки тому
at 25:32, When you talked about model order, why is that the higher order (more parameters in the model), the poorer model estimation? But in general, as you mentioned earlier in this video, a model can be almost always fit better if we add parameter, even though it's not very useful.
@PedroNariyoshi 3 роки тому ⁺¹
Larger number of parameters to estimate lead to larger errors in the parameter estimation. This can lead to overfitting and bias in the variance estimates.
@mikexcohen1 2 роки тому
Exactly, thanks for the clear reply.
@mettikhoramshahi 3 роки тому
Another every important parameter is the sampling rate of the signals.
@mikexcohen1 3 роки тому
True.
@sherifffruitfly 3 роки тому
Suppose I sell a bunch of different types of cars. This gives me a sales time series for each car type. I want to know which car types granger-cause which others. Does this lecture mean i just run the GC-process you describe for all combinations of predictor-car-types vs target-car-types, and see what the biggest GC-numbers are?
@mikexcohen1 3 роки тому ⁺¹
huh, interesting application. You can run GC on any multivariable time series. But you have to be careful with the interpretation and using the term "causal". That's why I prefer the term "prediction."
@sherifffruitfly 3 роки тому
@@mikexcohen1 Was just making sure I understood what was going on with a made-up use case. :) Another thing: those single variable linear models - those aren't required are they? I mean, you could have ANY model g(time lagged data points) and still play the same game with the log-variance-ratio number, right? Or am i missing a condition that must be fulfilled which forces linear models?
@mikexcohen1 3 роки тому
Mathematically, yes. But for the correct interpretation, one model needs to be a subset of another. So, you have the "full" model and the "reduced" model, and the reduced model is the same as the full model but with one or a few parameters missing. If the reduced model has a different set of parameters (i.e., it's a different model), then you're comparing apples to oranges.
@user-se5kp6lu5d Рік тому
Thanks so much. Amazing video. would you please share a code for implementing the granger causality in Matlab?
@mikexcohen1 Рік тому
I have this code as part of my ANTS book and my online signal-processing courses. You can find it on my github site.
@user-se5kp6lu5d Рік тому
@@mikexcohen1 Thank you for your prompt reply . I couldn't find the code, would you please share the link of that?
@mikexcohen1 Рік тому
It's this repository: github.com/mikexcohen/AnalyzingNeuralTimeSeries
I don't remember which chapter; it's been too long since I've read my own book, lol.
@bokkieyeung504 3 роки тому
Sorry I'm going to ask 2 questions about the previous lecture (two methods of power-based connectivity), whose comment function is disabled: 1) for the amplitude envelope correlations, you seems mentioned that the time-series of envelope is obtained by time-frequency analysis like wavelet convolution, etc., but I think that will give us different time-series amplitude/power for each frequency, so the envelope X/envelope Y in your plot is a sort of averaged time-series amplitude/power or a frequency-specific time-series amplitude/power? 2) for the trial-to-trial power coupling, why you use "spearman" instead of "Pearson" for the correlation analysis (shown in the codes)? does that mean, "connectivity" refers to a monotonic pattern regardless of the exact strength of correlation?
@mikexcohen1 3 роки тому
Thanks for pointing that out -- the comments were disabled by accident (the video was listed as "appropriate for kids" which auto-disables the comments. It's fixed now -- apologies to all the kids who were trying to learn about narrowband power time series correlations as an index of brain functional connectivity...).
1) Yes, the time series and their temporal characteristics may differ across frequency band, though this is not trivially the case. The relevant source paper is pubmed #10841367 if you want to read about this method in more detail.
2) You can also use Pearson. Raw, single-trial power values can have large outliers, so either a Spearman correlation or a robust regression analysis is recommended. That said, Spearman correlation is still a measure of the strength of the relationship; it just doesn't distinguish between linear and nonlinear-monotonic relationships. Of course, you can fit any other model to the data that is appropriate for your hypothesis; correlations are useful because they are simple, robust, fast, and require no parameters.
@bokkieyeung504 3 роки тому
@@mikexcohen1 thanks. will keep watching your videos~ I'm still a "kid" in this field :)
@younique9710 2 роки тому
I do not quite understand why the window is different from order. It seems like they mean the same thing. How order is different from the window? Maybe the reason why we separate the order from the window is in such the case that oder parameter is obtained every second or third whatever sequences (e.g., order5 = 1, 3, 5, 7, 9 or 1, 4, 7, 10, 13) but not every each sequence (1,2,3,4,5)?
@mikexcohen1 2 роки тому
The order is how many time points are parameterized by the model, whereas the window is the total amount of data that the model has access to when fitting the parameters. For example, if the order=1 and the window size is 100, then the model fits one parameter using 99 data values. If the order=1 and window size is 5, then the model is trying to fit the same parameter, but has only 4 data values to use. Hence, larger window size will lead to more reliable estimates of the autoregression parameter.
@younique9710 2 роки тому
@@mikexcohen1 I understood (order = lag). Thank you!

Наступне

Автоматичне відтворення