You can find the spreadsheets for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7 Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation
Hey man great!!!! Was waiting for this topic! Apart from ML and R-Squared this is the most important topic for selection of variable. There are two additional version of this: AICc and HQIC which I am sure you would show us in future Sava!
Hi, and glad you enjoyed the video! Thanks for the suggestion, will definitely cover the Hannan-Quinn and the corrected Akaike criteria in a future video.
Thanks for the video, super helpful! I've been struggling trying to find a decent solution to help inform to the optimal # of Observations in a Linear Regression. Sure, there are plenty of solutions to help with selecting the correct # of Variables (RMSE, Adj R, Pred R, Mallows C, etc). However, as I understand it, NONE of those really provide for a decent Apples-to-Apples way to compare different Models, with different # of Observations. Given that understanding, is it correct to apply the AIC/BIC here, similar to how you have constructed it (removing 1 observation, then re-running it), to select the model with the optimal # of observations? Then of course, you could ensure that it passes the rule of thumb test (minimum # of observations must be at least equal to 10x the # of independent variables). Lastly, I had heard that a topic called "Statistical Distance" or "Cosine Distance" is helpful in selecting the best model, given differing # of observations. Are you familar with those topics, and do you agree/disagree? Apologies for the long note, really loving all your vids.
Hi, and thanks for the question! Yes, it is important if you are using a Box-Jenkins heuristic for model selection. I would say Box-Jenkins is most useful to select AR versus MA, and then information criteria can be applied to determine how many lags one should use. Alternatively, you can check whether an AR model is stable and when it is not, use MA instead. By the way, a video on Hannan-Quinn and AICc is in the pipeline and will be released soon so stay tuned!
@@NEDLeducation Thank you so much. I really hope that you will soon share a vid how to implement ARMA. It would be so great if in addition you would explain in that regard how to check the prediction power of AR vs. MA vs. ARMA and then show us how to choose between these models
Hey Sava, after same own research, what do you think about the following "recipe" where I try to combine some of your videos: First of all one should check if the time series data that should be investigated is stationary or not for instance by using the Dickey-Fuller-Test. If it is stationary than one should compute some AR-Models and determine the appropriate number of lags by checking the AIC or BIC. After that one should check the residuals of the AR-Model to see if the residuals or autocorrelated for instance by applying the Durbin-Watson-Test from multiple linear regression. If there is a significant autocorrelation than it could be a good idea to model these "shocks" by expanding the simple AR-Model to an ARMA-Model. If in the first place the Dickey-Fuller-Test shows that the time series data is non-stationary than modeling the time series data with AR-Models is not appropriate. Instead one could use a simple MA-Model because these Models are stationary by definition or use an ARIMA-Model instead. What do you think? Is that summary correct? Thanks so much
You can find the spreadsheets for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7
Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation
Hey man great!!!! Was waiting for this topic! Apart from ML and R-Squared this is the most important topic for selection of variable. There are two additional version of this: AICc and HQIC which I am sure you would show us in future Sava!
Hi, and glad you enjoyed the video! Thanks for the suggestion, will definitely cover the Hannan-Quinn and the corrected Akaike criteria in a future video.
@@NEDLeducation definitely man, i am always waiting for your next upload!
Excellent! Thanks Savva!
Sava for President
Thanks for the video, super helpful! I've been struggling trying to find a decent solution to help inform to the optimal # of Observations in a Linear Regression. Sure, there are plenty of solutions to help with selecting the correct # of Variables (RMSE, Adj R, Pred R, Mallows C, etc). However, as I understand it, NONE of those really provide for a decent Apples-to-Apples way to compare different Models, with different # of Observations.
Given that understanding, is it correct to apply the AIC/BIC here, similar to how you have constructed it (removing 1 observation, then re-running it), to select the model with the optimal # of observations? Then of course, you could ensure that it passes the rule of thumb test (minimum # of observations must be at least equal to 10x the # of independent variables).
Lastly, I had heard that a topic called "Statistical Distance" or "Cosine Distance" is helpful in selecting the best model, given differing # of observations. Are you familar with those topics, and do you agree/disagree?
Apologies for the long note, really loving all your vids.
Hi! Many thanks for the video!
What if AIC and SIC suggest using different lag lengths? Is there any criteria, which one to use?
Thanks! But what about ACF and PACF? Aren´t they also important to see whether AR or MA is appropriate and also to see how many lags to include?
Hi, and thanks for the question! Yes, it is important if you are using a Box-Jenkins heuristic for model selection. I would say Box-Jenkins is most useful to select AR versus MA, and then information criteria can be applied to determine how many lags one should use. Alternatively, you can check whether an AR model is stable and when it is not, use MA instead. By the way, a video on Hannan-Quinn and AICc is in the pipeline and will be released soon so stay tuned!
@@NEDLeducation Thank you so much. I really hope that you will soon share a vid how to implement ARMA. It would be so great if in addition you would explain in that regard how to check the prediction power of AR vs. MA vs. ARMA and then show us how to choose between these models
Hey Sava, after same own research, what do you think about the following "recipe" where I try to combine some of your videos: First of all one should check if the time series data that should be investigated is stationary or not for instance by using the Dickey-Fuller-Test. If it is stationary than one should compute some AR-Models and determine the appropriate number of lags by checking the AIC or BIC. After that one should check the residuals of the AR-Model to see if the residuals or autocorrelated for instance by applying the Durbin-Watson-Test from multiple linear regression. If there is a significant autocorrelation than it could be a good idea to model these "shocks" by expanding the simple AR-Model to an ARMA-Model. If in the first place the Dickey-Fuller-Test shows that the time series data is non-stationary than modeling the time series data with AR-Models is not appropriate. Instead one could use a simple MA-Model because these Models are stationary by definition or use an ARIMA-Model instead. What do you think? Is that summary correct? Thanks so much
Dear Sava, could you tell me the reason for the small differences between coefficient values from excel and eviews? Thank you so much!
Please why did we use a constant value of 1?