5 years later, and you still have the best content I could find on YT. You have no idea how much you are helping me in my final year and what impact you have!
This was extremely helpful!! Between my 3 econometrics textbooks (Griffiths, Greene, and Wooldridge), the information on MA models was sparse. This really cleared up the mindset behind this model!
Thank you so much for explaining this so well! My professor and textbook explain this concept very mathematically which is hard to understand for beginners, they should really give a simple example and then dive into the details as you did.
I still don't think this makes sense to me why is incorporating past error somehow gives us better prediction in the future in this case. Since this crazy professor will randomly choose an acceptable # of cupcakes, your past error shouldn't help in better predicting in the future.
Event though the professor selects a different number every time, at the end the average is stable. Assume you have a time series of images. Images, due to the unstable environment they're taken in or all other factors that manipulate images nature, are not always the same, although they are taken from the same scene. So, what is the goal here ?to find the mutual information in the images and ignore the noises. These noises are how crazy professor is , and the importance of error, which we can handle by its coefficient. By handling these factors, we can get close to recognising the mutual information. Remember, these are unsupervised models. There are no lable to rely on.
How is the average moving though? It was fixed for each prediction! Wouldn't it have to be recalculated each time for it to be moving? Also we didn't seem to use anything related to the error being normally distributed... is there a reason for that? why was it mentioned in the first place?
Gemini 1.5 Pro: This video is about moving average model in time series analysis. The speaker uses a cupcake example to explain the concept. The moving average model is a statistical method used to forecast future values based on past values. It is a technique commonly used in time series analysis. The basic idea of the moving average model is to take an average of the past observations. This average is then used as the forecast for the next period. There are different variations of moving average models, and the speaker introduces the concept with moving average one (MA1) model. In the video, a grad student is used as an example. The grad student needs to bring cupcakes to a professor's dinner party every month. The number of cupcakes the grad student should bring is the forecast. The professor is known to be crazy and will tell the grad student how many cupcakes he thinks were wrong each month. This is the error term. The moving average model is used to adjust the number of cupcakes the grad student brings based on the error term from the previous month. The coefficient is a weight given to the error term. In the example, the coefficient is 0.5, meaning the grad student will adjust the number of cupcakes he brings by half of the error term from the previous month. For example, if the grad student brings 10 cupcakes in the first month, and the professor says the grad student brought 2 too many, then the grad student will bring 9 cupcakes in the second month (10 cupcakes - 0.5*2 error term). The video shows how the moving average model works through a table and graph. The speaker also mentions that there are other variations of moving average models, such as moving average two (MA2) model, which would take into account the error terms from two previous months.
the idea is that you're trying to predict the next value. you get told what the next value is by the professor. if its random then there is no signal in there & the results are still meaningless
Great videos, thank you! I have a question. Period 1 value is our mean value but we don't know what is mean since we just started from point 0. How to calculate residual then? We know the true observation and we don't know the mean. Is it just a guess? But when we use any statistical package it does not ask us to input guess mean value.
How come some MA(1) formulas have x_t = mu + (phi1) error_t + (phi2) error_t-1..... If you predicting at time t then how would you know error at time t (error_t), why are some formulas like this?
Where does the noise in the equation come from? In our data we only have time on the x axis and Y as the target variable. There is no error term. What I mean to ask is does the MA model first regress y on y lag terms like the AR model and then calculate error between the actual and predicted y terms? Then regress y against the calculated error terms(residuals)?
The error is a white noise coming from random shocks whose distribution is iid~(0,1). Ftting the MA estimates is more complicated than it is in autoregressive models (AR models), because the lagged error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. Hope this helps :).
This is a great explanation but in many equation they also add the current error (epsilon_t). I just don't get how are we supposed to know our current error if we are trying to forecast a value. Do we simply neglect that current equation for forecasting?
what is the difference between taking the average of first 3 values and calculating the centered average at time period 2 and this method(average+error t+ error at previous time period)
Why in some models the prediction (f hat) is the average of the previous f values. But in some models, it is the error of the previous models that predict f hat.
Thanks this is a really clear explanation. My only question is when you are calculating your f_t column, why are you including the error from the current time period? Shouldn't you only be including the 0.5*e-t-1?
Hi... I have one doubt.. shouldn't you have plotted the values for ft^ instead of ft in the graph? P.S: Thank you for taking the time to make these videos. It's really helpful.
@@isabellaexeoulitze6544 yeah.. I kinda expected that since it's a old video.. nevertheless the commented my doubt, hoping that someone else watching the video might clarify...
Like he drew the ft line for showing that the time series data is kind of like centered around the mean , but even I have a doubt that why didn't he also draw predicted ft along with real ft
I really like your videos. They work very well for me, someone without any background in time series. However, this one is somewhat confusing. You are demonstrating the concept of *moving average* with an example where the average stays the same. I get that the estimate moves around, but that is due to the error variance, right? The average itself is not moving anywhere. Both mu and mu_epsilon are assumed to be constant, so what's moving here?
5 years later, and you still have the best content I could find on YT. You have no idea how much you are helping me in my final year and what impact you have!
i wish my professor had explained it exactly like u just did
Thank you very much for making a vague concept so clear.
This was extremely helpful!! Between my 3 econometrics textbooks (Griffiths, Greene, and Wooldridge), the information on MA models was sparse. This really cleared up the mindset behind this model!
Never seen a better explanation of MA models. Immediate subscription!
Same here! I knew I would suscribe after 1 minute in the video. Very clear and very useful video. Thank you very much.
Oh damm!! this is wonderful, Simplified and explained pretty nicely. Keep spreading you knowledge!!
Thank you! Will do!
God Bless You! I needed a fast way to get some concepts on time series forecasting and you saved me.
Easy, Fast, Complete.
Thank you so much for explaining this so well! My professor and textbook explain this concept very mathematically which is hard to understand for beginners, they should really give a simple example and then dive into the details as you did.
Glad it helped!
I really don't know how to thank you for that great demonstration! I've been trying to understand MA process for years!
This was the best video on MA. The crazy prof made our life easier 😂😂😂
Wow! Great explanation. The professor´s example was very intuitive. Thanks for the content!
Thank you Sir. You have a great way of explaining things, something I sadly rarely find from my coding/statistics teachers.
This men's explanation is way better than those profs at University.
You are spectacularly GOOD in the explanation of the ARIMA! Cheers
I appreciate that!
I was stuck where is the “error" term coming from. Now I know... it is the error from the past. You explained! I wish you were my professor.
Thank you so much for making this fun video! Makes so much more sense now (after struggling through my not-so-crazy professor's stats class)
Couldn't be expressed so handsomely! Thanks!
Thank you so much for your very intelligent explanation to this model!!! i felt so confused about this model before.
Thank you so much, I have been reading this concept in an Econometric book...but this is easy to comprehend
Glad it was helpful!
So simple yet easy to understand. Thank you!
I was terrified for the mathematical symbols, but you made it so easy to understand! thank you!
Thanks for existing in this world bro.
So nice of you
Great explanation! I've learned everything that I looked for. Thank you.
Fantastic, got too caught up in the math in my macroeconometrics course and had no idea what these things actually were. Super helpful conceptually
This explanation gives better understanding why do we need avoid unit root in Time Series predictions
a year trying to understand this, and I ve just needed 15 minutes thx!!
Simple Explanation is a Talent - Thanks for this
ALWAYS GRATEFUL, THANK YOU FOR THE WONDERFUL CONTENT
Simple and clear explanation, thank you !
Finally ❤️ a video with an applicable and relevant example ❤️🙏
I still don't think this makes sense to me why is incorporating past error somehow gives us better prediction in the future in this case. Since this crazy professor will randomly choose an acceptable # of cupcakes, your past error shouldn't help in better predicting in the future.
I think the student naively believes the crazy professor will stick to his prior t-1 position (the student is unaware of the professor's craziness)
Everything in time series assumes that you can use past info to predict future info
Event though the professor selects a different number every time, at the end the average is stable. Assume you have a time series of images. Images, due to the unstable environment they're taken in or all other factors that manipulate images nature, are not always the same, although they are taken from the same scene. So, what is the goal here ?to find the mutual information in the images and ignore the noises. These noises are how crazy professor is , and the importance of error, which we can handle by its coefficient. By handling these factors, we can get close to recognising the mutual information. Remember, these are unsupervised models. There are no lable to rely on.
OMG, this is brilliant , amazing ,wonderful ,thank you
Finally understood this, thank you so much. Highly recommend!
How is the average moving though? It was fixed for each prediction! Wouldn't it have to be recalculated each time for it to be moving?
Also we didn't seem to use anything related to the error being normally distributed... is there a reason for that? why was it mentioned in the first place?
Exactly right, I am also having same query, Average not moving
Did you get any other source where this explained clearly
Manyt thanks for your clear explanation of the mathematical moving average formula
of course!
Gemini 1.5 Pro: This video is about moving average model in time series analysis. The speaker uses a cupcake example to explain the concept.
The moving average model is a statistical method used to forecast future values based on past values. It is a technique commonly used in time series analysis.
The basic idea of the moving average model is to take an average of the past observations. This average is then used as the forecast for the next period. There are different variations of moving average models, and the speaker introduces the concept with moving average one (MA1) model.
In the video, a grad student is used as an example. The grad student needs to bring cupcakes to a professor's dinner party every month. The number of cupcakes the grad student should bring is the forecast. The professor is known to be crazy and will tell the grad student how many cupcakes he thinks were wrong each month. This is the error term.
The moving average model is used to adjust the number of cupcakes the grad student brings based on the error term from the previous month. The coefficient is a weight given to the error term. In the example, the coefficient is 0.5, meaning the grad student will adjust the number of cupcakes he brings by half of the error term from the previous month.
For example, if the grad student brings 10 cupcakes in the first month, and the professor says the grad student brought 2 too many, then the grad student will bring 9 cupcakes in the second month (10 cupcakes - 0.5*2 error term).
The video shows how the moving average model works through a table and graph. The speaker also mentions that there are other variations of moving average models, such as moving average two (MA2) model, which would take into account the error terms from two previous months.
How do we know what the "error" is there is if there is no "true value" given a random realization of data.
the idea is that you're trying to predict the next value. you get told what the next value is by the professor. if its random then there is no signal in there & the results are still meaningless
Great explanation. Keep up the good work!
Explained with the Cup Cakes it makes perfect sense, thumbs up!
Nice example super easy to understand the concept!
Great videos, thank you! I have a question. Period 1 value is our mean value but we don't know what is mean since we just started from point 0. How to calculate residual then? We know the true observation and we don't know the mean. Is it just a guess? But when we use any statistical package it does not ask us to input guess mean value.
perfect explanation. Thank you!
Does MA model assume et (lagged residuals) are pure white noise ? Mean =0, constant variance , and no autocorrelation of residuals ?
Hi, great explanation! One question, how do you guess the mu value (the average cupcake you bring) for the fist time?
Let's use an example that is sligtly more natural to us -- so here's this crazy professor. :D
Great video. I think the calculation of the 3rd row is wrong. It should've been 9+0.5 = 9.5
No.. Constant term is 10 not 9
Thanks man. You're doing a suberb job.
I love this video, so simple but effective
Exceptionally useful videos for actuarial exams. Thanks for helping me pass🙂(hopefully)
Had I watched your series earlier would have saved me $3000 :(
Thank you for the video, how should we choose the 0.5 coefficient in front of the error term from last period in the regression model?
Great video! Thanks for sharing!
Great Presentation...
Glad you liked it!
Thanks!!! Perfect explanation :)
this is really helpful and so easy to understand!!!
Great explanation! Third row shouldn't it be 9.5 rather than 10.5?
No, 10+1/2=10.5
@@wenzhang5879 Yeah, got it. Thanks
Thank you very much! Such a clear explanation!
How come some MA(1) formulas have x_t = mu + (phi1) error_t + (phi2) error_t-1..... If you predicting at time t then how would you know error at time t (error_t), why are some formulas like this?
Extremely well explained
Excellent explanation
LOVE IT. Thank you.
Of course!
Where does the noise in the equation come from? In our data we only have time on the x axis and Y as the target variable. There is no error term. What I mean to ask is does the MA model first regress y on y lag terms like the AR model and then calculate error between the actual and predicted y terms? Then regress y against the calculated error terms(residuals)?
The error is a white noise coming from random shocks whose distribution is iid~(0,1). Ftting the MA estimates is more complicated than it is in autoregressive models (AR models), because the lagged error terms are not observable. This means that iterative non-linear fitting procedures need to be used in place of linear least squares. Hope this helps :).
Really good explaination!
Maybe I'm stupid for asking this...
If one was to write an MA filter, how do you determine M?
Brilliant explanation, thank you!
This is a great explanation but in many equation they also add the current error (epsilon_t). I just don't get how are we supposed to know our current error if we are trying to forecast a value. Do we simply neglect that current equation for forecasting?
Amazing explanation man
Thanks you so much.
Great video. Do you always start with the mean as your first guess for f hat? Also, how do you fit an MA(q) model?
what is the difference between taking the average of first 3 values and calculating the centered average at time period 2 and this method(average+error t+ error at previous time period)
What you are describing is MA smoothing, which is used to describe the trend-cycle of past data
Thank you. Love your video tutorials! Just one question: shouldn't the curve at 5'58'' be f_t? And c(10,9,10.5,10,11) be f_(t-1)?
Why in some models the prediction (f hat) is the average of the previous f values. But in some models, it is the error of the previous models that predict f hat.
I have the same doubt, sometimes he added the half of the error to f ,and sometime to f-hat
does miu have to be a constant? can we use a rolling window to calculate the average? will this yield better predictions?
how do we find the coefficient for the moving average model?
Algorithms use the entire time series to get as close as possible to the true value of the coefficient (often with a maximum likelihood estimator).
Hey amazing Content Bravo !
Can you add to that a video talking about random walk ?
That would be great .
Thanks this is a really clear explanation. My only question is when you are calculating your f_t column, why are you including the error from the current time period? Shouldn't you only be including the 0.5*e-t-1?
Greatly explain!!! Thanks
Hi. The mean of et is not 0. For time interval 5, you need to write -1.
So not natural.. it is why you are so good in teaching
Hi... I have one doubt.. shouldn't you have plotted the values for ft^ instead of ft in the graph?
P.S: Thank you for taking the time to make these videos. It's really helpful.
I was about to ask the same thing but I don't think the instructor responds to questions.
@@isabellaexeoulitze6544 yeah.. I kinda expected that since it's a old video.. nevertheless the commented my doubt, hoping that someone else watching the video might clarify...
Like he drew the ft line for showing that the time series data is kind of like centered around the mean , but even I have a doubt that why didn't he also draw predicted ft along with real ft
Hello, thanks for this video, but i Wonder about \theta_0. Could it be something different than 1?
Amazing explaination
What does it mean when the MA(1) estimated parameter = 1? For AR(1) that would mean there's a unit root. Any particular corollary for MA models?
God Bless you.
how is it possible you can explain this stuff so easily!
Great, now I understand moving averages but I have a sudden craving for cupcakes...
Wonderful example.
thanks!
I really like your videos. They work very well for me, someone without any background in time series. However, this one is somewhat confusing. You are demonstrating the concept of *moving average* with an example where the average stays the same. I get that the estimate moves around, but that is due to the error variance, right? The average itself is not moving anywhere. Both mu and mu_epsilon are assumed to be constant, so what's moving here?
you are just amazing
How is mean determined?
BTW, it was a great video! Thanks a lot!
How can I use such a model for forecasting?? I can forecast for one day into the future but how about 2 or more days into the future?
thanks! Really helpful
Fantastic!
How do you find the error terms for last time period in real world uni series?
Are the mean 0 and SD 1 of error_t assumptions?
Thank you❤❤❤
You can see how the crazy professor gets hungrier month by month
Sir please make videos on restricted Boltzmann machine
Well explained ❤
Thank you 🙂
THANK YOU SO MUCH
This looks like exponential smoothing. Please correct me if I'm wrong!
no. not the same.