(Stata16): Heteroskedasticity and Weighted (Generalised) Least Squares
Вставка
- Опубліковано 9 лют 2025
- @CrunchEconometrix This video explains how to correct heteroscedasticity with weighted (generalised) least squares. Coined from the Greek word hetero (which means different or unequal), and skedastic (which means spread or scatter). So, homoskedasticity means equal spread, and heteroskedasticity, on the other hand, means unequal spread. The measure of spread is the variance, hence, heteroskedasticity deals with unequal variances. Heteroskedasticity or heteroscedasticity is the same. Only be consistent. Yes! The longest word in the econometrics dictionary with 18 words. One of the assumptions of ordinary least squares (OLS) is that the model must be homoskedastic. Needed to justify the usual t tests, F tests, and confidence intervals for OLS estimation of the linear regression model, even in large samples. In general, heteroskedasticity is more likely to occur in cross-sectional analysis. This does not imply that heteroskedasticity in time series models is impossible. What are the causes of heteroskedasticity? (1) Poor data sampling method may lead to heteroskedasticity particularly when collecting primary data. (2) Wrong data transformation. For instance, over-differencing a variable may lead to heteroskedasticity. (3) Wrong model specification. Related to the functional form: log-log, log-level, and level-level models. (4) The presence of outliers can lead to your model becoming heteroskedastic. Bogus figures that stands out. Very obvious to the prying eyes. (5) Skewness of one or more regressors (closely related to outliers being evident in the data). Consequences of heteroskedasticity: (1) OLS estimators, β ̂_OLS are still linear, unbiased and consistent. Hence the regression estimates remain unbiased and consistent. (2) But the estimators, β ̂_OLS are inefficient (that is, not having minimum variance) in the class of minimum variance estimators. (3) Therefore, OLS is no longer BLUE (Best Linear Unbiased Estimator). (4) Such that regression predictors (estimates) are also inefficient, though consistent. (5) Implies that the regression estimates cannot be used to construct confidence intervals, or used for inferences. (6) Affects the variances (and standard errors) of the estimated β ̂_S. (7) OLS method under-estimates the variances (and standard errors). (8) Yields low standard errors (9) Leads to higher than expected values of t and F statistics. (10) Yields statistically significant coefficients. (11) Rejection of the null hypothesis too often (12) Causes Type I error. (13) Both the t and the F statistics are no longer reliable any more for hypothesis testing. Some heteroskedasticity tests are: Breusch-Pagan LM Test; Glesjer LM Test; Harvey-Godfrey LM Test; Park LM Test; Goldfeld-Quandt Test; White’s Test; Engle’s ARCH Test; and Koenker-Basset Test. Heteroskedasticity can be resolved by: (1) Functional Forms; (2) Generalised (Weighted) Least Squares (GLS/WLS); and (3) White’s Robust-Standard Errors. How to detect heteroskedasticity? The truth is that there is no hard and fast rule for detecting heteroskedasticity. Therefore, more often than not, heteroskedasticity may be a case of educated guesswork, prior empirical experiences or mere speculation. However, informal and formal approaches can be used in detecting the presence of heteroskedasticity such as: Informal approach: Plotting the residuals from the regression against the estimated dependent variable
Formal approach: Perform econometric tests. There are several tests of heteroskedasticity, each based on certain assumptions. The interested reader may want to consult the references listed at the end of the video.
Link to A&H_hprice.xlsx data (free) and dofile (Subject to payment) cruncheconomet...
Note: You have to CART and CHECKOUT.
References and Readings: Asteriou and Hall (2016) Applied Econometrics, 3ed; Wooldridge, J. M. (1995). Econometric Analysis of Cross Section and Panel Data. London, England: The MIT Press, Cambridge, Massachusetts; Baltagi, B.H. (1995) Econometric Analysis of Panel Data. New York, NY: John Wiley and Sons; Hsiao, C. (1986) Analysis of Panel Data, Econometric Society Monographs No. 11. Cambridge, United Kingdom: Cambridge University Press; Gujarati and Porter (2009) Basic Econometrics, International Edition; John, F. (1997) Applied Regression Analysis, Linear Models, and Related Methods, Sage Publications, California, p. 306; Mankiw, GN. (1990) “A Quick Refresher Course in Macroeconomics,” Journal of Economic Literature, Vol. XXVIII, p. 1648
Follow up with soft-notes and updates from CrunchEconometrix:
Playlists: / cruncheconometrix
Website: cruncheconomet...
Blog: cruncheconomet...
Facebook: / cruncheconometrix
UA-cam Custom URL: / cruncheconometrix
Twitter: / crunchmetrix
Reddit: / crunchmetrix
I want to appreciate all my subscribers from across the globe (Africa, Asia, Europe, the Middle East, The Americas, and The Pacific). Thank you all for your support. I am encouraged by your comments, questions, likes and critiques. They keep me focussed and poised to do better. I will continue to contribute my little quota such that every student and researcher will independently analyse his/her data. My teaching approach is very practical. I adopt a do-as-I-do style. Many thanks to those who have supported me by telling others. Once again, CrunchEconometrix loves to teach, support my Channel with your subscription, likes, feedbacks and sharing my videos with your cohorts. Please do not keep me to yourself (lol) inform your friends, students and academic networks about my Channel. Tell them CrunchEconometrix breaks down the econometric jargons and teaches with simplicity. Follow me on Facebook, Twitter and Reddit. Love you all, greatly!!!
I struggled with this subject along the semester and founding this video just left everything clear in 10 minutes. Thanks for taking the time to upload it!!!
Great to hear, Emiliano!
@CrunchEconometrix, Thank you maa'm....it helpedme a lot
Glad to hear, Ismat😊
Thanks for this Ma. I want to understand something. Changing the variables into a log form to correct Heteroskedasticity is that what GLS is all about? I understood the correction of Heteroskedasticity using the functional form as you taught. This particular one does it mean that when you change the cariables to a log form and you run the regression again, then check for heteroskedasticity, it is corrected and thus meaning GLS?
Hi Kezia, thanks for the positive feedback. The video description says it all. Answers your query.
Hello once again, can you please how to add the weight variable? Because if I am simply putting (weight/1/myIV) then Stata is giving an error. Can you please elaborate? I would be highly grateful. Thanks in advance.
Hi Fawa, why not follow the steps shown?
Hi,
Is there any rule to select the Z known variable, independent variable? Or i can choose it randomly?
Hi, kindly watch the clip again. Thanks.
@crunchEconometrix In correcting for heteroskedasticity in RE model ... which is the best estimator ? Robustness of RE or FGLS
Hi Kingsley, either approach works well insofar the heteroskedasticity is eliminated.
@@CrunchEconometrix thanks soo much
teacher how it is known that sqfeet is causing heteroscedasticity?
Hi Fahim, I explained how to detect heteroscedasticity so kindly watch the videos again.