Choosing Fixed-Effects, Random-Effects or Pooled OLS Models in Panel Data Analysis using Stata
Вставка
- Опубліковано 30 чер 2024
- Full text: phantran.net/choosing-fixed-e...
Database: drive.google.com/file/d/1G3NF...
Data in excel: docs.google.com/spreadsheets/...
In this video, we performed step by step the process of selecting the regression model for panel data (Random-Effects, Fixed-Effects or Pooled OLS Models), that is discussed in researches of Dougherty (2011) and Torres-Reyna (2007). Specifically, the process begins with considering whether the observations are a random sample from a given population, that is a subset of individuals randomly selected by researchers to represent an entire group as a whole. In the first step, we determine if these observations are a random sample, if this is the case, we perform the next step, otherwise we use fixed-effects model as the final decision. In case of random sample, we continue the second step by performing both fixed-effects and random-effects models, then we compare these models by using the Hausman test, also known as the Durbin-Wu-Hausman or DWH test, where the null hypothesis is that the preferred model is random effects versus the alternative the fixed effects (see Green, 2008, chapter 9). It basically tests whether the unique errors (ui) are correlated with the regressors, the null hypothesis is they are not. So, If the Hausman test indicates significant differences in the coefficients; final choice consists in Using fixed-effects model. In contrast for the third step, the Lagrange multiplier is used to decide if the random-effect model or the pool OLS model is suitable for the research. The null hypothesis in the LM test is that variances across entities is zero. This is no significant difference across units. Specifically, if LM test indicate the presence of random effects; random-effects model will be chosen; otherwise pooled OLS model will be our final decision.
Thanks. This is the first channel explaining everything clearly and step by step by using a real, published study. Please upload more models, tests, and commands
Thanks for watching and comments!
That's how you clearly explain the steps. Much appreciated
This was a really helpful video that breaks it down clearly! Thanks!
There are a few mistakes, but it is a very helpful video to understand. You are a legend, thank you!
Thank you so much for using data from your empirical study to teach how to carry out the analysis. Its been very helpful .
It's my pleasure
the concepts were explained really well. Thank you!
That was amazingly explained with interesting examples. Thanks a lot
I found your presentation very helpful and explained so clearly thank you :)
Thank you sir, for the excellent video. I have run the FE and RE for research data.
Very good explanations! Thank you so much!
Glad you enjoyed it!
Excellent explaination 👍
this video is so useful, thank you very much!
Glad it was helpful!
Thanks for the video. Having the snapshot of the dataset and the do file would help understand the content better.
This method is hot, thanks !
Good explication of this method
Thank you very much sir
Great tutorial
Thank you! Cheers!
Good job
thank you for this video
Welcome!
Thank you for this video. Now my question is that exist any test of endogeneity in the fix effects model. Thank you very much.
pretty Splendid. But I have a request please use command mode and speak slowly to elaborate more in-depth.
Please see Torres-Reyna (2007) at www.princeton.edu/~otorres/Panel101.pdf
Hi thank you for the video which is very helpful. One comment on the first hausman test result to be sure: wouldn't it be that we "reject" the null hypothesis (H0: random effect is a preferred model) as the p-value is 0.000, and accept the H1: fixed effect is a preferred model? The video says that the hausman test allows us to "accept" the null hypothesis and choose to use fixed effects.
Thanks for your comment. The correct interpretation must be "the hausman test allows us to reject the null hypothesis and choose to use fixed effects".
Random effects (RE) is preferred under the null hypothesis due to higher efficiency, while under the alternative Fixed effects (FE) is at least as consistent and thus preferred.
If the p-value is small (less than 0.05), reject the null hypothesis.
hey i have a question (please i need very urgently help) i do the same but there comes a note which says: the rank of the differenced variance matrix (9) does not equal the number of
coefficients being tested (10); be sure this is what you expect, or there may
be problems computing the test. Examine the output of your estimators for
anything unexpected and possibly consider scaling your variables so that the
coefficients are on a similar scale.
, but i dont see any problems, what should i do?
Try with sigmamore option!
. hausman fe re,sigmamore
Please see www.stata.com/manuals13/rhausman.pdf and www.albany.edu/faculty/kretheme/PAD705/SupportMat/PanelData.pdf (page 8)
Thanks for your great work!...AT the end, yoo explain the meaning of "rho", and the slide says "97.466% of the variance is due to differences across panels"...Did you want to say "entities" instead of "panels"?...Thanks again!
I based on Torres-Reyna (2007), but I think you are right ... across entities or observations of panel
Please can I apply same to my RE model, if my Hausman test results allows me to pick RE model... I wanna eliminate heteroskedasticity
Yes you can; you can also use the robust option for eliminating heteroskedasticity of your RE
Dougherty's knowledge at 6:17? In the ones I can find the page 421 isn't providing your given information/flowchart. Thank you.
Dougherty, C. Introduction to Econometrics; OUP Oxford: Oxford, UK, 2011. You can find it in the reference of our article at doi.org/10.3390/su11174569
Hi, I have 2 dependent variables that I want to test separately, Can I run the Hausman test for both of them separately?
You must run 02 selection processes separately for 02 dependent variables; in each process, run the hausman test for selecting the right model of each dependent variable. Good luck!
Hi, Thank you for such a fantastic video, in my case I got a p-value of more than .05 for both the Hausman test and LM. What should I do?
It means that your appropriate model is OLS regression. Thanks for watching and for your comment!
@@HKTStata SImple OLS or pooled OLS?
In my opinion, multiple OLS and pooled OLS are the same; let's see the regression equation at the end of phantran.net/different-regression-models-with-panel-data-fixed-effects-random-effects-and-pooled-ols/
@@HKTStata yes I also understand it in the same way. Thank you so much for your responses and help on this.
Does storing the estimates change or effect the data if we wanna do other regressions later?
No, storing will do not change your original data.
Can u please provide the link for this excel data. The drive link u provided is not opening
Please use Stata 14 or later! You can also download data in excel here docs.google.com/spreadsheets/d/135p2zph7SL6I5Y5UyCu7eKFzcOitrBLb/edit?usp=sharing&ouid=100029919331612689631&rtpof=true&sd=true
As you know robust model may fix the heteroskedasticity problem and also Pooled OLS fixes the serial correlation problem. Which model is preferable if both (heteroskedasticity and serial correlation) tests are failed?
For Pooled OLS, you just test the heteroskedasticity, if the test failed, you should use robust option. You do not need the serial correlation test, this is only for panel data models such fixed and random effects ones; Pooled OLS ignores the time effect.
@@HKTStata Thanks for your prompt response. But I wanna comprehend the issues around my problem. My data has both heteroskedasticity and autocorrelation problems. Which model is feasible for this type of problem sir?
You can see www.homepages.ucl.ac.uk/~uctpsc0/Teaching/GR03/Heter&Autocorr.pdf
Note that heteroskedasticity and/or autocorrelation problems do not matter; you can use robust option to fix them. So, choose the your appropriate model according to the process presented in my video; then add robust option for your chosen one; that is your final model.
Thanks for your helpful video. Can I ask if xttest0 indicates that POLS is suitable (p_value=1.000) and after running xtserial for POLS model, autocorrelation exists. what could I do to fix it? Thanks.
For POLS model, you should use VIF indicators for checking the heteroskedasticity problem. If you have VIF > 4, you must eliminate the variable having highest VIF score. Good luck!
@@HKTStata thanks so much for your reply. But I would like to ask about autocorrelation appeared in POLS model.
After using Breusch & Pagan LM (xttest0), I found that the POLS is preffered.
Then I performed xtserial to test autocorrelation. P_value
You can use robust option:
regress ...., robust
or
regress ...., vce(robust)
Good luck!
@@HKTStata thanks for your help.
How do we decide between Pooled OLS and and fixed effects? If I fail to reject the null of the Breush-Pagan Lagrange multiplier result
If p < 0.05 in your LM test, you choose the fixed-effect.
Modern presentation
If the Hausman test provides chi2(9) = 7.76 and p = 0.5586, while the LM test xttest0 provided chi-square statistics of 58.46, based on this, is the random effects model more appropriate?
Also, if I have several independent variables, should I repeat the process for each one of them, and if the results indicate that different models be used for different independent variables, should I do it, or use the same model across all regressions?
The Hausman test providing chi2(9) = 7.76 and p = 0.5586 > 0.05 indicate that the random effects is appropriate (but not fixed-effect). The LM test xttest0 providing Prob > chi2 < 0.05 indicate that the random effects is appropriate (but not OLS).
If you have several independent variables, you should run all of them at same time (do not repeat the process for each one of them) by choosing the appropriate model.
@@HKTStata Thank you for your answer! And sorry I actually meant to say that I have several dependent variables, hence my question is whether I repeat the process for each one of them, or just stick with one model like the random effects one
Yes, you must repeat the process for each one of dependent variables.
In hausman test p value came 0 means fixed effect model is appropriate but in xttest0 p value came 1 means pooled is appropriate. So which on is appropriate for this analysis??? Please can you explain
You should see the figure 1 in our article doi.org/10.3390/su11174569
You can use only the hausman test for your decision.
You can see also ua-cam.com/video/AxFVb75QSf4/v-deo.html
That's mean i have to do hausman test Again Between pooled ols and Fixed effect model?
And if significant p value came then means fixed effect model is appropriate????
The hausman test is for choosing between random or fixed effect model. The xttest0 is for choosing between pooled OLS or random effect model. If the hausman test is significant (p < 0.05), it means fixed effect model is appropriate; then you can ignore the xttest0 according to the figure 1 in our article doi.org/10.3390/su11174569
Thanks
Its really very informative. But I am unable to run sigmamore command. Plz explain
You must assure that fixed and random regression have stored before run haussman test. Plz retry by respecting the following steps: 1. Performing fixed-effects regression by using command xtreg with option fe, 2. and save the fixed-effects estimates 3. Performing random-effects regression by using command xtreg with option re, 4. and save the random-effects estimates 5. the Hausman test with sigamore option
Please reply me, is xttest0 in choosing model and xttest0 in check heteroskedasticity for RE model the same? REM is appropriate model for my data. First I check xttest0 to choose between RE and POLS then I use hausman to choose between RE and FE. Now I want to check heteroskedasticity for RE. What should I do?
To check heteroskedasticity for RE, you should use 2 following commands:
xtgls Y X1 ...Xn, panels(hetero)
xtreghet Y X1 ... Xn, id(ID) it(Year) model(xtmln) mfx(lin) diag lmhet
If heteroskedasticity, run: xtreg Y X1 ... Xn, robust re
Good luck :)
@@HKTStata what is the name of this method? can you tell me I have to write on my paper
You can see www.stata.com/manuals/xtxtgls.pdf
and fmwww.bc.edu/RePEc/bocode/x/xtreghet.html
@@HKTStata thank you ^^ you save my day
Thank you very much! Could you provide the topic of your published paper ?
Yes, sure!
Nguyen, Hoang Viet, Thanh Tu Phan, and Antonio Lobo. 2019. "Debunking the Myth of Foreign Direct Investment toward Long-Term Sustainability of a Developing Country: A Transaction Cost Analysis Approach" Sustainability 11, no. 17: 4569. doi.org/10.3390/su11174569
@@HKTStata Thank very much!
hello, due to heteros I put robust, but my result doesn't show probability value and just a dot. any advice?
dots in Stata output mean missing values. The standard errors for those two coefficients could not be computed; presumably there is a problem of near-collinearity in the regressors (assuming this is regression output). See more www.stata.com/statalist/archive/2005-11/msg00282.html
You should remove such variables!
Hello,did you solve your problem? thanks
You should remove such variables!
@@rigao7533 hello, i figured my number of variables are bigger than my cross sectional unit. I have removed some of my variables and able to get p-value reading
Isn’t it standard OLS and not pooled since you didn’t add pooled at the end of the command?
Pooled OLS is normal OLS; but just for panel data; so, we don't need to add anything at the end of the command. Good luck!
very helpful video. Thanks you so much
Can you please help me explaining these results from stata. I need to explain result for my dissertation
Please see phantran.net/category/methodology/statistical-software/stata/
If p value of constant in FEM becomes more than 5%, would it be a problem?
No, there is no problem.
Another question related with this,
Considering FEM is my final regression as per hausman test, then can i/ do i need to incorporate GMM regression output in my panel data analysis?
If yes, how to interpret it, assuming 3 out of 4 independent variables got significant in GMM (whereas, 2 got significant in FEM),also how to interpret j-statistic and prob (j statistic)?
Please see Torres-Reyna (2007) at www.princeton.edu/~otorres/Panel101.pdf & phantran.net/different-regression-models-with-panel-data-fixed-effects-random-effects-and-pooled-ols/ & www.mdpi.com/2071-1050/11/17/4569
Regarding the hausman fixed random test, I get the error that estimation result fixed not found. However, I did a fe and re test beforehand. Someone who can help?
Try to use the sigmamore option of hausman test!
Good luck!
@@HKTStata
xtreg roaab gov env soc bet lto tdt lte, fe
estimates store fixed
xtreg roaab gov env soc bet lto tdt lte, re
hausman fixed., sigmamore
I did this, but at the last command it says again estimation result fixed not found
Retry the following commands:
xtreg roaab gov env soc bet lto tdt lte, fe
estimates store fixed
xtreg roaab gov env soc bet lto tdt lte, re
estimates store random
hausman fixed random
xttest0
The input of hausman test must be both fixed and random model.
Good luck!
@@HKTStata It worked, thank you!
b = Consistent under H0 and Ha; obtained from xtreg.
B = Inconsistent under Ha, efficient under H0; obtained from xtreg.
Test of H0: Difference in coefficients not systematic
chi2(7) = (b-B)'[(V_b-V_B)^(-1)](b-B)
= 32.55
Prob > chi2 = 0.0000
.
. xttest0
Breusch and Pagan Lagrangian multiplier test for random effects
roaab[company_id,t] = Xb + u[company_id] + e[company_id,t]
Estimated results:
| Var SD = sqrt(Var)
---------+-----------------------------
roaab | .0007081 .0266108
e | .000149 .0122071
u | .0002418 .0155483
Test: Var(u) = 0
chibar2(01) = 355.07
Prob > chibar2 = 0.0000
What does this mean? I'm struggling how to interpret it.
That was amazingly explained with interesting examples. Thanks a lot
Thank you sir, for the excellent video. I have run the FE and RE for research data.
Thanks for the video. Having the snapshot of the dataset and the do file would help understand the content better.
Thanks for the video. Having the snapshot of the dataset and the do file would help understand the content better.
You can find it here:
Database: drive.google.com/file/d/1G3NF-jL6Eoz9zrOjad5dMZrv33-Sp_D2/view?usp=sharing
Data in excel: docs.google.com/spreadsheets/d/135p2zph7SL6I5Y5UyCu7eKFzcOitrBLb/edit?usp=sharing&ouid=100029919331612689631&rtpof=true&sd=true
Thanks for the video. Having the snapshot of the dataset and the do file would help understand the content better.
Here you are:
Database: drive.google.com/file/d/1G3NF-jL6Eoz9zrOjad5dMZrv33-Sp_D2/view?usp=sharing
Data in excel: docs.google.com/spreadsheets/d/135p2zph7SL6I5Y5UyCu7eKFzcOitrBLb/edit?usp=sharing&ouid=100029919331612689631&rtpof=true&sd=true