* Comparison of Panel Data Models in Stata

clear all
set more off

use C:\data\panel_wage

global id id
global t t
global ylist lwage
global xlist exp exp2 wks ed

describe $id $t $ylist $xlist
summarize $id $t $ylist $xlist

* Set data as panel data
sort $id $t
xtset $id $t
xtdescribe
xtsum $id $t $ylist $xlist

* Pooled OLS estimator
reg $ylist $xlist

* Fixed effects or within estimator
* Notice the F-test of the significance of the fixed effects
* that is automatically produced in the fe output of xtreg.  This is
* the test for pooling.  The null hypothesis is that the Pooled OLS
* model is supported by the data whereas the alternative hypothesis is
* that the fixed effects model is supported by the data.
* In this case the Pooled OLS model is rejected in favor of the fixed
* effects model.  The observed F-statistic for testing for pooling is
* 53.12 with N-1 = 594 numerator degrees of freedom and T*N - (N + k) 
* denominator degrees of freedom = 3567 where k is equal to the number
* of time-varying regressors in the model, in this case, k = 3.  The p-value
* is less than 0.0000 indicating that the Pooled model is rejected in favor 
* of the fixed effects model.  
xtreg $ylist $xlist, fe

* Random effects estimator
* Theta is the GLS transformation parameter for the GLS estimation
* of the data.  Here it is 0.8228.  
* If theta = 0 we have the Pooled OLS estimator whereas if 
* theta = 1 we have the fixed effects model.
* sigma_u is the standard deviation of the individual-specific random
* effects.  sigma_e is the standard deviation of the idiosyncharatic 
* error of the composite error in the RE model.  Rho is the percentage of the 
* composite error's variance that is explained by the individual-specific
* random effects.  Here it is 81.5%.     
xtreg $ylist $xlist, re theta

* Breusch-Pagan LM test for random effects versus Pooled OLS
* The postestimation command "xttest0" conducts the Breusch-Pagan test
* of the Pooled model (H0) versus the RE model (H1).  The observed
* chi-square (1) test statistic is 5192.13 with p-value < 0.0000.  Therefore,
* in this case, the Pooled OLS model is rejected in favor of 
* the Random Effects model. 
quietly xtreg $ylist $xlist, re
xttest0

* After the above two specification tests, we are down to two possible
* models: the FE and RE models.  Both are favored over the Pooled model. 

* Hausman test for fixed versus random effects model
* The null hypothesis is that the unobserved individual-specific 
* effects are uncorrelated with the explanatory variables of the model
* suggesting the random effects model is the appropriate model
* to use.  The alternative hypothesis is that the unobserved
* effects are correlated with the explanatory variables of
* the model suggesting the fixed effects model is the appropriate
* model to use.
* In the current problem the null hypothesis is rejected while the 
* alternative hypothesis is accepted.  The chi-square(3) statistic
* is 6191.43 with p-value < 0.0000 suggesting that the fixed effects
* model is to be preferred.      
quietly xtreg $ylist $xlist, fe
estimates store fixed
quietly xtreg $ylist $xlist, re
estimates store random
hausman fixed random

* An alternative test for fixed versus random effects model
* is the Mundlak test.  What is assumed in the Mundlak test
* is that there is a linear relationship (with noise) relating the 
* unobserved effects and the cross-section means of the time-varying
* variables in the model.  Below we conduct the Mundlak test.
* The time-varying explanatory variables are exp exp2 and wks.
* Notice that the Mundlak test equation has the original RE 
* explanatory variables in the model PLUS the cross-section means
* of the time-varying variables in the RE model.  Notice that 
* robust errors are used in the Mundlak test equation.     
* The null and alternative hypotheses are the same as those of the
* Hausman test.
* In the current problem chi-square(3) test statistic is 1792.1 with
* a p-value < 0.0000.  The null hypothesis is rejected while the 
* alternative hypothesis is accepted.  Therefore, the fixed effects
* model is preferred.     
bysort id: egen mean_exp = mean(exp)
bysort id: egen mean_exp2 = mean(exp2)
bysort id: egen mean_wks = mean(wks)
quietly xtreg $ylist $xlist mean_exp mean_exp2 mean_wks, vce(robust)
test mean_exp mean_exp2 mean_wks

* In summary, the F-test for pooling in the fixed estimation using
* xtreg rejected the Pooled OLS model.  The Breusch-Pagan LM test
* for random effects versus Pooled OLS rejected the Pooled OLS model
* in favor of the Random effects model.  That is the rejection of
* the Pooled OLS model by two separate tests.  That pretty well 
* eliminates the Pooled OLS model from further consideration.  On the
* other hand, two separate tests supported the fixed effects model
* over the random effects model (the Hausman and Mundlak tests).
* Therefore, it seems appropriate to report our findings based on the 
* fixed effects model results.