The PANEL Procedure

Example 27.5 Panel Study of Income Dynamics (PSID): Hausman-Taylor Models

Cornwell and Rupert (1988) analyze data from the Panel Study of Income Dynamics (PSID), an income study of 595 individuals over the seven-year period, 1976–1982 inclusive. Of particular interest is the effect of additional schooling on wages. The analysis here replicates that of Baltagi (2008, sec. 7.5), where it is surmised that covariate correlation with individual effects makes a standard random-effects model inadequate.

The following statements create the PSID data:

data psid;
   input id t lwage wks south smsa ms exp exp2 occ ind union fem blk ed;
   label id    = 'Person ID'
         t     = 'Time'
         lwage = 'Log(wages)'
         wks   = 'Weeks worked'
         south = '1 if resides in the South'
         smsa  = '1 if resides in SMSA'
         ms    = '1 if married'
         exp   = 'Years full-time experience'
         exp2  = 'exp squared'
         occ   = '1 if blue-collar occupation'
         ind   = '1 if manufacturing'
         union = '1 if union contract'
         fem   = '1 if female'
         blk   = '1 if black'
         ed    = 'Years of education';
datalines;
1    1  5.5606799126  32  1  0  1  3   9     0  0  0  0  0  9
1    2  5.7203102112  43  1  0  1  4   16    0  0  0  0  0  9
1    3  5.9964499474  40  1  0  1  5   25    0  0  0  0  0  9
1    4  5.9964499474  39  1  0  1  6   36    0  0  0  0  0  9
1    5  6.0614600182  42  1  0  1  7   49    0  1  0  0  0  9
1    6  6.1737899780  35  1  0  1  8   64    0  1  0  0  0  9
1    7  6.2441701889  32  1  0  1  9   81    0  1  0  0  0  9
2    1  6.1633100510  34  0  0  1  30  900   1  0  0  0  0  11
2    2  6.2146100998  27  0  0  1  31  961   1  0  0  0  0  11
2    3  6.2634000778  33  0  0  1  32  1024  1  1  1  0  0  11
2    4  6.5439100266  30  0  0  1  33  1089  1  1  0  0  0  11
2    5  6.6970300674  30  0  0  1  34  1156  1  1  0  0  0  11
2    6  6.7912201881  37  0  0  1  35  1225  1  1  0  0  0  11
2    7  6.8156399727  30  0  0  1  36  1296  1  1  0  0  0  11

   ... more lines ...   

You begin by fitting a one-way random effects model:

proc panel data=psid;
   id id t;
   model lwage = wks south smsa ms exp exp2 occ
                           ind union fem blk ed / ranone;
run;

The output is shown in Output 27.5.1. The coefficient on the variable ED (which represents years of education) estimates that an additional year of schooling is associated with about a 10.7% increase in wages. However, the results of the Hausman test for random effects show a serious violation of the random-effects assumptions, namely that the regressors are independent of both error components.

Output 27.5.1: One-Way Random Effects Estimation

The PANEL Procedure
Fuller and Battese Variance Components (RanOne)
 
Dependent Variable: lwage (Log(wages))

Model Description
Estimation Method RanOne
Number of Cross Sections 595
Time Series Length 7

Variance Component Estimates
Variance Component for Cross Sections 0.100553
Variance Component for Error 0.023102

Hausman Test for Random Effects
Coefficients DF m Value Pr > m
9 9 5288.98 <.0001

Parameter Estimates
Variable DF Estimate Standard
Error
t Value Pr > |t| Label
Intercept 1 4.030811 0.1044 38.59 <.0001 Intercept
wks 1 0.000954 0.000740 1.29 0.1971 Weeks worked
south 1 -0.00788 0.0281 -0.28 0.7795 1 if resides in the South
smsa 1 -0.02898 0.0202 -1.43 0.1517 1 if resides in SMSA
ms 1 -0.07067 0.0224 -3.16 0.0016 1 if married
exp 1 0.087726 0.00281 31.27 <.0001 Years full-time experience
exp2 1 -0.00076 0.000062 -12.31 <.0001 exp squared
occ 1 -0.04293 0.0162 -2.65 0.0081 1 if blue-collar occupation
ind 1 0.00381 0.0172 0.22 0.8242 1 if manufacturing
union 1 0.058121 0.0169 3.45 0.0006 1 if union contract
fem 1 -0.30791 0.0572 -5.38 <.0001 1 if female
blk 1 -0.21995 0.0660 -3.33 0.0009 1 if black
ed 1 0.10742 0.00642 16.73 <.0001 Years of education



An alternative could be a fixed-effects (FIXONE) model, but that would not permit estimation of the coefficient for ED, which does not vary within individuals. A compromise is the Hausman-Taylor model, for which you stipulate a set of covariates that are correlated with the individual effects (but uncorrelated with the observation-level errors). You specify the correlated variables in the CORRELATED= option in the INSTRUMENTS statement:

proc panel data=psid;
   id id t;
   instruments correlated = (wks ms exp exp2 union ed);
   model lwage = wks south smsa ms exp exp2 occ
                           ind union fem blk ed / htaylor;
run;

The results are shown in Output 27.5.2. The table of parameter estimates has an added column, Type, that identifies which regressors are assumed to be correlated with individual effects (C) and which regressors do not vary within cross sections (TI). It was stated previously that the Hausman-Taylor model is a compromise between fixed-effects and random-effects models, and you can think of the compromise this way: You want to fit a random-effects model, but the correlated (C) variables make that model invalid. So you fall back to the consistent fixed-effects model, but then the time-invariant (TI) variables are the problem because they will be dropped from that model. The solution is to use the Hausman-Taylor estimator.

The estimation results show that an additional year of schooling is now associated with a 13.8% increase in wages. Also presented is a Hausman test that compares this model to the fixed-effects model. As was the case previously when you fit the random-effects model, you can think of the Hausman test as a referendum on the assumptions you are making. For this estimation, it seems that your choice of variables to treat as correlated is adequate. It also seems to hold true that any correlation is with the individual-level effects, and not the observation-level errors.

Output 27.5.2: Hausman-Taylor Estimation

The PANEL Procedure
Hausman and Taylor Model for Correlated Individual Effects (HTaylor)
 
Dependent Variable: lwage (Log(wages))

Variance Component Estimates
Variance Component for Cross Sections 0.886993
Variance Component for Error 0.023044

Hausman Test against Fixed Effects
Coefficients DF m Value Pr > m
9 3 5.26 0.1539

Parameter Estimates
Variable Type DF Estimate Standard
Error
t Value Pr > |t| Label
Intercept     1 2.912726 0.2837 10.27 <.0001 Intercept
wks C   1 0.000837 0.000600 1.40 0.1627 Weeks worked
south     1 0.00744 0.0320 0.23 0.8159 1 if resides in the South
smsa     1 -0.04183 0.0190 -2.21 0.0274 1 if resides in SMSA
ms C   1 -0.02985 0.0190 -1.57 0.1159 1 if married
exp C   1 0.113133 0.00247 45.79 <.0001 Years full-time experience
exp2 C   1 -0.00042 0.000055 -7.67 <.0001 exp squared
occ     1 -0.0207 0.0138 -1.50 0.1331 1 if blue-collar occupation
ind     1 0.013604 0.0152 0.89 0.3720 1 if manufacturing
union C   1 0.032771 0.0149 2.20 0.0280 1 if union contract
fem   TI 1 -0.13092 0.1267 -1.03 0.3014 1 if female
blk   TI 1 -0.28575 0.1557 -1.84 0.0665 1 if black
ed C TI 1 0.137944 0.0212 6.49 <.0001 Years of education

C: correlated with the individual effects
TI: constant (time-invariant) within cross sections




At its core, the Hausman-Taylor estimator is an instrumental variables regression, where the instruments are derived from regressors that are assumed to be uncorrelated with the individual effects. Technically it is the cross-sectional means of these variables that need to be uncorrelated, not the variables themselves.

The Amemiya-MaCurdy model is a close relative of the Hausman-Taylor model. The only difference between the two is that the Amemiya-MaCurdy model makes the added assumption that the regressors (and not just their means) are uncorrelated with the individual effects. By making that assumption, the Amemiya-MaCurdy model can take advantage of a more efficient set of instrumental variables.

The following statements fit the Amemiya-MaCurdy model:

proc panel data=psid;
   id id t;
   instruments correlated = (wks ms exp exp2 union ed);
   model lwage = wks south smsa ms exp exp2 occ
                           ind union fem blk ed / amacurdy;
run;

The results are shown in Output 27.5.3. Little is changed from the Hausman-Taylor model. The Hausman test compares the Amemiya-MaCurdy model to the Hausman-Taylor model (not the fixed-effects model as previously) and shows that the one additional assumption is acceptable. You even gained a bit of efficiency in the process; compare the standard deviations of the coefficient on the variable ED from both models.

Output 27.5.3: Amemiya-MaCurdy Estimation

The PANEL Procedure
Amemiya and MaCurdy Model for Correlated Individual Effects (AMaCurdy)
 
Dependent Variable: lwage (Log(wages))

Variance Component Estimates
Variance Component for Cross Sections 0.886993
Variance Component for Error 0.023044

Hausman Test against Hausman-Taylor
Coefficients DF m Value Pr > m
13 13 14.67 0.3287

Parameter Estimates
Variable Type DF Estimate Standard
Error
t Value Pr > |t| Label
Intercept     1 2.927338 0.2751 10.64 <.0001 Intercept
wks C   1 0.000838 0.000599 1.40 0.1622 Weeks worked
south     1 0.007282 0.0319 0.23 0.8197 1 if resides in the South
smsa     1 -0.04195 0.0189 -2.21 0.0269 1 if resides in SMSA
ms C   1 -0.03009 0.0190 -1.59 0.1127 1 if married
exp C   1 0.11297 0.00247 45.76 <.0001 Years full-time experience
exp2 C   1 -0.00042 0.000055 -7.72 <.0001 exp squared
occ     1 -0.02085 0.0138 -1.51 0.1299 1 if blue-collar occupation
ind     1 0.013629 0.0152 0.89 0.3709 1 if manufacturing
union C   1 0.032475 0.0149 2.18 0.0293 1 if union contract
fem   TI 1 -0.13201 0.1266 -1.04 0.2972 1 if female
blk   TI 1 -0.2859 0.1555 -1.84 0.0660 1 if black
ed C TI 1 0.137205 0.0206 6.67 <.0001 Years of education

C: correlated with the individual effects
TI: constant (time-invariant) within cross sections




Finally, you should realize that the Hausman-Taylor and Amemiya-MaCurdy estimators are not cure-alls for correlated individual effects. Estimation tacitly relies on the uncorrelated regressors being sufficient to predict the correlated regressors. Otherwise, you run into the problem of weak instruments. If you have weak instruments, you will obtain biased estimates that have very large standard errors. However, that does not seem to be the case here.