The PANEL procedure analyzes a class of linear econometric models that commonly arise when time series and cross-sectional data are combined. This type of pooled data on time series cross-sectional bases is often referred to as panel data. Typical examples of panel data include observations over time on households, countries, firms, trade, and so on. For example, in the case of survey data on household income, the panel is created by repeatedly surveying the same households in different time periods (years).
The panel data models can be grouped into several categories depending on the structure of the error term. The PANEL procedure uses the following error structures and the corresponding methods to analyze data:
one-way and two-way models
fixed-effects, random-effects, and hybrid models
autoregressive models
moving average models
A one-way model depends only on the cross section to which the observation belongs. A two-way model depends on both the cross section and the time period to which the observation belongs.
Apart from the possible one-way or two-way nature of the effect, the other dimension of difference between the possible specifications is the nature of the cross-sectional or time-series effect. The models are referred to as fixed-effects models if the effects are nonrandom and as random-effects models otherwise.
If the effects are fixed, the models are essentially regression models with dummy variables that correspond to the specified effects. For fixed-effects models, ordinary least squares (OLS) estimation is the best linear unbiased estimator. Random-effects models use a two-stage approach. In the first stage, variance components are calculated by using methods described by: Fuller and Battese (1974); Wansbeek and Kapteyn (1989); Wallace and Hussain (1969); Nerlove (1971). In the second stage, variance components are used to standardize the data, and ordinary least squares (OLS) regression is performed.
Random effects models are more efficient than fixed effects models, and they have the ability to estimate effects for variables that do not vary within cross sections. The cost of these added features is that random effects models carry much more stringent assumptions than their fixed-effects counterparts. The PANEL procedure supports models that blend the desirable features of both random and fixed effects. These hybrid models are those by Hausman and Taylor (1981) and Amemiya and MaCurdy (1986).
Two types of models in the PANEL procedure accommodate an autoregressive structure: The Parks method estimates a first-order autoregressive model with contemporaneous correlation, and the dynamic panel estimator estimates an autoregressive model with lagged dependent variable.
The Da Silva method estimates a mixed variance-component moving-average error process. The regression parameters are estimated by two-step generalized least squares (GLS).
The PANEL procedure enhances the features that were previously implemented in the TSCSREG procedure. The following list shows the most important additions.
You can fit models for dynamic panel data using the generalized method of moments (GMM).
The Hausman-Taylor and Amemiya-MaCurdy estimators offer a compromise between fixed and random effects estimation in models where some variables are correlated with individual effects.
The MODEL statement supports between and pooled estimation.
The variance components for random-effects models can be calculated for both balanced and unbalanced panels by using the methods described by: Fuller and Battese (1974); Wansbeek and Kapteyn (1989); Wallace and Hussain (1969); Nerlove (1971).
The CLASS statement allows classification variables (and their interactions) directly into the analysis.
The TEST statement includes new options for Wald, LaGrange multiplier, and the likelihood ratio tests.
The new RESTRICT statement specifies linear restrictions on the parameters.
The FLATDATA statement enables the data to be in a compressed (or wide) form.
Several methods that produce heteroscedasticity-consistent (HCCME) and heteroscedasticity- and Autocorrelation-Consistent (HAC) covariance matrices are added because the presence of heteroscedasticity and autocorrelation can result in inefficient and biased estimates of the covariance matrix in an OLS framework.
Tests are added for poolability, panel stationarity, the existence of cross sectional and time effects, autocorrelation, and cross sectional dependence.
The LAG and related statements provide functionality for creating lagged variables from within the PANEL procedure. Using these statements is preferable to using the DATA step because creating lagged variables in a panel setting can prove difficult, often requiring multiple loops and careful consideration of missing values.
Working within the PANEL procedure makes the creation of lagged values easy. The missing values can be replaced with zeros, overall mean, time mean, or cross section mean by using the LAG, ZLAG, XLAG, SLAG, and CLAG statements, respectively.
The OUTPUT statement enables you to output data and estimates that can be used in other analyses.