The SSM Procedure

Example 27.11 Panel Data: Dynamic Panel Model for the Cigar Data

This example shows how you can use the SSM procedure to specify and fit the so-called dynamic panel model, which is commonly used to analyze a panel of time series. Suppose that a panel of time series $y_{t, i}$ follows the model

$y_{t, i} = \rho y_{(t-1),i} + \mu _{i} + \beta X_{t,i} + \zeta _{t} + \epsilon _{t,i}$

where t denotes the time index (for example, $t = 1, \ldots , T$ ); i denotes the panel index (for example, $i = 1, \ldots , P$ ); $\rho$ is the autoregression coefficient; $\mu _{i}$ denote the panel-specific intercepts; $X_{t,i}$ are observations on a regression variable with regression coefficient $\beta$ (the same for all panels); $\zeta _{t}$ are unobserved, random time effects; and $\epsilon _{t,i}$ are the observation errors. The sequences $\zeta _{t}$ and $\epsilon _{t,i}$ are assumed to be independent, zero-mean Gaussian variables with variances $\sigma _{1}^{2}$ and $\sigma _{0}^{2}$ , respectively. This is an example of a dynamic panel model that contains one regressor variable. It is easy to formulate this model equation as a state equation with state $\pmb {\alpha }_{t}$ of size P—the number of panels. Taking $y_{t,i} = \pmb {\alpha }_{t}[i]$ , it is easy to see that the states $\pmb {\alpha }_{t}$ evolve according to the equation

$\pmb {\alpha }_{t+1} = \mb{T} \pmb {\alpha }_{t} + \mb{W}_{t+1} \pmb {\beta } + \pmb {\eta }_{t+1}$

where $\mb{T} = \rho I_{P}$ (a P-dimensional, diagonal matrix with all its diagonal elements equal to $\rho$ ); $\mb{W}_{t} = (\mb{X}_{t} \; \; I_{P})$ is a $P \times (1+P)$ -dimensional matrix (in a block form) of state regression variables, where the first block is a column that includes all the values $X_{t,i}$ that are associated with a given time index (t) and the second block is a P-dimensional identity matrix; $\pmb {\beta } = (\beta \; \mu _{1}, \ldots , \mu _{P})^{'}$ is the $(1+P)$ -dimensional column vector of regression coefficients; and $\pmb {\eta }_{t} = (\zeta _{t}+\epsilon _{t,1}, \ldots , \; \zeta _{t}+\epsilon _{t,P})^{'}$ is a P-dimensional column vector of all the disturbances that are associated with time index t. Because $\zeta _{t}$ and $\epsilon _{t,i}$ are independent, the covariance matrix of $\pmb {\eta }_{t}$ —for example, $\mb{Q}_{t}$ —is easy to calculate: $\mb{Q}_{t}[i, i] = \sigma _{0}^{2} + \sigma _{1}^{2} \; \; \text {and, for}\; \; i \neq j, \; \; \mb{Q}_{t}[i, j] = \sigma _{1}^{2}$ . This formulation can be easily extended to multiple regression variables, such as $\Mathtext{r}$ variables, by appropriately modifying the term that is associated with the state regression variables— $\mb{W}_{t} \pmb {\beta }$ : the new $\mb{W}_{t}$ matrix becomes $P \times (r+P)$ -dimensional and the new regression vector $\pmb {\beta }$ becomes $(r+P)$ -dimensional.

The cross-sectional data, Cigar, that are used in the section Getting Started: SSM Procedure are reused in this example. In order to use the SSM procedure to perform the dynamic panel model–based analysis, the input data set must be reorganized so that it contains the variables that form the $P \times (r+P)$ -dimensional matrix $\mb{W}_{t}$ . For the Cigar data, the number of panels $P = 46$ (the number of regions considered in the study), and the number of regression variables $r = 3$ . Therefore, the input data set needs to be augmented by $46 * (3+46) = 2,254$ variables that constitute the matrix $\mb{W}_{t} = (\mb{X}_{t} \; \; I_{46})$ —the first $46 \times 3$ -dimensional block $\mb{X}_{t}$ contains the values of the three regression variables, lprice, lndi, and lpimin, at a given time index (a particular year in this case). The following DATA steps accomplish this task in two steps. In the first step, the raw data that form the rows of the Cigar data set are read into a temporary data set, Tmp, such that all 6*46 = 276 values that are associated with a given year (values of six variables—year, region, lsales, lprice, lndi, and lpimin for 46 panels in a given year) are read in a single row that consists of 276 columns. In the second step, the final input data set is formed by rearranging Tmp so that it contains the necessary variables in the proper order—year (the time index), region (the panel index), lsales (the response variable), and the variables that form the $46 \times 49$ -dimensional $\mb{W}$ matrix (w1, . . ., w2254).

data Tmp;
    input u1-u276;
datalines;
63 1 4.54223 3.35341 7.3514 3.26194
63 2 4.82831 3.17388 7.5729 3.21487
63 3 4.63860 3.29584 7.3000 3.25037

   ... more lines ...

data cigar(keep=year region lsales w1-w2254);
   array wmat{46, 49} w1-w2254;
   array ivar{46, 6} u1-u276;
   set tmp;
   year = intnx( 'year', '1jan63'd, u1-63 );
   format year year.;
   do i=1 to 46;
      region = ivar[i, 2];
      lsales = ivar[i, 3];
      do j=1 to 46;
          do k=1 to 49;
              wmat[j,k] = 0;
              if k = j+3 then wmat[j,k] = 1;
              if k=1 then wmat[j,k] = ivar[j, 4];
              if k=2 then wmat[j,k] = ivar[j, 5];
              if k=3 then wmat[j,k] = ivar[j, 6];
          end;
      end;
      output;
   end;
run;

The following statements specify and fit the dynamic panel model:

 proc ssm data=Cigar opt(tech=dbldog maxiter=75);
     id year interval=year;
     parms rho / lower=-0.9999 upper=0.9999;
     parms sigma0 sigma1 / lower=1.e-8;
     array RegionArray{46} region1-region46;
     do i=1 to 46;
        RegionArray[i] = (region=i);
     end;
     array cov{46,46};
     do i=1 to 46;
         do j=1 to 46;
            if(i=j) then cov[i,j] = sigma0 + sigma1;
            else cov[i,j] = sigma1;
         end;
     end;
     state panelState(46) T(I)=(rho) W(g)=(w1-w2254)
       cov(g)=(cov) a1(46) checkbreak;
     comp dynPanel = (RegionArray)*panelState;
     model lsales = dynPanel;
     output out=for1 press;
 run;

The estimates of the regression coefficients and the regional intercepts, which are all statistically significant, are shown in Output 27.11.1. In particular, the estimated coefficients of lprice, lndi, and lpimin, are –0.26, 0.13, and 0.07, respectively.

Output 27.11.1: Estimates of $\beta _1$ , $\beta _2$ , $\beta _3$ and the Regional Intercepts

The SSM Procedure

Estimate of the State Equation Regression Vector
State	Element Index	Estimate	Standard Error	t Value	Pr > \|t\|
panelState	1	-0.2627	0.0178	-14.79	<.0001
panelState	2	0.1340	0.0130	10.30	<.0001
panelState	3	0.0748	0.0198	3.78	0.0002
panelState	4	0.4265	0.0581	7.35	<.0001
panelState	5	0.3825	0.0605	6.32	<.0001
panelState	6	0.4425	0.0582	7.61	<.0001
panelState	7	0.3471	0.0631	5.50	<.0001
panelState	8	0.3686	0.0635	5.81	<.0001
panelState	9	0.4357	0.0614	7.10	<.0001
panelState	10	0.3753	0.0655	5.73	<.0001
panelState	11	0.4249	0.0606	7.01	<.0001
panelState	12	0.4185	0.0604	6.92	<.0001
panelState	13	0.3824	0.0602	6.35	<.0001
panelState	14	0.3942	0.0644	6.12	<.0001
panelState	15	0.4154	0.0626	6.64	<.0001
panelState	16	0.3961	0.0610	6.49	<.0001
panelState	17	0.3765	0.0618	6.10	<.0001
panelState	18	0.4528	0.0608	7.44	<.0001
panelState	19	0.4316	0.0586	7.36	<.0001
panelState	20	0.4357	0.0601	7.25	<.0001
panelState	21	0.3771	0.0639	5.90	<.0001
panelState	22	0.3939	0.0629	6.26	<.0001
panelState	23	0.4122	0.0621	6.64	<.0001
panelState	24	0.3949	0.0605	6.52	<.0001
panelState	25	0.4386	0.0565	7.77	<.0001
panelState	26	0.4118	0.0627	6.57	<.0001
panelState	27	0.3898	0.0604	6.45	<.0001
panelState	28	0.3818	0.0613	6.23	<.0001
panelState	29	0.4343	0.0632	6.87	<.0001
panelState	30	0.4619	0.0625	7.39	<.0001
panelState	31	0.3730	0.0636	5.86	<.0001
panelState	32	0.3784	0.0589	6.43	<.0001
panelState	33	0.3825	0.0625	6.12	<.0001
panelState	34	0.3784	0.0598	6.32	<.0001
panelState	35	0.4093	0.0628	6.52	<.0001
panelState	36	0.4155	0.0597	6.96	<.0001
panelState	37	0.3960	0.0615	6.44	<.0001
panelState	38	0.4075	0.0602	6.77	<.0001
panelState	39	0.4045	0.0586	6.91	<.0001
panelState	40	0.3918	0.0599	6.55	<.0001
panelState	41	0.4350	0.0608	7.16	<.0001
panelState	42	0.4007	0.0602	6.65	<.0001
panelState	43	0.3196	0.0597	5.36	<.0001
panelState	44	0.4337	0.0609	7.12	<.0001
panelState	45	0.3790	0.0634	5.98	<.0001
panelState	46	0.3767	0.0618	6.10	<.0001
panelState	47	0.4392	0.0597	7.36	<.0001
panelState	48	0.3932	0.0603	6.51	<.0001
panelState	49	0.3938	0.0616	6.40	<.0001

Output 27.11.2 shows the estimates of the autoregression coefficient $\rho$ , the observation error variance $\sigma ^{2}_{0}$ , and the variance of the time effect (variance of $\zeta$ ) $\sigma ^{2}_{1}$ .

Output 27.11.2: Estimates of $\rho$ , $\sigma ^{2}_{0}$ , and $\sigma ^{2}_{1}$

Estimates of Named Parameters
Parameter	Estimate	Standard Error
rho	0.831679	0.0124338
sigma0	0.001231	0.0000491
sigma1	0.000213	0.0000662

Finally, you can compare the fit of the dynamic panel model with the fit of the model that is discussed in the section Getting Started: SSM Procedure. Output 27.11.3 shows the likelihood-based information criteria for the dynamic panel model, and Output 27.11.4 shows the same information for the other model.

Output 27.11.3: Likelihood-Based Information Criteria: Dynamic Panel Model

Information Criteria
Statistic	Diffuse Likelihood Based	Profile Likelihood Based
AIC (lower is better)	-4732.722	-4856.398
BIC (lower is better)	-4717.247	-4343.874
AICC (lower is better)	-4732.704	-4841.250
HQIC (lower is better)	-4726.913	-4664.667
CAIC (lower is better)	-4714.247	-4245.874

Output 27.11.4: Likelihood-Based Information Criteria: Getting Started Example

Information Criteria
Statistic	Diffuse Likelihood Based	Profile Likelihood Based
AIC (lower is better)	-4488.093	-4145.246
BIC (lower is better)	-4477.776	-3637.952
AICC (lower is better)	-4488.084	-4130.417
HQIC (lower is better)	-4484.220	-3955.472
CAIC (lower is better)	-4475.776	-3540.952

Similarly, Output 27.11.5 shows fit criteria based on the delete-one cross validation error for the dynamic panel model, and Output 27.11.6 shows the same information for the other model.

Output 27.11.5: Delete-One Cross Validation Criteria: Dynamic Panel Model

Delete-One Cross Validation Error Criteria
Variable	N	PRESS	Generalized Cross-Validation
lsales	1380	1.115309	5.62798E-7

Output 27.11.6: Delete-One Cross Validation Criteria: Getting Started Example

Delete-One Cross Validation Error Criteria
Variable	N	PRESS	Generalized Cross-Validation
lsales	1380	1.290420	6.18144E-7

On the basis of both these considerations, the dynamic panel model appears to provide a better fit for the Cigar data than the model that is fit in the section Getting Started: SSM Procedure.