The EXPAND Procedure

Example 15.1 Combining Monthly and Quarterly Data

This example combines monthly and quarterly data sets by interpolating monthly values for the quarterly series. The series are extracted from two small sample data sets stored in the SASHELP library. These data sets were contributed by Citicorp Data Base services and contain selected U.S. macro economic series.

The quarterly series gross domestic product (GDP) and implicit price deflator (GD) are extracted from SASHELP.CITIQTR. The monthly series industrial production index (IP) and unemployment rate (LHUR) are extracted from SASHELP.CITIMON. Only observations for the years 1990 and 1991 are selected. PROC EXPAND is then used to interpolate monthly estimates for the quarterly series, and the interpolated series are merged with the monthly data.

The following statements extract and print the quarterly data, shown in Output 15.1.1.

data qtrly;
   set sashelp.citiqtr;
   where date >= '1jan1990'd &
         date <  '1jan1992'd ;
   keep date gdp gd;
run;

title "Quarterly Data";
proc print data=qtrly;
run;

Output 15.1.1: Quarterly Data Set

Quarterly Data

Obs DATE GD GDP
1 1990:1 111.100 5422.40
2 1990:2 112.300 5504.70
3 1990:3 113.600 5570.50
4 1990:4 114.500 5557.50
5 1991:1 115.900 5589.00
6 1991:2 116.800 5652.60
7 1991:3 117.400 5709.20
8 1991:4 . 5736.60


The following statements extract and print the monthly data, shown in Output 15.1.2.

data monthly;
   set sashelp.citimon;
   where date >= '1jan1990'd &
         date <  '1jan1992'd ;
   keep date ip lhur;
run;

title "Monthly Data";
proc print data=monthly;
run;

Output 15.1.2: Monthly Data Set

Monthly Data

Obs DATE IP LHUR
1 JAN1990 107.500 5.30000
2 FEB1990 108.500 5.30000
3 MAR1990 108.900 5.20000
4 APR1990 108.800 5.40000
5 MAY1990 109.400 5.30000
6 JUN1990 110.100 5.20000
7 JUL1990 110.400 5.40000
8 AUG1990 110.500 5.60000
9 SEP1990 110.600 5.70000
10 OCT1990 109.900 5.80000
11 NOV1990 108.300 6.00000
12 DEC1990 107.200 6.10000
13 JAN1991 106.600 6.20000
14 FEB1991 105.700 6.50000
15 MAR1991 105.000 6.70000
16 APR1991 105.500 6.60000
17 MAY1991 106.400 6.80000
18 JUN1991 107.300 6.90000
19 JUL1991 108.100 6.80000
20 AUG1991 108.000 6.80000
21 SEP1991 108.400 6.80000
22 OCT1991 108.200 6.90000
23 NOV1991 108.000 6.90000
24 DEC1991 107.800 7.10000


The following statements interpolate monthly estimates for the quarterly series and merge the interpolated series with the monthly data. The resulting combined data set is then printed, as shown in Output 15.1.3.

proc expand data=qtrly out=temp from=qtr to=month;
   convert gdp gd / observed=average;
   id date;
run;

data combined;
   merge monthly temp;
   by date;
run;

title "Combined Data Set";
proc print data=combined;
run;

Output 15.1.3: Combined Data Set

Combined Data Set

Obs DATE IP LHUR GDP GD
1 JAN1990 107.500 5.30000 5409.69 110.879
2 FEB1990 108.500 5.30000 5417.67 111.048
3 MAR1990 108.900 5.20000 5439.39 111.367
4 APR1990 108.800 5.40000 5470.58 111.802
5 MAY1990 109.400 5.30000 5505.35 112.297
6 JUN1990 110.100 5.20000 5538.14 112.801
7 JUL1990 110.400 5.40000 5563.38 113.264
8 AUG1990 110.500 5.60000 5575.69 113.641
9 SEP1990 110.600 5.70000 5572.49 113.905
10 OCT1990 109.900 5.80000 5561.64 114.139
11 NOV1990 108.300 6.00000 5553.83 114.451
12 DEC1990 107.200 6.10000 5556.92 114.909
13 JAN1991 106.600 6.20000 5570.06 115.452
14 FEB1991 105.700 6.50000 5588.18 115.937
15 MAR1991 105.000 6.70000 5608.68 116.314
16 APR1991 105.500 6.60000 5630.81 116.600
17 MAY1991 106.400 6.80000 5652.92 116.812
18 JUN1991 107.300 6.90000 5674.06 116.988
19 JUL1991 108.100 6.80000 5693.43 117.164
20 AUG1991 108.000 6.80000 5710.54 117.380
21 SEP1991 108.400 6.80000 5724.11 117.665
22 OCT1991 108.200 6.90000 5733.65 .
23 NOV1991 108.000 6.90000 5738.46 .
24 DEC1991 107.800 7.10000 5737.75 .