|
The EXPAND procedure converts time series from one sampling interval or frequency to another and interpolates missing values in time series. Using PROC EXPAND, you can collapse time series data from higher frequency intervals to lower frequency intervals, or you can expand data from lower frequency intervals to higher frequency intervals. You can also interpolate missing values in time series, either without changing series frequency or in conjunction with expanding or collapsing series. You can also convert aperiodic series, observed at arbitrary points in time, into periodic estimates.
By default, the EXPAND procedure fits cubic spline curves to the nonmissing values of variables to form continuous-time approximations of the input series. Output series are then generated from the spline approximations.
This example illustrates two applications of the transformation of the frequency of time series data.
data monthly;
set sashelp.citimon;
keep date fm1;
run;
data quarter;
set sashelp.citiqtr;
keep date gdp;
run;
data weekly;
set sashelp.citiwk;
keep date wspglt;
run;
The following statements illustrate the conversion to a common frequency for the three data sets QUARTER, MONTHLY, and WEEKLY that are created above. The data sets QUARTER and WEEKLY are converted to monthly frequency using two PROC EXPAND steps. The OUT= option creates an output data set, the FROM= and TO= options specify the input and output intervals. The ID statement is used to specify a SAS date or datetime variable to identify the time of each input observation. The variables to be converted are listed in the CONVERT statement. The observation characteristics of series are specified with the OBSERVED= option in the CONVERT statement. When OBSERVED=TOTAL or AVERAGE, as in this example, the interpolating curve is fitted to the data values so that the area under the curve within each input interval equals the value of the series. The WSPGLT=INTEREST option in the CONVERT statement in the second step renames the variable WSPGLT to INTEREST.
proc expand data=quarter out=temp1 from=qtr to=month;
id date;
convert gdp / observed = total;
run;
proc expand data=weekly out=temp2 from=week to=month;
id date;
convert wspglt = interest / observed = average;
run;
The three data sets are then merged using a DATA step MERGE statement to produce the data set COMBINED.
data combined;
merge monthly temp1 temp2;
by date;
if interest=. then delete;
run;
data samples;
input date : date. defects @@;
label defects = "Defects per 1000 units";
format date date.;
datalines;
13jan92 55 27jan92 73 19feb92 84 8mar92 69
27mar92 66 5apr92 77 29apr92 63 11may92 81
25may92 89 7jun92 94 23jun92 105 11jul92 97
15aug92 112 29aug92 89 10sep92 77 27sep92 82
;
To compute the monthly estimates, use PROC EXPAND with the TO=MONTH option and specify OBSERVED=(BEGINNING,AVERAGE).
proc expand data=samples out=monthly to=month;
id date;
convert defects / observed=(beginning,average);
run;
title "Estimated Monthly Average Defect Rates";
proc print data=monthly;
run;
The results are shown below.
|
proc expand data=samples out=daily to=day;
id date;
convert defects = interpol;
run;
data daily;
merge daily samples;
by date;
run;
proc gplot data=daily;
plot interpol*date defects*date / vaxis=axis2 overlay cframe=ligr;
title1 "Plot of Interpolated Defect Rate Curve";
axis2 label=(angle=90);
symbol1 c=blue interpol=join value=none;
symbol2 c=red interpol=none value=star;
run;
quit;
|
SAS Institute Inc. (1993), SAS/ETS User's Guide, Version 6, Second Edition, Cary, NC: SAS Institute Inc.