Previous Page | Next Page

The EXPAND Procedure

Specifying Observation Characteristics

It is important to distinguish between variables that are measured at points in time and variables that represent totals or averages over an interval. Point-in-time values are often called stocks or levels. Variables that represent totals or averages over an interval are often called flows or rates.

For example, the annual series U.S. Gross Domestic Product represents the total value of production over the year and also the yearly average rate of production in dollars per year. However, a monthly variable inventory may represent the cost of a stock of goods as of the end of the month.

When the data represent periodic totals or averages, the process of interpolation to a higher frequency is sometimes called distribution, and the total values of the larger intervals are said to be distributed to the smaller intervals. The process of interpolating periodic total or average values to lower frequency estimates is sometimes called aggregation.

By default, PROC EXPAND assumes that all time series represent beginning-of-period point-in-time values. If a series does not measure beginning of period point-in-time values, interpolation of the data values using this assumption is not appropriate, and you should specify the correct observation characteristics of the series. The observation characteristics of the series are specified with the OBSERVED= option on the CONVERT statement.

For example, suppose that the data set ANNUAL contains variables A, B, and C that measure yearly totals, while the variables X, Y, and Z measure first-of-year values. The following statements estimate the contribution of each month to the annual totals in A, B, and C, and interpolate first-of-month estimates of X, Y, and Z.

   proc expand data=annual out=monthly 
               from=year to=month;
      id date;
      convert x y z;
      convert a b c / observed=total;
   run;

The EXPAND procedure supports five different observation characteristics. The OBSERVED= value options for these five observation characteristics are:

BEGINNING

beginning-of-period values

MIDDLE

period midpoint values

END

end-of-period values

TOTAL

period totals

AVERAGE

period averages

The interpolation of each series is adjusted appropriately for its observation characteristics. When OBSERVED=TOTAL or AVERAGE is specified, the interpolating curve is fit to the data values so that the area under the curve within each input interval equals the value of the series. For OBSERVED=MIDDLE or END, the curve is fit through the data points, with the time position of each data value placed at the specified offset from the start of the interval.

See the section OBSERVED= Option for details.

Previous Page | Next Page | Top of Page