Structure of a SAS Data Set That Contains Time Series Data |
SAS requires time series data to be in a specific form that is recognizable by the SAS System. This form is a two-dimensional array, called a SAS data set, whose columns correspond to series variables and whose rows correspond to measurements of these variables at certain points in time. The time at which observations are recorded can be included in the data set as a time ID variable. Note that CRSP sets the date to the end of a time period as opposed to the beginning, and SASECRSP follows this convention. For example, the time ID variable for any particular month in a monthly time series occurs on the last trading day of that month.
The SASECRSP engine provides several different time ID variables depending on the data member opened. For most members, a time ID variable called CALDT is provided. CALDT provides a day-based calendar date and is in a CRSP date format. This means dates are stored as an offset in an array of trading days or a trading day calendar. There are five different CRSP trading day calendars and the one used depends on the frequency of the data member. For example, the CRSP date for a daily time series refers to a daily trading day calendar.
The five trading day calendars are: annual, quarterly, monthly, weekly and daily. For your convenience, the format and informat for this field is set so the CRSP date is automatically converted to an Integer date representation when viewed or printed. For data programming, the SASECRSP engine provides 23 different user functions for date conversions between CRSP, SAS, and integer dates.
The CCM database contains members whose dates are based on the fiscal calendar of the corresponding company, so a comprehensive set of time ID variables are provided. CRSPDT, RCALDT and FISCALDT provide day-based dates, each with its own format.
This time ID variable provides a date in CRSP date format similar to CALDT. CRSPDT differs only in that its format and informat are not set for automatic conversion to integer dates because this is already provided by FISCALDT and RCALDT. For fiscal members, CRSPDT is one based on the fiscal calendar of the company.
This time ID variable provides the same date CRSPDT does, but in integer format. It is the result of performing a CRSP-to-Integer date conversion on CRSPDT. Since the date CRSPDT holds is fiscal, FISCALDT is also fiscal.
This time ID variable is also an integer date, just like FISCALDT, but it has been shifted so the date is on calendar time as opposed to being fiscal.
For example, Microsoft’s fiscal year ends in June, so if you look at its annual period descriptor for the 2002 fiscal year, its time ID variables are 78 for CRSPDT, 20021231 for its FISCALDT, and 20020628 for RCALDT. In summary, a total of three time ID variables are provided for fiscal time series members. One is in CRSP date format, and the other two are in integer format with the only difference between the two integer formats being that one of them is based on the fiscal calendar of the company while the other is not.
For more information about how CALDT, CRSPDT, and date conversions are handled, see the section Understanding CRSP Date Formats, Informats, and Functions.
The CCM database also contains fiscal array members, which are all the segment data members. They are unlike the fiscal time series in that they are not associated with a calendar and also have their time ID variables embedded in the data as a data field. Generally both fiscal and calendar time ID variables are embedded. However, segment members segsrc, segcur, and segitm have only one fiscal time ID variable embedded. For your convenience, SASECRSP calculates and provides CALYR, the calendar version of the embedded fiscal time ID variable for these three segment members. Note that due to limitations of the data, all segment member time ID variables are year-based.