Some data files group time series data with respect to cross-section identifiers; for example, International Financial Statistics files, distributed by IMF, group data with respect to countries (COUNTRY). Within each country, data are further grouped by Control Source Code (CSC), Partner Country Code (PARTNER), and Version Code (VERSION).
If a data file contains cross-section identifiers, the DATASOURCE procedure adds them to the output data set as BY variables. For example, the data set in Table 13.2 contains three cross sections:
Cross-section one is identified by (COUNTRY=’112’ CSC=’F’ PARTNER=’ ’ VERSION=’Z’).
Cross-section two is identified by (COUNTRY=’146’ CSC=’F’ PARTNER=’ ’ VERSION=’Z’).
Cross-section three is identified by (COUNTRY=’158’ CSC=’F’ PARTNER=’ ’ VERSION=’Z’).
Table 13.2: The Form of a SAS Data Set Containing BY Variables
BY |
Time ID |
Time Series |
||||
Variables |
Variable |
Variables |
||||
COUNTRY |
CSC |
PARTNER |
VERSION |
DATE |
EFFEXR |
EXRINDEX |
112 |
F |
Z |
SEP1987 |
9326 |
12685 |
|
112 |
F |
Z |
OCT1987 |
9393 |
12813 |
|
112 |
F |
Z |
NOV1987 |
9626 |
13694 |
|
112 |
F |
Z |
DEC1987 |
9675 |
14099 |
|
112 |
F |
Z |
JAN1988 |
9581 |
13910 |
|
112 |
F |
Z |
FEB1988 |
9493 |
13549 |
|
146 |
F |
Z |
SEP1987 |
12046 |
16192 |
|
146 |
F |
Z |
OCT1987 |
12067 |
16266 |
|
146 |
F |
Z |
NOV1987 |
12558 |
17596 |
|
146 |
F |
Z |
DEC1987 |
12759 |
18301 |
|
146 |
F |
Z |
JAN1988 |
12642 |
18082 |
|
146 |
F |
Z |
FEB1988 |
12409 |
17470 |
|
158 |
F |
Z |
SEP1987 |
13841 |
16558 |
|
158 |
F |
Z |
OCT1987 |
13754 |
16499 |
|
158 |
F |
Z |
NOV1987 |
14222 |
17505 |
|
158 |
F |
Z |
DEC1987 |
14768 |
18423 |
|
158 |
F |
Z |
JAN1988 |
14933 |
18565 |
|
158 |
F |
Z |
FEB1988 |
14915 |
18331 |
Note that the data sets in Table 13.1 and Table 13.2 use two different ways of representing time series data for three different countries: the United Kingdom (COUNTRY=’112’), Switzerland (COUNTRY=’146’), and Japan (COUNTRY=’158’). The first representation (Table 13.1) incorporates each country’s name into the series names, while the second representation (Table 13.2) represents countries as different cross sections by using the BY variable named COUNTRY. See "Time Series and SAS Data Sets" in Chapter 4: Working with Time Series Data.