The DATASOURCE Procedure

Example 12.4 DRI/McGraw-Hill Format CITIBASE Files

Output 12.4.1 and Output 12.4.2 illustrate how to extract weekly series from a sample CITIBASE file. They also demonstrate how the OUTSELECT= option affects the contents of the auxiliary data sets.

The weekly series contained in the sample data file CITIDEMO are listed by the following statements:

options yearcutoff=1920;

filename datafile "%sysget(DATASRC_DATA)citidem.dat" RECFM=D LRECL=80;

proc datasource filetype=citibase interval=week
                outall=citiall outby=citikey;
run;

title1 'Summary Information on Weekly Data for CITIDEMO File';
proc print data=citikey;
run;

title1 'Weekly Series Available in CITIDEMO File';
proc print data=citiall( drop=label );
run;

Output 12.4.1: Listing of the OUTBY= CITIKEY Data Set

Summary Information on Weekly Data for CITIDEMO File

Obs ST_DATE END_DATE NTIME NOBS NSERIES NSELECT
1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 6 6



Output 12.4.2: Listing of the OUTALL= CITIALL Data Set

Weekly Series Available in CITIDEMO File

Obs NAME SELECTED TYPE LENGTH VARNUM BLKNUM FORMAT FORMATL FORMATD ST_DATE END_DATE NTIME NOBS CODE ATTRIBUT NDEC
1 FF142B 1 1 5 . 36   0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 FF142B 1 2
2 WSPCA 1 1 5 . 37   0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 WSPCA 1 2
3 WSPUA 1 1 5 . 38   0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 WSPUA 1 2
4 WSPIA 1 1 5 . 39   0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 WSPIA 1 2
5 WSPGLT 1 1 5 . 40   0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 WSPGLT 1 2
6 FCPOIL 1 1 5 . 41   0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 FCPOIL 1 4



Note the following from Output 12.4.2:

  • The OUTALL= data set reports the time ranges of variables.

  • There are six observations in the OUTALL= data set, the same number as reported by NSERIES and NSELECT variables in the OUTBY= data set.

  • The VARNUM variable contains all MISSING values, since no OUT= data set is created.

Output 12.4.3 and Output 12.4.4 demonstrate how the OUTSELECT= option affects the contents of the OUTBY= and OUTALL= data sets when a KEEP statement is present. First, set the OUTSELECT= option to OFF.

filename citidemo "%sysget(DATASRC_DATA)citidem.dat" RECFM=D LRECL=80;

proc datasource filetype=citibase infile=citidemo interval=week
                outall=alloff outby=keyoff outselect=off;
   keep WSP:;
run;

title1 'Summary Information on Weekly Data for CITIDEMO File';
proc print data=keyoff;
run;

title1 'Weekly Series Available in CITIDEMO File';
proc print data=alloff( keep=name kept selected st_date
                             end_date ntime nobs );
run;

Output 12.4.3: Listing of the OUTBY= Data Set with OUTSELECT=OFF

Summary Information on Weekly Data for CITIDEMO File

Obs ST_DATE END_DATE NTIME NOBS NSERIES NSELECT
1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 6 4



Output 12.4.4: Listing of the OUTALL= Data Set with OUTSELECT=OFF

Weekly Series Available in CITIDEMO File

Obs NAME KEPT SELECTED ST_DATE END_DATE NTIME NOBS
1 FF142B 0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
2 WSPCA 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
3 WSPUA 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
4 WSPIA 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
5 WSPGLT 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
6 FCPOIL 0 0 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271



Setting the OUTSELECT= option ON gives results shown in Output 12.4.5 and Output 12.4.6.

filename citidemo "%sysget(DATASRC_DATA)citidem.dat" RECFM=D LRECL=80;
proc datasource filetype=citibase infile=citidemo
                interval=week
                outall=allon outby=keyon outselect=on;
   keep WSP:;
run;

title1 'Summary Information on Weekly Data for CITIDEMO File';
proc print data=keyon;
run;

title1 'Weekly Series Available in CITIDEMO File';
proc print data=allon( keep=name kept selected st_date
                            end_date ntime nobs );
run;

Output 12.4.5: Listing of the OUTBY= Data Set with OUTSELECT=ON

Summary Information on Weekly Data for CITIDEMO File

Obs ST_DATE END_DATE NTIME NOBS NSERIES NSELECT
1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 6 4



Output 12.4.6: Listing of the OUTALL= Data Set with OUTSELECT=ON

Weekly Series Available in CITIDEMO File

Obs NAME KEPT SELECTED ST_DATE END_DATE NTIME NOBS
1 WSPCA 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
2 WSPUA 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
3 WSPIA 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271
4 WSPGLT 1 1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271



Comparison of Output 12.4.4 and Output 12.4.6 reveals the following:

  • The OUTALL= data set contains six (NSERIES) observations when OUTSELECT=OFF, and four (NSELECT) observations when OUTSELECT=ON.

  • The observations in OUTALL=ALLON are those for which SELECTED=1 in OUTALL=ALLOFF.

  • The time ranges in the OUTBY= data set are computed over all the variables (selected or not) for OUTSELECT=OFF, but only computed over the selected variables for OUTSELECT=ON. This corresponds to computing time ranges over all the series reported in the OUTALL= data set.

  • The variable NTIME is the number of time periods between ST_DATE and END_DATE, while NOBS is the number of observations the OUT= data set is to contain. Thus, NTIME is different depending on whether the OUTSELECT= option is set to ON or OFF, while NOBS stays the same.

The KEEP statement in the last two examples illustrates the use of an additional variable, KEPT, in the OUTALL= data sets of Output 12.4.4 and Output 12.4.6. KEPT, which reports the outcome of the KEEP statement, is only added to the OUTALL= data set when there is KEEP statement.

Adding the RANGE statement to the last example generates the data sets in Output 12.4.7 and Output 12.4.8:

filename citidemo "%sysget(DATASRC_DATA)citidem.dat" RECFM=D LRECL=80;
proc datasource filetype=citibase infile=citidemo interval=week
                outby=keyrange out=citiout outselect=on;
   keep WSP:;
   range from '01dec1990'd;
run;

title1 'Summary Information on Weekly Data for CITIDEMO File';
proc print data=keyrange;
run;

title1 'Weekly Data in CITIDEMO File';
proc print data=citiout;
run;

Output 12.4.7: Listing of the OUTBY=KEYRANGE Data Set for FILETYPE=CITIBASE

Summary Information on Weekly Data for CITIDEMO File

Obs ST_DATE END_DATE NTIME NOBS NINRANGE NSERIES NSELECT
1 Sun, 29 Dec 1985 Sun, 3 Mar 1991 271 271 15 6 4



Output 12.4.8: Printout of the OUT=CITIOUT Data Set for FILETYPE=CITIBASE

Weekly Data in CITIDEMO File

Obs DATE WSPCA WSPUA WSPIA WSPGLT
1 Sun, 25 Nov 1990 9.77000 9.66000 9.87000 8.62000
2 Sun, 2 Dec 1990 9.75000 9.64000 9.85000 8.47000
3 Sun, 9 Dec 1990 9.59000 9.48000 9.69000 8.22000
4 Sun, 16 Dec 1990 9.62000 9.51000 9.72000 8.35000
5 Sun, 23 Dec 1990 9.70000 9.60000 9.80000 8.48000
6 Sun, 30 Dec 1990 9.64000 9.53000 9.75000 8.31000
7 Sun, 6 Jan 1991 9.70000 9.59000 9.81000 8.62000
8 Sun, 13 Jan 1991 9.80000 9.70000 9.89000 8.58000
9 Sun, 20 Jan 1991 9.66000 9.57000 9.75000 8.36000
10 Sun, 27 Jan 1991 9.65000 9.56000 9.74000 8.38000
11 Sun, 3 Feb 1991 9.52000 9.43000 9.61000 8.16000
12 Sun, 10 Feb 1991 9.38000 9.29000 9.48000 8.14000
13 Sun, 17 Feb 1991 9.38000 9.29000 9.48000 8.21000
14 Sun, 24 Feb 1991 9.61000 9.53000 9.68000 8.50000
15 Sun, 3 Mar 1991 9.61000 9.53000 9.68000 8.50000



The OUTBY= data set in this last example contains an additional variable NINRANGE. This variable is added since there is a RANGE statement. Its value, 15, is the number of observations in the OUT= data set. In this case, NOBS gives the number of observations the OUT= data set would contain if there were not a RANGE statement.