The DATASOURCE Procedure

Example 12.8 Annual COMPUSTAT Data Files, V9.2 New Filetype CSAUC3

Annual COMPUSTAT data in Universal Character format is read for PRICES since the year 2002, so that the desired output show the PRICE (HIGH), PRICE (LOW), and PRICE (CLOSE) for each company.

filename datafile "%sysget(DATASRC_DATA)csaucy3.dat" RECFM=F LRECL=13612;
/*--------------------------------------------------------------*
 * create OUT=csauy3 data set with ASCII 2003 Industrial Data   *
 * compare it with the OUT=csauc data set created by DATA STEP  *
 *--------------------------------------------------------------*/

proc datasource filetype=csaucy3 ascii
                infile=datafile
                interval=year
                outselect=on
                outkey=y3key
                out=csauy3;

     keep data197-data199 label;
     range from 2002;
run;

proc sort
   data=csauy3 out=csauy3;
   by dnum cnum cic file zlist smbl xrel stk;
run;

title1 'Price, High, Low and Close for Range from 2002';
proc contents data=csauy3;
run;

proc print data=csauy3;
run;

Output 12.8.1 shows information on the contents of the CSAUY3 data set while Output 12.8.2 shows a listing of the CSAUY3 data set.

Output 12.8.1: Listing of the CONTENTS of OUT=CSAUY3 Data Set

Price, High, Low and Close for Range from 2002

The CONTENTS Procedure

Alphabetic List of Variables and Attributes
# Variable Type Len Format Label
3 CIC Char 3    
2 CNUM Char 6    
11 COUNTY Num 5    
13 CPSPIN Char 1    
15 CSSPII Char 1    
14 CSSPIN Char 2    
18 DATA197 Num 5   Price - Fiscal Year - High ($&c,NA)
19 DATA198 Num 5   Price - Fiscal Year - Low ($&c,NA)
20 DATA199 Num 5   Price - Close - Fiscal Year-End ($&c,NA)
17 DATE Num 4 YEAR4. Date of Observation
1 DNUM Num 5    
9 DUPFILE Num 5    
16 EIN Char 10    
4 FILE Num 5    
12 FINC Num 5    
6 SMBL Char 8    
10 STATE Num 5    
8 STK Num 5    
7 XREL Num 5    
5 ZLIST Num 5    



Output 12.8.2: Listing of the OUT=CSAUY3 Data Set

Price, High, Low and Close for Range from 2002

Obs DNUM CNUM CIC FILE ZLIST SMBL XREL STK DUPFILE STATE COUNTY FINC CPSPIN CSSPIN CSSPII EIN DATE DATA197 DATA198 DATA199
1 3089 899896 104 11 1 TUP 444 0 0 12 95 0 1 10   36-4062333 2002 24.990 14.4000 15.0800
2 3089 899896 104 11 1 TUP 444 0 0 12 95 0 1 10   36-4062333 2003 . . .
3 3674 032654 105 11 1 ADI 928 0 0 25 21 0 1 10   04-2348234 2002 48.840 17.8800 26.8000
4 3674 032654 105 11 1 ADI 928 0 0 25 21 0 1 10   04-2348234 2003 . . .
5 3842 053801 106 1 5 AVR 0 0 0 25 21 0       06-1174053 2002 1.500 0.2200 0.2300
6 3842 053801 106 1 5 AVR 0 0 0 25 21 0       06-1174053 2003 . . .
7 6035 149547 101 3 25 CAVB 0 0 0 47 149 0       62-1721072 2002 14.000 11.5810 13.3400
8 6035 149547 101 3 25 CAVB 0 0 0 47 149 0       62-1721072 2003 . . .
9 6211 617446 448 11 1 MWD 725 0 0 36 61 0 1 10 1 36-3145972 2002 60.020 28.8010 45.2400
10 6211 617446 448 11 1 MWD 725 0 0 36 61 0 1 10 1 36-3145972 2003 . . .
11 6726 09247M 105 1 4 BMN 0 0 0 34 13 0         2002 11.050 10.3700 11.0100
12 6726 09247M 105 1 4 BMN 0 0 0 34 13 0         2003 . . .
13 7011 54021P 205 1 5 LGN 0 0 0 13 121 0       52-2093696 2002 13.894 1.0084 13.8940
14 7011 54021P 205 1 5 LGN 0 0 0 13 121 0       52-2093696 2003 . . .
15 7370 35921T 108 1 5 FNT 0 0 0 36 87 0       13-3950283 2002 0.440 0.1200 0.2600
16 7370 35921T 108 1 5 FNT 0 0 0 36 87 0       13-3950283 2003 . . .
17 7370 459200 101 11 1 IBM 903 0 0 36 119 0 1 10 1 13-0871985 2002 126.390 54.0100 77.5000
18 7370 459200 101 11 1 IBM 903 0 0 36 119 0 1 10 1 13-0871985 2003 . . .
19 7812 591610 100 1 4 MGM 0 0 0 6 37 0       95-4605850 2002 23.250 9.0000 13.0000
20 7812 591610 100 1 4 MGM 0 0 0 6 37 0       95-4605850 2003 . . .



Note that annual COMPUSTAT data are available in either IBM 360/370 General format or Universal Character format. The first example expects an IBM 360/370 General format file since the FILETYPE= is set to CSAIBM, while the second example uses a Universal Character format file (FILETYPE=CSAUC).