Previous Page | Next Page

Starting with Raw Data: The Basics

Reading Unaligned Data


Understanding List Input

The simplest form of the INPUT statement uses list input. List input is used to read data values that are separated by a delimiter character (by default, a blank space). With list input, SAS reads a data value until it encounters a blank space. SAS assumes the value has ended and assigns the data to the appropriate variable in the program data vector. SAS continues to scan the record until it reaches a nonblank character again. SAS reads a data value until it encounters a blank space or the end of the input record.


Program: Basic List Input

This program uses the health and fitness club data from Introduction to DATA Step Processing to illustrate a DATA step that uses list input in an INPUT statement.

data club1;
   input IdNumber Name $ Team $ StartWeight EndWeight;3 
   datalines;1 
1023 David red 189 165 2 
1049 Amelia yellow 145 124 
1219 Alan red 210 192 
1246 Ravi yellow 194 177 
1078 Ashley red 127 118 
1221 Jim yellow 220 . 2 
; 1  

proc print data=club1;
   title 'Weight of Club Members';
run;

The following list corresponds to the numbered items in the preceding program:

[1] The DATALINES statement marks the beginning of the data lines. The semicolon that follows the data lines marks the end of the data lines and the end of the DATA step.

[2] Each data value in the raw data record is separated from the next by at least one blank space. The last record contains a missing value, represented by a period, for the value of EndWeight.

[3] The variable names in the INPUT statement are specified in exactly the same order as the fields in the raw data records.

The output that follows shows the resulting data set. The PROC PRINT statement that follows the DATA step produces this listing.

Data Set Created with List Input

                             Weight of Club Members                            1

                      Id                           Start      End
             Obs    Number    Name      Team      Weight    Weight

              1      1023     David     red         189       165 
              2      1049     Amelia    yellow      145       124 
              3      1219     Alan      red         210       192 
              4      1246     Ravi      yellow      194       177 
              5      1078     Ashley    red         127       118 
              6      1221     Jim       yellow      220         . 

Program: When the Data Is Delimited by Characters, Not Blanks

This program also uses the health and fitness club data but notice that here the data is delimited by a comma instead of a blank space, the default delimiter.

options pagesize=60 linesize=80 pageno=1 nodate;
data club1; 
   infile datalines2  dlm=','3 ;
   input IdNumber Name $ Team $ StartWeight EndWeight;
   datalines;
1023,David,red,189,1651  
1049,Amelia,yellow,145,124 
1219,Alan,red,210,192 
1246,Ravi,yellow,194,177 
1078,Ashley,red,127,118 
1221,Jim,yellow,220,. 
;  
proc print data=club1;
   title 'Weight of Club Members';
run;

The following list corresponds to the numbered items in the preceding output:

[1] These data values are separated by commas instead of blanks.

[2] List input, by default, scans the input records, looking for blank spaces to delimit each data value. The DLM= option enables list input to recognize a character, here a comma, as the delimiter.

[3] This example required the DLM= option, which is available only in the INFILE statement. Usually this statement is used only when the input data resides in an external file. The DATALINES specification, however, enables you to take advantage of INFILE statement options, when you are reading data records from the job stream.

Reading Data Delimited by Commas

                             Weight of Club Members                            1

                      Id                           Start      End
             Obs    Number    Name      Team      Weight    Weight

              1      1023     David     red         189       165 
              2      1049     Amelia    yellow      145       124 
              3      1219     Alan      red         210       192 
              4      1246     Ravi      yellow      194       177 
              5      1078     Ashley    red         127       118 
              6      1221     Jim       yellow      220         . 

List Input: Points to Remember

The points to remember when you use list input are:

Note:   List input requires the fewest specifications in the INPUT statement. However, the restrictions that are placed on the data may require that you learn to use other styles of input to read your data. For example, column input, which is discussed in the next section, is less restrictive. This section has introduced only simple list input. See Understanding How to Make List Input More Flexible to learn about modified list input.  [cautionend]

Previous Page | Next Page | Top of Page