Previous Page | Next Page

Starting with Raw Data: The Basics

Mixing Styles of Input


An Example of Mixed Input

When you begin an INPUT statement in a particular style (list, column, or formatted), you are not restricted to using that style alone. You can mix input styles in a single INPUT statement as long as you mix them in a way that appropriately describes the raw data records. For example, this DATA step uses all three input styles:

data club1;
   input IdNumber 1  
         Name $18. 2 
         Team $ 25-30 3 
         StartWeight EndWeight; 1 
   datalines;
1023 David Shaw         red    189 165
1049 Amelia Serrano     yellow 145 124
1219 Alan Nance         red    210 192
1246 Ravi Sinha         yellow 194 177
1078 Ashley McKnight    red    127 118
1221 Jim Brown          yellow 220   .
;

proc print data=club1;
   title 'Weight Club Members';
run;

The following list corresponds to the numbered items in the preceding program:

[1] The variables IdNumber, StartWeight, and EndWeight are read with list input.

[2] The variable Name is read with formatted input.

[3] The variable Team is read with column input.

The following output demonstrates that the data is read correctly.

Data Set Created with Mixed Styles of Input

                              Weight Club Members                              1

                  Id                                    Start      End
         Obs    Number    Name               Team      Weight    Weight

          1      1023     David Shaw         red         189       165 
          2      1049     Amelia Serrano     yellow      145       124 
          3      1219     Alan Nance         red         210       192 
          4      1246     Ravi Sinha         yellow      194       177 
          5      1078     Ashley McKnight    red         127       118 
          6      1221     Jim Brown          yellow      220         . 

Understanding the Effect of Input Style on Pointer Location


Why You Can Get into Trouble by Mixing Input Styles

CAUTION:
When you mix styles of input in a single INPUT statement, you can get unexpected results if you do not understand where the input pointer is positioned after SAS reads a value in the input buffer.

As the INPUT statement reads data values from the record in the input buffer, it uses a pointer to keep track of its position. Read the following sections so that you understand how the pointer movement differs between input styles before mixing multiple input styles in a single INPUT statement  [cautionend]


Pointer Location with Column and Formatted Input

With column and formatted input, you supply the instructions that determine the exact pointer location. With column input, SAS reads the columns that you specify in the INPUT statement. With formatted input, SAS reads the exact length that you specify with the informat. In both cases, the pointer moves as far as you instruct it and stops. The pointer is left in the column that immediately follows the last column that is read.

Here are two examples of input followed by an explanation of the pointer location. The first DATA step shows column input:

data scores;
   input Team $ 1-6 Score 12-13;
   datalines;
red        59
blue       95
yellow     63
green      76
;

The second DATA step uses the same data to show formatted input:

data scores;
   input Team $6. +5 Score 2.;
   datalines;
red        59    
blue       95    
yellow     63    
green      76    
;

The following figure shows that the pointer is located in column 7 after the first value is read with either of the two previous INPUT statements.

Pointer Position: Column and Formatted Input

[Pointer Position: Column and Formatted Input]

Unlike list input, column and formatted input rely totally on your instructions to move the pointer and read the value for the second variable, Score. Column input uses column specifications to move the pointer to each data field. Formatted input uses informats and pointer controls to control the position of the pointer.

This INPUT statement uses column input with the column specifications 12-13 to move the pointer to column 12 and read the value for the variable Score:

input Team $ 1-6 Score 12-13;

This INPUT statement uses formatted input with the +5 column-pointer control to move the pointer to column 12. Then the value for the variable Score is read with the 2. numeric informat.

input Team $6. +5 Score 2.;

Without the use of a pointer control, which moves the pointer to the column where the value begins, this INPUT statement would attempt to read the value for Score in columns 7 and 8, which are blank.


Pointer Location with List Input

List input, on the other hand, uses a scanning method to determine the pointer location. With list input, the pointer reads until a blank is reached and then stops in the next column. To read the next variable value, the pointer moves automatically to the first nonblank column, discarding any leading blanks it encounters. Here is the same data that is read with list input:

data scores;
   input Team $ Score;
   datalines;
red        59
blue       95
yellow     63
green      76
;

The following figure shows that the pointer is located in column 5 after the value red is read. Because Score, the next variable, is read with list input, the pointer scans for the next nonblank space before it begins to read a value for Score. Unlike column and formatted input, you do not have to explicitly move the pointer to the beginning of the next field in list input.

Pointer Position: List Input

[Pointer Position: List Input]

Previous Page | Next Page | Top of Page