INPUT Statement, List

Scans the input data record for input values and assigns them to the corresponding SAS variables.
Valid in: DATA step
Category: File-handling
Type: Executable

Syntax

INPUT <pointer-control> variable <$> <&> <@ | @@>;
INPUT <pointer-control> variable <: | & | ~>
<informat.> <@ | @@>;

Arguments

pointer-control
moves the input pointer to a specified line or column in the input buffer.
variable
specifies a variable that is assigned input values.
$
indicates to store a variable value as a character value rather than as a numeric value.
Tip:If the variable is previously defined as character, $ is not required.
&
indicates that a character value can have one or more single embedded blanks. This format modifier reads the value from the next non-blank column until the pointer reaches two consecutive blanks, the defined length of the variable, or the end of the input line, whichever comes first.
Restriction:The & modifier must follow the variable name and $ sign that it affects.
Tip:If you specify an informat after the & modifier, the terminating condition for the format modifier remains two blanks.
:
enables you to specify an informat that the INPUT statement uses to read the variable value. For a character variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column, the defined length of the variable, or the end of the data line, whichever comes first. For a numeric variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column or the end of the data line, whichever comes first.
Tips:If the length of the variable has not been previously defined, then its value is read and stored with the informat length.

The pointer continues to read until the next blank column is reached. However, if the field is longer than the formatted length, then the value is truncated to the length of variable.

~
indicates to treat single quotation marks, double quotation marks, and delimiters in character values in a special way. This format modifier reads delimiters within quoted character values as characters instead of as delimiters and retains the quotation marks when the value is written to a variable.
Restriction:You must use the DSD option in an INFILE statement. Otherwise, the INPUT statement ignores this option.
informat.
specifies an informat to use to read the variable values.
Tip:Decimal points in the actual input values always override decimal specifications in a numeric informat.
See:SAS Informats in SAS Formats and Informats: Reference
@
holds an input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @.
Restriction:The trailing @ must be the last item in the INPUT statement.
Tip:The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times.
@@
holds an input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @.
Restriction:The double trailing @ must be the last item in the INPUT statement.
Tip:The double trailing @ is useful when each input line contains values for several observations.

Details

When to Use List Input

List input requires that you specify the variable names in the INPUT statement in the same order that the fields appear in the input data records. SAS scans the data line to locate the next value but ignores additional intervening blanks. List input does not require that the data are located in specific columns. However, you must separate each value from the next by at least one blank unless the delimiter between values is changed. By default, the delimiter for data values is one blank space or the end of the input record. List input will not skip over any data values to read subsequent values, but it can ignore all values after a given point in the data record. However, pointer controls enable you to change the order that the data values are read.
There are two types of list input:
  • simple list input
  • modified list input.
Modified list input makes the INPUT statement more versatile because you can use a format modifier to overcome several of the restrictions of simple list input. See Modified List Input .

Simple List Input

Simple list input places several restrictions on the type of data that the INPUT statement can read:
  • By default, at least one blank must separate the input values. Use the DLM= or DLMSTR= option or the DSD option in the INFILE statement to specify a delimiter other than a blank.
  • Represent each missing value with a period, not a blank, or two adjacent delimiters.
  • Character input values cannot be longer than 8 bytes unless the variable is given a longer length in an earlier LENGTH, ATTRIB, or INFORMAT statement.
  • Character values cannot contain embedded blanks unless you change the delimiter.
  • Data must be in standard numeric or character format. (footnote1)

Modified List Input

List input is more versatile when you use format modifiers. The format modifiers are as follows:
Format Modifier
Purpose
&
reads character values that contain embedded blanks.
:
reads data values that need the additional instructions that informats can provide but that are not aligned in columns.1
~
reads delimiters within quoted character values as characters and retains the quotation marks.
1Use formatted input and pointer controls to quickly read data values that are aligned in columns.
For example, use the : modifier with an informat to read character values that are longer than 8 bytes or numeric values that contain nonstandard values.
Because list input interprets a blank as a delimiter, use modified list input to read values that contain blanks. The & modifier reads character values that contain single embedded blanks. However, the data values must be separated by two or more blanks. To read values that contain leading, trailing, or embedded blanks with list input, use the DLM= or DLMSTR= option in the INFILE statement to specify another character as the delimiter. See Reading Delimited Data with Modified List Input. If your input data use blanks as delimiters and they contain leading, trailing, or embedded blanks, you might need to use either column input or formatted input. If quotation marks surround the delimited values, you can use list input with the DSD option in the INFILE statement.

Comparisons

How Modified List Input and Formatted Input Differ
Modified list input has a scanning feature that can use informats to read data which are not aligned in columns. Formatted input causes the pointer to move like that of column input to read a variable value. The pointer moves the length that is specified in the informat and stops at the next column.
This DATA step uses modified list input to read the first data value and formatted input to read the second:
data jansales;
   input item : $10. amount comma5.;
datalines;
trucks 1,382
vans 1,235
sedans 2,391
;
The value of ITEM is read with modified list input. The INPUT statement stops reading when the pointer finds a blank space. The pointer then moves to the second column after the end of the field, which is the correct position to read the AMOUNT value with formatted input.
Formatted input, on the other hand, continues to read the entire width of the field. This INPUT statement uses formatted input to read both data values:
input item $10. +1 amount comma5.;
To read this data correctly with formatted input, the second data value must occur after the 10th column of the first value, as shown here:
----+----1----+----2
trucks    1,382
vans      1,235
sedans    2,391
Also, after the value of ITEM is read with formatted input, you must use the pointer control +1 to move the pointer to the column where the value AMOUNT begins.
When Data Contains Quotation Marks
When you use the DSD option in an INFILE statement, which sets the delimiter to a comma, the INPUT statement removes quotation marks before a value is written to a variable. When you also use the tilde (~) modifier in an INPUT statement, the INPUT statement maintains quotation marks as part of the value.

Examples

Example 1: Reading Unaligned Data with Simple List Input

The INPUT statement in this DATA step uses simple list input to read the input data records:
data scores;
   input name $ score1 score2 score3 team $;
   datalines;
Joe 11 32 76 red
Mitchel 13 29 82 blue
Susan 14 27 74 green
;
The next INPUT statement reads only the first four fields in the previous data lines, which demonstrates that you are not required to read all the fields in the record:
input name $ score1 score2 score3;

Example 2: Reading Character Data That Contains Embedded Blanks

The INPUT statement in this DATA step uses the & format modifier with list input to read character values that contain embedded blanks.
data list;
   infile file-specification;
   input name $ & score;
run;
It can read these input data records:
----+----1----+----2----+----3----+
Joseph   11 Joergensen  red
Mitchel  13 Mc Allister  blue
Su Ellen  14 Fischer-Simon  green
The & modifier follows the variable that it affects in the INPUT statement. Because this format modifier follows NAME, at least two blanks must separate the NAME field from the SCORE field in the input data records.
You can also specify an informat with a format modifier, as shown here:
    input name $ & +3 lastname & $15. team $;
In addition, this INPUT statement reads the same data to demonstrate that you are not required to read all the values in an input record. The +3 column pointer control moves the pointer past the score value in order to read the value for LASTNAME and TEAM.

Example 3: Reading Unaligned Data with Informats

This DATA step uses modified list input to read data values with an informat:
data jansales;
   input item : $10. amount;
   datalines;
trucks 1382
vans 1235
sedans 2391
;
The $10. informat allows a character variable of up to ten characters to be read.

Example 4: Reading Comma-Delimited Data with List Input and an Informat

This DATA step uses the DELIMITER= option in the INFILE statement to read list input values that are separated by commas instead of blanks. The example uses an informat to read the date, and a format to write the date.
data scores2;
   length Team $ 14;
   infile datalines delimiter=',';
   input Name $ Score1-Score3 Team $ Final_Date:MMDDYY10.;
   format final_date weekdate17.;
   datalines;
Joe,11,32,76,Red Racers,2/3/2007
Mitchell,13,29,82,Blue Bunnies,4/5/2007
Susan,14,27,74,Green Gazelles,11/13/2007
;
proc print data=scores2;
   var Name Team Score1-Score3 Final_Date;
   title 'Soccer Player Scores'; 
run;
Output from Comma-Delimited Data
Output from Comma-Delimited Data

Example 5: Reading Delimited Data with Modified List Input

This DATA step uses the DSD option in an INFILE statement and the tilde (~) format modifier in an INPUT statement to retain the quotation marks in character data and to read a character in a string that is enclosed in quotation marks as a character instead of as a delimiter.
data scores;
   infile datalines dsd;
   input Name : $9. Score1-Score3 
         Team ~ $25. Div $;
   datalines;
Joseph,11,32,76,"Red Racers, Washington",AAA
Mitchel,13,29,82,"Blue Bunnies, Richmond",AAA
Sue Ellen,14,27,74,"Green Gazelles, Atlanta",AA
;
proc print; run;
The output that PROC PRINT generates shows the resulting SCORES data set. The values for TEAM contain the quotation marks.
SCORES Data Set
SCORES Data Set
FOOTNOTE 1:See SAS Language Reference: Concepts for the information about standard and nonstandard data values.[return]