Previous Page | Next Page

Statements

INPUT Statement, List



Scans the input data record for input values and assigns them to the corresponding SAS variables.
Valid: in a DATA step
Category: File-handling
Type: Executable

Syntax
Arguments
Details
When to Use List Input
Simple List Input
Modified List Input
Comparisons
How Modified List Input and Formatted Input Differ
When Data Contains Quotation Marks
Examples
Example 1: Reading Unaligned Data with Simple List Input
Example 2: Reading Character Data That Contains Embedded Blanks
Example 3: Reading Unaligned Data with Informats
Example 4: Reading Comma-Delimited Data with List Input and an Informat
Example 5: Reading Delimited Data with Modified List Input
See Also

Syntax

INPUT <pointer-control> variable <$> <&> <@ | @@>;
INPUT <pointer-control> variable <:|&|~>
<informat.> <@ | @@>;


Arguments

pointer-control

moves the input pointer to a specified line or column in the input buffer.

See: Column Pointer Controls and Line Pointer Controls
Featured in: Reading Character Data That Contains Embedded Blanks
variable

specifies a variable that is assigned input values.

$

indicates to store a variable value as a character value rather than as a numeric value.

Tip: If the variable is previously defined as character, $ is not required.
Featured in: Reading Unaligned Data with Simple List Input
&

indicates that a character value can have one or more single embedded blanks. This format modifier reads the value from the next non-blank column until the pointer reaches two consecutive blanks, the defined length of the variable, or the end of the input line, whichever comes first.

Restriction: The & modifier must follow the variable name and $ sign that it affects.
Tip: If you specify an informat after the & modifier, the terminating condition for the format modifier remains two blanks.
See: Modified List Input
Featured in: Reading Character Data That Contains Embedded Blanks
:

enables you to specify an informat that the INPUT statement uses to read the variable value. For a character variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column, the defined length of the variable, or the end of the data line, whichever comes first. For a numeric variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column or the end of the data line, whichever comes first.

Tip: If the length of the variable has not been previously defined, then its value is read and stored with the informat length.
Tip: The pointer continues to read until the next blank column is reached. However, if the field is longer than the formatted length, then the value is truncated to the length of variable.
See: Modified List Input
Featured in: Reading Unaligned Data with Informats and Reading Delimited Data with Modified List Input
~

indicates to treat single quotation marks, double quotation marks, and delimiters in character values in a special way. This format modifier reads delimiters within quoted character values as characters instead of as delimiters and retains the quotation marks when the value is written to a variable.

Restriction: You must use the DSD option in an INFILE statement. Otherwise, the INPUT statement ignores this option.
See: Modified List Input
Featured in: Reading Delimited Data with Modified List Input
informat.

specifies an informat to use to read the variable values.

Tip: Decimal points in the actual input values always override decimal specifications in a numeric informat.
See Also: Definition of Informats
Featured in: Reading Unaligned Data with Informats and Reading Delimited Data with Modified List Input
@

holds an input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @.

Restriction: The trailing @ must be the last item in the INPUT statement.
Tip: The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times.
See: Using Line-Hold Specifiers
@@

holds an input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @.

Restriction: The double trailing @ must be the last item in the INPUT statement.
Tip: The double trailing @ is useful when each input line contains values for several observations.
See: Using Line-Hold Specifiers

Details


When to Use List Input

List input requires that you specify the variable names in the INPUT statement in the same order that the fields appear in the input data records. SAS scans the data line to locate the next value but ignores additional intervening blanks. List input does not require that the data are located in specific columns. However, you must separate each value from the next by at least one blank unless the delimiter between values is changed. By default, the delimiter for data values is one blank space or the end of the input record. List input will not skip over any data values to read subsequent values, but it can ignore all values after a given point in the data record. However, pointer controls enable you to change the order that the data values are read.

There are two types of list input:

Modified list input makes the INPUT statement more versatile because you can use a format modifier to overcome several of the restrictions of simple list input. See Modified List Input.


Simple List Input

Simple list input places several restrictions on the type of data that the INPUT statement can read:


Modified List Input

List input is more versatile when you use format modifiers. The format modifiers are as follows:

Format Modifier Purpose
& reads character values that contain embedded blanks.
: reads data values that need the additional instructions that informats can provide but that are not aligned in columns. **
~ reads delimiters within quoted character values as characters and retains the quotation marks.
** Use formatted input and pointer controls to quickly read data values that are aligned in columns.

For example, use the : modifier with an informat to read character values that are longer than 8 bytes or numeric values that contain nonstandard values.

Because list input interprets a blank as a delimiter, use modified list input to read values that contain blanks. The & modifier reads character values that contain single embedded blanks. However, the data values must be separated by two or more blanks. To read values that contain leading, trailing, or embedded blanks with list input, use the DLM= or DLMSTR= option in the INFILE statement to specify another character as the delimiter. See Reading Delimited Data with Modified List Input. If your input data use blanks as delimiters and they contain leading, trailing, or embedded blanks, you might need to use either column input or formatted input. If quotation marks surround the delimited values, you can use list input with the DSD option in the INFILE statement.


Comparisons


How Modified List Input and Formatted Input Differ

Modified list input has a scanning feature that can use informats to read data which are not aligned in columns. Formatted input causes the pointer to move like that of column input to read a variable value. The pointer moves the length that is specified in the informat and stops at the next column.

This DATA step uses modified list input to read the first data value and formatted input to read the second:

data jansales;
   input item : $10. amount comma5.;
datalines;
trucks 1,382
vans 1,235
sedans 2,391
;

The value of ITEM is read with modified list input. The INPUT statement stops reading when the pointer finds a blank space. The pointer then moves to the second column after the end of the field, which is the correct position to read the AMOUNT value with formatted input.

Formatted input, on the other hand, continues to read the entire width of the field. This INPUT statement uses formatted input to read both data values:

input item $10. +1 amount comma5.;

To read this data correctly with formatted input, the second data value must occur after the 10th column of the first value, as shown here:

----+----1----+----2
trucks    1,382
vans      1,235
sedans    2,391

Also, after the value of ITEM is read with formatted input, you must use the pointer control +1 to move the pointer to the column where the value AMOUNT begins.


When Data Contains Quotation Marks

When you use the DSD option in an INFILE statement, which sets the delimiter to a comma, the INPUT statement removes quotation marks before a value is written to a variable. When you also use the tilde (~) modifier in an INPUT statement, the INPUT statement maintains quotation marks as part of the value.


Examples


Example 1: Reading Unaligned Data with Simple List Input

The INPUT statement in this DATA step uses simple list input to read the input data records:

data scores;
   input name $ score1 score2 score3 team $;
   datalines;
Joe 11 32 76 red
Mitchel 13 29 82 blue
Susan 14 27 74 green
;

The next INPUT statement reads only the first four fields in the previous data lines, which demonstrates that you are not required to read all the fields in the record:

input name $ score1 score2 score3;


Example 2: Reading Character Data That Contains Embedded Blanks

The INPUT statement in this DATA step uses the & format modifier with list input to read character values that contain embedded blanks.

data list;
   infile file-specification;
   input name $ & score;
run;

It can read these input data records:

----+----1----+----2----+----3----+
Joseph   11 Joergensen  red
Mitchel  13 Mc Allister  blue
Su Ellen  14 Fischer-Simon  green

The & modifier follows the variable it affects in the INPUT statement. Because this format modifier follows NAME, at least two blanks must separate the NAME field from the SCORE field in the input data records.

You can also specify an informat with a format modifier, as shown here:

    input name $ & +3 lastname & $15. team $;

In addition, this INPUT statement reads the same data to demonstrate that you are not required to read all the values in an input record. The +3 column pointer control moves the pointer past the score value in order to read the value for LASTNAME and TEAM.


Example 3: Reading Unaligned Data with Informats

This DATA step uses modified list input to read data values with an informat:

data jansales;
   input item : $10. amount;
   datalines;
trucks 1382
vans 1235
sedans 2391
;

The $10. informat allows a character variable of up to ten characters to be read.


Example 4: Reading Comma-Delimited Data with List Input and an Informat

This DATA step uses the DELIMITER= option in the INFILE statement to read list input values that are separated by commas instead of blanks. The example uses an informat to read the date, and a format to write the date.

options pageno=1 nodate ls=80 ps=64;
data scores2;
   length Team $ 14;
   infile datalines delimiter=',';
   input Name $ Score1-Score3 Team $ Final_Date:MMDDYY10.;
   format final_date weekdate17.;
   datalines;
Joe,11,32,76,Red Racers,2/3/2007
Mitchell,13,29,82,Blue Bunnies,4/5/2007
Susan,14,27,74,Green Gazelles,11/13/2007
;

proc print data=scores2;
   var Name Team Score1-Score3 Final_Date;
   title 'Soccer Player Scores'; 
run;

Output from Comma-Delimited Data

                              Soccer Player Scores                             1

 Obs   Name            Team        Score1   Score2   Score3          Final_Date

  1    Joe        Red Racers         11       32       76      Mon, Feb 3, 2007
  2    Mitchell   Blue Bunnies       13       29       82      Sat, Apr 5, 2007
  3    Susan      Green Gazelles     14       27       74     Thu, Nov 13, 2007

Example 5: Reading Delimited Data with Modified List Input

This DATA step uses the DSD option in an INFILE statement and the tilde (~) format modifier in an INPUT statement to retain the quotation marks in character data and to read a character in a string that is enclosed in quotation marks as a character instead of as a delimiter.

data scores;
   infile datalines dsd;
   input Name : $9. Score1-Score3 
         Team ~ $25. Div $;
   datalines;
Joseph,11,32,76,"Red Racers, Washington",AAA
Mitchel,13,29,82,"Blue Bunnies, Richmond",AAA
Sue Ellen,14,27,74,"Green Gazelles, Atlanta",AA
;

The output that PROC PRINT generates shows the resulting SCORES data set. The values for TEAM contain the quotation marks.

SCORES Data Set

                         The SAS System                        1

OBS Name      Score1 Score2 Score3           Team            Div

 1  Joseph      11     32     76   "Red Racers, Washington"  AAA
 2  Mitchel     13     29     82   "Blue Bunnies, Richmond"  AAA
 3  Sue Ellen   14     27     74   "Green Gazelles, Atlanta" AA 

See Also

Statements:

INFILE Statement

INPUT Statement

INPUT Statement, Formatted


FOOTNOTE 1:   See SAS Language Reference: Concepts for the information about standard and nonstandard data values. [arrow]

Previous Page | Next Page | Top of Page