Statements |
Valid: | in a DATA step |
Category: | File-handling |
Type: | Executable |
Syntax |
INPUT <pointer-control> variable <$> <&> <@ | @@>; |
INPUT <pointer-control>
variable <:|&|~>
<informat.> <@ | @@>; |
moves the input pointer to a specified line or column in the input buffer.
See: | Column Pointer Controls and Line Pointer Controls |
Featured in: | Reading Character Data That Contains Embedded Blanks |
indicates to store a variable value as a character value rather than as a numeric value.
Tip: | If the variable is previously defined as character, $ is not required. |
Featured in: | Reading Unaligned Data with Simple List Input |
indicates that a character value can have one or more single embedded blanks. This format modifier reads the value from the next non-blank column until the pointer reaches two consecutive blanks, the defined length of the variable, or the end of the input line, whichever comes first.
Restriction: | The & modifier must follow the variable name and $ sign that it affects. |
Tip: | If you specify an informat after the & modifier, the terminating condition for the format modifier remains two blanks. |
See: | Modified List Input |
Featured in: | Reading Character Data That Contains Embedded Blanks |
enables you to specify an informat that the INPUT statement uses to read the variable value. For a character variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column, the defined length of the variable, or the end of the data line, whichever comes first. For a numeric variable, this format modifier reads the value from the next non-blank column until the pointer reaches the next blank column or the end of the data line, whichever comes first.
Tip: | If the length of the variable has not been previously defined, then its value is read and stored with the informat length. |
Tip: | The pointer continues to read until the next blank column is reached. However, if the field is longer than the formatted length, then the value is truncated to the length of variable. |
See: | Modified List Input |
Featured in: | Reading Unaligned Data with Informats and Reading Delimited Data with Modified List Input |
indicates to treat single quotation marks, double quotation marks, and delimiters in character values in a special way. This format modifier reads delimiters within quoted character values as characters instead of as delimiters and retains the quotation marks when the value is written to a variable.
Restriction: | You must use the DSD option in an INFILE statement. Otherwise, the INPUT statement ignores this option. |
See: | Modified List Input |
Featured in: | Reading Delimited Data with Modified List Input |
specifies an informat to use to read the variable values.
Tip: | Decimal points in the actual input values always override decimal specifications in a numeric informat. |
See Also: | Definition of Informats |
Featured in: | Reading Unaligned Data with Informats and Reading Delimited Data with Modified List Input |
holds an input record for the execution of the next INPUT statement within the same iteration of the DATA step. This line-hold specifier is called trailing @.
Restriction: | The trailing @ must be the last item in the INPUT statement. |
Tip: | The trailing @ prevents the next INPUT statement from automatically releasing the current input record and reading the next record into the input buffer. It is useful when you need to read from a record multiple times. |
See: | Using Line-Hold Specifiers |
holds an input record for the execution of the next INPUT statement across iterations of the DATA step. This line-hold specifier is called double trailing @.
Restriction: | The double trailing @ must be the last item in the INPUT statement. |
Tip: | The double trailing @ is useful when each input line contains values for several observations. |
See: | Using Line-Hold Specifiers |
Details |
List input requires that you specify the variable names in the INPUT statement in the same order that the fields appear in the input data records. SAS scans the data line to locate the next value but ignores additional intervening blanks. List input does not require that the data are located in specific columns. However, you must separate each value from the next by at least one blank unless the delimiter between values is changed. By default, the delimiter for data values is one blank space or the end of the input record. List input will not skip over any data values to read subsequent values, but it can ignore all values after a given point in the data record. However, pointer controls enable you to change the order that the data values are read.
There are two types of list input:
Modified list input makes the INPUT statement more versatile because you can use a format modifier to overcome several of the restrictions of simple list input. See Modified List Input.
Simple list input places several restrictions on the type of data that the INPUT statement can read:
By default, at least one blank must separate the input values. Use the DLM= or DLMSTR= option or the DSD option in the INFILE statement to specify a delimiter other than a blank.
Represent each missing value with a period, not a blank, or two adjacent delimiters.
Character input values cannot be longer than 8 bytes unless the variable is given a longer length in an earlier LENGTH, ATTRIB, or INFORMAT statement.
Character values cannot contain embedded blanks unless you change the delimiter.
Data must be in standard numeric or character format.(footnote 1)
List input is more versatile when you use format modifiers. The format modifiers are as follows:
For example, use the : modifier with an informat to read character values that are longer than 8 bytes or numeric values that contain nonstandard values.
Because list input interprets a blank as a delimiter, use modified list input to read values that contain blanks. The & modifier reads character values that contain single embedded blanks. However, the data values must be separated by two or more blanks. To read values that contain leading, trailing, or embedded blanks with list input, use the DLM= or DLMSTR= option in the INFILE statement to specify another character as the delimiter. See Reading Delimited Data with Modified List Input. If your input data use blanks as delimiters and they contain leading, trailing, or embedded blanks, you might need to use either column input or formatted input. If quotation marks surround the delimited values, you can use list input with the DSD option in the INFILE statement.
Comparisons |
Modified list input has a scanning feature that can use informats to read data which are not aligned in columns. Formatted input causes the pointer to move like that of column input to read a variable value. The pointer moves the length that is specified in the informat and stops at the next column.
This DATA step uses modified list input to read the first data value and formatted input to read the second:
data jansales; input item : $10. amount comma5.; datalines; trucks 1,382 vans 1,235 sedans 2,391 ;
The value of ITEM is read with modified list input. The INPUT statement stops reading when the pointer finds a blank space. The pointer then moves to the second column after the end of the field, which is the correct position to read the AMOUNT value with formatted input.
Formatted input, on the other hand, continues to read the entire width of the field. This INPUT statement uses formatted input to read both data values:
input item $10. +1 amount comma5.;
To read this data correctly with formatted input, the second data value must occur after the 10th column of the first value, as shown here:
----+----1----+----2 trucks 1,382 vans 1,235 sedans 2,391
Also, after the value of ITEM is read with formatted input, you must use the pointer control +1 to move the pointer to the column where the value AMOUNT begins.
When you use the DSD option in an INFILE statement, which sets the delimiter to a comma, the INPUT statement removes quotation marks before a value is written to a variable. When you also use the tilde (~) modifier in an INPUT statement, the INPUT statement maintains quotation marks as part of the value.
Examples |
The INPUT statement in this DATA step uses simple list input to read the input data records:
data scores; input name $ score1 score2 score3 team $; datalines; Joe 11 32 76 red Mitchel 13 29 82 blue Susan 14 27 74 green ;
The next INPUT statement reads only the first four fields in the previous data lines, which demonstrates that you are not required to read all the fields in the record:
input name $ score1 score2 score3;
The INPUT statement in this DATA step uses the & format modifier with list input to read character values that contain embedded blanks.
data list; infile file-specification; input name $ & score; run;
It can read these input data records:
----+----1----+----2----+----3----+ Joseph 11 Joergensen red Mitchel 13 Mc Allister blue Su Ellen 14 Fischer-Simon green
The & modifier follows the variable it affects in the INPUT statement. Because this format modifier follows NAME, at least two blanks must separate the NAME field from the SCORE field in the input data records.
You can also specify an informat with a format modifier, as shown here:
input name $ & +3 lastname & $15. team $;
In addition, this INPUT statement reads the same data to demonstrate that you are not required to read all the values in an input record. The +3 column pointer control moves the pointer past the score value in order to read the value for LASTNAME and TEAM.
This DATA step uses modified list input to read data values with an informat:
data jansales; input item : $10. amount; datalines; trucks 1382 vans 1235 sedans 2391 ;
The $10. informat allows a character variable of up to ten characters to be read.
This DATA step uses the DELIMITER= option in the INFILE statement to read list input values that are separated by commas instead of blanks. The example uses an informat to read the date, and a format to write the date.
options pageno=1 nodate ls=80 ps=64; data scores2; length Team $ 14; infile datalines delimiter=','; input Name $ Score1-Score3 Team $ Final_Date:MMDDYY10.; format final_date weekdate17.; datalines; Joe,11,32,76,Red Racers,2/3/2007 Mitchell,13,29,82,Blue Bunnies,4/5/2007 Susan,14,27,74,Green Gazelles,11/13/2007 ; proc print data=scores2; var Name Team Score1-Score3 Final_Date; title 'Soccer Player Scores'; run;
Output from Comma-Delimited Data
Soccer Player Scores 1 Obs Name Team Score1 Score2 Score3 Final_Date 1 Joe Red Racers 11 32 76 Mon, Feb 3, 2007 2 Mitchell Blue Bunnies 13 29 82 Sat, Apr 5, 2007 3 Susan Green Gazelles 14 27 74 Thu, Nov 13, 2007
This DATA step uses the DSD option in an INFILE statement and the tilde (~) format modifier in an INPUT statement to retain the quotation marks in character data and to read a character in a string that is enclosed in quotation marks as a character instead of as a delimiter.
data scores; infile datalines dsd; input Name : $9. Score1-Score3 Team ~ $25. Div $; datalines; Joseph,11,32,76,"Red Racers, Washington",AAA Mitchel,13,29,82,"Blue Bunnies, Richmond",AAA Sue Ellen,14,27,74,"Green Gazelles, Atlanta",AA ;
The output that PROC PRINT generates shows the resulting SCORES data set. The values for TEAM contain the quotation marks.
The SAS System 1 OBS Name Score1 Score2 Score3 Team Div 1 Joseph 11 32 76 "Red Racers, Washington" AAA 2 Mitchel 13 29 82 "Blue Bunnies, Richmond" AAA 3 Sue Ellen 14 27 74 "Green Gazelles, Atlanta" AA
See Also |
|
FOOTNOTE 1: See SAS Language Reference: Concepts for the information about standard and nonstandard data values.
Copyright © 2011 by SAS Institute Inc., Cary, NC, USA. All rights reserved.