Previous Page | Next Page

Reading Raw Data

Types of Data


Definitions

data values

are character or numeric values.

numeric value

contains only numbers, and sometimes a decimal point, a minus sign, or both. When they are read into a SAS data set, numeric values are stored in the floating-point format native to the operating environment. Nonstandard numeric values can contain other characters as numbers; you can use formatted input to enable SAS to read them.

character value

is a sequence of characters.

standard data

are character or numeric values that can be read with list, column, formatted, or named input. Examples of standard data include:

  • ARKANSAS

  • 1166.42

nonstandard data

is data that can be read only with the aid of informats. Examples of nonstandard data include numeric values that contain commas, dollar signs, or blanks; date and time values; and hexadecimal and binary values.


Numeric Data

Numeric data can be represented in several ways. SAS can read standard numeric values without any special instructions. To read nonstandard values, SAS requires special instructions in the form of informats. Reading Different Types of Numeric Data shows standard, nonstandard, and invalid numeric data values and the special tools, if any, that are required to read them. For complete descriptions of all SAS informats, see SAS Language Reference: Dictionary.

Reading Different Types of Numeric Data
Example of Numeric Data Description Solution Required to Read
Standard Numeric Data
23 input right aligned None needed
23 input not aligned None needed
23 input left aligned None needed
00023 input with leading zeros None needed
23.0 input with decimal point None needed
2.3E1 in E-notation, 2.30 (ss1) None needed
230E-1 in E-notation, 230x10 (ss-1) None needed
-23 minus sign for negative numbers None needed
Nonstandard Numeric Data
2 3 embedded blank COMMA. or BZ. informat
- 23 embedded blank COMMA. or BZ. informat
2,341 comma COMMA. informat
(23) parentheses COMMA. informat
C4A2 hexadecimal value HEX. informat
1MAR90 date value DATE. informat
Invalid Numeric Data
23 - minus sign follows number Put minus sign before number or solve programmatically. (table note 1)
.. double instead of single periods Code missing values as a single period or use the ?? modifier in the INPUT statement to code any invalid input value as a missing value.
J23 not a number Read as a character value, or edit the raw data to change it to a valid number.

TABLE NOTE 1:   It might be possible to use the S370FZDTw.d informat, but positive values require the trailing plus sign (+). [arrow]

Remember the following rules for reading numeric data:


Character Data

A value that is read with an INPUT statement is assumed to be a character value if one of the following is true:

Input data that you want to store in a character variable can include any character. Use the guidelines in the following table when your raw data includes leading blanks and semicolons.

Reading Instream Data and External Files Containing Leading Blanks and Semicolons
Characters in the Data What to Use Reason
leading or trailing blanks that you want to preserve formatted input and the $CHARw. informat List input trims leading and trailing blanks from a character value before the value is assigned to a variable.
semicolons in instream data DATALINES4 or CARDS4 statements and four semicolons (;;;;) to mark the end of the data With the normal DATALINES and CARDS statements, a semicolon in the data prematurely signals the end of the data.
delimiters, blank characters, or quoted strings DSD option, with DLM= or DLMSTR= option on the INFILE statement These options enable SAS to read a character value that contains a delimiter within a quoted string; these options can also treat two consecutive delimiters as a missing value and remove quotation marks from character values.

Remember the following when reading character data:

Previous Page | Next Page | Top of Page