Types of Data

Definitions

data values
are character or numeric values.
numeric value
contains only numbers, and sometimes a decimal point, a minus sign, or both. When they are read into a SAS data set, numeric values are stored in the floating-point format native to the operating environment. Nonstandard numeric values can contain other characters as numbers; you can use formatted input to enable SAS to read them.
character value
is a sequence of characters.
standard data
are character or numeric values that can be read with list, column, formatted, or named input. Examples of standard data include:
  • ARKANSAS
  • 1166.42
nonstandard data
is data that can be read only with the aid of informats. Examples of nonstandard data include numeric values that contain commas, dollar signs, or blanks; date and time values; and hexadecimal and binary values.

Numeric Data

Numeric data can be represented in several ways. SAS can read standard numeric values without any special instructions. To read nonstandard values, SAS requires special instructions in the form of informats. Reading Different Types of Numeric Data shows standard, nonstandard, and invalid numeric data values and the special tools, if any, that are required to read them. For complete descriptions of all SAS informats, see SAS Formats and Informats: Reference.
Reading Different Types of Numeric Data
Example of Numeric Data
Description
Solution Required to Read
Standard Numeric Data
                23
input right aligned
None needed
       23
input not aligned
None needed
23
input left aligned
None needed
00023
input with leading zeros
None needed
23.0
input with decimal point
None needed
2.3E1
in E notation, 2.30 (ss1)
None needed
230E-1
in E notation, 230x10 (ss-1)
None needed
-23
minus sign for negative numbers
None needed
Nonstandard Numeric Data
2 3
embedded blank
COMMA. or BZ. informat
- 23
embedded blank
COMMA. or BZ. informat
2,341
comma
COMMA. informat
(23)
parentheses
COMMA. informat
C4A2
hexadecimal value
HEX. informat
1MAR90
date value
DATE. informat
Invalid Numeric Data
23 -
minus sign follows number
Put minus sign before number or solve programmatically.(footnote1)
..
double instead of single periods
Code missing values as a single period or use the ?? modifier in the INPUT statement to code any invalid input value as a missing value.
J23
not a number
Read as a character value, or edit the raw data to change it to a valid number.
Remember the following rules for reading numeric data:
  • Parentheses or a minus sign preceding the number (without an intervening blank) indicates a negative value.
  • Leading zeros and the placement of a value in the input field do not affect the value assigned to the variable. Leading zeros and leading and trailing blanks are not stored with the value. Unlike some languages, SAS does not read trailing blanks as zeros by default. To cause trailing blanks to be read as zeros, use the BZ. informat described in SAS Formats and Informats: Reference.
  • Numeric data can have leading and trailing blanks but cannot have embedded blanks (unless they are read with a COMMA. or BZ. informat).
  • To read decimal values from input lines that do not contain explicit decimal points, indicate where the decimal point belongs by using a decimal parameter with column input or an informat with formatted input. See the full description of the INPUT statement in SAS Formats and Informats: Reference for more information. An explicit decimal point in the input data overrides any decimal specification in the INPUT statement.

Character Data

A value that is read with an INPUT statement is assumed to be a character value if one of the following is true:
  • A dollar sign ($) follows the variable name in the INPUT statement.
  • A character informat is used.
  • The variable has been previously defined as character. For example, in a LENGTH statement, in the RETAIN statement, by an assignment statement, or in an expression.
Input data that you want to store in a character variable can include any character. Use the guidelines in the following table when your raw data includes leading blanks and semicolons.
Reading Instream Data and External Files Containing Leading Blanks and Semicolons
Characters in the Data
What to Use
Reason
leading or trailing blanks that you want to preserve
formatted input and the $CHARw. informat
List input trims leading and trailing blanks from a character value before the value is assigned to a variable.
semicolons in instream data
DATALINES4 or CARDS4 statements and four semicolons (;;;;) to mark the end of the data
With the normal DATALINES and CARDS statements, a semicolon in the data prematurely signals the end of the data.
delimiters, blank characters, or quoted strings
DSD option, with DLM= or DLMSTR= option in the INFILE statement
These options enable SAS to read a character value that contains a delimiter within a quoted string; these options can also treat two consecutive delimiters as a missing value and remove quotation marks from character values.
Remember the following when reading character data:
  • In a DATA step, when you place a dollar sign ($) after a variable name in the INPUT statement, character data that is read from data lines remains in its original case. If you want SAS to read data from data lines as uppercase, use the CAPS system option or the $UPCASE informat.
  • If the value is shorter than the length of the variable, SAS adds blanks to the end of the value to give the value the specified length. This process is known as padding the value with blanks.
FOOTNOTE 1:It might be possible to use the S370FZDTw.d informat, but positive values require the trailing plus sign (+).[return]