FORMAT Procedure

INVALUE Statement

Creates an informat for reading and converting raw data values.
See: SAS Formats and Informats: Reference for documentation on informats supplied by SAS.
Converting Raw Character Data to Numeric Values

Syntax

INVALUE <$>name <(informat-option(s))>

Summary of Optional Arguments

Control the attributes of the format
specifies a maximum length for the format.
specifies a minimum length for the format.
stores values or ranges in the order in which you define them.
upper cases all input strings before they are compared to ranges.
Control the attributes of the informat
specifies the default length of the format.
left-justifies all input strings before they are compared to ranges.
Control the input template.
specifies the variable template for reading data.

Required Argument

name
names the informat that you are creating.
Restriction:A user-defined informat name cannot be the same as an informat name that is supplied by SAS.
Requirement:The name must be a valid SAS name. A numeric informat name can be up to 31 characters in length; a character informat name can be up to 30 characters in length and cannot end in a number. If you are creating a character informat, then use a dollar sign ($) as the first character. Adding the dollar sign to the name is why a character informat is limited to 30 characters.
Interaction:The maximum length of an informat name is controlled by the VALIDFMTNAME= system option. See SAS System Options: Reference for details.
Tips:Refer to the informat later by using the name followed by a period. However, do not use a period after the informat name in the INVALUE statement.

When SAS prints messages that refer to a user-written informat, the name is prefixed by an at sign (@). When the informat is stored, the at sign is prefixed to the name that you specify for the informat. The addition of the at sign to the name is why the name is limited to 31 or 30 characters. You need to use the at sign only when you are using the name in an EXCLUDE or SELECT statement; do not prefix the name with an at sign when you are associating the informat with a variable.

Optional Arguments

DEFAULT=length
specifies the default length of the format. The value for DEFAULT= becomes the length of the informat if you do not give a specific length when you associate the informat with a variable.
The default length of an informat depends on whether the informat is character or numeric. The default length of character informats is the length of the longest informatted value. The default of a numeric informat is 12 if you have numeric data to the left of the equal sign. If you have a quoted string to the left of the equal sign, then the default length is the length of the longest string.
Tip:As a best practice, if you specify an existing informat in a value-range set, always specify the DEFAULT= option.
JUST
left-justifies all input strings before they are compared to ranges.
MAX=length
specifies a maximum length for the informat or format. When you associate the format with a variable, you cannot specify a width greater than the MAX= value.
Default:40
Range:1–40
MIN=length
specifies a minimum length for the informat or format.
Default:1
Range:1–40
NOTSORTED
stores values or ranges for informats or formats in the order in which you define them. If you do not specify NOTSORTED, then values or ranges are stored in sorted order by default, and SAS uses a binary searching algorithm to locate the range that a particular value falls into. If you specify NOTSORTED, then SAS searches each range in the order in which you define them until a match is found.
Use NOTSORTED if one of the following is true:
  • You know the likelihood of certain ranges occurring, and you want your informat or format to search those ranges first to save processing time.
  • You want to preserve the order that you define ranges when you print a description of the informat or format using the FMTLIB option.
  • You want to preserve the order that you define ranges when you use the ORDER=DATA option and the PRELOADFMT option to analyze class variables in PROC MEANS, PROC SUMMARY, or PROC TABULATE.
Do not use NOTSORTED if the distribution of values is uniform or unknown, or if the number of values is relatively small. The binary searching algorithm that SAS uses when NOTSORTED is not specified optimizes the performance of the search under these conditions.
SAS automatically sets the NOTSORTED option when you use the CPORT and the CIMPORT procedures to transport informats or formats between operating environments with different standard collating sequences. This automatic setting of NOTSORTED can occur when you transport informats or formats between ASCII and EBCDIC operating environments. If this situation is undesirable, then do the following:
  • Use the CNTLOUT= option in the PROC FORMAT statement to create an output control data set.
  • Use the CPORT procedure to create a transport file for the control data set.
  • Use the CIMPORT procedure in the target operating environment to import the transport file.
  • In the target operating environment, use PROC FORMAT with the CNTLIN= option to build the formats and informats from the imported control data set.
UPCASE
converts all raw data values to uppercase before they are compared to the possible ranges. If you use UPCASE, then make sure the values or ranges that you specify are in uppercase.
value-range-set(s)
specifies raw data and values that the raw data will become. The value-range-set(s) can be one or more of the following:
value-or-range-1<..., value-or-range-n>=informatted-value | [existing-informat]
The informat converts the raw data to the values of informatted-value on the right side of the equal sign.
value-or-range
informatted-value
is the value that you want the raw data in value-or-range to become. Use one of the following forms for informatted-value:
'character-string'
is a character string up to 32,767 characters long. Typically, character-string becomes the value of a character variable when you use the informat to convert raw data. Use character-string for informatted-value only when you are creating a character informat. If you omit the single or double quotation marks around character-string, then the INVALUE statement assumes that the quotation marks are there.
For hexadecimal literals, you can use up to 32,767 typed characters, or up to 16,382 represented characters at two hexadecimal characters per represented character.
number
is a number that becomes the informatted value. Typically, number becomes the value of a numeric variable when you use the informat to convert raw data. Use number for informatted-value when you are creating a numeric informat. The maximum for number depends on the host operating environment.
_ERROR_
treats data values in the designated range as invalid data. SAS assigns a missing value to the variable, prints the data line in the SAS log, and issues a warning message.
_SAME_
prevents the informat from converting the raw data as any other value. For example, the following GROUP. informat converts values 01 through 20 and assigns the numbers 1 through 20 as the result. All other values are assigned a missing value.
   invalue group 01-20= _same_
                 other= .;
existing-informat
is an informat that is supplied by SAS or an existing user-defined informat. The informat that you are creating uses the existing informat to convert the raw data that match value-or-range on the left side of the equal sign. If you use an existing informat, then enclose the informat name in square brackets (for example, [date9.]) or with parentheses and vertical bars (for example, (|date9.|)). Do not enclose the name of the existing informat in single quotation marks.
Tip:As a best practice, if you specify an existing informat in a value-range-set, always specify a default value by using the DEFAULT= option.

Examples

Example 1: Create a Character Informat for Raw Data Values

The $GENDER. character informat converts the raw data values F and M to character values '1' and '2':
invalue $gender 'F'='1'
                'M'='2';
The dollar sign prefix indicates that the informat converts character data.

Example 2: Create Character and Numeric Values or a Range of Values

When you create numeric informats, you can specify character strings or numbers for value-or-range. For example, the TRIAL. informat converts any character string that sorts between A and M to the number 1 and any character string that sorts between N and Z to the number 2. The informat treats the unquoted range 1–3000 as a numeric range, which includes all numeric values between 1 and 3000:
invalue trial 'A'-'M'=1
              'N'-'Z'=2
               1-3000=3;

Example 3: Create an Informat Using _ERROR_ and _SAME_

The CHECK. informat uses _ERROR_ and _SAME_ to convert values of 1 through 4 and 99. All other values are invalid:
invalue check 1-4=_same_
               99=.
            other=_error_;
If you use a numeric informat to convert character strings that do not correspond to any values or ranges, then you receive an error message.