Shared Concepts and Topics


Syntax: CODE Statement

  • CODE <options>;

Table 19.9 summarizes the options you can specify in the CODE statement.

Table 19.9: CODE Statement Options

Option

Description

CATALOG=

Names the catalog entry where the generated code is saved

DUMMIES

Retains the dummy variables in the data set

ERROR

Computes the error function

FILE=

Names the file where the generated code is saved

FORMAT=

Specifies the numeric format for the regression coefficients

GROUP=

Specifies the group identifier for array names and statement labels

IMPUTE

Imputes predicted values for observations with missing or invalid covariates

LINESIZE=

Specifies the line size of the generated code

LOOKUP=

Specifies the algorithm for looking up CLASS levels

RESIDUAL

Computes residuals


You cannot specify both the FILE= and CATALOG= options. If you specify neither, the SAS scoring code is written to the SAS log. You can specify the following options in the CODE statement.

CATALOG=library.catalog.entry.type
CAT=library.catalog.entry.type

specifies where to write the generated code in the form of library.catalog.entry.type. The compound name can have from one to four levels. The default library is determined by the USER= SAS system option, which by default is WORK. The default entry is SASCODE, and the default type is SOURCE.

DUMMIES | NODUMMIES

specifies whether to keep dummy variables that represent the CLASS levels in the data set. The default is NODUMMIES, which specifies that dummy variables not be retained.

ERROR | NOERROR

specifies whether to generate code to compute the error function. The default is NOERROR, which specifies that the error function not be generated.

FILE=filename

names the external file that saves the generated code. When enclosed in a quoted string (for example, FILE="c:$\backslash $mydir$\backslash $scorecode.sas"), this option specifies the path for writing the code to an external file. You can also specify unquoted SAS filenames of no more than eight characters for filename. If the filename is assigned as a fileref in a Base SAS FILENAME statement, the file specified in the FILENAME statement is opened. The special filerefs LOG and PRINT are always assigned. If the specified filename is not an assigned fileref, the specified value for filename is concatenated with a .txt extension before the file is opened. For example, if FOO is not an assigned fileref, FILE=FOO causes FOO.txt to be opened. If filename has more than eight characters, an error message is printed.

FORMAT=format

specifies the format for the regression coefficients and other numerical values that do not have a format from the input data set. The default format is BEST20.

GROUP=group-name

specifies the group identifier for group processing. The group-name should be a valid SAS name of no more than 16 characters. It is used to construct array names and statement labels in the generated code.

IMPUTE

imputes the predicted values according to an intercept-only model for observations with missing or invalid covariate values. For a continuous response, the predicted value is the mean of the response variable; for a categorical response, the predicted values are the proportions of the response categories. When the IMPUTE option is specified, the scoring code also creates a variable named _WARN_ that contains one or more single-character codes that indicate problems in computing predicted values. The character codes used in _WARN_ go in the following positions:

Table 19.10: _WARN_ Variable Codes

Code

Column

Meaning

M

1

Missing covariate value

U

2

Unrecognized covariate category


LINESIZE=value
LS=value

specifies the line size for the generated code. The default is 78. The permissible range is 78 to 254.

LOOKUP=lookup-method

specifies the algorithm for looking up CLASS levels. You can specify the following lookup-methods:

AUTO

selects the LINEAR algorithm if a CLASS variable has fewer than five categories; otherwise, the BINARY algorithm is used. This is the default.

BINARY

uses a binary search. This method is fast, but might produce incorrect results and the normalized category values might contain characters that collate in different orders in ASCII and EBCDIC, if you generate the code on an ASCII machine and execute the code on an EBCDIC machine or vice versa.

LINEAR

uses a linear search with IF statements that have categories in the order of the class levels. This method is slow if there are many categories.

SELECT

uses a SELECT statement.

The default is LOOKUP=AUTO.

RESIDUAL | NORESIDUAL

specifies whether to generate code to compute residual values. If you request code for residuals and then score a data set that does not contain target values, the residuals will have missing values. The default is NORESIDUAL, which specifies that the code for residuals not be generated.