Previous Page | Next Page

The DQSCHEME Procedure

CREATE Statement


Creates a scheme or an analysis data set.
See also: Applying Schemes

CREATE ANALYSIS=<analysis-data-set>
INCLUDE_ALL
LOCALE=<locale-name>
MATCHDEF=<match-definition>
MODE= PHRASE|ELEMENT
SCHEME=<scheme-name>
SCHEME_LOOKUP=EXACT|IGNORE_CASE|USE_MATCHDEF
SENSITIVITY=<sensitivity-level>
VAR=<variable-name>;

Options

ANALYSIS=analysis-data-set

names the output data set that stores analytical data.

Restriction: This option is required if the SCHEME= option is not specified.
See: Create the Schemes
INCLUDE_ALL

specifies that the scheme is to contain all of the values of the input variable. This includes input variables:

  • with unique match codes

  • that were not transformed

  • that did not receive a cluster number

Note: The INCLUDE_ALL option is not set by default.
LOCALE=locale-name

specifies the locale that contains the specified match definition. The value can be a locale name in quotation marks. It can be the name of a variable whose value is a locale name, or is an expression that evaluates to a locale name.

The specified locale must be loaded into memory as part of the locale list.

Default: The first locale in the locale list.
Restriction: If no value is specified, the default locale is used.
See: Load and Unload Locales
MATCHDEF=match-definition

names the match definition in the specified locale that is used to establish cluster numbers. You can specify any valid match definition.

The value of the MATCHDEF= option is stored in the scheme as a meta option. This provides a default match definition when a scheme is applied. This meta option is used only when SCHEME_LOOKUP=MATCHDEF. The default value that is supplied by this meta option is superseded by match definitions specified in the APPLY statement or the DQSCHEMEAPPLY function or CALL routine.

Best Practice: Use definitions whose names end in (SCHEME BUILD) when using the ENUSA locale. These match definitions yield optimal results in the DQSCHEME procedure.
See: Meta Options
MODE= ELEMENT | PHRASE

specifies a mode of scheme application. This information is stored in the scheme as metadata, which specifies a default mode when the scheme is applied. The default mode is superseded by a mode in the APPLY statement, or in the DQSCHEMEAPPLY function or CALL routine. See Applying Schemes

ELEMENT

specifies that each element in each value of the input character variable is compared to the data values in the scheme. When SCHEME_LOOKUP=USE_MATCHDEF, the match code for each element is compared to match codes generated for each element, in each DATA variable value in the scheme.

PHRASE

this default value specifies that the entirety of each value of the input character variable is compared to the data values in the scheme. When SCHEME_LOOKUP=USE_MATCHDEF, the match code for the entire input value is compared to match codes that are generated for each data value in the scheme.

SCHEME=scheme-name

specifies the name or the fileref of the scheme that is created. The fileref must reference a fully qualified path with a filename that ends in .sch.bfd. Lowercase letters are required. To create a scheme data set in Blue Fusion Data format, specify the BFD option in the DQSCHEME procedure.

CAUTION:
In the z/OS operating environment, specify only schemes using SAS formats.BFD schemes can be applied, but not created in the z/OS operating environment.   [cautionend]

To create a scheme in SAS format, specify the NOBFD option in the DQSCHEME procedure and specify a one-level or two-level SAS data set name.

Restriction: The SCHEME= option is required if the ANALYSIS= option is not specified.
See: Syntax: DQSCHEME Procedure
SCHEME_LOOKUP= EXACT | IGNORE_CASE |USE_MATCHDEF

specifies one of three mutually exclusive methods of applying the scheme to the values of the input character variable. Valid values are defined as follows:

EXACT

this default value specifies that the values of the input variable are to be compared to the DATA values in the scheme without changing the input values in any way. The transformation value in the scheme is written into the output data set only when an input value exactly matches a DATA value in the scheme. Any adjacent blank spaces in the input values are replaced with single blank spaces before comparison.

IGNORE_CASE

specifies that capitalization is to be ignored when input values are compared to the DATA values in the scheme. Any adjacent blank spaces in the input values are replaced with single blank spaces before comparison.

USE_MATCHDEF

specifies that comparisons are to be made between the match codes of the input values and the match codes of the DATA values in the scheme. A transformation occurs when the match code of an input value is identical to the match code of a DATA value in the scheme.

Specifying USE_MATCHDEF enables the options LOCALE=, MATCHDEF=, and SENSITIVITY=, which can be used to override the default values that might be stored in the scheme.

The value of the SCHEME_LOOKUP= option is stored in the scheme as a meta option. This specifies a default lookup method when the scheme is applied. The default supplied by this meta option is superseded by a lookup method that is specified in the APPLY statement, or in the DQSCHEMEAPPLY function or CALL routine.

See: Meta Options
SENSITIVITY=sensitivity-level

determines the amount of information that is included in the match codes that are generated during the creation and perhaps the application of the scheme.

Higher sensitivity values generate match codes that contain more information. These match codes generally result in:

  • fewer matches

  • greater number of clusters

  • fewer values in each cluster

The value of the SENSITIVITY= option is stored in the scheme as a meta option. This provides a default sensitivity value when the scheme is applied. This meta option is used at apply time only when SCHEME_LOOKUP=MATCHDEF. The default value supplied by this meta option is superseded by a sensitivity value specified in the APPLY statement, or in the DQSCHEMEAPPLY function or CALL routine.

Default: 85
See: Meta Options
Valid values: 50 to 95
VAR=input-character-variable

specifies the input character variable that is analyzed and transformed. The maximum length of input values is 1024 bytes.

Previous Page | Next Page | Top of Page