The DQSCHEME Procedure |
See also: | Applying Schemes |
Options |
names the output data set that stores analytical data.
Restriction: | This option is required if the SCHEME= option is not specified. |
See: | Create the Schemes |
specifies that the scheme is to contain all of the values of the input variable. This includes input variables:
with unique match codes
that were not transformed
that did not receive a cluster number
Note: | The INCLUDE_ALL option is not set by default. |
specifies the locale that contains the specified match definition. The value can be a locale name in quotation marks. It can be the name of a variable whose value is a locale name, or is an expression that evaluates to a locale name.
The specified locale must be loaded into memory as part of the locale list.
Default: | The first locale in the locale list. |
Restriction: | If no value is specified, the default locale is used. |
See: | Load and Unload Locales |
names the match definition in the specified locale that is used to establish cluster numbers. You can specify any valid match definition.
The value of the MATCHDEF= option is stored in the scheme as a meta option. This provides a default match definition when a scheme is applied. This meta option is used only when SCHEME_LOOKUP=MATCHDEF. The default value that is supplied by this meta option is superseded by match definitions specified in the APPLY statement or the DQSCHEMEAPPLY function or CALL routine.
Best Practice: | Use definitions whose names end in (SCHEME BUILD) when using the ENUSA locale. These match definitions yield optimal results in the DQSCHEME procedure. |
See: | Meta Options |
specifies a mode of scheme application. This information is stored in the scheme as metadata, which specifies a default mode when the scheme is applied. The default mode is superseded by a mode in the APPLY statement, or in the DQSCHEMEAPPLY function or CALL routine. See Applying Schemes
specifies that each element in each value of the input character variable is compared to the data values in the scheme. When SCHEME_LOOKUP=USE_MATCHDEF, the match code for each element is compared to match codes generated for each element, in each DATA variable value in the scheme.
this default value specifies that the entirety of each value of the input character variable is compared to the data values in the scheme. When SCHEME_LOOKUP=USE_MATCHDEF, the match code for the entire input value is compared to match codes that are generated for each data value in the scheme.
specifies the name or the fileref of the scheme that is created. The fileref must reference a fully qualified path with a filename that ends in .sch.bfd. Lowercase letters are required. To create a scheme data set in Blue Fusion Data format, specify the BFD option in the DQSCHEME procedure.
To create a scheme in SAS format, specify the NOBFD option in the DQSCHEME procedure and specify a one-level or two-level SAS data set name.
Restriction: | The SCHEME= option is required if the ANALYSIS= option is not specified. |
See: | Syntax: DQSCHEME Procedure |
specifies one of three mutually exclusive methods of applying the scheme to the values of the input character variable. Valid values are defined as follows:
this default value specifies that the values of the input variable are to be compared to the DATA values in the scheme without changing the input values in any way. The transformation value in the scheme is written into the output data set only when an input value exactly matches a DATA value in the scheme. Any adjacent blank spaces in the input values are replaced with single blank spaces before comparison.
specifies that capitalization is to be ignored when input values are compared to the DATA values in the scheme. Any adjacent blank spaces in the input values are replaced with single blank spaces before comparison.
specifies that comparisons are to be made between the match codes of the input values and the match codes of the DATA values in the scheme. A transformation occurs when the match code of an input value is identical to the match code of a DATA value in the scheme.
Specifying USE_MATCHDEF enables the options LOCALE=, MATCHDEF=, and SENSITIVITY=, which can be used to override the default values that might be stored in the scheme.
The value of the SCHEME_LOOKUP= option is stored in the scheme as a meta option. This specifies a default lookup method when the scheme is applied. The default supplied by this meta option is superseded by a lookup method that is specified in the APPLY statement, or in the DQSCHEMEAPPLY function or CALL routine.
See: | Meta Options |
determines the amount of information that is included in the match codes that are generated during the creation and perhaps the application of the scheme.
Higher sensitivity values generate match codes that contain more information. These match codes generally result in:
fewer matches
greater number of clusters
fewer values in each cluster
The value of the SENSITIVITY= option is stored in the scheme as a meta option. This provides a default sensitivity value when the scheme is applied. This meta option is used at apply time only when SCHEME_LOOKUP=MATCHDEF. The default value supplied by this meta option is superseded by a sensitivity value specified in the APPLY statement, or in the DQSCHEMEAPPLY function or CALL routine.
Default: | 85 |
See: | Meta Options |
Valid values: | 50 to 95 |
specifies the input character variable that is analyzed and transformed. The maximum length of input values is 1024 bytes.
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.