DQSCHEME Procedure
CREATE Statement
Creates a scheme or an analysis data set.
Syntax
Optional Arguments
- ANALYSIS=analysis-data-set
-
Names the output data
set that stores analytical data.
Restriction:This option is required if the SCHEME= option is not
specified.
See:Concepts for additional information.
- INCLUDE_ALL
-
specifies that the
scheme is to contain all of the values of the input variable. This
includes input variables with these conditions:
-
-
that were not transformed
-
that did not receive a cluster
number
Note:The INCLUDE_ALL option is not set by default.
- LOCALE=locale-name
-
specifies the locale
that contains the specified match definition. The value can be a locale
name in quotation marks. It can be the name of a variable whose value
is a locale name, or is an expression that evaluates to a locale name.
The specified locale
must be loaded into memory as part of the locale list.
Default:The first locale in the locale list.
Restriction:If no value is specified, the default locale is used.
- MATCHDEF=match-definition
-
names the match definition
in the specified locale that is used to establish cluster numbers.
You can specify any valid match definition.
The value of the MATCHDEF=
option is stored in the scheme as a meta option. This provides a default
match definition when a scheme is applied. This meta option is used
only when SCHEME_LOOKUP= MATCHDEF. The default value that is supplied
by this meta option is superseded by match definitions specified in
the APPLY statement or the DQSCHEMEAPPLY CALL routine.
Tip:Use definitions whose names end in (SCHEME
BUILD)
when using the ENUSA locale. These match definitions
yield optimal results in the DQSCHEME procedure.
- MODE= ELEMENT | PHRASE
-
specifies a mode of
scheme application. This information is stored in the scheme as metadata,
which specifies a default mode when the scheme is applied. The default
mode is superseded by a mode in the APPLY statement, or in the DQSCHEMEAPPLY
function or CALL routine. See
Applying Schemes for additional
information.
- ELEMENT
-
specifies that each
element in each value of the input character variable is compared
to the data values in the scheme. When SCHEME_LOOKUP= USE_MATCHDEF,
the match code for each element is compared to match codes generated
for each element in each DATA variable value in the scheme.
- PHRASE
-
(default value) specifies
that the entirety of each value of the input character variable is
compared to the data values in the scheme. When SCHEME_LOOKUP= USE_MATCHDEF,
the match code for the entire input value is compared to match codes
that are generated for each data value in the scheme.
- SCHEME=scheme-name
-
specifies the name
or the fileref of the scheme that is created. The fileref must reference
a fully qualified path with a filename that ends in .sch.bfd
. Lowercase letters are required. To create
a scheme data set in Blue Fusion Data format, specify the BFD option
in the DQSCHEME procedure.
To create a scheme
in SAS format, specify the NOBFD option in the DQSCHEME procedure
and specify a one-level or two-level SAS data set name.
Restriction:The SCHEME= option is required if the ANALYSIS= option
is not specified.
See:Syntax for additional information.
CAUTION:
In the
z/OS operating environment, specify only schemes that use SAS formats.
BFD schemes can be applied, but not created in the z/OS operating
environment.
- SCHEME_LOOKUP= EXACT | IGNORE_CASE | USE_MATCHDEF
-
specifies one of three
mutually exclusive methods of applying the scheme to the values of
the input character variable. Valid values are defined as follows:
- EXACT
-
(default value) specifies
that the values of the input variable are to be compared to the DATA
values in the scheme without changing the input values in any way.
The transformation value in the scheme is written into the output
data set only when an input value exactly matches a DATA value in
the scheme. Any adjacent blank spaces in the input values are replaced
with single blank spaces before comparison.
- IGNORE_CASE
-
specifies that capitalization
is to be ignored when input values are compared to the DATA values
in the scheme.
Interaction:Any adjacent blank spaces in the input values
are replaced with single blank spaces before comparison.
- USE_MATCHDEF
-
specifies that comparisons
are to be made between the match codes of the input values and the match codes of the DATA values in the scheme.
Interactions:Specifying USE_MATCHDEF enables the options LOCALE=,
MATCHDEF=, and SENSITIVITY=, which can be used to override the default
values that might be stored in the scheme.
A transformation occurs when the match code of
an input value is identical to the match code of a DATA value in the
scheme.
The value of the SCHEME_LOOKUP=
option is stored in the scheme as a meta option. This specifies a
default lookup method when the scheme is applied. The default supplied
by this meta option is superseded by a lookup method that is specified
in the APPLY statement, or in the DQSCHEMEAPPLY function or CALL routine.
- SENSITIVITY=sensitivity-level
-
determines the amount
of information that is included in the match codes that are generated
during the creation and perhaps the application of the scheme. The
value of the SENSITIVITY= option is stored in the scheme as a meta
option. This provides a default sensitivity value when the scheme
is applied.
Higher sensitivity
values generate match codes that contain more information. These match
codes generally result in the following:
-
-
greater number of clusters
-
fewer values in each cluster
Default:85
Interactions:The default value supplied by this meta option
is superseded by a sensitivity value specified in the APPLY statement,
or in the DQSCHEMEAPPLY CALL routine.
This meta option is used at apply time only when
SCHEME_LOOKUP= MATCHDEF.
- VAR=input-character-variable
-
specifies the input
character variable that is analyzed and transformed. The maximum length
of input values is 1024 bytes.
Copyright © SAS Institute Inc. All rights reserved.