SAS Data Integration
Studio supports data quality improvement with the following transformations
listed under the Data Quality category in the Transformations tree:
-
Apply Lookup Standardization
-
-
Standardize with Definition
-
-
Like DataFlux Data Management
Studio, SAS Data Integration Studio is included in the SAS Data Management
bundle.
You can use the DataFlux
schemes in the Apply Lookup Standardization transformation to standardize
the format, casing, and spelling of character columns in a source
table. Similarly, you can select and apply DataFlux standardization
definitions in the Standardize with Definition transformation to elements
within a text string. For example, you might want to change all instances
of “Mister” to “Mr.” but only when “Mister”
is used as a salutation. However, this approach requires SAS Data
Quality Server.
The Create Match Code
transformation enables you to analyze source data and generate match
codes based on common information shared by clusters of records. Comparing
match codes instead of actual data enables you to identify records
that are in fact the same entity, despite minor variations in the
data.
The DataFlux Batch Job
and DataFlux Data Service transformations enable you to select and
execute DataFlux jobs and jobs configured as real time from a DataFlux
Data Management Server. Then, you can perform DataFlux quality activities
such as data jobs, process jobs, and profiles.
Many of the features
in SAS Data Quality Server and the DataFlux Data Management Platform
can be used in SAS Data Integration Studio jobs. For example, you
can use DataFlux standardization schemes and definitions in SAS Data
Integration Studio jobs. You can also execute DataFlux jobs, profiles,
and services from SAS Data Integration Studio.