SAS has fully integrated the DataFlux suite of data quality,
data integration, data governance, and master data management solutions into its SAS offerings. This
helps customers build a more integrated information management approach that goes
beyond data management and governance to support analytics and decision management.
SAS has certain software
offerings, such as SAS Data Management, that include SAS Data Integration
Studio, SAS Data Quality Server, and
SAS/ACCESS interfaces as well as the DataFlux data management products. The SAS Data Quality
offering, for example, consists of SAS Data Quality Server, a
Quality Knowledge Base (
QKB), and SAS language elements. Certain DataFlux products, when used together with SAS
products, also enable you to manage data profiling, quality, integration, monitoring,
and
enrichment.
Many of the features
in SAS Data Quality Server and the DataFlux Data Management Studio,
for example, can be used in SAS Data Integration Studio jobs. You
can also execute DataFlux jobs, profiles, and services from SAS Data
Integration Studio.
If your site has licensed
the appropriate SAS offerings, you can take advantage of the following
components:
DataFlux Data Management Studio
a desktop client that
combines data quality and data discovery features. You can use this
client to create jobs, profiles, standardization schemes, and other
resources that can be included in SAS Data Integration Studio jobs.
DataFlux Data Management Server
provides a scalable
server environment for large DataFlux Data Management Studio jobs.
Jobs can be uploaded from DataFlux Data Management Studio to a DataFlux
Data Management Server, where the jobs are executed. SAS Data Integration
Studio can execute DataFlux jobs on this server.
DataFlux Web Studio
a web-based application
with separately licensed modules that enable you to perform data management
tasks.
data service
a data job that has been configured as a real-time service and deployed to a DataFlux
Data Management
Server.
process job
a DataFlux job that combines data processing with conditional processing. The process
flow in the
job supports logical decisions, looping, events, and other features that are not available
in a data job flow.
profile
a job that executes one or more data profiling operations and displays a report based
on
the result of these operations. Data profiling encompasses discovery and audit activities
that help you assess the composition, organization, and quality of databases.
Quality Knowledge Base (QKB)
a collection of files and reference sources that allow Blue Fusion and consequently
all DataFlux software to do parsing, standardization, analysis, matching, and other
processes. A QKB includes locales, standardization schemes, and other resources.
a collection of data types and definitions that are pertinent to a particular language
or language convention. A locale for English –
UK, for example, has an address parse definition different
from an English – US parse definition.
The address format is significantly different even though the language
is similar.
a file that contains
pairs of data values and standardized values. Schemes are used to
standardize columns by providing a set of acceptable values.
standardization definition
a set of logic used to standardize an element within a string. For example, a definition
could be used to expand all instances of “Univ.” to “University” without having to
specify every literal instance such as “Univ. Arizona” and “Oxford Unv.” in a scheme.