Prerequisites for Data Quality Transformations

DataFlux Software

Transformations in the Data Quality folder require either SAS Data Quality Server 9.3 or DataFlux Data Management Platform 2.1. If your site purchased an Enterprise bundle that included SAS Data Integration Studio, then SAS Data Quality Server and the DataFlux Data Management Platform were included. For more information about configuring DataFlux software for use with SAS Data Integration Studio, see the SAS Data Integration Studio chapter of the SAS Intelligence Platform: Desktop Administration Guide.
Note: SAS Data Integration Studio 4.3 users cannot use single sign-on to access profiles or jobs on a DataFlux Data Management Server. They can access profiles or jobs on an unsecured Data Management Server only. The single sign-on feature cannot connect to an unsecured Data Management Server.
Review the DataFlux components that are described in Overview. Identify the components that you want to use in SAS Data Integration Studio, and then configure or create these components. For example, if you want to use a DataFlux standardization scheme in a SAS Data Integration Studio job, you must create the scheme in DataFlux software. For more information, see the DataFlux documentation such as the DataFlux Data Management Studio User’s Guide.
Note: With the exception of the DataFlux Batch Job transformation, which can be used to execute DataFlux dfPower Studio Architect jobs that do not contain macros, the current version of SAS Data Integration Studio works only with the DataFlux Data Management Platform. Other DataFlux dfPower Studio objects must be migrated to the DataFlux Data Management Platform. For more information, see the DataFlux Migration Guide.

Global Options on the Data Quality Tab

After the DataFlux resources have been configured or created, you can specify some global data quality options in SAS Data Integration Studio. SelectToolsthen selectOptions to display the Options window, and then click the Data Quality tab. The next figure shows some typical values in this tab.
Data Quality Tab
Data Quality Tab
Paths specified in the Data Quality group box are relative to the current SAS Application Server. The group box contains the following items:
Default Locale
specifies the locale that is referenced by SAS data quality jobs when a different locale is not specified in those jobs. The default value is Use the value defined on the server. The default uses the value of the SAS system option DQLOCALE, which is set on the SAS Application Server that executes SAS data quality jobs.
In a standard deployment, the SAS Application Server is not configured to use any specific locale. There are three main ways to set the locale. You can configure the DQLOCALE option on the SAS Application Server that executes SAS data quality jobs. You can select a locale in the Default Locale field above. Also, you can select a locale for an individual data quality transformation in a SAS Data Integration Studio job.
DQ Setup Location
specifies the location of a DataFlux Quality Knowledge Base (QKB). In a standard deployment, the SAS Application Server is configured to use the sample QKB that is provided by SAS Data Quality Server. The sample QKB is typically located at the following path: C:\Program Files\SASHome\SASFoundation\9.3\dquality\sasmisc\QltyKB\sample
There are two main ways to set the QKB. You can configure the DQSETUPLOC option on the SAS Application Server that executes SAS data quality jobs. You can also select a QKB in the DQ Setup Location field above.
Scheme Repository Type
specifies that the scheme data sets in the specified scheme repository are stored in SAS format (option value NOBFD) or in DataFlux format (option value BFD, the default). The Apply Lookup Standardization transformation uses schemes to standardize data.
Note: If you change an existing value in the fields Scheme Repository Type or Scheme Repository, then you must replace any instances of the Apply Lookup Standardization transformation in any existing jobs that you intend to run using your current metadata profile. Replacement is required because scheme metadata is added to these jobs when they are run for the first time. To update a job to use a different scheme repository, add a new Apply Lookup Standardization transformation to the job, configure the new transformation, delete the old transformation, and move the new transformation into place.
Scheme Repository
specifies the location of the scheme data sets that are used by the Apply Lookup Standardization transformation. To display scheme filenames in the transformation, specify:
QKB-root/scheme
To display scheme descriptions in the transformation, specify:
QKB-root
QKB-root is the directory that was specified when the Quality Knowledge Base was installed. QKB-root contains approximately nine subdirectories, with names such as regex, locale, and scheme.
Paths that are specified in the DataFlux Data Management Platform Tools group box are relative to the SAS Data Integration Studio application. This group box contains the following item:
DataFlux Installation Folder
specifies the folder where DataFlux Data Management Studio is installed. Under the 64-bit version of Windows, the default path is C:\Program Files (x86)\DataFlux\DMStudio\release_number. Use the keyboard, drop-down list, or the Browse button to specify a different installation folder.
If you specify the path to DataFlux Data Management Studio and click OK to save your changes, the next time you start SAS Data Integration Studio, you can run DataFlux Data Management Studio by selecting Toolsthen selectDataFlux Data Management Platform Toolsthen selectData Management Studio.