The data quality methods
referenced above are stored in the
dfs_serv_dq.xml
configuration file. The code in the configuration file is stored in an
MDS package ‘SYSPROC.DQ.DQUALITY’ at server start-up. Upon execution, the code is read
from the MDS package and executed with the provided parameters. The EXECUTE privilege
is granted to the SASUSERS group for the DQ schema. The following example uses
DQSTANDARDIZE
to
outline the steps for working with the data quality methods.
-
Implement a data quality
method by extracting the code from
dfs_serv_dq.xml
.
method dqstandardize(nvarchar(256) value, nvarchar(256) qkb_def, nvarchar(50) locale)
returns nvarchar(256);
/* Set the QKB path. */
dq.setQKB('&cfg.qkb_loc;');
if (check_err()) then return null;
/* Load locale. */
dq.loadLocale(locale);
if (check_err()) then return null;
/* cleanse data */
value = dq.standardize(qkb_def, value );
if (check_err()) then return null;
return value;
end;
-
Use a FedSQL procedure
to call the method that was implemented in step 1:
proc fedsql noprompt=&connectionString nolibs noerrorstop;
create table schema.cleansedTable as SELECT
SYSPROC.DQ.DQUALITY.DQSTANDARDIZE (
name ,
'Name',
'ENUSA' ) AS Standardized
FROM schema.inputTable;
quit;
The data quality methods on SAS Federation Server are designed to run with FedSQL.
When you run a data quality method from a SELECT statement in a FedSQL view, you are
dynamically connected to the SYSPROC catalog if the catalog is not present in your
DSN connection. However, if you issue a SELECT statement outside of a FedSQL view, you
are not dynamically connected to the SYSPROC catalog. To avoid an error, you must
create a
federated DSN and include the SYSPROC catalog.