The SAS Clinical Standards
Toolkit validation assesses the compliance of data, and the metadata
describing the data, with an accepted reference standard. It assesses
the consistency of values in a specific column, between columns, across
records in a specific data set, and across data sets. The primary
output is a Results data set that itemizes the process findings, and
an optional Metrics data set that summarizes the results.
The SAS Clinical Standards
Toolkit provides a framework to build a process. The process uses
inputs or process controls to evaluate the compliance of source data
with a reference standard. Each SAS Clinical Standards Toolkit process
uses a SAS program file to point to a SASReferences control data set,
and to execute a primary action SAS macro (such as sdtm_validate).
This SAS program file is referred to as a driver module in this document.
Generally, validation
is performed by running SAS macros against the standard, which is
represented by SAS files. Validation of some standards, such as CDISC
CRT-DDS, might include validating files that are not SAS files (such
as define.xml).
This display shows a SAS Clinical Standards
Toolkit validation process. Each component is fully described in the
following sections.
Components of a SAS Clinical Standards Toolkit Validation Process
-
Source Data is a set of SAS data sets in one or more libraries that collectively
represents a clinical study. These SAS data sets are referred to as
study domains or study data sets. One or more source data sets are
required by a typical SAS Clinical Standards Toolkit validation process.
However, it is possible to test only the structural compliance of
source metadata by limiting validation to a subset of validation checks.
-
Source Metadata is a set of SAS data sets in one or more libraries that provide
metadata about the source data. The source metadata is typically in
a format specific to a standard. For example, metadata about source
data sets might be captured in a source_tables data set. Metadata
about columns in those source data sets might be captured in a source_columns
data set.
-
Process Controls is the set of instructions that each SAS Clinical Standards Toolkit
process uses to perform a specific action. These instructions might
be provided in a varied number and in various type of files. For a
SAS Clinical Standards Toolkit validation process, these files include:
-
-
Properties are a series of name-value pairs that are translated into SAS global
macro variables. These macro variables are available for the duration
of the SAS Clinical Standards Toolkit process. Properties might be
defined in a varied number of files. Both text file format and SAS
data set format are supported.
For information
about a sample validation.properties file, see Validation Check Metadata: Validation Master. For information
about the SAS Clinical Standards Toolkit global macro variables, see Global Macro Variables.
-
Set of
Checks to Run is a set of checks that represent all
or some of the checks defined for a standard. Each check provides
metadata that is used by the validation code to perform a specific
compliance assessment.
-
Controlled
Terminology is an optional set of lookup values against
which source data columns can be evaluated. These values can be in
the form of SAS format catalogs or SAS data sets.
-
Results are presented in a Results data set that itemizes the process findings,
and in a Metrics data set that summarizes the results. The Results
data set usually contains a record indicating that each check was
run successfully without error, or it contains a record that itemizes
the errors detected. Information about the process also might be included.
The generation of a Metrics data set is conditional based on property
file settings.
The SAS Clinical Standards
Toolkit validation makes these basic assumptions:
-
There is some combination of source data and metadata
available as SAS files that you want to validate.
-
A reference standard has been defined with which the
source data and metadata are to be compared. The SAS Clinical Standards
Toolkit provides representative reference metadata for each supported
standard.
-
The source data can be in a varied number of SAS files,
and those SAS files can have any form. However, the metadata describing
the source data must accurately represent the source data. The metadata
must be in a form specific to a supported standard and defined by
the SAS Clinical Standards Toolkit.
-
A set of validation checks must be defined, and the
validation checks must conform to a generic SAS Clinical Standards
Toolkit SAS data set structure. The SAS Clinical Standards Toolkit
provides a representative set of validation checks for each supported
standard.