Validating the CDISC ADaM Data Sets

Assumption

You have a library of CDISC ADaM SAS data sets (for the purposes of this example, it is derived from a library of CDISC SDTM 3.1.2 domains). Derivation of ADaM analysis files from SDTM domains is not a supported function in the SAS Clinical Standards Toolkit. Products such as SAS Clinical Data Integration can be used to create these mapping processes and transformation processes.

Location of ADaM Driver Programs

The ADaM driver programs are located in the sample study library directory/cdisc-adam-2.1-1.6/sascstdemodata/programs directory.

Step 1: Derive Metadata about Your Source Data

Before you can validate your ADaM data sets, you must derive a set of metadata that describes your library of analysis data sets. In the SAS Clinical Standards Toolkit, the metadata that describes your study data sets and columns is generally referred to as source metadata. To help derive this metadata, the SAS Clinical Standards Toolkit provides a sample driver program (create_sourcemetadata.sas) that calls the SAS macro adamutil_createsrcmetafromsaslib.sas found in the global standards library directory/standards/cdisc-adam-2.1-1.6/macros directory. This macro uses Base SAS metadata (PROC CONTENTS output) and reference metadata (provided by SAS) describing the CDISC ADaM standard to initialize the source metadata. You might need to augment or modify this approach based on other metadata that you have available or other processes that you adopt.

Step 2: Build Your Own Driver Program

The sample driver that can be run to demonstrate the ADaM validation process is validate_data.sas. Use this driver as a sample to build your own driver program, specifying the locations of source data and metadata. Take note of the SASReferences data set created in the validate_data driver. It references both SDTM and ADaM data and metadata, as well as ADaM controlled terminology. Reference to the SDTM metadata supports comparison of ADaM column metadata with SDTM column metadata for those columns derived directly from SDTM. (See Assumption.)

Step 3: Submit the Modified Driver Program

The SAS Clinical Standards Toolkit validation processes generally create two types of output data sets: validation results and validation metrics. The names and locations of these SAS data sets depend on your SASReferences specifications for results management.
Here is a sample Results data set produced by the validation process:
Partial Sample Results Data Set (CDISC ADaM 2.1 Validation Process)
Partial Sample Results Data Set (CDISC ADaM 2.1 Validation Process)
The validation results are representative of the range of validation results one might see, from no reported errors (such as ADAM0001) to multiple errors detected (such as ADAM0053) to an inability to run a specific check because of a lack of data or metadata (such as ADAM0102). The validation metrics output data set attempts to summarize the validation results and provide a denominator for each check.
For a more thorough discussion of how validation is performed, see Chapter 8, “Internal Validation,” in the SAS Clinical Standards Toolkit: User's Guide.