The typical
SAS Clinical Standards Toolkit workflow in support of the CDISC standards
includes the definition and validation of SDTM submission data and
the creation and validation of a define.xml file based on the SDTM
domain data. This exercise illustrates how you can read a define.xml
file to extract the data and metadata for the purposes of recreating
the original source SDTM study. Recreating the original source study
has value as a standalone exercise, either to extract a new SDTM study
from a define.xml file or to create a new SDTM study using information
in a define.xml file as a template.
As a round-trip
exercise, this task validates the performance of the crtdds_write
and crtdds_read SAS Clinical Standards Toolkit macros and allows a
comparison of original and recreated SDTM metadata and data. The following
display details the high-level workflow for this exercise.
The following
steps describe the workflow in more detail. The first five steps describe
the derivation of the CDISC CRT-DDS 1.0 define.xml file.
-
Access
a study that contains valid CDISC SDTM data and metadata. This is
a study that contains domain data (AE, DM, CO, and so on) and SAS
Clinical Standards Toolkit metadata about that SDTM study, such as
source_tables and source_columns. SAS Clinical Standards Toolkit also
includes XSL style sheets, XML map files, and any metadata that is
provided by SAS during the SAS Clinical Standards Toolkit installation.
-
Use the
set of sample driver programs that are provided in the SAS Clinical
Standards Toolkit to define the input and output files for each process
task and to invoke the macros that support each standard-specific
task. The driver programs are designed to run with the sample studies,
but can be modified as needed. New custom drivers can also be created
and used.
-
Submit
the create_crtdds10_fromsdtm311.sas driver program to access the crtdds_sdtm311todefine10.sas
macro, and create the 39 data sets that comprise the SAS representation
of the CRT-DDS model. These 39 output data sets are written to the
!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/data
directory.
-
Validate
the CRT-DDS data sets by submitting the validate_crtdds_data.sas driver
program. This step is optional.
-
Create
the define.xml file by submitting the create_crtdds_define.sas driver
program. This driver program generates the define.xml file from the
39 CRT-DDS data sets that were created in step 3. It also calls the
crtdds_xmlvalidate macro to validate the XML file structure. The define.xml
file is written to the
!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/sourcexml
directory.
At this
point, a valid define.xml file has been created from the SAS representation
of the CRT-DDS model. In the next steps, the SDTM data and metadata
are recreated using the XML read process.
-
Submit
the create_sascrtdds_fromxml.sas driver program. This driver program
reads the define.xml file created in step 5, and generates the SAS
representation of the CRT-DDS model using the crtdds_read.sas macro.
The data sets created in this step should match the data sets created
in step 3. These data sets are written to the
!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/deriveddata
directory. This driver program generates the source_tables and source_columns
data sets in the
!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/derivedmetadata
directory. By specifying new target folder locations (deriveddata
and derivedmetadata), the data sets can be validated against the data
sets that were created or referenced in step 3.
-
SDTM domain
data sets are created based on a reachable set of SAS transport files
that are specified in the define. xml file. Submit the create_sasdata_fromxpt.sas
SDTM driver program. For SDTM 3.1.2, the program is in the
!sasroot/../../ SASClinicalStandardsToolkitSDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata/programs
directory. This driver program accesses the sdtmutil_createsasdatafromxpt.sas
macro to generate the SDTM domain data sets from the SAS transport
files. Creation of the SAS transport files is not performed by SAS
Clinical Standards Toolkit. These files would have been produced as
a prerequisite to the generation of the define.xml file as a part
of the Electronic Common Document preparation process. The sdtmutil_createsasdatafromxpt.sas
macro assumes that the SAS transport files are reachable from a folder
relative to the location of the referenced define.xml file. In the
create_sasdata_fromxpt.sas SDTM driver program, the XPT files are
read from the
!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/transport
directory. The generated data sets are written to the
!sasroot/../../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdiscsdtm-3.1.2/sascstdemodata/derived/data
directory. At this point, the SDTM domain data sets should contain
the same information as the original domain data sets that were accessed
at the beginning of this process. By specifying a new target folder
location, the SDTM data sets can be validated against those referenced
in steps 1 and 3 above.
-
Source
metadata that describes the SDTM domains and columns is derived using
information contained in the CRT-DDS data sets derived in step 6.
Submit the create_sourcemetadata.sas SDTM driver program. For SDTM
3.1.2, it is installed in the
!sasroot/../../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata/programs
directory. In this exercise, this driver program calls the sdtmutil_createsrcmetafromcrtdds
macro, which uses a library of SAS data sets that capture define.xml
metadata (typically derived using the crtdds_read macro). The output
of this step is a set of SDTM metadata in source_tables, source_columns,
and source_study data sets. These data sets are written to the
!sasroot/../../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdiscsdtm-3.1.2/sascstdemodata/derived/metadata
directory. At this point, the SDTM metadata should contain the same
information as the original metadata that was accessed at the beginning
of this process. By specifying a new target folder location, the SDTM
metadata data sets can be validated against those referenced in steps
1 and 3 above.
-
SAS formats
that support SDTM controlled terminology are derived using information
contained in the CRT-DDS data sets that were derived in step 6. Submit
the create_formatsfromcrtdds.sas SDTM driver program. For SDTM 3.1.2,
this program is installed in the
!sasroot/../../ SASClinicalStandardsToolkitSDTM312/
1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata/programs
directory. The driver program accesses the sdtmutil_createformatsfromcrtdds.sas
macro and generates the controlled terminology SAS formats catalog
based on codelists specified in the define.xml file. The derived SAS
format catalog is written to the
!sasroot/../../ SASClinicalStandardsToolkitSDTM312/1.3/sample/cdiscsdtm-3.1.2/sascstdemodata/derived/formats
directory. These formats should match those formats that were referenced
by the SDTM columns at the beginning of this process. By specifying
a new target folder location, the SAS format catalog can be validated
against the catalog referenced in steps 1 and 3 above.
Note: When running multiple driver programs:
The SAS
Clinical Standards Toolkit uses autocall macro libraries to contain
and reference standard-specific code libraries. Once the autocall
path is set, and one or more macros have been used in an autocall
macro library, deallocation or reallocation of the autocall file reference
cannot occur unless the autocall path is reset to exclude the specific
file reference.
This becomes
a problem with repeated calls to %cstutil_processsetup() or %cstutil_allocatesasreferences
in the same SAS session. You might receive SAS errors, such as the
following one, unless you submit some specific SAS code:
ERROR - At least one file associated with fileref SDTMAUTO is still in use.
ERROR - Error in the FILENAME statement.
If you
call %cstutil_processsetup() or %cstutil_allocatesasreferences more
than once in the same SAS session, which typically uses %let _cstReallocateSASRefs=1
to tell the SAS Clinical Standards Toolkit to attempt reallocation,
use the following code between each code submission:
%let _cstReallocateSASRefs=1;
%include "&_cstGRoot/standards/cst-framework-1.3/programs/resetautocallpath.sas";
In the
driver programs provided with the SAS Clinical Standards Toolkit,
the previous code is commented so that it does not get submitted during
run time.
Once the
round trip exercise is complete, data derived from the process should
match the original data. There might be some metadata collected that
does not match exactly (particularly any date and time fields that
collect real-time information). Differences can be detected by doing
a PROC COMPARE with any of the derived data and metadata data sets
against the original data and metadata data sets.