Special Topic: A Round Trip Exercise Involving the CDISC SDTM and CDISC CRT-DDS Standards

The typical SAS Clinical Standards Toolkit workflow in support of the CDISC standards includes the definition and validation of SDTM submission data and the creation and validation of a define.xml file based on the SDTM domain data. This exercise illustrates how you can read a define.xml file to extract the data and metadata for the purposes of recreating the original source SDTM study. Recreating the original source study has value as a standalone exercise, either to extract a new SDTM study from a define.xml file or to create a new SDTM study using information in a define.xml file as a template.
As a round-trip exercise, this task validates the performance of the crtdds_write and crtdds_read SAS Clinical Standards Toolkit macros and allows a comparison of original and recreated SDTM metadata and data. The following display details the high-level workflow for this exercise.
Round Trip Process
Figure of the flow of round tripping the XML process
The following steps describe the workflow in more detail. The first five steps describe the derivation of the CDISC CRT-DDS 1.0 define.xml file.
  1. Access a study that contains valid CDISC SDTM data and metadata. This is a study that contains domain data (AE, DM, CO, and so on) and SAS Clinical Standards Toolkit metadata about that SDTM study, such as source_tables and source_columns. SAS Clinical Standards Toolkit also includes XSL style sheets, XML map files, and any metadata that is provided by SAS during the SAS Clinical Standards Toolkit installation.
  2. Use the set of sample driver programs that are provided in the SAS Clinical Standards Toolkit to define the input and output files for each process task and to invoke the macros that support each standard-specific task. The driver programs are designed to run with the sample studies, but can be modified as needed. New custom drivers can also be created and used.
  3. Submit the create_crtdds10_fromsdtm311.sas driver program to access the crtdds_sdtm311todefine10.sas macro, and create the 39 data sets that comprise the SAS representation of the CRT-DDS model. These 39 output data sets are written to the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/data directory.
  4. Validate the CRT-DDS data sets by submitting the validate_crtdds_data.sas driver program. This step is optional.
  5. Create the define.xml file by submitting the create_crtdds_define.sas driver program. This driver program generates the define.xml file from the 39 CRT-DDS data sets that were created in step 3. It also calls the crtdds_xmlvalidate macro to validate the XML file structure. The define.xml file is written to the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/sourcexml directory.
    At this point, a valid define.xml file has been created from the SAS representation of the CRT-DDS model. In the next steps, the SDTM data and metadata are recreated using the XML read process.
  6. Submit the create_sascrtdds_fromxml.sas driver program. This driver program reads the define.xml file created in step 5, and generates the SAS representation of the CRT-DDS model using the crtdds_read.sas macro. The data sets created in this step should match the data sets created in step 3. These data sets are written to the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/deriveddata directory. This driver program generates the source_tables and source_columns data sets in the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/derivedmetadata directory. By specifying new target folder locations (deriveddata and derivedmetadata), the data sets can be validated against the data sets that were created or referenced in step 3.
  7. SDTM domain data sets are created based on a reachable set of SAS transport files that are specified in the define. xml file. Submit the create_sasdata_fromxpt.sas SDTM driver program. For SDTM 3.1.2, the program is in the !sasroot/../../ SASClinicalStandardsToolkitSDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata/programs directory. This driver program accesses the sdtmutil_createsasdatafromxpt.sas macro to generate the SDTM domain data sets from the SAS transport files. Creation of the SAS transport files is not performed by SAS Clinical Standards Toolkit. These files would have been produced as a prerequisite to the generation of the define.xml file as a part of the Electronic Common Document preparation process. The sdtmutil_createsasdatafromxpt.sas macro assumes that the SAS transport files are reachable from a folder relative to the location of the referenced define.xml file. In the create_sasdata_fromxpt.sas SDTM driver program, the XPT files are read from the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/transport directory. The generated data sets are written to the !sasroot/../../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdiscsdtm-3.1.2/sascstdemodata/derived/data directory. At this point, the SDTM domain data sets should contain the same information as the original domain data sets that were accessed at the beginning of this process. By specifying a new target folder location, the SDTM data sets can be validated against those referenced in steps 1 and 3 above.
  8. Source metadata that describes the SDTM domains and columns is derived using information contained in the CRT-DDS data sets derived in step 6. Submit the create_sourcemetadata.sas SDTM driver program. For SDTM 3.1.2, it is installed in the !sasroot/../../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata/programs directory. In this exercise, this driver program calls the sdtmutil_createsrcmetafromcrtdds macro, which uses a library of SAS data sets that capture define.xml metadata (typically derived using the crtdds_read macro). The output of this step is a set of SDTM metadata in source_tables, source_columns, and source_study data sets. These data sets are written to the !sasroot/../../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdiscsdtm-3.1.2/sascstdemodata/derived/metadata directory. At this point, the SDTM metadata should contain the same information as the original metadata that was accessed at the beginning of this process. By specifying a new target folder location, the SDTM metadata data sets can be validated against those referenced in steps 1 and 3 above.
  9. SAS formats that support SDTM controlled terminology are derived using information contained in the CRT-DDS data sets that were derived in step 6. Submit the create_formatsfromcrtdds.sas SDTM driver program. For SDTM 3.1.2, this program is installed in the !sasroot/../../ SASClinicalStandardsToolkitSDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata/programs directory. The driver program accesses the sdtmutil_createformatsfromcrtdds.sas macro and generates the controlled terminology SAS formats catalog based on codelists specified in the define.xml file. The derived SAS format catalog is written to the !sasroot/../../ SASClinicalStandardsToolkitSDTM312/1.3/sample/cdiscsdtm-3.1.2/sascstdemodata/derived/formats directory. These formats should match those formats that were referenced by the SDTM columns at the beginning of this process. By specifying a new target folder location, the SAS format catalog can be validated against the catalog referenced in steps 1 and 3 above.
Note: When running multiple driver programs:
The SAS Clinical Standards Toolkit uses autocall macro libraries to contain and reference standard-specific code libraries. Once the autocall path is set, and one or more macros have been used in an autocall macro library, deallocation or reallocation of the autocall file reference cannot occur unless the autocall path is reset to exclude the specific file reference.
This becomes a problem with repeated calls to %cstutil_processsetup() or %cstutil_allocatesasreferences in the same SAS session. You might receive SAS errors, such as the following one, unless you submit some specific SAS code:
ERROR - At least one file associated with fileref SDTMAUTO is still in use.
ERROR - Error in the FILENAME statement. 
If you call %cstutil_processsetup() or %cstutil_allocatesasreferences more than once in the same SAS session, which typically uses %let _cstReallocateSASRefs=1 to tell the SAS Clinical Standards Toolkit to attempt reallocation, use the following code between each code submission:
%let _cstReallocateSASRefs=1;
%include "&_cstGRoot/standards/cst-framework-1.3/programs/resetautocallpath.sas";
In the driver programs provided with the SAS Clinical Standards Toolkit, the previous code is commented so that it does not get submitted during run time.
Once the round trip exercise is complete, data derived from the process should match the original data. There might be some metadata collected that does not match exactly (particularly any date and time fields that collect real-time information). Differences can be detected by doing a PROC COMPARE with any of the derived data and metadata data sets against the original data and metadata data sets.