Writing XML Files

Overview

Support of CDISC XML-based standards such as CDISC CRT-DDS (define.xml) and CDISC ODM includes the ability to render these files in SAS data set format and the ability to create model-specific XML files from a SAS data set representation of those standards.
In SAS Clinical Standards Toolkit 1.3, you can create a CDISC CRT-DDS 1.0 define.xml file that references a CDISC SDTM 3.1.1 or 3.1.2 study. CDISC ODM write capabilities are under development. (For the latest updates, see the SAS Support Web site for SAS Clinical Standards Toolkit at http://support.sas.com/rnd/base/cdisc/cst/index.html).
The next section outlines the basic workflow for the creation of model-specific XML files.

Basic Workflow

The following is the basic workflow for writing XML files:
  1. Build the SAS representation of a given XML-based standard by referencing an existing set of data and metadata about a clinical study, or by creating data and metadata about a new clinical study.
  2. Validate the SAS representation of the XML-based standard (to include foreign key relationships, value conformance to a set of expected values, and so on). This step is optional.
  3. Create a standardized intermediate cubeXML file using the data and metadata contained in the SAS representation of the standard.
  4. (Build and) reference a set of valid XSL style sheets for each target data set (such as ItemDefs.xsl).
  5. Use the SAS DATA step component JavaObj to read the cubeXML file using the XSL style sheets to create the target standard-specific XML file.
  6. Validate the structure and syntax of the XML file that was created. This step is optional.

Creating the CDISC CRT-DDS 1.0 define.xml File

There are four key macros that are provided with the SAS Clinical Standards Toolkit that support creation of the define.xml file. The four macros are listed in the order in which they are executed:
  • The crtdds_sdtm311todefine10.sas macro creates the 39 tables for the SAS representation of the CRT-DDS files from SDTM metadata. This macro, using SDTM table and column metadata as its source, populates a subset of 12 CRT-DDS data sets. Although the macro name implies that it is specific to SDTM 3.1.1, it operates on both CDISC SDTM 3.1.1 and 3.1.2 domains.
  • The crtdds_validate.sas macro submits a set of validation checks based on what is defined in the Validation Control data set to validate the referenced SAS representation of the CRT-DDS files.
  • The crtdds_write.sas macro creates the define.xml file from the SAS representation of the CRT-DDS files.
  • The crtdds_xmlvalidate.sas macro validates that the XML file is syntactically correct. This macro is important if you customize the define.xml file outside of the workflow. For example, if you edit the define.xml file to add links for annotated CRF pages, this macro validates the syntax.
These macros are called by driver programs that are responsible for properly setting up each SAS Clinical Standards Toolkit process to perform a specific SAS Clinical Standards Toolkit task. Three sample driver modules are provided with the SAS Clinical Standards Toolkit CDISC CRT-DDS standard. The following lists the purpose of each of these drivers:
  1. The create_crtdds10_from_sdtm311.sas driver program sets up the required metadata and SASReferences data set for the sample study. It runs the crtdds_sdtm311todefine10.sas macro. It creates the SAS representation of the CRT-DDS define data sets from the sample study SDTM data sets.
  2. The validate_crtdds_data.sas driver program validates the SAS representation of the CRT-DDS define data sets based on the selected CRT-DDS validation checks. This driver program can be run multiple times until data validation has been reconciled.
  3. The create_crtdds_define.sas driver program creates the define.xml file. It runs the crtdds_write and crtdds_xmlvalidate macros. This driver program creates and validates the XML syntax for the define.xml file.
These three driver programs are examples that are provided with the SAS Clinical Standards Toolkit. You can use these driver programs or create your own. The names of these driver programs are not important. However, the content is important and demonstrates how the various SAS Clinical Standards Toolkit framework macros are used to generate the required metadata files.

Sample Driver Program: create_crtdds10_from_sdtm311.sas

Overview

The create_crtdds10_from_sdtm311.sas driver program sets up the required environment variables and library references to initiate the crtdds_sdtm311todefine10.sas macro. This macro extracts data from the SDTM 3.1.1 or 3.1.2 metadata files. (For more information about the source_tables and source_columns data sets, see Source Metadata.) Depending on the available source information, the macro attempts to convert the information into the 39 tables that represent the SAS interpretation of the CDISC CRT-DDS 1.0 model. All 39 data sets are created, but only those data sets with the available data are populated. The other tables contain zero observations.
The following parameters must be set by the user before submitting the macro:
Parameters for the crtdds_sdtm311todefine10.sas Macro
Parameter
Required
Description
_cstOutLib
Yes
Identifies the library reference (LIBNAME) where the tables are created.
_cstSourceTables
Yes
A data set that contains the SDTM metadata for the domains to be included in the CRT-DDS file.
_cstSourceColumns
Yes
A data set that contains the SDTM metadata for the domain columns to be included in the CRT-DDS file.
_cstSourceStudy
Yes
A data set that contains the SDTM metadata for the studies to be included in the CRT-DDS file.
The following is an example of a call to the crtdds_sdtm311todefine10.sas macro:
%crtdds_sdtm311todefine10(
_cstOutLib=srcdata,
_cstSourceTables=sampdata.source_tables,
_cstSourceColumns=sampdata.source_columns,
_cstSourceStudy=sampdata.source_study
);
In the example, the crtdds_sdtm311todefine10 macro sets _cstOutLib to srcdata. All of the CRT-DDS-defined tables are written to the SAS Srcdata library. The _cstSourceTables parameter accesses the source_tables data set that exists in the Sampdata library (sampdata.source_tables). The _cstSourceColumns parameter accesses the source_columns data set that exists in the Sampdata library (sampdata.source_columns). The _cstSourceStudy parameter accesses the source_study data set that exists in the sampdata library (sampdata.source_study).
The create_crtdds10_from_sdtm311.sas driver program is provided with SAS, and it is ready to run on any of the SDTM sample studies. Although the program name implies that it is specific to SDTM 3.1.1, it operates on both CDISC SDTM 3.1.1 and 3.1.2 domains. The driver program can be run interactively or in batch. To run the program interactively, start a SAS session, and load the driver program into the SAS editor.
For SAS 9.1.3, the driver program is located at:
!sasroot/../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/programs/create_crtdds10_from_sdtm311.sas
For SAS 9.2, the driver program is located at:
!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/programs/create_crtdds10_from_sdtm311.sas
The value for !sasroot is the location of your SAS installation directory.

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, there are two input file references and one output reference that are key to successful completion of the create_crtdds10_from_sdtm311.sas driver program. The following table lists these files and data sets, and they are discussed in separate sections. In the sample create_crtdds10_from_sdtm311.sas driver program, the following values are set for &studyRootPath and &studyOutputPath and are specific to a SAS release.
SAS 9.1.3
&studyRootPath=!sasroot/../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata
&studyOutputPath=!sasroot/../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/
SAS 9.2
&studyRootPath=!sasroot/../../SASClinicalStandardsToolkit SDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata
&studyOutputPath=!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0
Key Components of the SASReferences Data Set
Input or Output
Metadata Type
SAS LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
sourcemetadata
sampdata
LIBNAME
&studyRootPath/metadata
source_tables.sas7bdat
Input
sourcemetadata
sampdata
LIBNAME
&studyRootPath/metadata
source_columns.sas7bdat
Output
sourcedata
srcdata
LIBNAME
&studyOutputPath/data

Process Inputs

The sourcemetadata type refers to two data sets that contain the SDTM domain metadata, source_tables and source_columns. Both data sets are stored in the same library. Because the sample create_crtdds10_from_sdtm311.sas driver program provided with the SAS Clinical Standards Toolkit references a source CDISC SDTM 3.1.2 study, the source_tables data set contains SDTM 3.1.2 metadata about each standard domain defined in the CDISC-SDTM 3.1.2 Implementation Guide and includes any customizations that you have added. The source_columns type contains similar metadata, but it is at the column level. This source metadata is read from the !sasroot/../../SASClinicalStandardsToolkitSDTM312/1.3/sample/cdisc-sdtm-3.1.2/sascstdemodata/metadata directory. This location is represented in the driver program by the Srcmeta library name.
A source study data set (source_study.sas7bdat) is required by this macro. The following variables are required in this data set:
Variables Required in the Source Study Data Set (source_study.sas7bdat)
Variable*
Required
Description
StudyName
Yes
Name of the study. This value is used to populate the srcdata.study.studyname column.
DefineDocumentName
Yes
Name of the define document being created. This value is used to populate the srcdata.definedocument.description and srcdata.definedocument.id columns.
SASref
Yes
Reference that ties the study name to the corresponding domains that are associated with this study in the source_tables and source_columns data sets.
ProtocolName
Yes
Name of the protocol for the study. This value is used to populate the srcdata.study.protocolname column.
StudyDescription
Yes
Description of the study. This value is used to populate the srcdata.study.studydescription column.
Note: You should not use commas, semicolons, or quotation marks in the description.
*All variables are required to be non-blank.
Multiple studies can be referenced in the source study data set, as well as source_columns and source_tables, by using different SASref values to link them across the tables.

Process Outputs

The sourcedata type is the library where the metadata files are created. These metadata files are the data sets that constitute the SAS representation of the CDISC CRT-DDS 1.0 standard. The create_crtdds10_from_sdtm311.sas driver program creates 39 data sets. Most of these data sets have zero observations because there is no default SDTM metadata source. In the SAS Clinical Standards Toolkit sample study, these data sets are written to the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/data directory. This location is represented in the driver program by the srcdata library name.

Process Results

When the driver program finishes running, the work._cstresults.sas7bdat data set is created. This data set contains informational, warning, and any error messages that were generated by the submitted driver program. Because the create_crtdds10_from_sdtm311.sas sample SASreferences data set does not include a results record, this example does not save the process results data set after the SAS session ends.
Example of a Partial Results Data Set from CRT-DDS Sample Study
Display of the Partial work._cstresults data set from the CRT-DDS sample study

Sample Driver Program: create_crtdds_define.sas

Overview

The create_crtdds_define.sas driver program sets up the required environment variables and library references to initiate the crtdds_write.sas macro. This macro reads the 39 data sets that comprise the SAS representation of the CDISC CRT-DDS 1.0 model, and converts that information to the required define.xml structure. If source metadata or data are missing, then empty elements and attributes are not created in the define.xml file. The inputs and outputs are specified in the SASRferences data set. The following table lists the optional parameters that can be set by the user when submitting the macro:
Parameters for the crtdds_write.sas Macro
Parameter
Required
Description
_cstCreateDisplayStyleSheet
Optional
Identifies whether the macro should create a style sheet in the same directory as the output XML file. If the value is 1, then the macro looks in the provided SASReferences file for a record with a type and subtype of referencexml and stylesheet and uses that file. If the value is 0, then the macro does not create the XSL, even if one is specified in the SASReferences file. The default setting is 1.
_cstOutputEncoding
Optional
XML encoding to use for the CRT-DDS file that is created. By default, UTF-8 is used.
_cstHeaderComment
Optional
A short comment is added at the top of the CRT-DDS file. If no comment is provided, then a default comment is used. The default comment notes that the file was produced by SAS Clinical Standards Toolkit.
_cstResultsOverrideDS
Optional
Provides the opportunity to designate [LIBNAME.]member as the name of the Results data set. If this parameter is omitted (default setting), then the Results data set specified by the &_cstResultsDS global macro variable is used.
_cstLogLevel
Optional
Identifies the level of error reporting. Valid values are Info, Warning, Error, and Fatal Error. The default setting is Info.
The following is an example of a call to the crtdds_write.sas macro:
%crtdds_write(_cstCreateDisplayStyleSheet=1, _cstOutputEncoding=UTF-16,
             _cstResultsOverrideDS=&_cstResultsDS);
In this example, a default style sheet is generated in the same directory as the XML output based on the information in the SASReferences data set. XML encoding is set to UTF-16, and process results are written to the default &_cstResultsDS data set.
The following is the call to the macro from the sample create_crtdds_define.sas driver program:
%crtdds_write(_cstCreateDisplayStyleSheet=1);
The call creates a display style sheet, and uses default values for the parameters.
The create_crtdds_define.sas driver program is ready to run on any of the CDISC SDTM sample studies. The driver program can be run interactively or in batch.
For SAS 9.1.3, the driver program is located at:
!sasroot/../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/programs/create_crtdds_define.sas
For SAS 9.2, the driver program is located at:
!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/programs/create_crtdds_define.sas
The value for !sasroot is the location of your SAS installation directory.
Multiple tasks can be executed in any SAS Clinical Standards Toolkit driver program. The create_crtdds_define.sas driver program calls both the crtdds_write macro to create the define.xml file, and the crtdds_xmlvalidate macro to validate the syntax of the generated define.xml file. For more information about the crtdds_xmlvalidate macro, see Validation of XML-Based Standards.

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, there are two input file references and three output references that are key to successful completion of the create_crtdds_define.sas driver program. The following table lists these files and data sets, and they are discussed in separate sections. In the sample create_crtdds_define.sas driver program, the following values are set for &studyRootPath and &studyOutputPath and are specific to a SAS release.
SAS 9.1.3
&studyRootPath=!sasroot/../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0
&studyOutputPath=!sasroot/../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0
SAS 9.2
&studyRootPath=!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0
&studyOutputPath=!sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0
Key Components of the SASReferences Data Set
Input or Output
Metadata Type
LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
control
control
LIBNAME
&workpath
sasreferences.sas7bdat
Input
sourcedata
srcdata
LIBNAME
&studyRootPath/data
Input or output
referencexml
xslt01
filename
Output
results
results
LIBNAME
&studyOutputPath/results
write_results.sas7bdat
Output
externalxml
extxml
filename
&studyOutputPath/sourcexml
define.xml

Process Inputs

Use of the control library name that points to the path in the &workpath macro variable illustrates a technique of documenting the derivation of the SASReferences data set in the SAS Work library. The driver program initiates the macro variable &workpath with the following SAS code:
%let workPath=%sysfunc(pathname(work));
The sourcedata type is the library that contains the 39 data sets that might have been populated by the create_crtdds10_from_sdtm311.sas driver program. These metadata files are the data sets that constitute the SAS representation of the CDISC CRT-DDS 1.0 standard. In the SAS Clinical Standards Toolkit sample study, these data sets are read from the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/data directory. This location is represented in the driver program by the Srcdata library name.

Process Outputs

The externalxml type refers to the define.xml file. This file is accessed in the driver program from the extxml filename statement, and is written to the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/sourcexml directory.
The referencexml type can serve as either an input or output file reference. Because the path and filename are not provided, the crtdds_write macro interprets the _cstCreateDisplayStyleSheet=1 parameter to use the default style sheet that is provided by SAS Clinical Standards Toolkit in the Global Library. Had a path and filename been provided, the referencexml type would serve as an output file reference for the crtdds_write macro to copy the default style sheet from the Global Library to the path and filename that were specified. The results type refers to the write_results data set that documents the create define process results. In the SAS Clinical Standards Toolkit CDISC CRT-DDS folder hierarchy, this information is written to the !sasroot/../../SASClinicalStandardsToolkitCRTDDS10/1.3/sample/cdisc-crtdds-1.0/results directory.

Process Results

Inclusion of the results record (row) in the SASReferences data set signals that the process results are to be copied to a write_results data set located in the specified SAS library.
Example of a Partial Results Data Set from the CRT-DDS Sample Study
Display of a partial Results data set from the CRT-DDS sample study