Writing XML Files

Overview

Support of CDISC XML-based standards, such as CDISC CRT-DDS 1.0, CDISC Define-XML 2.0, and CDISC ODM, includes the ability to render these files in SAS data set format and the ability to create model-specific XML files from a SAS data set representation of those standards.
In the SAS Clinical Standards Toolkit, you can create a CDISC CRT-DDS 1.0 define.xml file or CDISC Define-XML 2.0 file (including Analysis Results Metadata 1.0) that references a CDISC SDTM study, a SEND study, or a CDISC ADaM study. You can also create a CDISC ODM 1.3.0 XML file or a CDISC ODM 1.3.1 file.
The next section outlines the basic workflow for the creation of model-specific XML files.

Basic Workflow

Here is the basic workflow for writing XML files:
  1. Build the SAS representation of a given XML-based standard by referencing an existing set of data and metadata about a clinical study, or by creating data and metadata about a new clinical study in the standard-specific SAS format.
  2. (Optional) Validate the SAS representation of the XML-based standard (to include foreign key relationships, value conformance to a set of expected values, and so on).
  3. Create a standardized intermediate cubeXML file using the data and metadata contained in the SAS representation of the standard.
  4. (Build and) reference a set of valid XSL style sheets for each target data set (such as ItemDefs.xsl).
  5. Use the SAS DATA step component JavaObj to read the cubeXML file using the XSL style sheets to create the target standard-specific XML file.
  6. (Optional) Validate the structure and syntax of the XML file that was created against an XML schema.

Creating a CDISC CRT-DDS 1.0 define.xml File

There are four key macros that are provided with the SAS Clinical Standards Toolkit that support creation of a CDISC CRT-DDS 1.0 define.xml file. The four macros are listed in the order in which they are executed:
  1. The %CRTDDS_SDTMTODEFINE macro creates the 39 tables for the SAS representation of the CRT-DDS files from SDTM metadata. This macro, using SDTM table and column metadata as its source, populates a subset of 19 CRT-DDS data sets.
    The %CRTDDS_ADAMTODEFINE macro is similar to the %CRTDDS_SDTMTODEFINE macro but uses ADaM table and column metadata.
  2. The %CRTDDS_VALIDATE macro submits a set of validation checks based on what is defined in the Validation Control data set to validate the referenced SAS representation of the CRT-DDS files.
  3. The %CRTDDS_WRITE macro creates the define.xml file from the SAS representation of the CRT-DDS files.
  4. The %CSTUTILXMLVALIDATE macro validates that the XML file is structurally and syntactically correct according to the XML schema for the CRT-DDS 1.0 standard. This macro is important if you customize the define.xml file outside of the workflow. For example, if you edit the define.xml file to add links for annotated CRF pages, this macro validates the syntax.
These macros are called by driver programs that are responsible for properly setting up each SAS Clinical Standards Toolkit process to perform a specific SAS Clinical Standards Toolkit task. Several sample driver programs are provided with the SAS Clinical Standards Toolkit CDISC CRT-DDS standard related to the creation of the define.xml file.
Here is the purpose of each of these driver programs:
  • The create_crtdds_from_sdtm.sas driver program sets up the required metadata and SASReferences data set for the sample study. It runs the %CRTDDS_SDTMTODEFINE macro. It creates the SAS representation of the CRT-DDS data sets from the sample study SDTM data sets.
  • The validate_crtdds_data.sas driver program validates the SAS representation of the CRT-DDS define data sets based on the selected CRT-DDS validation checks. This driver program can be run multiple times until data validation has been reconciled.
  • The create_crtdds_define.sas driver program creates the CDISC CRT-DDS 1.0 define.xml file. It runs the %CRTDDS_WRITE and %CSTUTILXMLVALIDATE macros. This driver program creates and validates the XML syntax for the define.xml file.
These driver programs are examples that are provided with the SAS Clinical Standards Toolkit. You can use these driver programs or create your own. The names of these driver programs are not important. However, the content is important and demonstrates how the various SAS Clinical Standards Toolkit framework macros are used to generate the required metadata files.
The driver programs create a define.xml based on SDTM metadata. Similar programs are provided with the SAS Clinical Standards Toolkit for the creation of a define.xml based on ADaM metadata.

Sample Driver Program: create_crtdds_from_sdtm.sas

Overview

The create_crtdds_from_sdtm.sas driver program sets up the required environment variables and library references to initiate the %CRTDDS_SDTMTODEFINE macro. This macro extracts data from the SDTM metadata files. (For more information about the source_tables and source_columns data sets, see Source Metadata.) Depending on the available source information, the macro attempts to convert the information into the 39 tables that represent the SAS interpretation of the CDISC CRT-DDS 1.0 model. All 39 data sets are created, but only those data sets with available data are populated. The other tables contain zero observations.
The following table lists the parameters for the driver program:
Parameters for the create_crtdds_from_sdtm.sas Driver Program
Parameter
Required
Description
_cstOutLib
Yes
The library reference (LIBNAME) where the tables are created.
_cstSourceTables
Yes
The data set that contains the SDTM metadata for the domains to include in the CRT-DDS file.
_cstSourceColumns
Yes
The data set that contains the SDTM metadata for the domain columns to include in the CRT-DDS file.
_cstSourceStudy
Yes
The data set that contains the SDTM metadata for the studies to include in the CRT-DDS file.
_cstSourceValues
No
The data set that contains the SDTM metadata for the Value Level columns to include in the CRT-DDS file.
_cstSourceDocuments
No
The data set that contains the SDTM metadata for the Document references to include in the CRT-DDS file.
Here is an example of a call to the %CRTDDS_SDTMTODEFINE macro:
%crtdds_sdtmtodefine(
  _cstOutLib=srcdata,
  _cstSourceTables=sampdata.source_tables,
  _cstSourceColumns=sampdata.source_columns,
  _cstSourceValues=sampdata.source_values,
  _cstSourceDocuments=sampdata.source_documents,
  _cstSourceStudy=sampdata.source_study
  );
In the example, the %CRTDDS_SDTMTODEFINE macro writes all of the CRT-DDS 1.0 defined tables to the Srcdata library.
The create_crtdds_from_sdtm.sas driver program is provided with the SAS Clinical Standards Toolkit, and it is ready to run on any of the SDTM sample studies. The driver program can be run interactively or in batch. To run the driver program interactively, start a SAS session, and load the driver program into the SAS editor.
The driver program is located here:
sample study library directory/cdisc-crtdds-1.0–1.7/programs

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It references the input files that are needed, the librefs and filenames to use, and the names and locations of data sets to be created by the process. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, there are five input file references and one output data set reference that are key to the successful completion of the create_crtdds_from_sdtm.sas driver program. Key Components of the SASReferences Data Set for the create_crtdds_from_sdtm.sas Driver Program lists these files and data sets, and they are discussed in separate sections. In the sample create_crtdds_from_sdtm.sas driver program, these values are set for &studyRootPath and &studyOutputPath:
&studyRootPath=sample study library directory/cdisc-sdtm-3.1.3–1.7/sascstdemodata
&studyOutputPath=sample study library directory/cdisc-crtdds-1.0–1.7
Key Components of the SASReferences Data Set for the create_crtdds_from_sdtm.sas Driver Program
Metadata Type
SAS LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
sourcemetadata
sampdata
libref
&studyRootPath/metadata
source_tables.sas7bdat
sourcemetadata
sampdata
libref
&studyRootPath/metadata
source_columns.sas7bdat
sourcemetadata
sampdata
libref
&studyRootPath/metadata
source_study.sas7bdat
sourcemetadata
sampdata
libref
&studyRootPath/metadata
source_values.sas7bdat
sourcemetadata
sampdata
libref
&studyRootPath/metadata
source_documents.sas7bdat
Output
sourcedata
srcdata
libref
&studyOutputPath/data

Process Inputs

The sourcemetadata type refers to three data sets that contain the SDTM domain metadata: source_tables, source_columns, and source_values. These data sets are stored in the same library.
The sample create_crtdds_from_sdtm.sas driver program provided with the SAS Clinical Standards Toolkit references a source CDISC SDTM 3.1.3 study. So, the source_tables data set contains SDTM 3.1.3 metadata about each standard domain defined in the Study Data Tabulation Model Implementation Guide: Human Clinical Trials (Version 3.1.3) and includes any customizations that you have added. The source_columns data set contains similar metadata but it is at the column level. The source_values data set contains Value Level metadata. The source metadata is read from this location:
sample study library directory/cdisc-sdtm-3.1.3–1.7/sascstdemodata/metadata
This location is represented in the driver program by the sampdata library name.
A source study data set (source_study) is required by this driver program. The following table lists the variables that are required in this data set:
Variables Required in the Source Study Data Set (source_study)
Variable*
Required
Description
StudyName
Yes
The name of the study. This value is used to populate the srcdata.study.studyname column.
DefineDocumentName
Yes
The name of the define document to create. This value is used to populate the srcdata.definedocument.FileOID.
SASref
Yes
The reference that ties the study name to the corresponding domains that are associated with this study in the source_tables and source_columns data sets.
ProtocolName
Yes
The name of the protocol for the study. This value is used to populate the srcdata.study.protocolname column.
StudyDescription
Yes
The description of the study. This value is used to populate the srcdata.study.studydescription column.
Note: You cannot use commas, semicolons, or quotation marks in the description.
Standard
Yes
The name of the standard in the SAS Clinical Standards Toolkit. (For example, CDISC-SDTM.)
StandardVersion
Yes
The version of the standard in the SAS Clinical Standards Toolkit. (For example, 3.1.3.)
FormalStandard
Yes
The formal name of the standard as used in CRT-DDS. (For example, CDISC SDTM.)
FormalStandardVersion
Yes
The formal version of the standard as used in CRT-DDS. (For example, 3.1.3.)
*All variables are required to be non-blank.
Only a single study can be referenced in the source study data set.

Process Outputs

The sourcedata type is the library where the metadata files are created. These metadata files are the data sets that comprise the SAS representation of the CDISC CRT-DDS 1.0 standard. The create_crtdds_from_sdtm.sas driver program creates 39 data sets. Most of these data sets have zero observations because there is no default SDTM metadata source. In the SAS Clinical Standards Toolkit sample study, these data sets are written to the sample study library directory/cdisc-crtdds-1.0–1.7/data directory. This location is represented in the driver program by the srcdata library name.

Process Results

When the driver program finishes running, the sdtmtodefine_results data set is created. This data set contains informational, warning, and error messages that were generated by the submitted driver program.
Example of a Partial Results Data Set from CRT-DDS Sample Study
Example of the partial results data set from the CRT-DDS sample study

Sample Driver Program: create_crtdds_define.sas

Overview

The create_crtdds_define.sas driver program sets up the required environment variables and library references to initiate the %CRTDDS_WRITE macro. This macro reads the 39 data sets that comprise the SAS representation of the CDISC CRT-DDS 1.0 model, and it converts that information to the required define.xml structure. If source metadata or data are missing, then empty elements and attributes are not created in the define.xml file. The inputs and outputs are specified in the SASReferences data set.
Note: For more information about the %CRTDDS_WRITE macro, see the SAS Clinical Standards Toolkit: Macro API Documentation.
Here is an example of a call to the %CRTDDS_WRITE macro:
%crtdds_write(_cstCreateDisplayStyleSheet=1,
              _cstOutputEncoding=UTF-16,
              _cstResultsOverrideDS=&_cstResultsDS);
In this example, a default style sheet is generated in the same directory as the XML output based on the information in the SASReferences data set. XML encoding is set to UTF-16, and process results are written to the default &_cstResultsDS data set.
Here is the call to the macro from the sample create_crtdds_define.sas driver program:
%crtdds_write(_cstCreateDisplayStyleSheet=1);
The call creates a display style sheet and uses default values for the parameters.
The create_crtdds_define.sas driver program is ready to run on any of the CDISC SDTM sample studies. The driver program can be run interactively or in batch.
The driver program is located here:
sample study library directory/cdisc-crtdds-1.0–1.7/programs
Multiple tasks can be executed in any SAS Clinical Standards Toolkit driver program. The create_crtdds_define.sas driver program calls both the %CRTDDS_WRITE macro to create the define.xml file, and the %CSTUTILXMLVALIDATE macro to validate the syntax of the generated define.xml file. For more information about the %CSTUTILXMLVALIDATE macro, see Validation of XML-Based Standards.

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It references the input files that are needed, the librefs and filenames to use, and the names and locations of data sets to be created by the process. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, there are two input file references and three output data set references that are key to the successful completion of the create_crtdds_define.sas driver program. Key Components of the SASReferences Data Set for the %CRTDDS_WRITE Macro lists these files and data sets, and they are discussed in separate sections. In the sample create_crtdds_define.sas driver program, these values are set for &studyRootPath and &studyOutputPath:
&studyRootPath=sample study library directory/cdisc-crtdds-1.0–1.7
&studyOutputPath=sample study library directory/cdisc-crtdds-1.0–1.7
Key Components of the SASReferences Data Set for the %CRTDDS_WRITE Macro
Metadata Type
LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
control
control
libref
&workpath
sasreferences.sas7bdat
sourcedata
srcdata
libref
&studyRootPath/data
Output
referencexml
xslt01
filename
&studyOutputPath/sourcexml
define-v1-updated-html.xsl
results
results
LIBNAME
&studyOutputPath/results
write_results.sas7bdat
externalxml
extxml
filename
&studyOutputPath/sourcexml
define.xml

Process Inputs

Use of the control library name that points to the path in the &workpath macro variable demonstrates a technique of documenting the derivation of the SASReferences data set in the SAS Work library. The driver program initiates the macro variable &workpath with this SAS code:
%let workPath=%sysfunc(pathname(work));
The sourcedata type is the library that contains the 39 data sets that might have been populated by the create_crtdds_from_sdtm.sas driver program. These metadata files are the data sets that constitute the SAS representation of the CDISC CRT-DDS 1.0 standard. In the SAS Clinical Standards Toolkit sample study, these data sets are read from the sample study library directory/cdisc-crtdds-1.0–1.7/data directory. This location is represented in the driver program by the Srcdata library name.

Process Outputs

The externalxml type refers to the define.xml file. This file is accessed in the driver program from the extxml filename statement, and is written to the sample study library directory/cdisc-crtdds-1.0–1.7/sourcexml directory.
The referencexml type can serve as either an input or output file reference. If the path and filename are not specified, the %CRTDDS_WRITE macro interprets the _cstCreateDisplayStyleSheet=1 parameter to indicate the default style sheet that is provided by the SAS Clinical Standards Toolkit in the global standards library. If a path and filename are specified, the referencexml type serves as an output file reference for the %CRTDDS_WRITE macro. The default style sheet is copied from the global standards library to the path and filename that are specified.
The results type refers to the write_results data set that documents the results of the create_crtdds_define.sas driver program. In the SAS Clinical Standards Toolkit CDISC CRT-DDS folder hierarchy, this information is written to the sample study library directory/cdisc-crtdds-1.0–1.7/results directory.

Process Results

Inclusion of the results record (row) in the SASReferences data set indicates that the process results are to be copied to a write_results data set located in the specified SAS library.
Example of a Partial Results Data Set from the CRT-DDS Sample Study
Example of a partial Results data set from the CRT-DDS sample study

Creating a define.pdf File from the SAS Representation of the CDISC CRT-DDS 1.0 Standard

The CDER Data Standards Common Issues Document (Version 1.1/December 2011) states:
“A critical component of data submission is the define file. A properly functioning define.xml file is an important part of the submission of standardized electronic datasets and should not be considered optional. As a transition step, CDER prefers that sponsors submit both the define.pdf and define.xml formats. The define.pdf is primarily for printing purposes and need not include hyperlinks. CDER will advise when it is ready to only receive define.xml.”
The SAS Clinical Standards Toolkit has a macro that supports the creation of a define.pdf file from the SAS representation of a CDISC CRT-DDS 1.0 standard. This macro is called %CRTDDS_WRITEPDF and is located here:
global standards library directory/standards/cdisc-crtdds-1.0-1.7/macros
The %CRTDDS_WRITEPDF macro supports the creation of a define.pdf file for the CDISC ADaM, SDTM, and SEND standards. The contents of the sections (which attributes are printed) is based on the Study Data Tabulation Model Metadata Submission Guidelines (SDTM-MSG) (http://www.cdisc.org/sdtm, 2011-12-31).
The define.pdf file has an optional table of contents and these sections:
  • Dataset level metadata
  • Variable level metadata
  • Value level metadata
  • Algorithms (Computational Methods)
  • Controlled Terminology
The following parameters are the most important parameters for the %CRTDDS_WRITEPDF macro:
  • _cstCDISCStandard
    The CDISC standard for which the define.pdf is created. Valid values: SDTM, SEND, and ADAM. The default is SDTM.
  • _cstSourceLib
    The library that contains the CRT-DDS SAS data sets. If not provided, the code looks in SASReferences for type=sourcedata.
  • _cstReportOutput
    The name of the PDF to create. If not provided, the code looks in SASReferences for type=report.
  • _cstLinks
    Indicates whether the macro creates internal hyperlinks in the PDF. Valid values: Y or N. The default is N.
  • _cstTOC
    Indicates that the macro creates a table of contents in the PDF. Valid values: Y or N. The default is N.
Two sample driver programs are provided with the SAS Clinical Standards Toolkit to demonstrate the use of the %CRTDDS_WRITEPDF macro:
sample study library directory/cdisc-crtdds-1.0-1.7/programs/create_crtdds_define_pdf.sas
sample study library directory/cdisc-crtdds-1.0-1.7/programs/create_crtdds_define_pdf_adam.sas
The following displays show examples of define.pdf files that were created by the %CRTDDS_WRITEPDF macro:
Example define.pdf File for SDTM
Example define.pdf file for SDTM
Example define.pdf File for ADaM
Example define.pdf file for ADaM

Creating a CDISC Define-XML 2.0 define.xml File (Including Analysis Results Metadata 1.0)

There are three key macros that are provided with the SAS Clinical Standards Toolkit that support creation of a CDISC Define-XML 2.0 define.xml file. The three macros are listed in the order in which they are executed:
  1. The %DEFINE_SOURCETODEFINE macro creates the tables for the SAS representation of the CDISC Define-XML 2.0 files from study metadata. This macro, using SDTM or ADaM table metadata and column metadata as its source, populates a subset of the Define-XML 2.0 data sets.
  2. The %DEFINE_WRITE macro creates the define.xml file from the SAS representation of the CDISC Define-XML 2.0 files.
  3. The %CSTUTILXMLVALIDATE macro validates that the XML file is structurally and syntactically correct according to the XML schema for the CDISC Define-XML 2.0 standard.
These macros are called by driver programs that are responsible for properly setting up each SAS Clinical Standards Toolkit process to perform a specific SAS Clinical Standards Toolkit task. Several sample driver programs are provided with the SAS Clinical Standards Toolkit CDISC Define-XML 2.0 standard related to the creation of the define.xml file.
Here is the purpose of each of these driver programs:
  1. The create_sasdefine_from_source.sas driver program sets up the required metadata and SASReferences data set for the sample study. It runs the %DEFINE_SOURCETODEFINE macro. It creates the SAS representation of the CDISC Define-XML 2.0 data sets from the sample study data sets.
  2. The create_definexml.sas driver program creates the CDISC Define-XML 2.0 define.xml file. It runs the %DEFINE_WRITE and %CSTUTILXMLVALIDATE macros. This driver program creates and validates the XML syntax for the define.xml file.
Note: The create_definexml_from_source.sas and create_definexml_from_source_adam.sas driver programs combine the two purposes into one driver program.
These driver programs are examples that are provided with the SAS Clinical Standards Toolkit. You can use these driver programs or create your own. The names of these driver programs are not important. However, the content is important and demonstrates how the various SAS Clinical Standards Toolkit framework macros are used to generate the required metadata files.
The driver programs create a define.xml file based on SDTM or ADaM metadata.

Sample Driver Program: create_sasdefine_from_source.sas

Overview

The create_sasdefine_from_source.sas driver program sets up the required environment variables and library references to initiate the %DEFINE_SOURCETODEFINE macro. This macro extracts data from the SDTM or ADaM metadata files. (For more information about the source_tables and source_columns data sets, see Source Metadata.) Depending on the available source information, the macro attempts to convert the information into the tables that represent the SAS interpretation of the CDISC Define-XML 2.0 model.
When the macro parameter _cstFullModel has the value N, only the 31 Define-XML 2.0 core tables are created. Otherwise, all 46 tables in the Define-XML 2.0 reference standard are created, but only those tables with available data are populated. The other tables contain zero observations. When the macro parameter _cstCheckLengths has the value Y, the macro checks the actual value lengths of variables with DataType=text against the lengths defined in the metadata templates. If the lengths are short, a warning is written to the log file and the Results data set.
Note: For more information about the %DEFINE_SOURCETODEFINE macro, see the SAS Clinical Standards Toolkit: Macro API Documentation.
Here is an example of a call to the %DEFINE_SOURCETODEFINE macro:
%define_sourcetodefine(
   _cstOutLib=srcdata,
   _cstSourceStudy=sampdata.source_study,
   _cstSourceTables=sampdata.source_tables,
   _cstSourceColumns=sampdata.source_columns,
   _cstSourceCodeLists=sampdata.source_codelists,
   _cstSourceDocuments=sampdata.source_documents,
   _cstSourceValues=sampdata.source_values,
   _cstFullModel=N,
   _cstCheckLengths=Y,
   _cstLang=en
   );
In this example, the %DEFINE_SOURCETODEFINE macro writes all of the Define-XML 2.0 tables to the Srcdata library.
Here is an example that uses analysis results metadata:
%define_sourcetodefine(
    _cstOutLib=srcdata,
    _cstSourceStudy=sampdata.source_study,
    _cstSourceTables=sampdata.source_tables,
    _cstSourceColumns=sampdata.source_columns,
    _cstSourceCodeLists=sampdata.source_codelists,
    _cstSourceDocuments=sampdata.source_documents,
    _cstSourceValues=sampdata.source_values,
    _cstSourceAnalysisResults=sampdata.source_analysisresults,
   _cstFullModel=N,
   _cstCheckLengths=Y,
   _cstLang=en
  );
In this example, eight extra tables are created with metadata for analysis results.
The create_sasdefine_from_source.sas driver program is provided with the SAS Clinical Standards Toolkit, and it is ready to run on any of the SDTM or ADaM sample studies. The driver program can be run interactively or in batch. To run the driver program interactively, start a SAS session, and load the driver program into the SAS editor.
The driver program is located here:
sample study library directory/cdisc-definexml-2.0.0–1.7/programs

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It references the input files that are needed, the librefs and filenames to use, and the names and locations of data sets to be created by the process. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, there are seven input file references and one output data set reference that are key to the successful completion of the create_sasdefine_from_source.sas driver program. Key Components of the SASReferences Data Set for the create_sasdefine_from_source.sas Driver Program lists these files and data sets, and they are discussed in separate sections. In the sample create_sasdefine_from_source.sas driver program, these values are set for &studyRootPath and &studyOutputPath:
&studyRootPath=sample study library directory/cdisc-definexml-2.0.0–1.7/sascstdemodata
&studyOutputPath=sample study library directory/cdisc-definexml-2.0.0–1.7
Here is the specification of &_cstSrcMetaDataFolder in the SASReferences data set in the create_sasdefine_from_source.sas driver program:
&_cstSrcMetaDataFolder=%lowcase(&_cstTrgStandard)-&_cstTrgStandardVersion/metadata
Here are the macro variable assignments in the sample driver program to work with the sample SDTM 3.1.2 metadata:
%let _cstTrgStandard=CDISC-SDTM;
%let _cstTrgStandardVersion=3.1.2;
Here is how to use the sample driver program create_sasdefine_from_source.sas for ADaM metadata:
%let _cstTrgStandard=CDISC-ADAM;
%let _cstTrgStandardVersion=2.1;
Key Components of the SASReferences Data Set for the create_sasdefine_from_source.sas Driver Program
Metadata Type
SAS LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
sourcemetadata
sampdata
libref
&studyRootPath/ &_cstSrcMetaDataFolder
source_study
sourcemetadata
sampdata
libref
&studyRootPath/ &_cstSrcMetaDataFolder
source_tables
sourcemetadata
sampdata
libref
&studyRootPath/ &_cstSrcMetaDataFolder
source_colums
sourcemetadata
sampdata
libref
&studyRootPath/ &_cstSrcMetaDataFolder
source_codelists
sourcemetadata
sampdata
libref
&studyRootPath/ &_cstSrcMetaDataFolder
source_values
sourcemetadata
sampdata
libref
&studyRootPath/ &_cstSrcMetaDataFolder
source_documents
sourcemetadata
sampdata
libref
&studyRootPath/ &_cstSrcMetaDataFolder
source_analysisresults
Output
sourcedata
srcdata
libref
&studyOutputPath/data/%lowcase(&_cstTrgStandard)-&_cstTrgStandardVersion

Process Inputs

The sourcemetadata type refers to the data sets that contain the SDTM study metadata: source_study, source_tables, source_columns, source_values, source_codelists, source_documents, and source_analysisresults. . These data sets are stored in the same library.
The sample create_sasdefine_from_source.sas driver program provided with the SAS Clinical Standards Toolkit references a source CDISC SDTM 3.1.2 study. So, the source_tables data set contains SDTM 3.1.2 metadata about each standard domain defined in the CDISC SDTM Implementation Guide V3.1.2 and includes any customizations that you have added. The source_columns data set contains similar metadata but it is at the column level. The source_values data set contains Value Level metadata. The source_analysisresults data set would typically only be referenced in a CDISC ADaM study.The source metadata is read from this location:
sample study library directory/cdisc-definexml-2.0.0–1.7/sascstdemodata/cdisc-sdtm-3.1.2/metadata
This location is represented in the driver program by the sampdata library name.
A source study data set (source_study) can have only one record, and it is required by this macro. The following table lists the variables that are required in this data set:
Variables Required in the Source Study Data Set (source_study)
Variable*
Required
Description
SASref
Yes
The reference that ties the study name to the corresponding domains that are associated with this study in the source_tables and source_columns data sets.
StudyName
Yes
The name of the study. This value is used to populate the srcdata.study.studyname column.
StudyDescription
Yes
The description of the study. This value is used to populate the srcdata.study.studydescription column.
Note: You cannot use commas, semicolons, or quotation marks in the description.
ProtocolName
Yes
The name of the protocol for the study. This value is used to populate the srcdata.study.protocolname column.
StudyVersion
Yes
The name of the define document to create. This value is used to populate the srcdata.metadataversion.oid column.
FormalStandardVersion
Yes
The formal version of the standard as used in Define-XML 2.0. This value is used to populate the srcdata.definedocument.standardversion column. (For example, 3.1.2.)
FormalStandardName
Yes
The formal name of the standard as used in Define-XML 2.0. This value is used to populate the srcdata.definedocument.standardname column. (For example, SDTM-IG.)
Standard
Yes
The name of the standard in the SAS Clinical Standards Toolkit. (For example, CDISC-SDTM.)
StandardVersion
Yes
The version of the standard in the SAS Clinical Standards Toolkit. (For example, 3.1.2.)
*All variables are required to be non-blank.
Only a single study can be referenced in a source study data set. The %DEFINE_SOURCETODEFINE macro selects records from only the source_tables, source_colums, source_codelists, source_values, source_documents, and source_analysisresults data sets whose StudyVersion column value is equal to the value of the StudyVersion column in the source_study data set.

Process Outputs

The sourcedata type is the library where the metadata files are created. These metadata files are the data sets that constitute the SAS representation of the CDISC Define-XML 2.0 standard. The create_sasdefine_from_source.sas driver program creates 46 or 31 data sets, depending on the value of the _cstFullModel macro parameter. Most of these data sets have zero observations because there is no default SDTM metadata source. In the SAS Clinical Standards Toolkit sample driver program create_sasdefine_from_source.sas, these data sets are written to this location:
sample study library directory/cdisc-definexml–2.0.0-1.7/data/cdisc-sdtm-3.1.2
This location is represented in the driver program by the srcdata library name.

Process Results

When the driver program finishes running, the sourcetodefine_results data set is created in the Results library. This data set contains informational, warning, and error messages that were generated by the driver program.
Example of a Partial Results Data Set from Define-XML 2.0 Sample Study
Example of a Partial Results Data Set from Define-XML 2.0 Sample Study

Sample Driver Program: create_definexml.sas

Overview

The create_definexml.sas driver program sets up the required environment variables and library references to initiate the %DEFINE_WRITE macro. This macro reads the data sets that comprise the SAS representation of the CDISC Define-XML 2.0 model, and it converts that information to the required XML structure. If source metadata or data are missing, then empty elements and attributes are not created in the XML file. The inputs and outputs are specified in the SASReferences data set.
Note: For more information about the %DEFINE_WRITE macro, see the SAS Clinical Standards Toolkit: Macro API Documentation.
Here is an example of a call to the %DEFINE_WRITE macro:
%define_write(_cstCreateDisplayStyleSheet=1,
               _cstOutputEncoding=UTF-8,
               _cstResultsOverrideDS=&_cstResultsDS);
In this example, a default style sheet is generated in the same directory as the XML output based on the information in the SASReferences data set. XML encoding is set to UTF-16, and process results are written to the default &_cstResultsDS data set.
Here is the call to the macro from the sample create_definexml.sas driver program:
%define_write(_cstCreateDisplayStyleSheet=1);
The call creates a display style sheet and uses default values for the parameters.
The create_definexml.sas driver program is ready to run on any of the CDISC SDTM sample studies. The driver program can be run interactively or in batch.
The driver program is located here:
sample study library directory/cdisc-definexml-2.0.0–1.7/programs
Multiple tasks can be executed in any SAS Clinical Standards Toolkit driver program. The create_definexml.sas driver program calls both the %DEFINE_WRITE macro to create the Define-XML file and the %CSTUTILXMLVALIDATE macro to validate the syntax of the generated Define-XML file. For more information about the %CSTUTILXMLVALIDATE macro, see Validating an XML File against an XML Schema: %CSTUTILXMLVALIDATE Macro.

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It references the input files that are needed, the librefs and filenames to use, and the names and locations of data sets to be created by the process. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, there are two input file references and three output data set references that are key to the successful completion of the create_definexml.sas driver program. Key Components of the SASReferences Data Set for the %DEFINE_WRITE Macro lists these files and data sets, and they are discussed in separate sections. In the sample create_definexml.sas driver program, these values are set for &studyRootPath and &studyOutputPath:
&studyRootPath=sample study library directory/cdisc-definexml-2.0.0–1.7
&studyOutputPath=sample study library directory/cdisc-definexml-2.0.0–1.7
Key Components of the SASReferences Data Set for the %DEFINE_WRITE Macro
Metadata Type
LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
control
control
libref
&workpath
sasreferences
sourcedata
srcdata
libref
&studyRootPath/data/&_cstSrcDataFolder
Output
referencexml
xslt01
filename
define-2-0-0.xsl
results
results
libref
&studyOutputPath/results
write_results
externalxml
extxml
filename
&studyOutputPath/sourcexml
&_cstDefineFile..xml
report
html
filename
&studyOutputPath/sourcexml
&_cstDefineFile..html
Here is the specification of &_cstSrcMetaDataFolder in the SASReferences data set in the create_sasdefine_from_source.sas driver program:
_cstSrcDataFolder=%lowcase(&_cstTrgStandard)-&_cstTrgStandardVersion
Here are the variable assignments in the sample driver program to work with the sample SDTM 3.1.2 metadata:
%let _cstTrgStandard=CDISC-SDTM;
%let _cstTrgStandardVersion=3.1.2;
%let _cstDefineFile=define-sdtm-3.1.2.xml;

Process Inputs

Use of the control library name that points to the path in the &workpath macro variable demonstrates a technique of documenting the derivation of the SASReferences data set in the SAS Work library. The driver program initiates the macro variable &workpath with this SAS code:
%let workPath=%sysfunc(pathname(work));
The sourcedata type is the library that contains the Define-XML data sets that might have been populated by the create_sasdefine_from_source.sas driver program. These metadata files are the data sets that constitute the SAS representation of the CDISC Define-XML 2.0 standard. In the SAS Clinical Standards Toolkit sample study, these data sets are read from the sample study library directory/cdisc-definexml–2.0.0-1.7/data/cdisc-sdtm-3.1.2 directory. This location is represented in the driver program by the Srcdata library name.

Process Outputs

The externalxml type refers to the define-sdtm-3.1.2.xml file. This file is accessed in the driver program from the extxml filename statement, and is written to the sample study library directory/cdisc-definexml–2.0–1.7/sourcexml directory.
The referencexml type can serve as either an input or output file reference. If the path and filename are not specified, the %DEFINE_WRITE macro interprets the _cstCreateDisplayStyleSheet=1 parameter to indicate the default style sheet that is provided by the SAS Clinical Standards Toolkit in the global standards library. If a path and filename are specified, the referencexml type serves as an output file reference for the %DEFINE_WRITE macro. The default style sheet is copied from the global standards library to the path and filename that are specified.
The results type refers to the write_results data set that documents the results of the create_definexml.sas driver program. In the SAS Clinical Standards Toolkit CDISC Define-XML folder hierarchy, this information is written to the sample study library directory/cdisc-definexml–2.0-1.7/results directory.
In Microsoft Windows, the define-sdtm-3.1.2.xml file can be viewed by double-clicking it in the SAS Program Editor. This renders the file in your default web browser or in any other application that has been associated with XML files.
On UNIX, if you have not set up your browser configuration in SAS, you need to copy define-sdtm-3.1.2.xml and define2-0-0.xsl to an environment where you can display the XML file in a web browser.
Note: The style sheet information in define2-0-0.xsl is not guaranteed to work for all browser types and versions to produce the correct HTML. But, it does work with Internet Explorer 6.0 and higher. The Chrome browser, for example, does not allow local XML and XSLT processing.
The sample driver program also creates the HTML rendition in the same folder as the XML file using this code:
proc xsl
   in=extxml
   xsl=xslt01
   out=html;
run;
Instead of opening the XML file in a browser and letting the browser use the XSL file to render the HTML, you can directly open the HTML file.
Depending on your browser, you might see a security warning because the style sheet uses JavaScript.
The following display shows the define-sdtm-3.1.2.xml file in a web browser:
define-sdtm-3.1.2.xml File in a Web Browser
define-sdtm-3.1.2.xml File in a Web Browser
The following display shows the define-adam-2.1.xml file in a web browser:
define-adam-2.1.xml File in a Web Browser
define-adam-2.1.xml File in a Web Browser

Process Results

Inclusion of the results record (row) in the SASReferences data set indicates that the process results are to be copied to a write_results data set located in the specified SAS library.
Example of a Partial Results Data Set from the Define-XML 2.0 Sample Study
Example of a Partial Results Data Set from the Define-XML 2.0 Sample Study

Creating a CDISC ODM XML File

Note: The process to create a CDISC ODM XML file is the same for all ODM versions that are supported by the SAS Clinical Standards Toolkit. The process is explained using ODM version 1.3.0.
There are several key macros that are provided with the SAS Clinical Standards Toolkit that support the creation of an ODM XML file. The macros are listed in the order in which they are executed:
  1. The %ODM_VALIDATE macro submits a set of validation checks based on what is defined in the Validation Control data set to validate the referenced SAS representation of each ODM XML file.
  2. The %ODM_WRITE macro creates the ODM XML file from the SAS representation of the ODM files and validates that the XML file is structurally and syntactically correct. This macro is important if you customize the XML file outside of the workflow.
  3. The %CSTUTILXMLVALIDATE macro validates that the XML file is structurally and syntactically correct, according to the XML schema for the ODM standard. This macro is important if you customize the ODM XML file outside of the workflow.
These macros are called by driver programs that are responsible for properly setting up each SAS Clinical Standards Toolkit process to perform a specific SAS Clinical Standards Toolkit task. Two sample driver programs are provided with the SAS Clinical Standards Toolkit CDISC ODM standard related to the creation of the XML file.
Here is the purpose of each of these driver programs:
  1. The validate_odm_data.sas driver program validates the SAS representation of the ODM data sets based on the selected ODM validation checks. This driver program can be run multiple times until data validation has been reconciled.
  2. The create_odmxml.sas driver program calls the %ODM_WRITE macro to create the XML file. This driver program creates and validates the syntax for the XML file.
These driver programs are examples that are provided with the SAS Clinical Standards Toolkit. You can use these driver programs or create your own. The names of these driver programs are not important. However, the content is important and demonstrates how the various SAS Clinical Standards Toolkit framework macros are used to generate the required metadata files.

Sample Driver Program: create_odmxml.sas

Overview

The create_odmxml.sas driver program sets up the required environment variables and library references to initiate the %ODM_WRITE macro. This macro reads the 66 data sets that comprise the default SAS representation of the CDISC ODM 1.3.0 model, and it converts that information to the required ODM XML structure. If source metadata or data are missing, then empty elements and attributes are not created in the ODM XML file. The inputs and outputs are specified in the SASRferences data set.
For more information about the %ODM_WRITE macro, see the SAS Clinical Standards Toolkit: Macro API Documentation.
Here is an example of a call to the %ODM_WRITE macro:
%odm_write(_cstOutputEncoding=UTF-16, _cstResultsOverrideDS=&_cstResultsDS);
In this example, no default style sheet is generated for the XML output, XML encoding is set to UTF-16, and process results are written to the default &_cstResultsDS data set.
Here is the call to the macro from the sample create_odmxml.sas driver program:
%odm_write();
The call uses default values for the parameters. The create_odmxml.sas driver program is ready to run on the CDISC ODM sample study provided with the SAS Clinical Standards Toolkit. The driver program can be run interactively or in batch.
The driver program is located here:
sample study library directory/cdisc-odm-1.3.0–1.7/programs

The SASReferences Data Set

As a part of each SAS Clinical Standards Toolkit process setup, a valid SASReferences data set is required. It references the input files that are needed, the librefs and filenames to use, and the names and locations of data sets to be created by the process. It can be modified to point to study-specific files. For an explanation of the SASReferences data set, see SASReferences File.
In the SASReferences data set, there are one input file reference and two output data set references that are key to the successful completion of the create_odmxml.sas driver program. Key Components of the SASReferences Data Set for the %ODM_WRITE Macro lists these files and data sets, and they are discussed in separate sections. In the sample create_odmxml.sas driver program, these values are set for &studyRootPath and &studyOutputPath:
&studyRootPath=sample study library directory/cdisc-odm-1.3.0–1.7
&studyOutputPath=sample study library directory/cdisc-odm-1.3.0–1.7
Key Components of the SASReferences Data Set for the %ODM_WRITE Macro
Metadata Type
SAS LIBNAME or Fileref to Use
Reference Type
Path
Name of File
Input
sourcedata
srcdata
libref
&studyRootPath/data
Output
results
results
libref
&studyOutputPath/results
write_results.sas7bdat
externalxml
extxml
filename
&studyOutputPath/sourcexml
odm_sample_out.xml

Process Inputs

The sourcedata type is the library that contains the default 66 data sets that comprise the SAS representation of an ODM XML file. These data sets might have been populated by a previous odm_read task, or you might have processes in place that build these data sets from source files. In the SAS Clinical Standards Toolkit sample study, these data sets are read from the sample study library directory/cdisc-odm-1.3.0–1.7/data directory. This location is represented in the driver program by the Srcdata library name.

Process Outputs

The externalxml type refers to the ODM XML file that is to be derived by the process. This file is accessed in the driver program from the extxml filename statement, and is written to the sample study library directory/cdisc-odm-1.3.0–1.7/sourcexml directory.
Note: Unlike CDISC CRT-DDS or CDISC Define-XML, CDISC does not supply a default style sheet for ODM and one is not provided as part of the SAS Clinical Standards Toolkit. However, you can use the %ODM_WRITE macro, which provides the _cstCreateDisplayStyleSheet parameter, to use information that you provide in the Metadata Type referencexml record of the SASReferences file.
The results type refers to the write_results data set that documents the results of the create_odmxml driver program. In the SAS Clinical Standards Toolkit CDISC CRT-DDS folder hierarchy, this information is written to this location:
sample study library directory/cdisc-odm-1.3.0–1.7/results

Process Results

Inclusion of the results record (row) in the SASReferences data set indicates that the process results are to be copied to a write_results data set located in the specified SAS library.
Example of a Partial Results Data Set from the ODM Sample Data Hierarchy
Example of a partial Results data set from the ODM sample data hierarchy