CDISC Dataset-XML 1.0

Purpose

CDISC Dataset-XML defines a standard format for transporting tabular data in XML between any two entities based on CDISC ODM XML. In addition to supporting the transport of data sets as part of a submission to the FDA, Dataset-XML can be used to exchange data between two parties. For example, the Dataset-XML data format can be used by a CRO to transmit SDTM or ADaM data sets to a sponsor organization. Dataset-XML supports SDTM, ADaM, and SEND data sets but can also be used to exchange any other type of tabular data set.
The metadata for a data set in a Dataset-XML file must conform to the Define-XML standard. Each Dataset-XML file contains data for a single data set, but a single Define-XML file describes all of the data sets included in the folder. Both Define-XML 1.0 and Define-XML 2.0 are supported for use with Dataset-XML.

Release Date

CDISC Dataset-XML Version 1.0 Specification, Production Version 1.0.0, April 22, 2014

Regulatory Basis

In the United States, the approval process for regulated human and animal health products requires the submission of data from clinical trials and other studies as expressed in the Code of Federal Regulations (CFR). The FDA established the regulatory basis for wholly electronic submission of data in 1997 with the publication of regulations on the use of electronic records in place of paper records (21 CFR Part 11). In 1999, the FDA standardized the submission of clinical and non-clinical data using the SAS Version 5 XPORT Transport Format and the submission of metadata using Portable Document Format (PDF), respectively. In 2005, the Study Data Specifications published by the FDA included the recommendation that data definitions (metadata) be provided as a Define-XML file.
On November 5, 2012, the FDA held a meeting entitled “Regulatory New Drug Review: Solutions for Study Data Exchange Standards”, the purpose of which was to solicit input regarding the advantages and disadvantages of current and emerging open, consensus-based standards for the exchange of regulated study data. CDISC Dataset-XML was presented as an alternative for consideration.
In 2014, the FDA conducted a pilot to evaluate CDISC Dataset-XML as a solution to the challenges of the SAS Version 5 XPORT transport.

CDISC Dataset-XML 1.0 SAS Data Set Construction

The SAS Clinical Standards Toolkit CDISC Dataset-XML 1.0 standard supports reading a Dataset-XML file, building a Dataset-XML file, and validating the structural integrity of a Dataset-XML file against an XML schema. To support this functionality, supplemental files include these global standards library files:
  • The Messages data set in the messages folder provides unified error messaging for all Dataset-XML processes.
  • SAS code in the macros folder provides CDISC Dataset-XML 1.0-specific code that augments code that is provided in the primary SAS Clinical Standards Toolkit autocall library (!sasroot/cstframework/sasmacro).
  • The referencexml folder contains SAS XML map files, which are used to read XML files into SAS data sets.