What's New in the SAS Clinical Standards Toolkit

Overview

Here are some of the new capabilities in the SAS Clinical Standards Toolkit 1.4:
  • The CDISC ADaM 2.1 standard has been registered and contains all of the metadata for the ADSL and BDS data structures, based on the ADaM Implementation Guide and CDISC Controlled Terminology ADaM from the National Cancer Institute (NCI). Validation checks were created from information from CDISC ADaM Validation Checks Version 1.1 Maintenance Release.
  • The CDISC ODM 1.3.0 standard, including all metadata and validation checks, has been fully implemented. In addition, ODM Read and Write functionality has been added to the SAS Clinical Standards Toolkit. For a description and limitations of the implementation, see Supported Standards and XML-Based Standards.
  • Sample reports can output comma-separated values (*.csv) files, enabling you to view the output using Microsoft Excel.
  • The SAS Clinical Standards Toolkit global standards library folder structure for controlled terminology has been reconfigured to facilitate the SAS Clinical Standards Toolkit implementation. In previous versions of the SAS Clinical Standards Toolkit, whenever there was a new release of the controlled terminology dictionaries from NCI, an additional folder was required in the Standards directory. In the SAS Clinical Standards Toolkit 1.4, any additional folders needed have been replaced with a single folder named cdisc-terminology-1.4. In this folder, there are additional folders for each standard and corresponding subfolders for each NCI release of the controlled terminology dictionaries. In addition, a current folder has been added so that you can save the latest version of the NCI controlled terminology dictionary without changing any references in your SAS Clinical Standards Toolkit code. The SAS Clinical Standards Toolkit provides the current folder populated with the latest version of the controlled terminology dictionary.
  • The SAS Clinical Standards Toolkit metadata and code base are updated.
  • Two new validation check macros are available. These macros are specifically designed to handle cross-standard validation checks that occur in ADaM 2.1 validation.

Changes to Metadata and Code Base

Framework Changes

The following autocall macros are new. These macros are in the !sasroot/cstframework/sasmacro directory:
  • cst_getstandardsubtypes generates a data set containing the installed Clinical Terminology subtypes (for example, SDTM, CDASH, ADaM, or any user customizations).
  • cstcheck_columnexists is used in validation processes to determine whether one or more of the columns defined in columnScope exist in each of the tables defined in tableScope.
  • cstcheck_columnvarlist is used in validation processes to support comparison of multiple columns within the same data set or across multiple data sets.
    Note: As a general rule, this macro expects a check metadata columnScope syntax of {_cstList:var1+var2+var3...varn} for within-data-set assessments and {_cstList:var1...varn}{_cstList:var1...varn} for multi-data set assessments.
    Note: This macro requires use of _cstCodeLogic at a DATA step level (for example, a full DATA step or PROC SQL invocation). _cstCodeLogic creates a work file (_cstproblems) containing records in error. _cstCodeLogic must handle any data set joins when multiple data sets are involved in the column comparisons.
  • cstcheck_crossstdcomparedomains is used in validation processes to compare values for one or more columns in one table with those same columns in another domain in another standard, or it compares values against metadata from the comparison standard. This macro allows validation across standards.
    Note: This macro requires the use of _cstCodeLogic as a full DATA step or PROC SQL invocation. This DATA step or PROC SQL invocation assumes as input a work copy of the column metadata data set returned by the cstutil_buildcollist macro. Any resulting records in the derived data set represent errors to be reported.
  • cstcheck_crossstdmetamismatch is used in validation processes to identify inconsistencies among metadata across registered standards. This macro allows validation across standards.
    Note: cstcheck_crossstdmetamismatch requires the use of _cstCodeLogic as a full DATA step or PROC SQL invocation. This DATA step or PROC SQL invocation assumes as input a work copy of the column metadata data set returned by the cstutil_buildcollist macro. Any resulting records in the derived data set represent errors to be reported.
  • cstcheck_java determines whether any Java issues or related to Java issues exist in the previous DATA step. This macro must be called immediately after the DATA step that declares the Java object. It is called by crtdds_read.sas, crtdds_write.sas, crtdds_xmlvalidate.sas, odm_read.sas, odm_write.sas, and odm_xmlvalidate.sas programs.
  • cstutil_buildformatsfromxml is designed for use with CDISC XML-based standards such as CRT-DDS and ODM. Those standards capture acceptable values in codelists. This module reads the codelist information to create one or more SAS format catalogs, based on the xml:lang language tags. This macro is called by the odm_read and crtdds_read macros.
  • cstutil_createsubdir creates subdirectories on a computer that is not Microsoft Windows. The SAS Clinical Standards Toolkit sample drivers create output files that need to have Read and Write access to the subdirectories. This macro creates the subdirectories in the specified workspace. If a value is missing, the StudyOutputPath points to the Work directory, and any subdirectories are created under it. StudyOutputPath is referenced in SASReferences.
  • cstutil_createsublists creates the work._cstsublists data set that has interpreted validation check metadata as specified in the columnScope column in the expected form of [var1][var2]. This macro is used in the Validation Master data set in the check metadata codelogic field. This macro is not always called for the derivation of work._cstsublists.
  • cstutil_readxmltags provides a proof-of-concept implementation of a tool to read the element tags and attributes of an XML file to identify those element tags and attributes that the SAS Clinical Standards Toolkit does not currently handle using the CDISC ODM odm_read macro. This macro relies on a defined set of XSLT modules, metadata that specifies a SAS representation of ODM, and a SAS XML map file that reads a derived cubexml file. Each of these makes assumptions about the XML content to be read.
    Assumptions:
    • The XML file has previously been defined with a SAS fileref.
    • ODM reference metadata is available as defined in SASReferences.
    Limitations:
    • The code does not work on a continuous-stream (no line returns) XML file.
    • The code might not work well on multi-element rows like <Study><MetaDataVersionOID=<..><...>.
    • The code might not handle PCDATA.
    cstutil_readxmltags is used by find_unsupported_tags.sas.
  • cstutil_writeodmcubexml is used to create an XML file to be used by the define.xml process. There is one input to this macro: the MDP SAS data set that contains the member names and library references needed for the define process.
In addition to the new macros, all except one of the preexisting check macros (cstcheck*.sas) have been modified. Most of the modifications were minor. However, a few were more significant. You should compare the files, and determine whether the changes affect any of your current processes. The cstcheck_notimplemented macro was not modified. Macros are located in the !sasroot/cstframework/sasmacro directory.

CDISC SDTM Changes

For SDTM 3.1.1 and 3.1.2, validation check SDTM0452 was modified to check only those variables existing in your AE domain. This change is in the codelogic field of the Validation Master data set. In addition, a change was made to the codesource field from cstcheck_column to cstcheck_columnvarlist with the corresponding values needed in columnScope. codetype was set to 2 to denote a DATA step or PROC SQL invocation. reportingcolumns was set to blank.

CDISC CRT-DDS Changes

The validation.properties file was added to contain validation-specific settings for the global macro variables. These settings used to be in initialize.properties, which was modified by removing the validation-specific settings, and changing _cstSubjectColumns from none to _none_. These changes lessen the risk of using a customer variable. These two files are located in the /standards/cdisc-crtdds-1.0-1.4/programs folder.
One new CDISC CRT-DDS macro was created, and nine macros were modified. These macros are located in the /standards/cdisc-crtdds-1.0-1.4/macros folder.
The crtdds_sdtmtodefine macro replaces the crtdds_sdtm311todefine10 macro. This macro is called by the create_crtdds_from_sdtm.sas driver program. This macro creates the 39 tables for the SAS representation of the CRT-DDS files from the SDTM metadata. This macro, using SDTM table and column metadata as its source, populates a subset of 12 CRT-DDS data sets. The metadata source is specified via the sasreferences data set.
The crtdds_read, crtdds_write, and crtdds_validate macros now use the xmlv2 engine in SAS 9.3, and the xml92 engine in SAS 9.2.
The crtdds_codelistitems macro no longer tries to derive the Rank attribute, which is the numeric order of a CodedValue relative to a CodedValue for other CodeListItems in a CodeList. The reason for this change is that the Rank attribute should be used only where the relative value corresponding to an enumeration cannot or should not be determined by its lexical order. For example, assume that you have a list of enumerated text values including "Low", "Medium", and "High". You want to assign relative numeric values 1, 2, and 3, respectively, to these text values. You should include a Rank attribute for each EnumeratedItem defined. Without the applied Rank attribute, the normal lexical ordering would be "High", "Low", and "Medium".
The following macros were modified. Most of the modifications were minor. However, a few were more significant. You should compare the files, and determine whether the changes affect any of your current processes.
  • crtdds_codelistitems
  • crtdds_codelists
  • crtdds_computationmethods
  • crtdds_definedocument
  • crtdds_itemgroupdefitemrefs
  • crtdds_read
  • crtdds_sdtm311todefine10 (this macro is deprecated in the SAS Clinical Standards Toolkit 1.4)
  • crtdds_write
  • crtddsutil_buildchecktablelist

CDISC-Terminology Changes

The folder structure for CDISC controlled terminology has undergone significant changes. These changes were made to streamline the implementation of the SAS Clinical Standards Toolkit. In previous versions of the SAS Clinical Standards Toolkit, each new release of the NCI EVS controlled terminology dictionary required its own root-level folder in the cstGlobalLibrary/standards folder. In the SAS Clinical Standards Toolkit 1.4, there is now one folder in cstGlobalLibrary/standards named cdisc-terminology-1.4. This folder contains separate folders for each standard, and subfolders within each standard for each new release of the controlled terminology dictionary.
201104 is the default CDISC-Terminology standard version for SDTM 3.1.1 and SDTM 3.1.2 SAS Clinical Standards Toolkit 1.4 standards. For ADaM 2.1 the default CDISC-Terminology is 201101. In addition, for the convenience of our customers, the 201104 release of the CDISC Controlled Terminology for CDASH dictionary is shipped with this release of the SAS Clinical Standards Toolkit.
A new column, codelist_extensible, has been added to all of the cterms.sas7bdat data sets. This column is provided by NCI and denotes whether the format can have new values added to it by the customer. The value for this new column is Yes or No.
An existing column, CDISC_Preferred_Term, has been removed from all cterms data sets. This column was removed from the April 2011 (201104) NCI Controlled Terminology spreadsheet. CDISC_Preferred_Term was used by the SAS Clinical Standards Toolkit to determine whether duplicate values existed in the spreadsheet before generating the format catalog (cterms.sas7bcat). The NCI_Preferred_Term column is now used instead.