Model Usage

Usage Scenario: Creating Metadata That Maps Data

This usage scenario describes how to map a physical table to another physical table, and how to split a physical table into two tables.

Requirements

This usage scenario assumes that the user has a general understanding of the Transform Submodel of the SAS Metadata Model.

Mapping a Physical Table to Another Physical Table

A table is mapped to another table by defining a transformation. The transformation of one table to another table could be a simple replication process from one location to another location. The replication process is described with metadata types from the Transform submodel. The locations of the tables are described with metadata types from the Software Deployment submodel.

A transformation can be associated with a job that is scheduled to run as a batch job or that is triggered by some other event. This scenario does not discuss how to associate a transformation with a job.

Figure 1 shows the basic metadata types used when mapping one PhysicalTable to another PhysicalTable.

[Figure 1. The transformation of one table to another table.]

Figure 1. The transformation of one table to another table.

The TransformationActivity metadata type is the starting point when defining a transformation. The TransformationActivity metadata type groups metadata objects that describe the steps in a transformation. It links them to a metadata object that describes the output of the transformation activity. Each step in a transformation is represented by an object of the TransformationStep metadata type. A TransformationActivity object is associated with a TransformationStep object by defining a Steps association. The TransformationActivity object in this scenario has one TransformationStep defined with the Steps association (numbered 1 in Figure 1). The TransformationActivity metadata type has a TransformationTargets association (numbered 2) to the output. In this scenario, the output is a PhysicalTable object named TBL2.

The actual mappings between the source and target tables in a TransformationStep are represented by the ClassifierMap metadata type. The ClassifierMap is associated to the TransformationStep with a Transformations association (numbered 3). The ClassifierMap has a ClassifierSources association (numbered 4) to the input PhysicalTable, which is named TBL1. The ClassifierMap has a ClassifierTargets association (numbered 5) to the output PhysicalTable, which is named TBL2. In addition, the ClassiferMap has a FeatureMaps association (numbered 6) to two FeatureMap objects. The FeatureMap objects do the actual mappings of columns in PhysicalTable TBL1 to columns with the same names in PhysicalTable TBL2. The FeatureSources association (numbered 8) defines the input for the mappings. The FeatureTargets association (numbered 7) defines the output of the mappings.

A FeatureMap object could have a SourceCode association to a TextStore object that contains more information about how to transform the columns. That information is not shown here. This scenario does a simple copy and keeps all data the same in the target columns.

Splitting One PhysicalTable into Two

Figure 2 shows the metadata types used to represent the following SAS DATA step code:

 
data TBL2(keep=c1 c2 options=?) 
     TBL3(keep=c1 c3 c4 options=?)
  set TBL1;
  if c1 < 10 then output TBL3;
  output TBL2;
run;

[Figure 2. The splitting one table into two tables.]

Figure 2. The splitting one table into two tables.

The TransformationActivity object is defined and has a Steps association (numbered 1 in Figure 2) to a TransformationStep object. In this scenario, the TransformationStep object has a Transformations association (numbered 2) to two ClassifierMap objects. Each ClassifierMap object represents a process that creates a new table. Each ClassifierMap has a ClassifierSources association (numbered 3) to input. The input can be any type of Classifier object. In this scenario, the input for both ClassifierMap objects is PhysicalTable TBL1. This is the table that will be split into two tables. Each ClassifierMap has a ClassifierTargets association (numbered 4) to output. Target output can be any type of Classifier object. The target output objects in this scenario are the PhysicalTable TBL2 and PhysicalTable TBL3.

Each ClassifierMap object is associated with a set of FeatureMap objects. One ClassifierMap has two FeatureMap objects defined. The other has three FeatureMap objects defined. The FeatureMap metadata type maps any number of features to other features. For example, the FeatureMap object could combine multiple columns using a mathematical function. In this scenario, the columns that are selected are not transformed. Each FeatureMap has a FeatureSources association to a column in TBL1. Each FeatureMap has a FeatureTargets association to a column in TBL2 or TBL3. In this scenario, a simple copy of the input column to a column in the new output table is represented. For example, the FeatureSources association (numbered 6) illustrates the input of Column C1 in TBL1 to Column C1 in TBL3 with a FeatureTargets association (numbered 7) in the FeatureMap.

A ClassifierMap object can have an AssociatedRowSelector association (numbered 8). The RowSelector object specifies which rows should be included in the ClassifierTargets association. The RowSelector object has a SourceCode association (numbered 9) to a TextStore object. The TextStore object stores the actual text expression used to determine which rows to include. The expression can contain macro variables that the application needs to replace with values from the metadata. If this is the case, there would be associated Variable objects. Variable objects would contain information about the text string or Marker that is to be substituted in the expression. They would also contain associations to the metadata objects where substitution values are found. The Variable metadata type has an attribute, ValueType=, that can contain the name of the attribute in the associated metadata object that should be used to get the substitution value. The RowSelector object has a Columns association (numbered 10) to the Column object used by this RowSelector.

If there were options in this scenario, they would be stored as Property objects and grouped in a PropertySet object. They would be associated to the ClassifierMap object with a PropertySets association (number 11). A Property object is associated to a PropertySet object with a Properties association (number 12). A Property object contains the name of the property or option and the value for the property. A PropertyRole= attribute can also be assigned to the Property.

Top of Page