Previous Page | Next Page

Planning Your Data Stores and Processes

Processes

Most of the work in defining a Job is in defining its processes. A process is a routine that creates a warehouse data store or that extracts data, transforms data, or loads data into a data store. In the Process Editor, you define metadata records that are used to generate or retrieve the source code for processes. Mappings, User Exits, Data Transfers, Record Selectors, and Load Steps are all metadata records that generate or retrieve processes.

Each process that you define in the Process Editor generates or retrieves code. SAS/Warehouse Administrator can generate source code for any process except a User Exit, a Detail Logical Table Load Step, or an ODD Load Step. However, you can specify a user-written routine for any process.

The sections that follow give an overview of SAS/Warehouse Administrator processes.


Mappings

A Mapping process is automatically added when you specify an input for most data stores in the Process Editor. A Mapping is a metadata record used to generate or retrieve a routine that maps columns from one or more data sources into one or more Data Tables, Detail Tables, OLAP Tables, or OLAP MDDBs.

For example, in Process Flow With Mapping in the Process Editor, columns from a source table named Customer are mapped to columns in a target table named Customer.

Process Flow With Mapping in the Process Editor

[Process Flow With Mapping in the Process Editor]

Common mappings include one-to-one (one data source to a target table), joins (one or more data sources merged by one or more common columns), and unions (two or more data sources appended to a target table). For details, see the two Mapping examples in Maintaining Processes.


User Exits

A User Exit process is a metadata record used to retrieve a user-written routine. The routine must be stored in a SAS catalog with an entry type of SOURCE or SCL. A User Exit routine often extracts or transforms information for a warehouse data store, but it could do many other tasks. Process Flow With User Exit in the Process Editor shows a User Exit that could run a validation routine on customer income levels before loading them into a warehouse table.

Process Flow With User Exit in the Process Editor

[Process Flow With User Exit in the Process Editor]

For details, see Example: Defining User Exit Process Properties.


Data Transfers

A Data Transfer process is a metadata record used to generate or retrieve a routine that moves data from one host to another. Data Transfers are required when an input source and the target data reside on different hosts.

For example, in Process Flow With Data Transfer in the Process Editor, suppose that the Customer source table was on Host 1, and the Customer target table was on Host 2.

Process Flow With Data Transfer in the Process Editor

[Process Flow With Data Transfer in the Process Editor]

The Process Flow in Process Flow With Data Transfer in the Process Editor illustrates how a Data Transfer could be used to copy the Customer source data on Host 1 to an intermediate work table on a Host 2. The Mapping in the Process Flow would then map columns from the intermediate work table to the target table.

If SAS/Warehouse Administrator generates the code for a Data Transfer, it uses SAS/CONNECT software and PROC UPLOAD or PROC DOWNLOAD to move the data. This method is most easily applied to transfers between a local host (host where SAS/Warehouse Administrator is installed) and a remote host. For details, see Example: Defining Data Transfer Process Properties to Move Data from Remote Host to Local Host.

If you need a remote-to-remote transfer, one solution is to specify a user-written transfer routine in the metadata for the Data Transfer process.

Note:   Data Transfers always execute on the remote host (a host other than the host where SAS/Warehouse Administrator is installed).   [cautionend]


Record Selectors

A Record Selector process is a metadata record used to generate or retrieve a routine that subsets data prior to loading it to a specified table. In the current release, a Record Selector can be used only to subset the source data specified in an ODD or in a Data File (which is an input to an ODD).

For example, Process Flow With Record Selector in the Process Editor illustrates a Record Selector process that subsets information from the Customer source.

Process Flow With Record Selector in the Process Editor

[Process Flow With Record Selector in the Process Editor]

For details, see Example: Defining Record Selector Process Properties.


Load Processes

A Load process (also called a Load Step) is a metadata record used to generate or retrieve a routine that puts data into a specified target object. After you define the metadata for a given data store, you must define a Load process, which actually creates and loads the data store.

Unlike data preparation processes (Mappings, User Exits, Data Transfers, and Record Selectors), Load processes do not have their own icons in a Process Flow. They are associated with the icons of the tables they create. For example, in Process Flow With Record Selector in the Process Editor, the Load processes for the Customer source table and the Customer target table are not shown in the Process Flow.

To view the Load process metadata for a given data store:

  1. Display the Process Flow for the data store.

  2. In the Process Flow, click the right mouse button on the data store whose Load process you would like to inspect. Select Edit Load Step from the pop-up menu.

    A window will display the Load process metadata for the selected data store.

For details, see the Load process examples in Maintaining Processes.

Previous Page | Next Page | Top of Page