Registering Data Sources |
Overview |
This example describes how to create an ODD that executes user-written code that extracts information from a data source, and then saves the results to a SAS table or view. The location of the extraction table or view is then specified in the ODD. Use this approach when you want to transform source data before making it available in a Warehouse Environment.
Note: The following explanations describe the metadata and methods to achieve the desired results; it is assumed that the appropriate ODD Group exists.
Preparing to Create an ODD with a User-Written Load Step |
In addition to the preparation described in Preparing to Create ODDs, you must do the following preparation for ODDs with user-written Load Steps:
Input Sources | |
Output Targets | |
Processes |
Load Steps create tables or views. SAS provides a number of ways to create tables and views, including
The example in this section describes how to use the Query window to generate the majority of a Load Step routine. For more information about user-written source code, see the online Help. To display the relevant online Help, in the SAS System Help contents, select Help on SAS Software Products Using SAS/Warehouse Administrator Software Overview Overview of SAS/Warehouse Administrator user-written source code to display the topic User-Written Source Code.
Create a Load Step Routine (Query Window Method) |
This section summarizes how to use the Query window to generate the majority of a Load Step routine. The Query window is an interactive interface that enables you to build, save, and run queries without being an expert with SQL (Structured Query Language) or with the SAS SQL procedure. The query that you build in the Query window is passed to the SAS SQL procedure for processing when you run the query.
The code generated by the Query window can, with minor changes, be used to create a SAS table or view. If you know a little about PROC SQL syntax, this method might be easier than writing your own code to create a Load Step in SAS/Warehouse Administrator.
This section summarizes how to
use the Query window to generate a query against a data source in an input library
edit the query so that it creates a table or view and saves it to an output library.
Note: The input library, output library, and the code library must have library definitions in the Warehouse Environment that will contain the ODD.
For example, assume that the following SAS libraries have library definitions in the example Toy Store Environment:
OPERDATA (Operational Data library), where the input table for the query routine is stored
SOURCE (Source Code library), where the query routine will be saved
ODDOUT (ODD Output library), where the edited query will write its output.
Assign the input library (such as OPERDATA) and the code library (such as SOURCE) in the current SAS session.
From the SAS window, select Tools, then Query from the menu.
From the SQL Query Tables window, in the Table Sources panel, click the input library where the data source resides (such as OPERDATA).
The tables in the selected library are listed in the Available Tables panel.
In the Available Tables panel, click the table that contains the data for the ODD, and then click the right arrow.
In the Selected Tables window, click the table, and then click .
In the SQL Query Columns window, click on the columns you want to appear in the ODD, or select all columns if you want all columns from the source to appear in the ODD. Then click the right arrow.
From the SAS menu, select File Save Query Save Query as a SOURCE entry
Update the SOURCE entry specification window so that it will save your query in the appropriate code library (such as SOURCE), in the appropriate catalog (which could also be named SOURCE), and in the appropriate catalog entry (such as CUSTODD), as follows:
You have just generated a query that selects columns from a data source. You must now modify the query so that it will store the query results in another table--the table the ODD will point to.
The next task is to edit the query generated from the Query window so that it creates a table or view and saves it to the output library. You might only need to add a single line of code--an SQL CREATE TABLE statement or a CREATE VIEW statement--to the code that was generated by the Query window.
Note: The steps below assume that the code library is in SAS Version 8 format. If your code library is not in Version 8 format, the steps for editing a source code entry will be somewhat different than the following steps.
To edit a SAS Version 8 source code entry:
In SAS Version 8, if you have not done so already, display the tree view of the SAS Explorer by typing explorer on the Command line and pressing Return.
Assign the input library (such as OPERDATA), the code library (such as SOURCE), and the output library (such as ODDOUT) in the current SAS session.
In the Explorer, expand the code library until you can see the entry for the query that you saved from the Query window. Double-click that entry.
Modify the query so that it will store the query results in a table. Above the first line (PROC SQL), you might want to add comments that will describe the purpose of this source code. Between PROC SQL and Select, add an SQL CREATE TABLE or CREATE VIEW statement that saves its output to an output library (such as ODDOUT).
When you are finished making the modifications, click the X in the upper right-hand corner of the SAS Notepad window to save your changes.
You have just modified a query so that it will store the query results in another table--the table that the ODD will point to. The next step is to test the query in the SAS Editor window.
After you have edited and saved the query, submit it in the SAS Editor window in order to verify that the routine will in fact create a table. Also, if the table already exists, you can simply import its column metadata in the ODD that you will create later.
Assign the input library (such as OPERDATA), the code library (such as SOURCE), and the output library (such as ODDOUT) in the current SAS session.
Open the query routine in the SAS Notepad window, as described in Edit the Query So That It Creates a Table.
Copy the routine from the SAS Notepad window to the SAS Editor window.
Execute the routine. Check the SAS log to verify that the routine creates the table in the output library.
You have just verified that the modified query will create a table in the output library. Later, this routine will be specified in the metadata for the ODD's Load Step. Before you do that, however, you must first define the properties of the ODD.
Define ODD Properties |
After you have created a routine that extracts information from a data source, and then saves the results to a SAS table or view, you can specify the location of the extraction table or view in an ODD.
In the SAS/Warehouse Administrator Explorer, position the cursor on the parent ODD Group, for example, Sales Source Data, press the right mouse button, select Add Item, then ODD. In the Explorer window, a new ODD is added under the ODD Group.
To update the default metadata for the ODD, position the cursor on its icon, click the right mouse button, and select Properties. The Operational Data Definition Properties window displays for you to enter the appropriate information.
General Tab |
specifies the ODD's name, such as Customer, as well as a description, an owner, and an administrator. To specify an owner or an administrator, click the down arrow to select a name from a list. If you need to add a new Contact record, click the right arrow to display a properties window. For details about that window, click its button. The next task is to specify the location of the data source for this ODD. |
Data Location Tab |
specifies the definition for the host where the target library resides (such as local); specifies the target SAS library (such as ODD Output, the display name for the ODDOUT library); specifies the SAS table or view within that library (such as Customer). The next display shows an example of this metadata: To specify a host or SAS library, click the down arrow to select a name from a list. If you need to add a new host or library, click the right arrow to display a properties window. For details about that window, click its button. To specify a SAS table or view within the library, click the down arrow beside the SAS Table or View field to select a name from a list. Note: If you can select a SAS table by clicking the down arrow beside the SAS Table or View field, then SAS was able to assign the library that is specified in the SAS Library field. If you are unable to access a table, read the SAS log to identify the problem. |
Columns Tab |
specifies the column metadata to be included in the ODD. No column metadata has been specified yet. The column metadata for a data store must accurately specify the columns that you want to "map" to other data stores in a Process Flow, such as the one shown in ODD in a Process Flow. In the case of an ODD that executes user-written code that extracts information from a data source, and then saves the results to an output table or view, the metadata on this tab should match the columns in the output table or view. In the current example, the edited query routine has been executed at least once, as described in Test the Edited Query in the SAS Editor. Accordingly, the ODD's output table already exists. This means that you can use the "Import from Supplied Data Location" method to import the metadata for all columns in the output table. To import columns from an output table, click , and then select Supplied Data Location. All of the columns from the output table specified on the Data Location tab are imported. |
After you specify the column metadata, you have finished entering properties for this ODD. Click to save its metadata record and return to the Explorer.
You have just defined the properties of the ODD. You have specified the location of the output table and have specified metadata for the columns in the output table. The next step is to define a Process Editor Job for the ODD.
Define Process Editor Job |
In the current example, we have created a routine that extracts information from a data source, and then saves the results to a SAS table or view. We have defined an ODD that specifies the location of the SAS table or view. The next step is to create a Process Editor Job that includes
a Process Flow that specifies how data moves from the original data source to the ODD
an ODD Load Step that points to the routine that extracts information from a data source, and then saves the results to a SAS table or view.
As explained in the "Maintaining Jobs" chapter, data stores with user-written Load Steps--such as the ODD in our current example--do not need Process Flows in the Process Editor. However, creating a Process Flow for the example ODD has two advantages:
The flow of information from the input source to the ODD will be documented within SAS/Warehouse Administrator.
SAS/Warehouse Administrator can generate LIBNAME statements for the input source(s) specified in a Process Flow. In our example, this means that you will not have to assign the following SAS libraries manually, which have library definitions in the Toy Store Warehouse Environment:
OPERDATA (Operational Data library), where the input table for the Load Step routine is stored
SOURCE (Source Code library), where the Load Step routine is stored
ODDOUT (ODD Output library), where the Load Step routine will write its output.
In the current example, these libraries will be specified in the data stores and processes in a Process Flow. When the Process Flow is executed, SAS/Warehouse Administrator assigns the libraries.
Display the Customer ODD in the SAS/Warehouse Administrator Explorer.
Position the cursor on the Customer ODD, click the right mouse button and select Process.
A default Job will be created in the Process Editor, as follows:
In this display, the Job is represented by the icon with the rectangle around it in the left panel. The next task is to specify the input to the Customer ODD. Assume that the original data source in our example is the SAS table OPERDATA.CUSTOMER.
In the Process View of the Process Editor (right panel), position the cursor on the Customer ODD, click the right mouse button and select Add, then New Data File. (We selected New Data File because this input has not yet been specified for any other data store in the current Environment, and a Data File is the appropriate type for ODD inputs that are SAS tables or views. For details, see Valid Inputs and Outputs for Data Stores.)
A default Data File is added to the Process Flow, as follows:
In the Process View of the Process Editor, position the cursor on the Data File, click the right mouse button and select Properties. The Operational Data Definition Process Attributes window displays for you to enter the appropriate information.
When you are finished entering metadata for the Data File, click to save it.
You are returned to the Process Editor. The Data File has its new name, if you specified one.
The Process Flow now specifies what information will flow from the source (Operdata Customer) to the target (Customer). The final task is to specify a Load Step that actually moves data from the source to the target.
In the Process View of the Process Editor, position the cursor on the Customer ODD, click the right mouse button and select Edit Load Step. The Operational Data Definition Process Attributes window displays for you to enter the appropriate information.
After you specify Execution tab metadata, you are finished creating this Load Step. Click to save its metadata record and return to the Process Editor.
You are now ready to test the Job for this ODD.
Test the ODD |
If you have not done so already, open the ODD in the Process Editor.
For example, display the ODD (such as Customer) in the SAS/Warehouse Administrator Explorer, as follows:
Position the cursor on the ODD, click the right mouse button and select Process. The ODD will be opened in the Process Editor, as follows:
In the left panel of the Process Editor, the ODD's Job and output tables will be listed. In this display, the output table for the Job has a rectangle around it. Note that the output table has a parent. This parent is the Job for the Customer ODD.
(Optional) Position the cursor on the Job, and click the left mouse button to select it, as follows:
Position the cursor on the Job, click the right mouse button and select Run. The Load Generation/Execution Properties window displays.
Use this method to view the data that is accessed through an ODD. In order for this method to work, the Job for an ODD must have been successfully executed at least once.
Display the ODD (such as Customer) in the SAS/Warehouse Administrator Explorer, as follows:
Position the cursor on the ODD, click the right mouse button and select Data Utilities, and then Open. The ODD will be opened in a VIEWTABLE window, as follows:
After you have tested your ODD, it is ready to be used as an input to another data store. See also the maintenance note in Keeping ODD Column Metadata Current.
Copyright © 2012 by SAS Institute Inc., Cary, NC, USA. All rights reserved.