What's New in SAS ETL Studio 3.3
Overview
SAS ETL Studio 3.3 (renamed to SAS Data Integration Studio in March 2006) makes it much easier to perform the following tasks:
- create iterative jobs (processing loops)
- view, subset, sort, and search the data in a table
- use lookup tables to populate target tables
- view the differences between imported metadata and existing metadata, and
choose which changes to apply
- perform cluster analysis, in order to eliminate duplicate mailings to members
of the same household, for example
Note: SAS ETL Studio 3.3 adds new features to the version of SAS ETL
Studio that originally shipped with SAS 9.1.3.
Dynamic Grid Enablement and Parallel Processing
SAS ETL Studio uses a set of macros to enable parallel processing. These
macros can be enabled in two ways:
- by selecting YES in the Enable parallel
processing macros option on the Options tab of the
properties window for a job
- by including a Loop transformation in a job
When parallel processing is enabled for a SAS ETL Studio job, the
transformations in the job can be executed in parallel on a single machine that
supports symmetric multiprocessing (SMP) or by a grid computing networkif the
grid software is licensed.
Dynamic Grid Enablement and Parallel Processing for Iterative Jobs
You can create iterative jobs (processing loops) with the following new
components:
-
a control table that lists rows of information to be processed. You can use
the Library Contents transformation to create a control table.
-
an iterative job that contains a control table and a processing loop that
is implemented with the Loop transformation and the Loop End transformation.
The iterative job can be executed in parallel on a single machine that supports
SMP or by a grid computing network.
- a parameterized job that contains the transformations that process the rows
listed in the control table. This job is dropped between the Loop
transformation and the Loop End transformation in the iterative
job.
Change Analysis on Imported Metadata
When you import metadata to SAS ETL Studio, you can view the differences between
the imported metadata and existing metadata and choose which changes to apply.
Changes in tables, columns, indexes, and keys display in a Differences window.
To help you understand the impact of a given change, you can run impact analysis
or reverse impact analysis on imported tables and columns.
New Lookup Transformation
You can use the new Lookup transformation in a job to load a target with columns
that are taken from a source and from a number of lookup tables. The Lookup
transformation offers a number of advantages over the existing Fact Table Lookup
transformation:
- The Lookup transformation processes all lookups in a single DATA step,
using a hashing technique that is new in SAS 9.0.
- It supports multi-column key lookup.
- It has better error and exception handling, including missing value and
missing table support.
- User-configurable default settings can be persisted.
- Its lookup and error tables can be viewed in the Process Editor and in
impact analysis reports.
- It provides WHERE clause support in lookups to filter incoming lookup records.
Enhanced Support for SPD Server Tables
The SAS Scalable Performance Data (SPD) Server is a high performance,
multi-user, parallel-processing data server. It stores data in a special format
that facilitates parallel processing. You can now use the properties window for
an SPD Server table to specify options that are unique to those tables. You can
also use a new loader transformation, the SPD Server Table Loader, to specify
options that are unique to SPD Servers. The new loader generates code that is
appropriate for the special data format that the SPD Server uses.
New Support for Cluster Analysis
The Create Match Code transformation now supports cluster analysis as well
as match code analysis. Cluster analysis assigns cluster numbers based on multiple
conditions. For example, suppose you wanted to eliminate duplicate mailings
to members of the same household. When analyzing a demographic database, conditions
can be defined for names and for street address and zip code. Rows that match
either condition receive the same cluster number and would be considered as
one household.
Enhanced Change Management
The following improvements make it easier to use change management in SAS ETL
Studio:
- You do not have to check out a table in order to add it as a source or a
target in a job.
- When you check out a job, the tables in the job are no longer
automatically checked out. This makes it easier to avoid locking more
resources than you need.
- Check-out operations have fewer restrictions in regard to repository
dependencies.
- A new menu selection (Clear Project Repository) enables you to delete all
new objects and unlock all checked-out objects in a project repository.
A
check-in description can be longer than 200 characters. You can print the
check-in history as well.
You must have the latest metadata server for these features to be enabled.
Enhanced View Data Window
The View Data window now enables you to do the following tasks:
- Select which columns in the table to view.
- Use WHERE clause subsetting of the data with simple and complex filters.
- Find character and numeric data, through the entire file
- Display the total number of records in the table in the title bar of the
window (1 of
).
- Jump to beginning, jump to end, jump to line number.
- Select which column names appear in the header: Metadata Name, SAS Column
Name, and/or Description.
- Sort the data by column selection, ascending, descending.
- View formatted and unformatted data.
Enhanced Update Table Metadata Feature
This feature has been improved in the following ways:
- The METALIB procedure is used to make metadata updates. This method is
faster and more reliable than previous methods.
- You can update tables that are remote to the SAS application server.
- You can select multiple tables and run Update Table Metadata against them
all at once.
- You can view an improved update log which lists the tables that have been
successfully updated or which have failed. The log also provides details about
the updates to each table.
Your SAS application server must have the latest SAS Workspace Server
component for these features to be enabled.
Enhanced Support for Cubes
New options enable you to export and import cube metadata; save the OLAP procedure
code that is generated for a cube; maintain calculated members that are associated
with a cube; and tune aggregations.
General Enhancements
-
The Options window has a new Code Generation tab that
enables you to set global options for the code that will be generated for all
new jobs.
- The output table (an intermediate work table) for a number of
transformations can now be either a physical table or a view. The Extract,
Join, Lookup, and Data Validation transformations are among those that have
this option. For details, see the Options tab in the property
window for each transformation.
- After a job has been executed at least once, you can right-click a
transformation in the job and view the contents of its work table. This helps
you verify or debug the transformations in a job.
-
The SCD Type 2 Loader now handles duplicate records and reloads closed-out
records.
- The property window for the Extract transformation now has Order
By and Group By tabs. It also has an option that
determines whether the SELECT DISTINCT clause is inserted into the SQL
statement for the transformation.
-
You can now select, copy, and print the contents of the
Finish tab that is used in many wizards.
Some new features in SAS Data Integration Studio 3.3 require the latest servers. It is
strongly recommended that you upgrade your metadata for the
current release of SAS ETL Studio. See the installation instructions for further
information.
For details about upgrading metadata as required for SAS 9.1.3, as well as
installing and configuring servers, libraries, users, and other resources that
are required by SAS ETL Studio, see the SAS 9.1.3 Intelligence Platform:
Installation Guide and the SAS 9.1.3 Intelligence Platform:
Administration Guide.