What's New in SAS Data Integration Studio 4.2 and 4.21
Overview
SAS Data
Integration Studio versions 4.2 and 4.21 have many new features. The
main enhancements that are common to both versions include the following:
- enhanced Job Editor
- new Details pane for jobs
- new advanced debugging
- enhanced mapping and propagation
- enhanced workflow
- new options format for transformations
- enhanced data cleansing and enrichment
- new metadata reporting
- new tree structure
- new Basic Properties window
- new security enhancements
The enhancements
that are new to SAS Data Integration Studio 4.21 include the following:
- a Teradata Loader transformation
- the ability to restart jobs from
checkpoints
- an enhanced ability to push some
job steps to a database server for processing
- an enhanced ability to set options
on tables, especially for transformation inputs and outputs
- an enhanced ability to redirect
the temporary output tables for transformations
- the ability to register Netezza
and Neoview tables
- the ability to convert a SAS program
into a SAS Data Integration Studio job
- the Forecasting transformation
is available again
- summary of changes to the interface
Enhanced Job Editor
The Job Editor, the window that is used for building and
maintaining jobs, has been completely redesigned. It has many new
features, including the following:
- integrated and customized design-time
checks for early notification of potential problems as you build the
process flow for a job.
- process flow layout is saved along
with the job. When a job is reopened, the layout appears in the same
state as when it was last saved.
- integrated overview pane for navigating
large process flows and a details panel to show node details.
- easy zoom and pan features to better
view the process flow.
- the ability to disable and enable
nodes and submit flows in any state.
- complete control over the order
in which transformations in a process flow are executed.
- integrated impact analysis.
- full Undo to reverse an action,
and Redo to reverse the Undo operation.
- tables can be included in a process
flow more than once.
- documentation available in the
form of notes on nodes and on the canvas.
- integrated debugging that includes
interactive submit features, performance monitoring and statistics
capture, and progress and status notification.
- log checking with the ability to
jump to errors, warnings, status, and the code that is generated for
each transformation in the job.
- enhanced mapping features that
includes intelligent handling of data type conversions, easy and selectable
customizes mappings, and controlled propagation of changes to mappings.
New Details Pane for Jobs
The Job Editor has an optional Details pane that enables
you to maintain column definitions and mappings, view the status of
each transformation when the job is executed, check errors and warnings
when the job is executed, view run-time statistics for the job, and
change the order in which transformations in the job are executed.
The Details pane includes the following tabs:
- a Columns tab that enables you to manipulate column metadata for tables or
external files that are selected in the current job.
- a Mappings tab that enables you to control column mapping and column propagation
settings. The Mappings tab is displayed only
when a transformation is selected in the current job.
- a Status tab that displays the status of each step (transformation) in a
submitted job.
- a Warnings and Errors tab that displays any warnings and errors that are generated when
a job is submitted.
- a Statistics tab that displays run-time and table statistics that are generated
by submitted job. This tab includes tabular and graphical displays.
- a Control Flow tab that displays the order in which transformations in the job
are executed. This tab also enables you to validate and change the
order of execution.
New Advanced Debugging
Advanced
debugging of jobs that are opened in the Job Editor window is supported by the following features:
- status indicators that identify
complete and incomplete transformations in a job.
- status messages that explain the
status issues with incomplete transformations.
- an integrated debugger toolbar
that includes the following functions: Run, Stop, Run From Selected Transformation, Run To Selected Transformation, Run Selected Transformations, Step, and Continue.
- the ability to easily identify
run-time errors in the Status and Warnings and Errors tabs of the Details pane.
- run-time progress and status indicators
that enable you to follow the progress of a job running in the Job Editor window. You can also review status messages
about each node on the Status tab of the
Design pane. These messages are displayed as each step is processed.
- the ability to click an error or
warning on the Errors and Warnings tab of
the Design pane to see it displayed on the Log tab of the Job Editor window. You can also
jump to the Code tab or the properties window
for any step.
Enhanced Mapping and Propagation
Mapping
and propagation controls enable you to manage the flow of data and
the propagation of changes in your jobs. This flexibility is supported
by the following features:
- rules-based mappings that support
pattern mapping, user customizations, and automatic numeric to character
or character to numeric mappings.
- mapping and propagation controls
on the Mappings tab in selected transformations
and the Details pane, the Diagram tab of
the Job Editor window, and the Diagram tab toolbar. These controls enable you to manage
to the scope and direction of mapping and propagation in your jobs.
Enhanced Workflow
Workflow
issues have been enhanced with the following features:
- the Control Flow tab on the Details pane, which enables you select a node in the
process flow diagram and specify when it runs. Simply drag the row
for the node to the desired position in the flow.
- the Statistics tab on the Details pane, which enables you to easily capture and
display performance information such as real time, CPU time, memory
use, input/output, and record count data. This data can be displayed
as a table or as a graph.
- the ability to start job runs from
the source, the target, or the middle of the job.
New Transformation Features
Transformations
now support the following new features:
- A dynamic target table structure
enables you to replace a temporary output with a permanent target
table or register the temporary table in place. Both options support
existing mappings.
- Values for many options can be
selected from a drop-down menu.
- Enhancements to the Transformation
Generator wizard include more parameter types such as dates, calendars,
and sets of values; the ability to select more elements from a static
list or dynamically populated list; the ability to validate and test
features while building the transformation; and the ability to import
or export parameter sets between transformations by using the Import from XML button or the Export to
XML button on the Options page of the Transformation
Generator wizard.
- The Analysis transformations have
been rewritten and enhanced, including Correlations, Distribution
Analysis, the Frequency transformations, and the Summary transformations.
Enhancements include support for dynamic target updates, ODS integration,
and support for most options in the corresponding SAS procedures (FREQ,
SUMMARY, and so on).
- A fast change data capture technique
reads changes for Oracle, DB2, and Attunity data. You can also create
custom data formats for change data capture.
- The UPSERT load option, which simultaneously
updates and inserts during a load, is now supported for the Table
Loader transformation for Teradata.
- Slowly Changing Dimensions now
support multiple techniques in a row, Type 1 and Type 2 columns, and
many performance enhancements.
- SQL Join now supports the optional
ability to disable automatic joins, easy order swapping, and layout
persistence in the Designer window.
Enhanced Data Cleansing and Enrichment
Data cleansing
and enrichment have been enhanced by running DataFlux jobs and real-time
services on the DataFlux Integration Server. The following transformations
have been added to support this process:
- the DataFlux IS Job transformation,
which executes a DataFlux job on a DataFlux Integration Server
- the DataFlux IS Service transformation,
which executes a DataFlux real-time service on a DataFlux Integration
Server
New Metadata Reporting
The new Reports feature enables you to generate reports and
review the metadata for tables and jobs in a convenient format. You
can also generate your own reports by creating a Java report plug-in.
You can perform the following tasks with these reports:
- finding information about a table
or job quickly
- comparing information between different
tables or jobs
- obtaining a single file that contains
summary information of all tables or jobs in HTML, RTF, or PDF format
- performing custom behaviors that
are defined by user-created plug-in SAS code, Java code, or both
Enhanced Tree Structure
The Custom
tree is replaced by the Folders tree, which is a new, standard interface
in many SAS 9.2 applications. The Folders tree enables you to add
custom folders so that you can organize metadata in categories that
are meaningful to your organization. The Folders tree might be the
interface you use most often when you want to select metadata for
update. For more information about the Folders tree, see the "Getting Started" chapter in the SAS Data Integration Studio: User's Guide.
The Inventory
tree contains folders for more types of objects. Most of the time,
however, SAS Data Integration Studio users work with the same objects
as before, such as tables, libraries, and jobs. You can right-click
objects in the Inventory tree and select Find In
Folders to find them in the Folders tree.
The Process
Library tree is now called the Transformations tree.
The Project
tree, a special tree that was used under change management, is now
called the Checkouts tree. You can check in objects individually.
Set up and administration of change management is easier. There is
no need to set up a metadata repository dependency chain.
New Basic Properties Pane
When you
click an object such as a table, a transformation, or a job, a list
of the main properties for the object is displayed in the Basic Properties
pane in the bottom-left corner of the SAS Data Integration Studio
desktop. You can turn this feature on and off with the View menu.
Security Enhancements
SAS Data
Integration Studio can use the following new security features from
the 9.2 Server platform:
- support for single signon (for
Windows servers only). This feature is an option when you connect
to a metadata server. When single signon is enabled, user host credentials
are used to connect to all servers and users are not prompted for
connection information.
- the ability to honor DBMS login
settings. Some databases such as Oracle and DB2 provide optional support
for always prompting for user name and password. The client now honors
this setting. If you configure this setting on your database, you
are prompted to enter credentials that surface in your LIBNAME statement.
- site settings that you can configure
for your metadata server. If the metadata server is configured to not store passwords on the clients, then the client
login screen always appears and the client metadata server connection
profile does not store credentials.
- a run-time credential lookup for
libraries. When this option is set, LIBNAME statements never shows
user names and passwords in generated code or the logs. Instead the
LIBNAME engine looks up credentials when the job is run and uses them.
No credentials are ever placed in the log.
Teradata Loader Transformation
The Teradata
Table Loader transformation can be added to a process flow when a
Teradata table is used as a target. This transformation has a unique Load Technique tab that provides different load options
depending on whether the source table is in the same Teradata database
as the target table. The Teradata Table Loader transformation also
supports the pushdown feature that enables you to process relational
database tables directly on the appropriate relational database server.
Restart Jobs from Checkpoints
The restart
feature enables you to restart a job at the beginning of a step (transformation)
when a job previously failed at that step or a subsequent step. The
code for the steps preceding the checkpoint is skipped, and the state
is restored from the save-state information that is preserved by the
checkpoint code. Then, processing can pick up from the specified transformation.
Push Job Code down to a Database
When both
the inputs and outputs of the Extract, SQL Join, Teradata Table Loader,
and Table Loader transformations are stored in the same relational
database, the code for these transformations can be pushed down to
a database server for execution. This option increases performance
by shifting data transformation to the most appropriate processing
resource.
Specify Table Options
To display
most table options, display the properties window for a table and
select the new Options tab. The options that
are available vary according to the data format of the tables (SAS
or DBMS). You can specify table options for the inputs and outputs
of most transformations on the new Table Options tab of the properties window for the transformation. The options
that are available vary according to the data format of the tables
(SAS or DBMS) and whether the table is an input or an output.
Redirect Temporary Output Tables
Transformations
in a job typically create temporary work tables as they execute. The
default location for these temporary tables is the SAS WORK library.
You can now easily redirect these temporary tables to an alternative
location, including a DBMS. Redirecting this output can improve performance,
support the restart of jobs from a checkpoint, and support the pushdown
of work to a third-party database.
Register Netezza and Neoview Tables
You can
register Netezza and Neoview tables and include them in SAS Data Integration
Studio jobs.
Convert SAS Programs into SAS Data Integration Studio Jobs
The Import
SAS Code wizard enables you to analyze a SAS program and to automatically
create a SAS Data Integration Studio job that performs the same tasks
as the program.
Forecasting Transformation is Available Again
You can
use the Forecasting transformation to run the High-Performance Forecasting
procedure (PROC HPF) against a warehouse data store. PROC HPF provides
a quick and automatic way to generate forecasts for many sets of time-series
data or transactional data. The procedure can forecast millions of
series at a time, with the series organized into separate variables
or across BY groups. The Forecasting transformation provides a simple
interface for entering values for various options that are associated
with PROC HPF.
Summary of Changes to the Interface
Changes to the interface for SAS Data
Integration Studio versions 4.2 and 4.21
include the following:
Trees |
The Custom tree is replaced by the Folders tab. The Process Library tree is replaced by the Transformations tab. The Quick Properties pane is now called the Basic Properties pane. The Metadata tree is no longer available. |
File Menu |
The File menu includes a New submenu which enables you to register new tables, jobs, and other objects; a Register Tables submenu which enables you to register existing tables; an Import submenu that includes a new Import SAS Code option; and a Connection Profile option (formerly called Metadata Profile). |
Edit Menu |
The Edit menu includes a new Connections option, which displays the Connections window. Use the Connections window to manage the input and output connections for tables or transformations in the Diagram tab of the Job Editor window. |
View Menu |
The View menu now contains Control Order, Layout, Zoom, and Grid options that are specific to jobs. The Comparison Results option is moved to the Tools menu. The View Libname option is moved to the Actions menu. |
Check Outs Menu |
The Check Outs menu replaces the Project menu. The Fetch option is no longer available. For more information, see the change management topic in the user's guide or the online Help. |
Actions Menu |
The Actions menu includes many new features that are mostly related to jobs. For more information, see the section about creating jobs in the user's guide or the online Help. |
Debug Menu |
The Debug menu is new. For more information, see the section about managing jobs in the user's guide or the online Help. |
Tools Menu |
The Tools menu has a number of changes.The Source Designer option is now Register Tables on the File menu. The Target Designer option is now available by selecting File New Table. The Transformation Generator option is now available by selecting File New Transformation. The Update Table Metadata option is now available by selecting Actions Update Metadata. The Process Designer option is much simplified and is available by selecting File New Job. The following options are no longer available: Advanced Aggregation Tuning, Calculated Members, Import Cube, Transformation Importer, Configure Status Handling. |
For more information and about the main menus and windows, see the topics under "Windows and Other Components" in the table of contents in the online Help.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.