Scalability and Performance Papers
The following papers have been written by SAS Insititute R & D staff, SAS customers and SAS alliance partners
to provide detailed information about many
different areas of scalability. In addition to the papers provided here, you may want to browse the Service and Support Library for other papers available from SAS.
Topics Independent of SAS Version
A Survey of Shared File Systems: Determining the Best Choice for your Distributed Applications (a SAS White Paper)
A shared file system is a required and integral component of all SASŪ Grid Manager deployments, Enterprise Business Intelligence deployments
with load balanced servers on multiple systems, and other types of distributed SAS applications. This paper examines the behavior of
several different shared file systems in the context of performance of a representative deployment.
Available in PDF
Best Practices for Data Sharing in a Grid Distributed SAS® Environment (a SAS White Paper)
Storage performance is the most critical component of implementing SAS in a distributed grid
environment. This paper provides an introduction to basic storage terminology and concerns.
It also describes the best practices used during successful testing with SAS and several clustered
file systems. This paper can be used as a reference guide when configuring a distributed
environment that will perform and scale to meet the needs of your organization.
Available in PDF
A Practical Approach to Solving Performance Problems with the SAS System (updated December 2007)
This paper presents some of the common causes of performance problems and provides a systematic approach for solving these
problems and improving performance.
Available in PDF
Solving SAS Performance Problems: Employing Host Based Tools (presented at SUGI 31)
The SAS White Paper, "A Practical Approach to Solving Performance Problems with the SAS System," detailed the role
of the FULLSTIMER option in diagnosing and solving performance problems. It introduced the usage of host-based performance
monitors for further investigation. This paper continues with that approach, detailing the use of the most commonly
available host-based performance monitors. It will discuss how to employ them in performance testing, interpret them with a
SAS mindset, and reconcile them to FULLSTIMER output to determine problem causes.
Available in PDF
SAS Performance Monitoring - A Deeper Discussion (presented at SAS Global Forum 2008)
This SAS white paper continues on from "Solving SAS Performance Problems: Employing Host Based Tools".
In this paper, we continue with that approach by describing the use of easy-to-use tools such as
nmon and
perfmon®,
investigate the most common causes of performance problems, and suggest what to look for in monitors first. The goal
is to illustrate how you can use graphical monitoring tools such as
nmon and
perfmon in conjunction with SAS FULLSTIMER
log output to determine problem causes.
Available in PDF
How to Maintain Happy SAS Users (presented at NESUG 2007)
There is one common thread when working with the IT administration staff at a SAS customer's location with regard to what
they can do to maintain happy SAS users, and that is to ensure that underlying hardware is properly configured to support the SAS
applications. This is not a trivial task since different SAS applications need to have the hardware configured differently and
depending on where you are with your understanding of how SAS will be used will help you evaluate options for the
hardware, operating, and infrastructure (mid-tier) configuration. This is easier for existing SAS customers and more difficult
with new SAS customers or new SAS applications at an existing SAS customer site.
Available in PDF
Topics Pertaining to SAS 9
SAS® Grid 101: How It Can Modernize Your Existing SAS Environment (presented at SAS Global Forum 2008)
Grid computing promises many benefits, including improved performance of applications, higher resource utilization,
lower cost of ownership, and flexibility for your IT infrastructure. This paper describes many of the business issues
that can be addressed by SAS Grid Computing, as well as provide code examples of how to implement SAS
applications on the grid. Learn how you can use SAS Grid Computing to modernize your existing SASŪ environment
and add new value to your existing applications with little or no change.
Available in PDF.
Presentation
available in PPS.
Data Integration in a Grid-Enabled Environment (presented at SAS Global Forum 2008)
SAS® Data Integration Studio and SAS® Grid Manager add capabilities to the SASŪ product suite to distribute
workloads across a grid of computers and thereby allow large processes to complete more quickly than
previously possible. SAS Grid Manager has been incorporated into SAS Data Integration Studio to facilitate
using grid resources for any long-running task that can be processed in parallel to another task. This paper
discusses typical data integration workloads, how to scale them on typical grid computing hardware, and the
new capability to load balance multiple data integration tasks across grid resources.
Available in PDF.
Presentation
available in PPS.
Introducing the SAS® Code Analyzer (presented at SAS Global Forum 2008)
This paper introduces the PROC SCAPROC procedure, the SAS Code Analyzer that is
new in Release 9.2 of Base SAS® Software. We will examine the advantages of using
the procedure, its syntax and phases of execution, and the output that the procedure
can produce. This procedure greatly facilitates grid enabling your existing SAS programs.
Available in PDF.
Presentation
available in PPS.
Balancing the Load - SAS® Server Technologies for Scalability (presented at SAS Global Forum 2008)
This paper will address a variety of SAS® servers and how they can be used to balance workload and work together
to provide scalability in a SAS Enterprise deployment. We will discuss a variety of servers including stored process
servers, workspace servers, data step batch servers, and grid servers. We also will discuss the options for using
these servers to balance load and provide solutions that can leverage a scale-out architecture.
Available in PDF.
Presentation
available in PPS.
Architecting a Finely Tuned SAS® Grid Solution (presented at SAS Global Forum 2008)
SAS Grid Computing is a scale-out SAS solution that enables SAS applications to better utilize computing resources.
When architecting a SAS Grid Computing solution it is important to understand the components required to ensure a
scalable and high optimized solution. This paper details some of the components necessary to architect and tune a
SAS Grid Computing solution.
Available in PDF
ETL Performance Tuning Tips (2007)
This paper provides important ETL performance, tuning, and capacity information for SAS®9 and SAS Data Integration
Studio. Topics include how to analyze and debug flows, gain efficiency quickly, set up the system environment, and, when
necessary, customize advanced performance techniques. Many best practice recommendations are presented that will help
you to customize and enhance the performance of new and existing ETL processes.
Available in PDF
Best Practices for Configuring IO Subsystems for SAS9 Applications (presented at SAS Global Forum 2007)
The increased power of SAS®9 applications allows information and knowledge creation from very large amounts of data.
Analysis that used to consist of 10s-100s of gigabytes (GBs) of supporting data has rapidly grown into the 10s of
terabytes (TBs). This data expansion has resulted in more and larger SAS data stores. Setting up file systems to
support these large volumes of data, as well as ensuring adequate storage space for the SASŪ temporary files can be
very challenging. This paper will present some best practices for configuring the IO subsystem for your SAS9 applications,
ensuring adequate capacity, bandwidth, and performance to keep your SAS9 users moving.
Available in PDF
Ensuring you have the Proper Resources for your SAS9 Applications (presented at SAS Global Forum 2007)
Several previous SUGI papers address performance-problem troubleshooting with SAS®9 applications. This paper
contains information to help you manage and monitor your computer environment to ensure that you have adequate
resources to support your SAS9 applications. The paper briefly describes the infrastructure that is needed
for various SAS applications (from simple SAS jobs to the most complicated SAS9 Enterprise BI applications). It
then shows how to identify which system infrastructure resource areas are most under pressure; and how to continually
monitor them by using simple, operating-system tools and more-complex, third-party monitoring applications. The monitoring
and resulting resource management advice will help ensure that you can meet the demands of your SAS users.
Available in PDF
SAS Goes Grid - Managing the Workload Across Your Enterprise (presented at SUGI31)
Learn how grid computing and scheduling have been incorporated and automated to deliver value in a highly efficient manner
for SAS analytics, data integration (ETL), data mining and business intelligence. Also learn about advanced configuration
options that you can use to fine-tune your SAS grid environment and allow multiple applications to efficiently and
dynamically use a virtual IT infrastructure.
Available in PDF
SAS and Grid Computing - Maximize Efficiency, Lower Total Cost of Ownership (presented at SUGI29)
Grid computing is about leveraging your available resources and idle processor cycles to more quickly solve a problem while
at the same time maximizing efficiency and reducing your total cost of ownership. This paper will discuss how SAS works in a
grid, the types of applications that are well suited to grid computing and success stories using SAS in a grid.
Available in PDF
***CUSTOMER PAPER: Performance Enhancements SAS 9.0 HP-UX 11.00 64 bit
Charles Pollack of Suncorp Metway presented performance and scalability improvements that
he experienced with SAS 9.0. Charles details the improvements he got by running his V8 SAS jobs with SAS 9.0 and leveraging
some of the new scalable features including: PROC SORT, PROC SUMMARY, the new SPDE engine, and the new piping feature of MP
CONNECT. He was able to use piping in one scenario to reduce his execution time to less than half!
Slide presentation available in PDF
Developing Client/Server Applications to Maximize SAS 9 Parallel Capabilities (presented at SUGI28)
This paper discusses parallel processes and parallel threads and how they can be used together to create scalable
client/server applications. Three application scenarios are discussed in detail. The first scenario uses MP CONNECT in a
grid environment. The second scenario combines MP CONNECT with the threaded I/O in the new SPD Engine. The final scenario
combines MP CONNECT with threaded PROC SUMMARY. Each application is implemented with an iterative approach to maximize
scalability and mimimize total elapsed execution time.
Available in PDF The presentation for this paper is also
available as
PowerPoint
Up and Out: Where We're Going with Scalability in SAS Version 9
In order to achieve scalability you must have an application with scalable characteristics, hardware capable of providing
scalability, and a software solution that is capable of leveraging the available hardware. SAS 9 is a complete and flexible
solution to software scalability. This paper introduces several concepts related to software scalability and how SAS 9
addresses scalability in many of the SAS products and servers.
Available in PDF
"What's under the hood?" V9 Performance and Scalability
Scaling SAS Data Access to Oracle RDBMS
In SAS 9 several of the SAS/ACCESS products encorporated threading in order to provide parallel access to DBMS data when
available. The focus of this paper is on the SAS/ACCESS to Oracle product and the options that can be used to parallelize
access to data stored in an Oracle DBMS. In addition, key concepts of Oracle 9i that pertain to scalability are described.
Available in PDF The presentation for this paper
is also available as
PowerPoint
Version 9: Scaling the Future
This paper details some of the enhancements being made to PROC SORT and PROC SUMMARY to leverage SMP hardware and provide
better performance.
Available in PDF
An Inside Look at Version 9 and Release 9.1 Threaded Base SAS Procedures (presented at SUGI28)
Several key Base SAS procedures have encorporated threading to achieve significant performance improvements on SMP
architectures. This paper looks at how threading has been used to enhance scalability in the following SAS procedures: SORT,
SUMMARY, MEANS, TABULATE, REPORT, and SQL.
Available
in PDF
SAS Meets Big Iron: High Performance Computing in SAS Analytic Procedures
This paper discusses enhancements to several SAS/STAT and Enterprise Miner procedures. These enhancements are aimed at
exploiting multiprocessor hardware and improving performance.
Available in PDF
Parallelization in Action with SAS Analytics Software (presented at SUGI28)
This presentation details the scalability that is now possible with several of the analytic procedures as a result of
parallel threading. Graphical results are presented for the following procedures: GLM, REG, DMREG, and LOESS.
Available in PDF
V9 OLAP: An Architectural Overview
This paper gives details about the new Version 9 OLAP Server architecture. Details of the architecture will show how the
OLAP server scales for multi-user access and how it can be tuned to provide the best performance for your applications.
Available in PDF
OLAP Server: Focusing on Performance (presented at SUGI28)
SAS OLAP server performance is composed of two primary metrics: query response time and number of transactions processed per
second (i.e.,throughput). This paper covers several factors to consider in achieving the best performance from your OLAP
server including: hardware, cube design and implementation, and query formulation. Several tuning options are also presented
that can aid in improving performance.
Available in
PDF
Designing V9 OLAP Structures for Optimum Performance and Scalability (presented at SUGI28)
This presentation covers the new 9.1 considerations for designing OLAP structures to maximize performance and scalability.
Several new tuning options are also available. In addition, the Applications Response Management (ARM) tool can be used to
monitor and diagnose any performance problems.
PowerPoint
SPD Server 4.1: Scalability Solution for SAS; Turning Big Data Into Business and Analytic Intelligence
This paper provides an overview of the new features and enhancements of SPD Server 4.0. These features can be used to
optimize performance for the SAS Enterprise Marketing Automation Solution.
Available in PDF
Scalability Solution for SAS Dynamic Cluster Tables: SAS Scalable Performance Data Server 4.3 and Later
This paper provides an overview of dynamic cluster tables in SAS Scalable Performance Data Server 4.3 as well as enhancements that have been included in later releases. Dynamic cluster tables enable both the partitioning of data based on cr
iteria in the data and parallel loading of the cluster tables.
Available in PDF
SAS Scalable Performance Data Server 4.3 tsm1: Parallel Join with Enhanced Group By
SAS SPD Server 4.3 TSM1 provides a new facility that executes SQL joins in parallel reducing the total time required to
complete a query. This paper discusses the coverage, restrictions, tuning and performance benefits of the Parallel Join
Facility and the enhanced Group By feature.
Available in PDF
Topics Pertaining to SAS Version 8 and Later
Multiprocessing with Version 8 of the SAS System
This paper introduces MP CONNECT technology and how it can be used to run portions of your applications in parallel to
reduce the total elapsed time required to complete your job.
Available in PDF
The %Distribute System for Large-Scale Parallel Computation in the SAS System
This paper describes how to use MP CONNECT and the SAS macro facility to accomplish grid or high performance computing. A
"divide and conquer" approach was taken to leverage the processing power of a variety of machines, using them in
parallel to dramatically reduce the processing time of a Monte Carlo simulation. You can also download
this ZIP file, which contains the SAS file used in the Monte Carlo
simulation discussed in this paper.
Available in PDF
Save Time Today Using SAS Views
This paper emphasizes the value of using the established SAS View technology to reduce execution time when processing large
amounts of data. Both SQL views and SAS data step views can be used to not only save you time but also reduce disk space
requirements.
Available in PDF
Topics From Our Alliance Partners
ARMing SAS Jobs On HP-UX 11.x
This paper describes the steps necessary to ARM a SAS job and then monitor that job using HP GlancePlus. This document gives
a brief introduction to the ARM API and then proceeds to outline two different methods of ARMing a SAS job. A description is
then given on how to view the ARM data with HP GlancePlus.
Available in PDF
Performance Tuning Guide for SAS Users and Sun System Administrators
This paper provides a basic understanding of how to analyze and apply tuning changes to SAS software running on the Sun
Solaris and Sun UltraSPARC hardware platform. It is based on Sun Solaris versions 8 and 9 along with Base SAS versions 8.2,
9.0, and 9.1. The information in this paper will help you plan for a new installation of SAS running on Sun. In addition,
tools and methods will be given to help identify and resolve any issues with an installed SAS System on Sun.
Available in PDF
Configuring EMC CLARiiON for SAS Business Intelligence Platform (June 2007)
This white paper covers guidelines for configuring and deploying EMC CLARiiON storage systems in typical environments deploying SAS
Business Intelligence data analysis applications. Deployments vary in how data is accessed and modified, so no single formula is
guaranteed to work for all environments. The goal of this paper is to explain how certain configuration choices in the CLARiiON affect
different I/O patterns directed against the system. By understanding the actual dominant I/O pattern in a specific SAS deployment
environment, SAS system administrators will be able to collaborate with their storage administrators to properly configure the CLARiiON
system layout to the best advantage for their particular deployment.
Available in PDF
Tuning Guide for SAS9 on AIX 5L (April 2006)
In this paper, we first provide a brief overview of SAS®9 and typical performance scenarios. Then, we focus on general best
practice suggestions and performance settings for tuning the AIX 5L Version5.3 operating system for enhanced SAS®9 performance on
POWER5 processor-based servers. The tuning of the disk IO subsystem and SAS application will be outlined briefly. This paper
will also discuss general performance monitoring methodology and performance tools.
Available in PDF
IBM TotalStorage DS4000 Storage Considerations for SAS 9 on the IBM 3Server p590 (September 2005)
This document presents the storage findings and recommendations for SAS 9.1 with IBM TotalStorage DS4000 disk arrays (formerly known as FAStT).
In addition to the DS4400, most of the storage findings and layout recommendations can be applied toward any of the members of the IBM TotalStorage
DS4000 family including the DS4100 DS4300, DS4400, DS4500, and DS4800. Storage testing focused on file system configuration, host bus adapters quantity,
and disk array placement. In addition, the team tested the DS4000 Storage Management physical disk array placement software algorithm and completed
the TotalStorage Proven certification testing for SAS 9 with the stated IBM hardware.
Available in PDF
Taking SAS to the Enterprise: Kernel Configuration Guidelines for SAS 9 on HP-UX (July 2004)
The intended audience for this paper is the experienced HP-UX system administrator who is seeking information on the SAS-specific required and recommended
system configuration guidelines for an HP-UX server.
Available in PDF
SAS Parallel Scoring Optimization
As data proliferates, organizations are taking advantage of data mining techniques to develop tactical and strategic insight
into these vast data stores. Read how SAS parallel scoring can support an enterprise-class data mining operation.
Available in PDF
We at SAS have created the Scalability Community to make you aware of the connectivity and scalability features and
enhancements that you can leverage for your SAS installation. The success of this community depends on you. Send electronic
mail to scalability@sas.com with your comments, requirements, and suggestions.