FOCUS AREAS

Scalability and Performance Papers

The following papers have been written by SAS Insititute R&D staff, SAS customers and SAS alliance partners to provide detailed information about many different areas of scalability. In addition to the papers provided here, you may want to browse the Service and Support Library for other papers available from SAS.

Topics Independent of SAS Version

A Survey of Shared File Systems: Determining the Best Choice for your Distributed Applications (a SAS White Paper)
A shared file system is a required and integral component of all SASŪ Grid Manager deployments, Enterprise Business Intelligence deployments with load balanced servers on multiple systems, and other types of distributed SAS applications. This paper examines the behavior of several different shared file systems in the context of performance of a representative deployment.    Available in PDF

Best Practices for Data Sharing in a Grid Distributed SAS® Environment (a SAS White Paper)
Storage performance is the most critical component of implementing SAS in a distributed grid environment. This paper provides an introduction to basic storage terminology and concerns. It also describes the best practices used during successful testing with SAS and several clustered file systems. This paper can be used as a reference guide when configuring a distributed environment that will perform and scale to meet the needs of your organization.    Available in PDF

A Practical Approach to Solving Performance Problems with the SAS System (updated December 2007)
This paper presents some of the common causes of performance problems and provides a systematic approach for solving these problems and improving performance.   Available in PDF

Solving SAS Performance Problems: Employing Host Based Tools (presented at SUGI 31)
The SAS White Paper, "A Practical Approach to Solving Performance Problems with the SAS System," detailed the role of the FULLSTIMER option in diagnosing and solving performance problems. It introduced the usage of host-based performance monitors for further investigation. This paper continues with that approach, detailing the use of the most commonly available host-based performance monitors. It will discuss how to employ them in performance testing, interpret them with a SAS mindset, and reconcile them to FULLSTIMER output to determine problem causes.   Available in PDF

SAS Performance Monitoring - A Deeper Discussion (presented at SAS Global Forum 2008)
This SAS white paper continues on from "Solving SAS Performance Problems: Employing Host Based Tools". In this paper, we continue with that approach by describing the use of easy-to-use tools such as nmon and perfmon®, investigate the most common causes of performance problems, and suggest what to look for in monitors first. The goal is to illustrate how you can use graphical monitoring tools such as nmon and perfmon in conjunction with SAS FULLSTIMER log output to determine problem causes.   Available in PDF

How to Maintain Happy SAS Users (presented at NESUG 2007)
There is one common thread when working with the IT administration staff at a SAS customer's location with regard to what they can do to maintain happy SAS users, and that is to ensure that underlying hardware is properly configured to support the SAS applications. This is not a trivial task since different SAS applications need to have the hardware configured differently and depending on where you are with your understanding of how SAS will be used will help you evaluate options for the hardware, operating, and infrastructure (mid-tier) configuration. This is easier for existing SAS customers and more difficult with new SAS customers or new SAS applications at an existing SAS customer site.    Available in PDF

Topics Pertaining to SAS 9

SAS® Grid 101: How It Can Modernize Your Existing SAS Environment (presented at SAS Global Forum 2008)
Grid computing promises many benefits, including improved performance of applications, higher resource utilization, lower cost of ownership, and flexibility for your IT infrastructure. This paper describes many of the business issues that can be addressed by SAS Grid Computing, as well as provide code examples of how to implement SAS applications on the grid. Learn how you can use SAS Grid Computing to modernize your existing SASŪ environment and add new value to your existing applications with little or no change.    Available in PDF. Presentation available in PPS.

Data Integration in a Grid-Enabled Environment (presented at SAS Global Forum 2008)
SAS® Data Integration Studio and SAS® Grid Manager add capabilities to the SASŪ product suite to distribute workloads across a grid of computers and thereby allow large processes to complete more quickly than previously possible. SAS Grid Manager has been incorporated into SAS Data Integration Studio to facilitate using grid resources for any long-running task that can be processed in parallel to another task. This paper discusses typical data integration workloads, how to scale them on typical grid computing hardware, and the new capability to load balance multiple data integration tasks across grid resources.    Available in PDF. Presentation available in PPS.

Introducing the SAS® Code Analyzer (presented at SAS Global Forum 2008)
This paper introduces the PROC SCAPROC procedure, the SAS Code Analyzer that is new in Release 9.2 of Base SAS® Software. We will examine the advantages of using the procedure, its syntax and phases of execution, and the output that the procedure can produce. This procedure greatly facilitates grid enabling your existing SAS programs.    Available in PDF. Presentation available in PPS.

Balancing the Load - SAS® Server Technologies for Scalability (presented at SAS Global Forum 2008)
This paper will address a variety of SAS® servers and how they can be used to balance workload and work together to provide scalability in a SAS Enterprise deployment. We will discuss a variety of servers including stored process servers, workspace servers, data step batch servers, and grid servers. We also will discuss the options for using these servers to balance load and provide solutions that can leverage a scale-out architecture.   Available in PDF. Presentation available in PPS.

Architecting a Finely Tuned SAS® Grid Solution (presented at SAS Global Forum 2008)
SAS Grid Computing is a scale-out SAS solution that enables SAS applications to better utilize computing resources. When architecting a SAS Grid Computing solution it is important to understand the components required to ensure a scalable and high optimized solution. This paper details some of the components necessary to architect and tune a SAS Grid Computing solution.   Available in PDF

ETL Performance Tuning Tips (2007)
This paper provides important ETL performance, tuning, and capacity information for SAS®9 and SAS Data Integration Studio. Topics include how to analyze and debug flows, gain efficiency quickly, set up the system environment, and, when necessary, customize advanced performance techniques. Many best practice recommendations are presented that will help you to customize and enhance the performance of new and existing ETL processes.    Available in PDF

Best Practices for Configuring IO Subsystems for SAS9 Applications (presented at SAS Global Forum 2007)
The increased power of SAS®9 applications allows information and knowledge creation from very large amounts of data. Analysis that used to consist of 10s-100s of gigabytes (GBs) of supporting data has rapidly grown into the 10s of terabytes (TBs). This data expansion has resulted in more and larger SAS data stores. Setting up file systems to support these large volumes of data, as well as ensuring adequate storage space for the SASŪ temporary files can be very challenging. This paper will present some best practices for configuring the IO subsystem for your SAS9 applications, ensuring adequate capacity, bandwidth, and performance to keep your SAS9 users moving.    Available in PDF

Ensuring you have the Proper Resources for your SAS9 Applications (presented at SAS Global Forum 2007)
Several previous SUGI papers address performance-problem troubleshooting with SAS®9 applications. This paper contains information to help you manage and monitor your computer environment to ensure that you have adequate resources to support your SAS9 applications. The paper briefly describes the infrastructure that is needed for various SAS applications (from simple SAS jobs to the most complicated SAS9 Enterprise BI applications). It then shows how to identify which system infrastructure resource areas are most under pressure; and how to continually monitor them by using simple, operating-system tools and more-complex, third-party monitoring applications. The monitoring and resulting resource management advice will help ensure that you can meet the demands of your SAS users.    Available in PDF

SAS Goes Grid - Managing the Workload Across Your Enterprise (presented at SUGI31)
Learn how grid computing and scheduling have been incorporated and automated to deliver value in a highly efficient manner for SAS analytics, data integration (ETL), data mining and business intelligence. Also learn about advanced configuration options that you can use to fine-tune your SAS grid environment and allow multiple applications to efficiently and dynamically use a virtual IT infrastructure.   Available in PDF

SAS and Grid Computing - Maximize Efficiency, Lower Total Cost of Ownership (presented at SUGI29)
Grid computing is about leveraging your available resources and idle processor cycles to more quickly solve a problem while at the same time maximizing efficiency and reducing your total cost of ownership. This paper will discuss how SAS works in a grid, the types of applications that are well suited to grid computing and success stories using SAS in a grid.   Available in PDF

Developing Client/Server Applications to Maximize SAS 9 Parallel Capabilities (presented at SUGI28)
This paper discusses parallel processes and parallel threads and how they can be used together to create scalable client/server applications. Three application scenarios are discussed in detail. The first scenario uses MP CONNECT in a grid environment. The second scenario combines MP CONNECT with the threaded I/O in the new SPD Engine. The final scenario combines MP CONNECT with threaded PROC SUMMARY. Each application is implemented with an iterative approach to maximize scalability and mimimize total elapsed execution time.   Available in PDF The presentation for this paper is also available as   PowerPoint

Up and Out: Where We're Going with Scalability in SAS Version 9
In order to achieve scalability you must have an application with scalable characteristics, hardware capable of providing scalability, and a software solution that is capable of leveraging the available hardware. SAS 9 is a complete and flexible solution to software scalability. This paper introduces several concepts related to software scalability and how SAS 9 addresses scalability in many of the SAS products and servers.   Available in PDF

Scaling SAS Data Access to Oracle RDBMS
In SAS 9 several of the SAS/ACCESS products encorporated threading in order to provide parallel access to DBMS data when available. The focus of this paper is on the SAS/ACCESS to Oracle product and the options that can be used to parallelize access to data stored in an Oracle DBMS. In addition, key concepts of Oracle 9i that pertain to scalability are described.   Available in PDF The presentation for this paper is also available as   PowerPoint

Version 9: Scaling the Future
This paper details some of the enhancements being made to PROC SORT and PROC SUMMARY to leverage SMP hardware and provide better performance.  Available in PDF

An Inside Look at Version 9 and Release 9.1 Threaded Base SAS Procedures (presented at SUGI28)
Several key Base SAS procedures have encorporated threading to achieve significant performance improvements on SMP architectures. This paper looks at how threading has been used to enhance scalability in the following SAS procedures: SORT, SUMMARY, MEANS, TABULATE, REPORT, and SQL.   Available in PDF

SAS Meets Big Iron: High Performance Computing in SAS Analytic Procedures
This paper discusses enhancements to several SAS/STAT and Enterprise Miner procedures. These enhancements are aimed at exploiting multiprocessor hardware and improving performance.  Available in PDF

Parallelization in Action with SAS Analytics Software (presented at SUGI28)
This presentation details the scalability that is now possible with several of the analytic procedures as a result of parallel threading. Graphical results are presented for the following procedures: GLM, REG, DMREG, and LOESS.   Available in PDF

V9 OLAP: An Architectural Overview
This paper gives details about the new Version 9 OLAP Server architecture. Details of the architecture will show how the OLAP server scales for multi-user access and how it can be tuned to provide the best performance for your applications.   Available in PDF

OLAP Server: Focusing on Performance (presented at SUGI28)
SAS OLAP server performance is composed of two primary metrics: query response time and number of transactions processed per second (i.e.,throughput). This paper covers several factors to consider in achieving the best performance from your OLAP server including: hardware, cube design and implementation, and query formulation. Several tuning options are also presented that can aid in improving performance.   Available in PDF

Designing V9 OLAP Structures for Optimum Performance and Scalability (presented at SUGI28)
This presentation covers the new 9.1 considerations for designing OLAP structures to maximize performance and scalability. Several new tuning options are also available. In addition, the Applications Response Management (ARM) tool can be used to monitor and diagnose any performance problems.   PowerPoint

SPD Server 4.1: Scalability Solution for SAS; Turning Big Data Into Business and Analytic Intelligence
This paper provides an overview of the new features and enhancements of SPD Server 4.0. These features can be used to optimize performance for the SAS Enterprise Marketing Automation Solution.   Available in PDF

Scalability Solution for SAS Dynamic Cluster Tables: SAS Scalable Performance Data Server 4.3 and Later
This paper provides an overview of dynamic cluster tables in SAS Scalable Performance Data Server 4.3 as well as enhancements that have been included in later releases. Dynamic cluster tables enable both the partitioning of data based on cr iteria in the data and parallel loading of the cluster tables.   Available in PDF

SAS Scalable Performance Data Server 4.3 tsm1: Parallel Join with Enhanced Group By
SAS SPD Server 4.3 TSM1 provides a new facility that executes SQL joins in parallel reducing the total time required to complete a query. This paper discusses the coverage, restrictions, tuning and performance benefits of the Parallel Join Facility and the enhanced Group By feature.   Available in PDF

Topics Pertaining to SAS Version 8 and Later

Multiprocessing with Version 8 of the SAS System
This paper introduces MP CONNECT technology and how it can be used to run portions of your applications in parallel to reduce the total elapsed time required to complete your job.  Available in PDF

The %Distribute System for Large-Scale Parallel Computation in the SAS System
This paper describes how to use MP CONNECT and the SAS macro facility to accomplish grid or high performance computing. A "divide and conquer" approach was taken to leverage the processing power of a variety of machines, using them in parallel to dramatically reduce the processing time of a Monte Carlo simulation. You can also download this ZIP file, which contains the SAS file used in the Monte Carlo simulation discussed in this paper.   Available in PDF

Save Time Today Using SAS Views
This paper emphasizes the value of using the established SAS View technology to reduce execution time when processing large amounts of data. Both SQL views and SAS data step views can be used to not only save you time but also reduce disk space requirements.   Available in PDF

Topics From Our Alliance Partners

Configuring EMC CLARiiON for SAS Business Intelligence Platform (June 2007)
This white paper covers guidelines for configuring and deploying EMC CLARiiON storage systems in typical environments deploying SAS Business Intelligence data analysis applications. Deployments vary in how data is accessed and modified, so no single formula is guaranteed to work for all environments. The goal of this paper is to explain how certain configuration choices in the CLARiiON affect different I/O patterns directed against the system. By understanding the actual dominant I/O pattern in a specific SAS deployment environment, SAS system administrators will be able to collaborate with their storage administrators to properly configure the CLARiiON system layout to the best advantage for their particular deployment.    Available in PDF

Tuning Guide for SAS9 on AIX 5L (April 2006)
In this paper, we first provide a brief overview of SAS®9 and typical performance scenarios. Then, we focus on general best practice suggestions and performance settings for tuning the AIX 5L Version5.3 operating system for enhanced SAS®9 performance on POWER5 processor-based servers. The tuning of the disk IO subsystem and SAS application will be outlined briefly. This paper will also discuss general performance monitoring methodology and performance tools.   Available in PDF

IBM TotalStorage DS4000 Storage Considerations for SAS 9 on the IBM 3Server p590 (September 2005)
This document presents the storage findings and recommendations for SAS 9.1 with IBM TotalStorage DS4000 disk arrays (formerly known as FAStT). In addition to the DS4400, most of the storage findings and layout recommendations can be applied toward any of the members of the IBM TotalStorage DS4000 family including the DS4100 DS4300, DS4400, DS4500, and DS4800. Storage testing focused on file system configuration, host bus adapters quantity, and disk array placement. In addition, the team tested the DS4000 Storage Management physical disk array placement software algorithm and completed the TotalStorage Proven certification testing for SAS 9 with the stated IBM hardware.   Available in PDF

Taking SAS to the Enterprise: Kernel Configuration Guidelines for SAS 9 on HP-UX (July 2004)
The intended audience for this paper is the experienced HP-UX system administrator who is seeking information on the SAS-specific required and recommended system configuration guidelines for an HP-UX server.   Available in PDF

SAS Parallel Scoring Optimization
As data proliferates, organizations are taking advantage of data mining techniques to develop tactical and strategic insight into these vast data stores. Read how SAS parallel scoring can support an enterprise-class data mining operation.   Available in PDF

page divider

We at SAS have created the Scalability Community to make you aware of the connectivity and scalability features and enhancements that you can leverage for your SAS installation. The success of this community depends on you. Send electronic mail to scalability@sas.com with your comments, requirements, and suggestions.