SAS Institute. The Power to Know

FOCUS AREAS

Scalability and Performance Papers

The following papers have been written by SAS Insititute R & D staff to provide detailed information about many different areas of scalability. In addition to the papers provided here, you may want to browse the Service and Support Library for other papers available from SAS.

Topics Independent of SAS Version

Best Practices for Data Sharing in a Grid Distributed SAS® Environment (a SAS White Paper)
Storage performance is the most critical component of implementing SAS in a distributed grid environment. This paper provides an introduction to basic storage terminology and concerns. It also describes the best practices used during successful testing with SAS and several clustered file systems. This paper can be used as a reference guide when configuring a distributed environment that will perform and scale to meet the needs of your organization.    Available in PDF

A Practical Approach to Solving Performance Problems with the SAS System
This paper presents some of the common causes of performance problems and provides a systematic approach for solving these problems and improving performance.   Available in PDF

Solving SAS Performance Problems: Employing Host Based Tools
The SAS White Paper, "A Practical Approach to Solving Performance Problems with the SAS System," detailed the role of the FULLSTIMER option in diagnosing and solving performance problems. It introduced the usage of host-based performance monitors for further investigation. This paper continues with that approach, detailing the use of the most commonly available host-based performance monitors. It will discuss how to employ them in performance testing, interpret them with a SAS mindset, and reconcile them to FULLSTIMER output to determine problem causes.   Available in PDF

Topics Pertaining to SAS 9

SAS® Grid 101: How It Can Modernize Your Existing SAS Environment (presented at SAS Global Forum 2008)
Grid computing promises many benefits, including improved performance of applications, higher resource utilization, lower cost of ownership, and flexibility for your IT infrastructure. This paper describes many of the business issues that can be addressed by SAS Grid Computing, as well as provide code examples of how to implement SAS applications on the grid. Learn how you can use SAS Grid Computing to modernize your existing SASŪ environment and add new value to your existing applications with little or no change.    Available in PDF. Presentation available in PPS.

Data Integration in a Grid-Enabled Environment(presented at SAS Global Forum 2008)
SAS® Data Integration Studio and SAS® Grid Manager add capabilities to the SASŪ product suite to distribute workloads across a grid of computers and thereby allow large processes to complete more quickly than previously possible. SAS Grid Manager has been incorporated into SAS Data Integration Studio to facilitate using grid resources for any long-running task that can be processed in parallel to another task. This paper discusses typical data integration workloads, how to scale them on typical grid computing hardware, and the new capability to load balance multiple data integration tasks across grid resources.    Available in PDF. Presentation available in PPS.

Introducing the SAS® Code Analyzer(presented at SAS Global Forum 2008)
This paper introduces the PROC SCAPROC procedure, the SAS Code Analyzer that is new in Release 9.2 of Base SAS® Software. We will examine the advantages of using the procedure, its syntax and phases of execution, and the output that the procedure can produce. This procedure greatly facilitates grid enabling your existing SAS programs.    Available in PDF. Presentation available in PPS.

Balancing the Load - SAS® Server Technologies for Scalability(presented at SAS Global Forum 2008)
This paper will address a variety of SAS® servers and how they can be used to balance workload and work together to provide scalability in a SAS Enterprise deployment. We will discuss a variety of servers including stored process servers, workspace servers, data step batch servers, and grid servers. We also will discuss the options for using these servers to balance load and provide solutions that can leverage a scale-out architecture.   Available in PDF. Presentation available in PPS.

Archtecting a Finely Tuned SAS® Grid Solution(presented at SAS Global Forum 2008)
SAS Grid Computing is a scale-out SAS solution that enables SAS applications to better utilize computing resources. When architecting a SAS Grid Computing solution it is important to understand the components required to ensure a scalable and high optimized solution. This paper details some of the components necessary to architect and tune a SAS Grid Computing solution.   Available in PDF

SAS Goes Grid - Managing the Workload Across Your Enterprise (presented at SUGI31)
Learn how grid computing and scheduling have been incorporated and automated to deliver value in a highly efficient manner for SAS analytics, data integration (ETL), data mining and business intelligence. Also learn about advanced configuration options that you can use to fine-tune your SAS grid environment and allow multiple applications to efficiently and dynamically use a virtual IT infrastructure.   Available in PDF

SAS and Grid Computing - Maximize Efficiency, Lower Total Cost of Ownership (presented at SUGI29)
Grid computing is about leveraging your available resources and idle processor cycles to more quickly solve a problem while at the same time maximizing efficiency and reducing your total cost of ownership. This paper will discuss how SAS works in a grid, the types of applications that are well suited to grid computing and success stories using SAS in a grid.   Available in PDF

***CUSTOMER PAPER: Performance Enhancements SAS 9.0 HP-UX 11.00 64 bit
At a recent user group conference Charles Pollack of Suncorp Metway presented performance and scalability improvements that he experienced with SAS 9.0. Charles details the improvements he got by running his V8 SAS jobs with SAS 9.0 and leveraging some of the new scalable features including: PROC SORT, PROC SUMMARY, the new SPDE engine, and the new piping feature of MP CONNECT. He was able to use piping in one scenario to reduce his execution time to less than half!   Slide presentation available in PDF

Developing Client/Server Applications to Maximize SAS 9 Parallel Capabilities (presented at SUGI28)
This paper discusses parallel processes and parallel threads and how they can be used together to create scalable client/server applications. Three application scenarios are discussed in detail. The first scenario uses MP CONNECT in a grid environment. The second scenario combines MP CONNECT with the threaded I/O in the new SPD Engine. The final scenario combines MP CONNECT with threaded PROC SUMMARY. Each application is implemented with an iterative approach to maximize scalability and mimimize total elapsed execution time.   Available in PDF The presentation for this paper is also available as   PowerPoint

Up and Out: Where We're Going with Scalability in SAS Version 9
In order to achieve scalability you must have an application with scalable characteristics, hardware capable of providing scalability, and a software solution that is capable of leveraging the available hardware. SAS 9 is a complete and flexible solution to software scalability. This paper introduces several concepts related to software scalability and how SAS 9 addresses scalability in many of the SAS products and servers.   Available in PDF

"What's under the hood?" V9 Performance and Scalability
This presentation gives an overview of the ways that SAS is addressing performance through scalability in SAS V9.    Slide presentation available in PDF

Scalable Access to SAS Data
This paper introduces the new Scalable Performance Data (SPDE) Engine available in Version 9. The SPDE engine brings support of parallel access to partitioned data into Base SAS Software.   Available in PDF

Scaling SAS Data Access to Oracle RDBMS
In SAS 9 several of the SAS/ACCESS products encorporated threading in order to provide parallel access to DBMS data when available. The focus of this paper is on the SAS/ACCESS to Oracle product and the options that can be used to parallelize access to data stored in an Oracle DBMS. In addition, key concepts of Oracle 9i that pertain to scalability are described.   Available in PDF The presentation for this paper is also available as   PowerPoint

Version 9: Scaling the Future
This paper details some of the enhancements being made to PROC SORT and PROC SUMMARY to leverage SMP hardware and provide better performance.  Available in PDF

An Inside Look at Version 9 and Release 9.1 Threaded Base SAS Procedures (presented at SUGI28)
Several key Base SAS procedures have encorporated threading to achieve significant performance improvements on SMP architectures. This paper looks at how threading has been used to enhance scalability in the following SAS procedures: SORT, SUMMARY, MEANS, TABULATE, REPORT, and SQL.   Available in PDF

SAS Meets Big Iron: High Performance Computing in SAS Analytic Procedures
This paper discusses enhancements to several SAS/STAT and Enterprise Miner procedures. These enhancements are aimed at exploiting multiprocessor hardware and improving performance.  Available in PDF

Parallelization in Action with SAS Analytics Software (presented at SUGI28)
This presentation details the scalability that is now possible with several of the analytic procedures as a result of parallel threading. Graphical results are presented for the following procedures: GLM, REG, DMREG, and LOESS.   PowerPoint

V9 OLAP: An Architectural Overview
This paper gives details about the new Version 9 OLAP Server architecture. Details of the architecture will show how the OLAP server scales for multi-user access and how it can be tuned to provide the best performance for your applications.   Available in PDF

OLAP Server: Focusing on Performance (presented at SUGI28)
SAS OLAP server performance is composed of two primary metrics: query response time and number of transactions processed per second (i.e.,throughput). This paper covers several factors to consider in achieving the best performance from your OLAP server including: hardware, cube design and implementation, and query formulation. Several tuning options are also presented that can aid in improving performance.   Available in PDF

Designing V9 OLAP Structures for Optimum Performance and Scalability (presented at SUGI28)
This presentation covers the new 9.1 considerations for designing OLAP structures to maximize performance and scalability. Several new tuning options are also available. In addition, the Applications Response Management (ARM) tool can be used to monitor and diagnose any performance problems.   PowerPoint

SPD Server 4.1: Scalability Solution for SAS; Turning Big Data Into Business and Analytic Intelligence
This paper provides an overview of the new features and enhancements of SPD Server 4.0. These features can be used to optimize performance for the SAS Enterprise Marketing Automation Solution.   Available in PDF

Scalability Solution for SAS Dynamic Cluster Tables: SAS Scalable Performance Data Server 4.3 and Later
This paper provides an overview of dynamic cluster tables in SAS Scalable Performance Data Server 4.3 as well as enhancements that have been included in later releases. Dynamic cluster tables enable both the partitioning of data based on cr iteria in the data and parallel loading of the cluster tables.   Available in PDF

SAS Scalable Performance Data Server 4.3 tsm1: Parallel Join with Enhanced Group By
SAS SPD Server 4.3 TSM1 provides a new facility that executes SQL joins in parallel reducing the total time required to complete a query. This paper discusses the coverage, restrictions, tuning and performance benefits of the Parallel Join Facility and the enhanced Group By feature.   Available in PDF

Topics Pertaining to SAS Version 8 and Later

Multiprocessing with Version 8 of the SAS System
This paper introduces MP CONNECT technology and how it can be used to run portions of your applications in parallel to reduce the total elapsed time required to complete your job.  Available in PDF

The %Distribute System for Large-Scale Parallel Computation in the SAS System
This paper describes how to use MP CONNECT and the SAS macro facility to accomplish grid or high performance computing. A "divide and conquer" approach was taken to leverage the processing power of a variety of machines, using them in parallel to dramatically reduce the processing time of a Monte Carlo simulation. You can also download this ZIP file, which contains the SAS file used in the Monte Carlo simulation discussed in this paper.   Available in PDF

Save Time Today Using SAS Views
This paper emphasizes the value of using the established SAS View technology to reduce execution time when processing large amounts of data. Both SQL views and SAS data step views can be used to not only save you time but also reduce disk space requirements.   Available in PDF

Topics From Our Alliance Partners

ARMing SAS Jobs On HP-UX 11.x
This paper describes the steps necessary to ARM a SAS job and then monitor that job using HP GlancePlus. This document gives a brief introduction to the ARM API and then proceeds to outline two different methods of ARMing a SAS job. A description is then given on how to view the ARM data with HP GlancePlus.  Available in PDF

Performance Tuning Guide for SAS Users and Sun System Administrators
This paper provides a basic understanding of how to analyze and apply tuning changes to SAS software running on the Sun Solaris and Sun UltraSPARC hardware platform. It is based on Sun Solaris versions 8 and 9 along with Base SAS versions 8.2, 9.0, and 9.1. The information in this paper will help you plan for a new installation of SAS running on Sun. In addition, tools and methods will be given to help identify and resolve any issues with an installed SAS System on Sun.   Available in PDF

SAS Parallel Scoring Optimization
As data proliferates, organizations are taking advantage of data mining techniques to develop tactical and strategic insight into these vast data stores. Read how SAS parallel scoring can support an enterprise-class data mining operation.   Available in PDF

Tuning WebHound 4.0 and SAS 8.2 for Enterprise Windows Systems
This paper presents performance results of running WebHound 4.0 and SAS Version 8.2 on a Unisys ES7000 Server. WebHound 4.0 has incorporated MP CONNECT to provide scalability when running on an SMP server. This paper documents the recommended system and application configuration changes to achieve maximum performance and scalability for WebHound 4.0 when run on the ES7000.   Available in PDF

page divider

We at SAS have created the Scalability Community to make you aware of the connectivity and scalability features and enhancements that you can leverage for your SAS installation. The success of this community depends on you. Send electronic mail to scalability@sas.com with your comments, requirements, and suggestions.