SAS Scalability & Performance

Scalability is all about reducing the time-to-solution for your critical tasks.

Scalability can be approached from two directions or from both directions simultaneously:

Scale up

Scale out

Scale up & out

Fully utilize SMP hardware

Fully utilize distributed processors

Fully utilize the new scalability features in SAS 9 to combine the two scalability choices

It is important to note that scaling up and scaling out are not mutually exclusive choices. Hardware vendors have responded to the need for scalability by creating SMP machines that provide increased horsepower for solving large, CPU intensive problems.

Scaling up, from a hardware perspective, means increasing the number of processors, disk drives, I/O channels, etc. on a single server machine. 

Scaling out means adding more hardware, not bigger hardware.

Successfully scaled performance is not obtainable by simply installing more/faster processors or more/faster I/O devices. Scalability involves making choices between investing in SMP hardware, upgrading I/O configurations, making use of networked machines, reorganizing your data, and how much you are willing to modify your application. Achieving true scalability is a balancing act involving the choice of scalable hardware along with the right software that is specifically designed to leverage it. The portion of the original problem that can actually be processed in parallel determines the amount of scalability achievable from the software solution.

Scalability be accomplished by performing two or more tasks in parallel (independent parallelization) or overlapping two or more tasks (pipeline parallelization). This requires two things:

  • that there is at least some portion(s) of your task that can be overlapped or performed in parallel and
  • that you have hardware that is capable of multiprocessing.

It is important to understand that not every application lends itself to scalability and not every hardware configuration is capable of providing scalability.

How can I decide whether an application should be scaled?

Determine if it takes "too long" to run.
This may mean that the time required to run a job exceeds the batch window of time that you have available, or  that it takes "too long" for you to get the information from your application in order to make timely decisions.

Identify the pieces of the application that seem to consume the most time.
Then you can determine if these portions of your task are compute intensive or if they are I/O bound. This will help you to understand how scalable a particular task may be.

Hardware that is capable of multiprocessing would include symmetric multiprocessing (SMP) machines or multiple machines on a network each containing a single processor. In addition to the number of processors, it is important to have multiple I/O channels. This is inherent to multiple machines on a network. For an SMP machine, this can be accomplished with RAID arrays that allow you to stripe or spread your data across multiple physical disks. Even for a single threaded application, this can improve I/O performance because the operating system is able to read data from multiple drives simultaneously and synchronize the result for the application. For an application that is threaded, not only can the reads be done in parallel, but threads can be used to process the data in parallel as well.

Recommended Resources

Exchange ideas, information and best practices for installing and maintaining your SAS environment in the SAS Community.

Visit the SAS Administrators support hub, a one-stop solution to find support resources for your system.

Explore our training options, including on-site classroom, live web, e-learning and one-on-one mentoring.