SUPPORT / SAMPLES & SAS NOTES
 

Support

Sample 26178: Efficient Data Access using SAS Enterprise Guide

DetailsAboutRate It
Data access can be mysterious, like drawing water from the kitchen sink. You can see the water come out of the tap — and touch it and taste it, which is instantly gratifying. But do you really know where the water comes from? And do you know exactly how much water there is beyond that magical water faucet? Probably not — that’s something that most of us simply take for granted.

However, when you have a problem — water doesn’t flow, or it flows too slowly — then you really notice. And diagnosing plumbing issues is something that most people are uncomfortable with. (If you’re lucky, it’s just a kink in a hose somewhere.)

Data access works the same way. When your data flows freely into your SAS Enterprise Guide session, it seems like you can do nothing wrong. You can view it in the data grid, create queries, and run tasks. However, when data access points are not defined efficiently, that data flow can feel like you’re trying to suck an elephant through a straw. Everything seems to take so much longer to accomplish.

Crash course in data plumbing

Diagnosing data access issues can be easier than household plumbing chores. You simply need be able to answer three main questions:

Where does the data source originate? For example, is it in a database system such as Oracle, or is it in a text file on your server file system?

How large is the data source? SAS and SAS Enterprise Guide can deal with data sources that are millions or even billions of records large. However, understanding the data size is important to understanding the tradeoffs of various data configurations.

What route does the data travel to get to your SAS session? Because SAS needs to process your data for analysis and reporting, your data needs to travel the shortest distance possible from its point of origin into your SAS session. Even though you might use SAS Enterprise Guide to select your data source, what counts is how many “hops” the data must make to get to the SAS session where the real work occurs.

All this leads to the golden rule of efficient data access with SAS:

Define your data sources in terms of your SAS server. Use SAS libraries to connect to your data sources, and route all of your data access through those libraries.

Passing Niagara Falls through a garden hose

Because SAS Enterprise Guide makes it easy to get to data in many ways, you can easily violate this rule (inadvertently, of course). For example, you can choose File->Open->ODBC and select a data source defined relative to your local PC. However, when you use that data within your project, SAS Enterprise Guide realizes that this data is not presently accessible to your SAS session, so it attempts to perform the great favor of copying it for you.

WARNING: SAS Enterprise Guide is a great tool for many things, but it can be a bottleneck in the process of copying data. Copying data from an external source to your SAS session with SAS Enterprise Guide as the go-between is very inefficient. If the data is large, this operation can take several minutes (or longer!). In technical terms, this called “going around your elbow to get to your thumb.”

SAS/ACCESS: The plumber’s helper

Fortunately, it is easy to avoid moving all those data records through SAS Enterprise Guide: Simply define access to the external data source on the SAS server. SAS provides a set of data access products — SAS/ACCESS — to make this easy to do. A SAS/ACCESS module exists for just about every major database type in use today. For any that are missing, you can use SAS/ACCESS to ODBC, which can be like a universal pipe fitting to connect you to any data source.

The SAS/ACCESS products allow you to define the data sources in terms of SAS libraries. And after a data source is in a SAS library, your SAS programs can access it just as if it were a native SAS data set.

SAS libraries can be defined in your environment by a SAS administrator, or they can be defined as needed within your SAS Enterprise Guide project. You can find a step-by-step example of how to define such a library in SAS for Dummies, Chapter 15, "Setting it All Up."

Efficient data access with SAS Enterprise Guide -- a performance in three movements

For the visual learner out there, here is a bonus feature that you won't find in SAS for Dummies. It's a video demonstration of what happens when you access data using various techniques in SAS Enterprise Guide. Because it's difficult to "see" data flowing across the network into your desktop machine, the video represents your data using a prop that we can all understand: M&Ms.
http://www.youtube.com/watch?v=OSTa1EUpKT8


About the Author
Chris Hemedinger is a senior software manager in the Business Intelligence Clients division at SAS. Chris began his career at SAS in 1993 as a technical writer, creating hits such as SAS Companion for the OS/2 Environment (remember OS/2?) and SAS Companion for the Microsoft Windows Environment. In 1997 he became involved in a prototype project to make SAS easier to use for non-programmers -- that project evolved into the hugely popular SAS Enterprise Guide, a product that Chris has worked with ever since.

He co-authored SAS for Dummies with Stephen McDaniel. The book is available from the online bookstore.


These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.