Bookstore Questions?

Related Links

Stay Informed


linkedin sasCommunity blog Facebook YouTube

SAS® Books - SAS Tips

Have questions about these tips? Contact us at: saspress@sas.com or subscribe to SAS® Learning Report to be notified when books are available.

William Benjamin author photo

Exchanging Data between SAS® and Microsoft Excel: Tips and Techniques to Transfer and Manage Data More Efficiently
By William E. Benjamin, Jr.

Tip #1: Using PROC IMPORT to Read Data from a Subset of Cells in an Excel Worksheet.

When using the SAS IMPORT procedure to read data from an Excel file, the DBMS option XLSX (or XLS for *.xls files) provides an alternative method to reading a small group of cells from an Excel spreadsheet. However, this method does not always provide reliable variable names when GETNAMES=YES. GETNAMES=YES looks for variable names in the first row of input cells.

In this example taken from my book (Example 3.7, p. 44), the GETNAMES=NO is used to turn off the search for variable names in the Excel file. The RANGE='shoes$C2:F4'n command selects only 12 cells from Sheet "shoes" from the Excel file.

 Click here for the code

This enables the user to read part of an Excel spreadsheet.

Tip #2: Using Data Set Options to Process Date and Time Values

SAS can apply changes to data formatting when the data is being processed without the need to use SAS code to post process after input or pre-process before output. The following code from my book (Example 4.7, p. 62) processes dates, which are one of the most common values that are hard to process. SAS data set options provide these pre/post processing features as part of the SET or DATA input/output processing. See the "SET-UP" section below. The file has a product name, amount, and three dates with time segments attached.

The first code line assigns a LIBNAME xls_data so we can read the data from the sales_data.xlsx file created from the CSV data file. Two DATA steps read the same Excel file producing different results. The SAS data set Sales_data1 was created with default options so that all of the date fields are formatted as SAS DATE9. values.

In the second SAS data set, fields are changed on input using the SAS data set option SASDATEFMT as shown in the list below.

 Click here for the code

Field Name    Processing option applied
  • Product    Default-character string
  • Amount    Default-numbericcureency value
  • Date_Ordered     Default - DATE9. value
  • Date_Shipped     SASDATEFMT option converted Excel date/time to a TIME8. value
  • Date_Delivered     SASDATEFMT option converted Excel date/time to a DATETIME21.2 value

SAS and Excel systems recognize dates in their own way. Each uses different base dates for their starting point. The SAS base date is 01/01/1960 and Excel uses the Dublin Julian Day (12/30/1899) as the base date.

The data for the CSV file can be downloaded from my author page, it is in Chapter 4. Save data beyond row 7 as a *.csv file and load into Excel and save in an xls/xlsx file format.

LIBNAME xls_data EXCEL 'C:\My_excel_files\sales_data.xlsx';

DATA Sales_data1;
   SET xls_data.'Sales_data$'n;

DATA Sales_data2;
  SET xls_data.'Sales_data$'n (SASDATEFMT=(Date_Shipped=TIME8. Date_Delivered=DATETIME21.2));

This enables the user to convert formats on SET or DATA statements without extra SAS code.

Tip #3: The New ODS Destination EXCEL for writing Workbooks (Page 147)

There is a new entry into the ODS arena of data transfer to Excel. ExcelXP is a Tagset that has many features. It is delivered as source code within the SASHELP.Tmplmst tagset files. This can be modified or a new version can be downloaded at any time. But, there is something new on the horizon. It is called the EXCEL ODS Destination.

This will be a process included within SAS ODS features that is not available for updating by users. It was first shipped as an "Experimental" version with SAS Version 9.4 TS Level 1M1. The outputs of this ODS destination are native format Excel workbooks for the version 2007 format and above. As with all things SAS sends out in an "Experimental" mode, this ODS destination is likely to change before it is in a production mode.

 Click here for the code
ODS EXCEL FILE='c:\temp\Excel\shoes_native.xlsx';

   PROC PRINT DATA=sashelp.shoes;
   BY region;


Ron Cody author photo

An Introduction to SAS® University Edition
By Ron Cody

Tip #1: Using the COUNTC and CATS Functions Together

The CATS function concatenates all of its arguments after it strips leading and trailing blanks. The COUNTC function counts characters. Together, they can let you operate on several separate variables as if they were a single string. To demonstrate, here is a program to count the number of Y's and N's in a five question survey.

 Click here for the code
*Old way

Make an array of the questions, initialize counters, count the Y's and N's in a DO loop;;
data survey;
   input (Ques1-Ques5)($1.);
   array Ques[5];
   Num_Yes = 0;
   Num_No = 0;
   do i = 1 to 5;
      if upcase(Ques[i]) = 'Y' then Num_Yes + 1;
      else if upcase(Ques[i]) = 'N' then Num_No + 1;
   drop i;
; *New way
Use the CATS function to concatenate the 5 questions. Then, use the COUNTC function to count the number of Y's or N's in the string, using the ignore case modifier.; data survey;
   input (Ques1-Ques5)($1.);
   Num_Yes = countc(cats(of Ques1-Ques5),'y','i');
   Num_No = countc(cats(of Ques1-Ques5),'n','i');

Tip #2: Using CALL SORTN to Save Lots of Coding

The CALL SORTN routine sorts values WITHIN an observation. One way to use this call routine is:

CALL SORTN(of Var1-Var10);

After the call, the values of Var1 to Var10 will be in ascending order. Here is an example where you use this call routine to compute the mean of the highest 8 quiz scores out of a total of 10.

 Click here for the code

*Tip 2;

*Use CALL SORTN to compute the mean of the 8 highest scores

(out of 10);

data grade;

  input Score1-Score10;
  call sortn(of Score1-Score10);
  Mean_8 = mean(of Score3-Score10);
  /* alternative program
  call sortn(of Score10-Score1); *Descending sort;
  Mean_8 = mean(of Score1-Score8);
80 90 100 50 92 90 79 98 100 55

Tip #3: Cody's Law of SAS Programming

If you find yourself writing a very tedious SAS program--STOP. Take a walk, talk to a colleague, then ask yourself, "Does SAS have some tool to make this program less tedious?" Perhaps an ARRAY, or a MACRO, or a FUNCTION can help. Sometimes brute force programming is necessary, but very often, there is a SAS tool that will make for an easier and more elegant program.

Lora Delwiche author photo

Exercises and Projects for The Little SAS® Book, Fifth Edition
By Rebecca A. Ottesen, Lora Delwiche and Susan Slaughter

Tip #1: Commenting Blocks of Code

This tip I learned from fellow SAS Press author Alan Wilson at SAS Global Forum 2008 in San Antonio. It might be a bit overly dramatic to say that this tip changed my life, but that's not far from the truth! So, I am paying this tip forward. Thank you Alan Wilson!

To comment out a whole block of code, simply highlight the lines of code, hold down the control key, and press the forward slash ( /). SAS will take those lines of code and turn them into comments by adding a /* to the beginning of each line and an */ at the end of each line.

To convert the commented lines back to code, highlight the lines again, hold down the control and shift keys, and press the forward slash ( /). This works in both the SAS Windowing environment (Display Manager) and SAS Enterprise Guide.

If you are using SAS Studio as your programming interface, you comment the same way, but to uncomment, just hold down the control key and then press the forward slash.

Rebecca Ottesen author photo

Exercises and Projects for The Little SAS® Book, Fifth Edition
By Rebecca A. Ottesen, Lora Delwiche and Susan Slaughter

Tip #2: Grouping Quantitative Variables

My favorite tip to share with students and SAS users is how to use PROC FORMAT to group quantitative variables into categories. A format can be created with a VALUE statement that specifies the ranges relevant to the category groupings.

Then, this format can be applied with a FORMAT statement during an analysis to group the variable accordingly (Don't forget the CLASS statement when applicable!) You can also create categorical variables in the DATA step by applying the format in an assignment statement with a PUT function.

Susan Slaughter author photo

The Little SAS® Book, Fifth Edition
By Lora Delwiche and Susan Slaughter

Tip #3: Amazing SAS Resources

This is the best time ever to learn SAS! When I first encountered SAS, there were only two ways that I could get help. I could either ask another graduate student who might or might not know the answer, or I could go to the computer center and borrow the SAS manual. (There was only one.) Today it's totally different.

I am continually AMAZED by the resources that are available now—many for FREE.

Here are four resources that every new SAS user should know about:

1. SAS Studio

This is a wonderful new interface for SAS that runs in a browser and has both programming and point-and-click features. SAS Studio is free for students, professors, and independent learners. You can download the SAS University Edition to run SAS Studio on your own computer, or use SAS OnDemand for Academics via the Internet.

2. Online classes

Two of the most popular self-paced e-learning classes are available for free: SAS Programming 1: Essentials, and Statistics 1. These are real classes which in the past people paid hundreds of dollars to take.

3. Videos

You can access hundreds of SAS training videos, tutorials, and demos at support.sas.com/training/tutorial. Topics range from basic (What is SAS?) to advanced (SAS 9.4 Metadata Clustering).

4. Community of SAS users

If you encounter a problem, it is likely that someone else faced a similar situation and figured out how to solve it. On communities.sas.com you can post questions and get answers from SAS users and developers. On the site, www.lexjansen.com, you can find virtually every paper ever presented at a SAS users group conference. The site www.sasCommunity.org is a wiki-style compendium of all things SAS.

Peter Eberhardt author photo

The DS2 Procedure: SAS® Programming Methods at Work
By Peter Eberhardt

Tip #1: Merge without a Sort

As we all know, if you want to do a merge (not an SQL join), your data have to be sorted on a common key; otherwise, the merge will fail. Well, that is no longer true. Using PROC DS2, you can perform a DATA step type merge without first sorting the data. Check my author page for The DS2 Procedure: SAS Programming Methods at Work for the example data.

 Click here for the code
proc ds2;
data mrgDS2(overwrite=yes) ;
   method run();
     merge plants (in=inP)
       animals (in=inA);
   by common

Tip #2: Simply IF The DATA step provides the IFN() and IFC() functions to allow a direct assignment from an IF type test. PROC DS2 takes this a step forward with the IF expression; the IF expression combines the flexibility of the DATA step IF/THEN/ELSE with the direct assignment to a variable:

 Click here for the code
accept =          if citycode = '1' and Income > aIncome then  1
       else if citycode = '2' and Income > bIncome then 1
       else if citycode = '3' and Income > cIncome then 1
       else 0;

Tip #3: A Stitch in Time -- or, you need thread for that

Indy cars run on all cylinders; otherwise, they could not run as fast as possible. Why should your data access leave unused CPU cycles and have slower performance? With PROC DS2, you can create threaded data access to take advantage of all the CPU available. It can be easy to convert your data access to a threaded data access; here, we create a thread process to access the data, then spawn four threads to read the data:

 Click here for the code

/* create the thread */
proc ds2;
thread score /overwrite=yes;
   method run();
      DCL int accept i;
      set incomes;
      accept = if citycode = '1' and income > 100000 then 1
         else if citycode = '2' and income > 200000 then 1
         else if citycode = '3' and income > 120000 then 1
         else 0;
      if accept then output;

/* use the thread */
proc ds2;
data results /overwrite=yes;
   declare thread score score;
   method run();
      set from score threads=4;

Jane Eslinger author photo

The SAS® Programmer's PROC REPORT Handbook: Basic to Advanced Reporting Techniques
By Jane Eslinger

Tip #1: Using Bright Colors and Thick Borders When Changing Style Attributes

As you are creating and testing your PROC REPORT code that changes style attributes, use bright colors and thick border widths. This will help you easily spot what changed, if anything, between runs. Once you have finalized the look of your report, you can change all of the colors to the appropriate color, like black, and the line thickness to the default, like 1pt.

Tip #2: Formats are Your Friend

Leverage the power of formats because they are your friends. Formats are used inside of PROC REPORT to apply foreground and background colors, to insert images, and change the sort order. Using formats to apply style attributes also means that you only have to change the value in one place, which saves time if updates must be made to the code.

Tip #3: The Order of the Items on the COLUMN Statement Matters

PROC REPORT processes the data and creates the report based on the order of the report items on the COLUMN statement. It reads them from left to right. Be sure to pay attention to the order when using multiple report items inside of a compute block; it matters!

Mark Jordan author photo

Mastering the SAS® DS2 Procedure: Advanced Data Wrangling Techniques
by Mark Jordan
Anticipated publication date: May 2016

Tip #1: Warp Speed Data Processing with DS2

DS2 gives you the power to beat the clock! It increases your CPU time, but cuts your elapsed time in half.
 Click here for the code
"/* Make some fake data to test with: */
data t;
   call streaminit(123456);
   do id=1 to 1000000;

/* Traditional DATA step process */
data t1;
  array score[0:100];
  set t END=LAST;
  do i=LBOUND(SCORE) to hbound(score);
      Score[i]= (SQRT(((id * ru * rn) / (id + rn + ru))*ID))*
         (SQRT(((id * ru * rn) / (id + rn + ru))*ID));
   if last then put 'Data step processed ' count 'observations.';
   drop i count;

/* DS2 threaded processing */
proc ds2;
/* The thread program describes what to do to each row of data */
thread th2/overwrite=yes;
  dcl bigint count;
   drop count;
  vararray double score[0:100] score0-score100;
   method run();
      dcl int i;
      set t;
     do i=LBOUND(SCORE) to hbound(score);
         Score[i]= (SQRT(((id * ru * rn) / (id + rn + ru))*ID))*
         (SQRT(((id * ru * rn) / (id + rn + ru))*ID));
method term();
  /*Make each thread report how many obs processed*/
  put 'Thread' _threadid_ ' processed' count 'observations.';

/* The DATA program executes multiple threads */
data th4/overwrite=yes;
   dcl thread th2 t;
   method run();
  set from t threads=4;

After executing the DATA step program, the SAS log shows execution real time was 5.20 seconds vs. a CPU time of 5.11 seconds. This process is likely CPU bound. Next, we rewrite the process in DS2 leveraging threads. After executing the DS2 DATA step program, the SAS log shows execution real time was 2.46 seconds vs. a CPU time of 8.37 seconds.

Tip #2: Clean Messy Data with the COMPRESS Function

Discover the power of the COMPRESS function modifiers! Rather than explicitly listing all the stuff you don't want, you can describe character classes using modifiers instead.

 Click here for the code
      	"/* Make messy data for testing */
data MixedUp;
   infile datalines dlm='*' dsd truncover;
   input Name:$20. ph:$15. ht:$15. wt:$15.;
  LABEL Name='Student Name'
     Ph='Primary Phone Number'
M. Jordan*(123)555-1234*178cm*81 kg
O. Kenobi*123.555.9999*175cm*75 kg
D. Vader#%%!!*1235556666*82in*200 lbs

TITLE ""Messy Data"";
proc sql;
   select * from MixedUp;

/* Custom picture format for displaying numeric phone numbers */
proc format;
   picture phone (default=13) low-high='000)000-0000' (prefix='(');

/* Clean up the data with COMPRESS */
data CleanedUp1;
   LENGTH Name $ 20 Phone 8;
   LABEL Name='Student Name'
     Phone='Primary Phone Number';
   keep Name Phone ht wt;
   format Phone phone. ;
   set MixedUp;
   /* Use COMPRESS to keep only alpabetics */
   Name=Compress(Name,'. ','ka');
   /* Use COMPRESS to keep only digits */
   /* Convert result to numeric using INPUT */

TITLE ""Pretty Data"";
proc sql;
   select * from CleanedUp1;

The 'k' modifier means 'KEEP' instead of remove, 'a' stands for alphabetical characters,'d' for digits. The DATA step uses these modifiers to quickly clean up the messy values in the original data. A custom picture format for phone numbers makes the final presentation even prettier!

Tip #3: More to LIKE in DS2

In the traditional DATA step, the LIKE operator is only valid in a WHERE clause. But in DS2, you can use it anywhere, making processing of character data easier than ever.

 Click here for the code
"libname test ""!SASROOT\core\sashelp"";
proc ds2;
title 'S-Type Vehicles';
   dcl varchar(50) ThisCar;
   Keep ThisCar;
   method run();
     set test.cars;
   ThisCar=CATX(' - ', Make, Model, Type);
   if ThisCar like '% S' then output;

Sharon Torrence Jones author photo

A Recipe for Success Using SAS® University Edition: How to Plan Your First Analytics Project By Sharon Torrence Jones
Anticipated publication date: August 2016

Tip #1: Start with Good Ingredients

You can relate data analytics to a casserole because you can understand the simplicity yet the depth of a good casserole. We get it, see it, think it, share it. In programming and cooking, starting with good ingredients and a good plan will create an ease of execution because you know where to begin and end. You can ensure that you have good ingredients (data!) by asking specific questions in a Google forms survey, cleaning and checking your data for completion, researching the area of interest so questions are applicable, and taking the time to understand your data. If you start with good data, you'll end up with great results!

Tip #2: Start with a Good Plan

I think about data like I think about food. If you watch a cooking show, the host always follows a plan to create a dish. There is a beginning, middle, and end to your analytics project, just like with the recipe process. Here are some helpful tips to create an outline for your project:

  • set an objective
  • plan a schedule
  • research the coding concepts that you will need to complete your project.

Tip #3: Tell the Story

Tell the story of your data by using easy-to-understand visuals and keeping it simple! Find what will communicate the most important aspects of your project, and showcase them. Is it from PROC MEANS or PROC FREQ? People want to be able to relate, so your story will bring the abstract of data to life!

kirk paul lafler author photo

PROC SQL: Beyond the Basics Using SAS®, Second Edition
By Kirk Paul Lafler

Tip #1: Application of the _METHOD PROC SQL Option

"PROC SQL's _METHOD option provides a wealth of information about the processes performed during specific PROC SQL operations, including query analysis and evaluation, any algorithms selected by the optimizer, and/or during testing and debugging operations. The following syntax illustrates the use of the _METHOD option.

 Click here for the code
PROC SQL  _METHOD;  /* Produces codes related to the SQL query process */
   < SELECT Query ; >

Code       Description
SQXCRTA      Create table as Select.
SQXSLCT      Select statement or clause.
SQXJSL       Step loop join (Cartesian).
SQXJM       Merge join operation.
SQXJNDX       Index join operation.
SQXJHSH       Hash join operation.
SQXSORT       Sort operation.
SQXSRC      Source rows from table.
SQXFIL       Rows filtration.
SQXSUMG      Summary stats (aggregates) with GROUP BY clause.
SQXSUMN      Summary stats with no GROUP BY clause.
SQXFIL      Rows filtration.
SQXSUMG      Summary stats (aggregates) with GROUP BY clause.
SQXSUMN      Summary stats with no GROUP BY clause.

Tip #2: Displaying Additional SAS Log Messages with the MSGLEVEL=System Option

SAS users are able to control how much information the SAS System writes to the SAS log by specifying the MSGLEVEL= SAS System option in an Options statement. The MSGLEVEL= option supports two possible values: N (which is the default) to print standard notes, warnings, and error messages; and I to print standard notes, warnings, error messages, plus additional information about sort, merge, and index processing.

When specifying MSGLEVEL=I in an options statement, SAS displays the sort product that was used in a sort operation, a warning when variables are overwritten during merge processing; and the name of the available index that was used in index processing (or helpful suggestions on what can be done to influence SAS to use an available index); along with the usual assortment of notes, warnings, and error messages.

The following example demonstrates how the MSGLEVEL=I system option statement is specified.

 Click here for the code
   < SELECT Query ; >

Tip #3: Determining the SAS Product(s) Your Organization Licenses and Has Installed

Have you ever wondered which SAS product(s) your organization licenses along with each product's expiration date? This tip shows an undocumented procedure, PROC SETINIT, familiar and frequently used by SAS Administrators and Technical Support staff and available only in SAS versions 8 and 9.

The SETINIT procedure displays the SAS product(s) that are currently licensed, site name and number, the expiration dates for each product licensed, system birthday, and grace and warning periods. Because PROC SETINIT is an undocumented procedure, users should enjoy the information it provides, but use with care.

 Click here for the code

PROC PRODUCT_STATUS is another SAS procedure used by Administrators, Technical Support staff, and SAS users to assist and support software deployment strategies, planning and scheduling upgrades, contacting SAS Technical Support, and applying fixes.

Available starting in SAS version 9.2, the PRODUCT_STATUS procedure provides important information about the specific SAS versions for each licensed and installed product.


sanjay matange author photo

Reuse a Graph with Different Data®
By Sanjay Matange

Tip #1 – Reuse a Graph with Different Data

Do you have a complex multi-cell graph created in ODS Graphics Designer that you’d like to reuse with different data? Ideally, you’d like to change the data without having to change the plots in each cell. The designer’s shared variable feature provides an easy way to build and reuse graphs with different data. With shared variables, you associate one or more shared variables with the data roles in your plots.

For detailed information, see this blog: Reuse a Graph with Different Data

Tip #2 – Managing Group Attributes in Your Graph

When rendering a graph with group classifications, the visual attributes for the different class levels are assigned from the GraphData1-GraphData12 style elements of the active style, in the order that the values are encountered in the data. The group value that is encountered first in the data gets attributes such as color, marker symbol, or line pattern from the GraphData1 element.

Say you are using the HTMLBlue style and your data set has three group classification values - "Mild," "Moderate," and "Severe" which occur in that sequence. Now, "Mild" is assigned the visual attributes from GraphData1, for which the color is "Blue." "Moderate" will get GraphData2 with the color "Red," and "Severe" will get GraphData3, with the color "Green." If the group values occur in a different order next week, with "Severe" being first, then "Severe" will get GraphData1, with the color "Blue." So, the colors that are assigned by group values will change from day to day.

To ensure that group classifications get consistent assignment of the visual attributes, regardless of ordering, use the SAS 9.3 Discrete Attributes Map feature in SGPLOT and GTL. A Discrete Attributes Map is defined as a data set with specific columns and multiple observations, like a format. For each "Value" expected in the data, you can add an observation to specify the visual attributes to be used for that specific value. An example is shown below.

matange graphic

In this case, for value='Severe', the Line and Marker colors are set to "Red." So, no matter where in the data this group value "Severe" is found, the associated graphical element will be the color "Red." Many visual attributes can be controlled, including fill color, line color, marker symbol, line pattern, etc. Multiple attribute "Maps" can be defined in one data set to be used in different cases. In this case only one map is defined with Id=Severity.

For detailed information, see my blog: Attribute Maps - 1

Tip #3 – Including Unicode Values in Your Graph

Often we want to introduce Greek characters in the graph to correctly represent some unit value or data. When the value to be displayed is in an inset, axis label, or title, this is relatively easy, and can be added to the text string using the special ODS escape characters and the special notation to display Unicode characters.

However, when the Greek or Unicode values are in the data itself, such as category or group values, the matter is not so trivial. With SAS 9.40M3, this is possible using a user-defined format that maps regular values to the Unicode representations. One special requirement is that, in this case, the original ODS escape character (*ESC*) must be used as shown below.

 Click here for the code
proc format;
  value agegroupUnicode
   0 -< 40 = '< 40'    40 -< 50 = '40 < 50'    50 -< 60 = '50 < 60'    60 -< high = "(*ESC*){unicode '2265'x} 60" ; run;

Now, the category values are displayed using the formatted strings, including the Unicode characters.
For detailed information, see my blog: Unicode in Formatted Data - SAS 9.40M3

Tip #4 – Managing the Size of Your Graph for Inclusion in Journals

One common use of the graphs created during analysis is to include these in submissions to technical journals. Often the natural tendency is to render the graph in its default size and resolution, and then insert it into one column in a two-column format. The default size of the graph is 640x480 pixels at 96 dpi. One column in a two-column format is 3.5" wide. Putting the graph in this space causes the graph image to be scaled down by about 50% to fit the space.

While in general this is OK for the graphical elements, the fonts take a hard hit. Everything is shrunk about 50%, and a default 8pt font becomes 4pt, and thus hard to read. A better way is to render the graph to the width of the destination (3.25") using the WIDTH option on the ODS GRAPHICS statement and at a high dpi, say 300. This causes the fonts to be scaled down nonlinearly by (Width Ratio)0.25 which only reduces the font to 80% of original. The graph is now 3.25" wide and when added to a Word doc, the fonts will be much more readable.

For detailed information, see my blog: Graph size for presentations