Writing Efficient and Portable Macros |
Use Macros Wisely |
An application that uses a macro to generate only constant text is inefficient. In general, for these situations consider using a %INCLUDE statement. Because the %INCLUDE statement does not have to compile the code first (it is executed immediately), it might be more efficient than using a macro (especially if the code is executed only once). If you use the same code repeatedly, it might be more efficient to use a macro because a macro is compiled only once during a SAS job, no matter how many times it is called.
However, using %INCLUDE requires you to know exactly where the physical file is stored and specify this name in the program itself. Because with the autocall facility all you have to remember is the name of the macro (not a pathname), the gain in human efficiency might more than offset the time gained by not compiling the macro. Also, macros provide additional programming features, such as parameters, conditional sections, and loops, as well as the ability to view macro variable resolution in the SAS log.
So, be sure to use a macro only when necessary. And, balance the various efficiency factors and gains (how many times you use the code, CPU time versus ease-of-use) to reach a solution that is best for your application.
Use Name Style Macros |
Macros come in three invocation types: name style, command style, and statement style. Of the three, name style is the most efficient because name style macros always begin with a % , which immediately tells the word scanner to pass the token to the macro processor. With the other two types, the word scanner does not know immediately whether the token should be sent to the macro processor. Therefore, time is wasted while the word scanner determines whether the token should be sent.
Avoid Nested Macro Definitions |
Nesting macro definitions inside other macros is usually unnecessary and inefficient. When you call a macro that contains a nested macro definition, the macro processor generates the nested macro definition as text and places it on the input stack. The word scanner then scans the definition and the macro processor compiles it. If you nest the definition of a macro that does not change, you cause the macro processor to compile the same macro each time that section of the outer macro is executed.
As a rule, you should define macros separately. If you want to nest a macro's scope, simply nest the macro call, not the macro definition.
As an example, the macro STATS1 contains a nested macro definition for the macro TITLE:
/* Nesting a Macro Definition--INEFFICIENT */ %macro stats1(product,year); %macro title; title "Statistics for &product in &year"; %if &year>1929 and &year<1935 %then %do; title2 "Some Data Might Be Missing"; %end; %mend title; proc means data=products; where product="&product" and year=&year; %title run; %mend stats1; %stats1(steel,2002) %stats1(beef,2000) %stats1(fiberglass,2001)
Each time the macro STATS1 is called, the macro processor generates the definition of the macro TITLE as text, recognizes a macro definition, and compiles the macro TITLE. In this case, STATS1 was called three times, which means the TITLE macro was compiled three times. With only a few statements, this task takes only micro-seconds; but in large macros with hundreds of statements, the wasted time could be significant.
The values of PRODUCT and YEAR are available to TITLE because its call is within the definition of STATS1; therefore, it is unnecessary to nest the definition of TITLE to make values available to TITLE's scope. Nesting definitions are also unnecessary because no values in the definition of the TITLE statement are dependent on values that change during the execution of STATS1. (Even if the definition of the TITLE statement depended on such values, you could use a global macro variable to effect the changes, rather than nest the definition.)
The following program shows the macros defined separately:
/* Separating Macro Definitions--EFFICIENT */ %macro stats2(product,year); proc means data=products; where product="&product" and year=&year; %title run; %mend stats2; %macro title; title "Statistics for &product in &year"; %if &year>1929 and &year<1935 %then %do; title2 "Some Data Might Be Missing"; %end; %mend title; %stats2(cotton,1999) %stats2(brick,2002) %stats2(lamb,2001)
Here, because the definition of the macro TITLE is outside the definition of the macro STATS2, TITLE is compiled only once, even though STATS2 is called three times. Again, the values of PRODUCT and YEAR are available to TITLE because its call is within the definition of STATS2.
Note: Another reason to define macros separately is because it makes them easier to maintain, each in a separate file.
Assign Function Results to Macro Variables |
It is more efficient to resolve a variable reference than it is to evaluate a function. Therefore, assign the results of frequently used functions to macro variables.
For example, the following macro is inefficient because the length of the macro variable THETEXT must be evaluated at every iteration of the %DO %WHILE statement:
/* INEFFICIENT MACRO */ %macro test(thetext); %let x=1; %do %while (&x > %length(&thetext)); . . . %end; %mend test; %test(Four Score and Seven Years Ago)
A more efficient method would be to evaluate the length of THETEXT once and assign that value to another macro variable. Then, use that variable in the %DO %WHILE statement, as in the following program:
/* MORE EFFICIENT MACRO */ %macro test2(thetext); %let x=1; %let length=%length(&thetext); %do %while (&x > &length); . . . %end; %mend test2; %test(Four Score and Seven Years Ago)
As another example, suppose you want to use the %SUBSTR function to pull the year out of the value of SYSDATE. Instead of using %SUBSTR repeatedly in your code, assign the value of the %SUBSTR(&SYSDATE, 6) to a macro variable, then use that variable whenever you need the year.
Turn Off System Options When Appropriate |
While the debugging system options, such as MPRINT and MLOGIC, are very helpful at times, it is inefficient to run production (debugged) macros with this type of system option set to on. For production macros, run your job with the following settings: NOMLOGIC, NOMPRINT, NOMRECALL, and NOSYMBOLGEN.
Even if your job has no errors, if you run it with these options turned on you incur the overhead that the options require. By turning them off, your program runs more efficiently.
Note: Another approach to deciding when to use MPRINT versus NOMPRINT is to match this option's setting with the setting of the SOURCE option. That is, if your program uses the SOURCE option, it should also use MPRINT. If your program uses NOSOURCE, then run it with NOMPRINT as well.
Note: If you do not use autocall macros, use the NOMAUTOSOURCE system option. If you do not use stored compiled macros, use the NOMSTORED system option.
Use the Stored Compiled Macro Facility |
The Stored Compiled Macro Facility reduces execution time by enabling macros compiled in a previous SAS job or session to be accessed during subsequent SAS jobs and sessions. Therefore, these macros do not need to be recompiled. Use the Stored Compiled Macro Facility only for production (debugged) macros. It is not efficient to use this facility when developing a macro application.
Because you cannot re-create the source code for a macro from the compiled code, you should keep a copy of the source code in a safe place, in case the compiled code becomes corrupted for some reason. Having a copy of the source is also necessary if you intend to modify the macro at a later time.
See Storing and Reusing Macros for more information about the Stored Compiled Macro Facility.
Note: The compiled code generated by the Stored Compiled Macro Facility is not portable. If you need to transfer macros to another host environment, you must move the source code and recompile and store it on the new host.
Centrally Store Autocall Macros |
When using the autocall facility, it is most efficient in terms of I/O to store all your autocall macros in one library and append that library name to the beginning of the SASAUTOS system option specification. Of course, you could store the autocall macros in as many libraries as you want--but each time you call a macro, each library is searched sequentially until the macro is found. Opening and searching only one library reduces the time SAS spends looking for macros.
However, it might make more sense, if you have hundreds of autocall macros, to have them separated into logical divisions according to purpose, levels of production, who supports them, and so on. As usual, you must balance reduced I/O against ease-of-use and ease-of-maintenance.
All autocall libraries in the concatenated list are opened and left open during a SAS job or session. The first time you call an autocall macro, any library that did not open the first time is tested again each time an autocall macro is used. Therefore, it is extremely inefficient to have invalid pathnames in your SASAUTOS system option specification. You see no warnings about this wasted effort on the part of SAS, unless no libraries at all open.
There are two efficiency tips involving the autocall facility:
Do not store nonmacro code in autocall library files.
Do not store more than one macro in each autocall library file.
Although these two practices are used by SAS and do work, they contribute significantly to code-maintenance effort and therefore are less efficient.
Other Useful Efficiency Tips |
Here are some other efficiency techniques you can try:
Reset macro variables to null if the variables are no longer going to be referenced.
Use triple ampersands to force an additional scan of macro variables with long values, when appropriate. See Storing Only One Copy of a Long Macro Variable Value for more information.
Adjust the values of the MSYMTABMAX= System Option and MVARSIZE= System Option to fit your situation. In general, increase the values if disk space is in short supply; decrease the values if memory is in short supply. MSYMTABMAX affects the space available for storing macro variable symbol tables; MVARSIZE affects the space available for storing values of individual macro variables.
Storing Only One Copy of a Long Macro Variable Value |
Because macro variables can have very long values, the way you store macro variables can affect the efficiency of a program. Indirect references using three ampersands enable you to store fewer copies of a long value.
For example, suppose your program contains long macro variable values that represent sections of SAS programs:
%let pgm=%str(data flights; set schedule; totmiles=sum(of miles1-miles20); proc print; var flightid totmiles;);
You want the SAS program to end with a RUN statement:
%macro check(val); /* first version */&val %if %index(&val,%str(run;))=0 %then %str(run;); %mend check;
First, the macro CHECK generates the program statements contained in the parameter VAL (a macro variable that is defined in the %MACRO statement and passed in from the macro call). Then, the %INDEX function searches the value of VAL for the characters run; . (The %STR function causes the semicolon to be treated as text.) If the characters are not present, the %INDEX function returns 0. The %IF condition becomes true, and the macro processor generates a RUN statement.
To use the macro CHECK with the variable PGM, assign the parameter VAL the value of PGM in the macro call:
%check(&pgm)
As a result, SAS sees the following statements:
data flights; set schedule; totmiles=sum(of miles1-miles20); proc print; var flightid totmiles; run;
The macro CHECK works properly. However, the macro processor assigns the value of PGM as the value of VAL during the execution of CHECK. Thus, the macro processor must store two long values (the value of PGM and the value of VAL) while CHECK is executing.
To make the program more efficient, write the macro so that it uses the value of PGM rather than copying the value into VAL:
%macro check2(val); /* more efficient macro */&&&val %if %index(&&&val,%str(run;))=0 %then %str(run;); %mend check2; %check2(pgm)
The macro CHECK2 produces the same result as the macro CHECK:
data flights; set schedule; totmiles=sum(of miles1-miles20); proc print; var flightid totmiles; run;
However, in the macro CHECK2, the value assigned to VAL is simply the name PGM , not the value of PGM. The macro processor resolves &&&VAL into &PGM and then into the SAS statements contained in the macro variable PGM. Thus, the long value is stored only once.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.