There are situations which require you to run a procedure or a macro many times. For the special case where you need to run a procedure on separate blocks of observations in a data set, you can simply add a BY statement to your procedure call for most procedures. But many macros were not written with BY group processing as a built-in capability. Another type of problem needing repeated runs is when you want to fit a series of regression models with either a different response variable, or different predictors, or both. In these situations you would obviously like to avoid writing separate macro calls or procedure steps for the multiple runs required.
A general macro, RunBY, is available that can run other macros, procedures, or specialized code successively for BY groups in your data.
The following presents another approach that is also quite flexible. It uses the CALL EXECUTE function in the DATA step to generate the multiple procedure steps and run them. With this approach you can run the procedure or macro once for each observation in a data set, using the information that you place in the observations.
Three common uses of the CALL EXECUTE approach are illustrated below. For examples of doing similar tasks with the RunBY macro, see its documentation.
Run a procedure for each variable in a data set
Suppose you want to fit a series of single-predictor logistic regression models using the following data set. Each model will use one of the predictors to model the response variable, REMISS. When complete, you want a single data set that contains the intercepts and slopes for all of the models, along with tests and confidence intervals for the parameters. These statements create the data set.
data remiss; input remiss cell smear infil li blast temp; datalines; 1 .8 .83 .66 1.9 1.1 .996 1 .9 .36 .32 1.4 .74 .992 0 .8 .88 .7 .8 .176 .982 0 1 .87 .87 .7 1.053 .986 1 .9 .75 .68 1.3 .519 .98 0 1 .65 .65 .6 .519 .982 1 .95 .97 .92 1 1.23 .992 0 .95 .87 .83 1.9 1.354 1.02 0 1 .45 .45 .8 .322 .999 0 .95 .36 .34 .5 0 1.038 0 .85 .39 .33 .7 .279 .988 0 .7 .76 .53 1.2 .146 .982 0 .8 .46 .37 .4 .38 1.006 0 .2 .39 .08 .8 .114 .99 0 1 .9 .9 1.1 1.037 .99 1 1 .84 .84 1.9 2.064 1.02 0 .65 .42 .27 .5 .114 1.014 0 1 .75 .75 1 1.322 1.004 0 .5 .44 .22 .6 .114 .99 1 1 .63 .63 1.1 1.072 .986 0 1 .33 .33 .4 .176 1.01 0 .9 .93 .84 .6 1.591 1.02 1 1 .58 .58 1 .531 1.002 0 .95 .32 .3 1.6 .886 .988 1 1 .6 .6 1.7 .964 .99 1 1 .69 .69 .9 .398 .986 0 1 .73 .73 .7 .398 .986 ;
Since PROC LOGISTIC will be run once for each of the predictor variables, we need a data set that contains a variable holding the names of the predictors. This could be done by specifying all of the names as data in a DATA step.
data names; input var $ @@; datalines; cell smear infil li blast temp ;
The above approach might not be very practical for an existing data set with a large number of variables. Alternatively, use PROC TRANSPOSE to transpose the predictor variables and keep only the _NAME_ variable that contains the variable names. Since all the variables in the data set are predictors except for REMISS (the response), the DROP= option is used to ignore REMISS.
proc transpose data=remiss(drop=remiss) out=names(keep=_name_); run;
If the predictors are a small, contiguous group of variables, you could instead use a VAR statement with a variable list specifying only the first and last variables in the group. PROC CONTENTS can be used to see the order of variables in a data set.
proc transpose data=remiss out=names(keep=_name_); var cell--temp; run;
Regardless of which method is used to create a data set variable containing the predictor names, the following step is used to generate the statements for each PROC LOGISTIC step and then executes them. The ODS OUTPUT statement with the PERSIST option accumulates the Parameter Estimates tables in data set PE across all of the PROC LOGISTIC runs. Since this DATA step only produces code and runs it rather than creating a data set, DATA _NULL_ is specified. The SET statement assures that this DATA step will iterate for each observation in the NAMES data set. That is, it will iterate for each predictor name. The CALL EXECUTE statement runs the code in parentheses. Note that the code is a simple concatenation of three text strings: the PROC LOGISTIC statement and the MODEL statement up to the equal sign, the predictor name in the current observation of the NAMES data set, and the end of the MODEL statement and the RUN statement. The double vertical bars ( || ) are the string concatenation operator. For each iteration of this DATA step, the variable VAR in the NAMES data set contains the next predictor name and it is inserted into the MODEL statement. The text string is then run by CALL EXECUTE resulting in PROC LOGISTIC fitting a model with that predictor. The resulting Parameter Estimates table is added to the PE data set by the ODS OUTPUT statement.
ods output parameterestimates(persist)=pe; data _null_; set names; call execute("proc logistic data=remiss; model remiss =" || var || "; run;" ); run; ods output close;
The SAS Log shows the execution of each PROC LOGISTIC step (notes produced by each executed step are also displayed but are omitted here for clarity):
1 + proc logistic data=remiss; model remiss =cell ; run; 2 + proc logistic data=remiss; model remiss =smear ; run; 3 + proc logistic data=remiss; model remiss =infil ; run; 4 + proc logistic data=remiss; model remiss =li ; run; 5 + proc logistic data=remiss; model remiss =blast ; run; 6 + proc logistic data=remiss; model remiss =temp ; run; |
At completion, the accumulated parameter estimates tables are in data set PE.
proc print data=pe noobs; run;
Each pair of observations contains the intercept and slope for each of the six models fit by the preceding DATA step.
|
In this simple example, only a single substitution (the predictor name) is made to the code of the procedure at each run. But you could use this method for much more complex situations requiring more substitutions. You would simply need more variables in the input data set to specify all the information needed for each run.
Replicating the BY statement in a procedure
One common application is the equivalent of BY processing. This can be used with a procedure that doesn't offer a BY statement, or if an error when using a BY statement terminates the procedure before all BY groups are processed. Unlike BY processing with a BY statement, this method does not require sorting the data set.
See the documentation of the RunBY macro for a macro-based approach to doing BY group processing.
Using the Nonparametric Logistic Regression example in the PROC GAMPL documentation, the GAMPL procedure is run separately for the subjects in the training (TEST=0) and test (TEST=1) groups. PROC FREQ is used to produce a data set (GROUPS) containing the unique values of TEST. Then the DATA step runs for each unique TEST value executing PROC GAMPL for that TEST value. Note the use of pairs of quotes in the TITLE statement to represent a single quote in the code that is executed.
proc freq data=Pima; table test / out=groups; run; data _null_; set groups; call execute("proc gampl data=Pima seed=12345; where test="|| test ||"; model diabetes(event='1')=spline(Glucose) spline(Pressure)/dist=binary; title ""Test="|| test ||"""; run;"); run;
Adding BY processing to a macro
Another common usage of this technique is to add the capability of a BY statement to a macro.
See the documentation of the RunBY macro for a macro-based approach to doing BY group processing.
In this example, the technique is used to do the equivalent of BY processing with the MAGREE macro. Data set A contains the ratings of five raters (R) on each of ten subjects (S) for each of two questions (QUESTION).
data a; do question=1 to 2; do s=1 to 10; do r=1 to 5; input y @@; output; end; end; end; datalines; 1 2 2 2 2 1 1 3 3 3 3 3 3 3 3 1 1 1 1 3 1 1 1 3 3 1 2 2 2 2 1 1 1 1 1 2 2 2 2 3 1 3 3 3 3 1 1 1 3 3 1 2 2 2 2 1 1 3 3 3 3 3 3 2 3 1 1 1 1 3 1 1 2 3 3 1 2 2 2 2 1 1 1 2 1 2 2 2 2 3 1 3 3 3 3 1 1 1 3 3 ;
PROC FREQ is used to produce a data set (QUESTIONS) containing the unique values of QUESTION. Then the DATA step runs for each QUESTION executing a subsetting DATA step followed by the MAGREE macro for that question. The subsetting DATA step creates data set B which contains only the data for one question. Note the use of pairs of quotes in the TITLE statement to represent a single quote in the code that is executed. Also the %NRSTR macro function is needed to prevent the MAGREE macro from being executed before it is run by CALL EXECUTE.
proc freq data=a; table question / out=questions; run; data _null_; set questions; call execute('data b; set a; where question='|| question ||'; title ''Question ='|| question ||''';run; %nrstr(%magree(data=b, items=s, raters=r, response=y, stat=kappa))' ); run;
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | z/OS | ||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows 2012 | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX | ||||
SAS System | SAS/ETS | z/OS | ||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows 2012 | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | Analytics ==> analytics SAS Reference ==> Functions ==> Macro ==> CALL EXECUTE |
Date Modified: | 2020-07-09 11:31:22 |
Date Created: | 2012-06-19 17:35:36 |