The SAS programming language has been enhanced to allow SAS products, such as ETLS and EM, to produce grid enabled SAS applications as well as to allow you to develop your own SAS grid applications. SAS functions have been created in SAS/CONNECT to provide the syntax to enable applications to utilize the grid capabilities provided with SAS Grid Manager. One of the goals driving the creation of these functions was to provide a way for user-written applications that currently utilize the parallel capabilities of SAS/CONNECT to easily leverage the new grid capabilities with minimal code modification.
The following capabilities for grid enabling SAS applications are supported:
The following functions provide the functionality described above:
Syntax
Macro Function: %let rc=%sysfunc(grdsvc_enable(identifier, options));
Data Step: rc=grdsvc_enable("identifier","options");
where "identifier" and "options" could also be character data step variables.
Arguments
identifier: remote-session-ID | _all_ | _list_ | _show_id
The remote-session-ID is the SAS/CONNECT remote session ID (i.e., server-ID). An option value must be specified with this identifier.
_all_
The value _all_ enables all sessions to execute on the grid except for those SAS/CONNECT remote session IDs specified in other grdsvc_enable calls. An option value must be specified with this identifier.
_list_
The value _list_ prints the metadata contents of the grid server component to the SAS log. An option value must be specified with this identifier.
_showid_
The value _showid_ prints the grid-enabled or grid disabled remote-session-IDs to the SAS log. An option value should not be specified with this identifier.
options: resource=SASApplicationServer [;workload=workload-value] | ""
SASApplicationServer specifies the SAS application server to be accessed to look up the Logical Grid Server in order to obtain the properties defined in the Grid Server Component. These properties include things like the SAS command script to be used to invoke the remote SAS session and possibly other resources required by the application.
workload=workload-value
The workload-value indicates the type of workload being distributed and enables the grid to be partitioned according to workloads. For example, a specification of EM would indicate that the tasks associated with this identifier are Enterprise Miner tasks and should be directed to the grid resources designated to perform Enterprise Miner work.
""
The value "" indicates not to use grid for the specified identifier. This value is intended to be used when you have specified _all_ in a previous call but want to turn it off for a small number of exceptions.
Output
A zero return code indicates success.
A non-zero return code indicates that a distributed grid environment is not available and SMP execution is assumed. Grid execution is not an option if any of the following is true:
In this case, SMP execution is assumed with a default sascmd value of "!sascmd -noobjectserver" unless the global sascmd option has been specified.
Note: The grdsvc_enable call does not resolve to a specific grid node or cause any execution in the grid. The mapping of the identifier to a specific grid node and the creation of a SAS session on that grid node happens when the signon statement is executed.
Note: Specifying a workload property in the SAS grid server metadata component is what will actually control the partitioning of the grid. If workload is specified on the grdsvc_enable call, the SAS/CONNECT code will look for a matching value in the workload property of the SAS grid server metadata component. If that property is blank, the grid has not been partitioned and any workload is accepted. If the property is non-blank, the value specified in the grdsvc_enable call must match one of the workloads specified in the property or the signon will fail. If it does match one of the workloads specified in the property, it will be passed to LSF for selection of a machine and must appear as a resouce for one or more machines defined in the LSF cluster file. If the value matches a workload specified in the property but SAS Grid Manager is not licensed or is not properly installed, the signon will fail.
Example code:
/* all connections go to the grid, look up grid server component under SASMain SAS application server */ %let rc=%sysfunc(grdsvc_enable(_all_, resource=SASMain); /* turn off grid for all connections, use normal SAS/CONNECT options/settings for all connections */ %let rc=%sysfunc(grdsvc_enable(_all_, ""); /* p1 connection will go to the grid, look up grid server component under SASMain SAS application server */ %let rc=%sysfunc(grdsvc_enable(p1, resource=SASMain); /* turn off grid for the p1 connection, use normal SAS/CONNECT options/settings for all connections */ %let rc=%sysfunc(grdsvc_enable(p1, "");
The following would result in p1, p2, and p4 executing on nodes in the grid that have the resource "SASMain" associated with them and have been designated as able to handle workload of type ETL. P3 would signon to the specified machine aaa.bbb.ccc.com.
%let p1_rc=%sysfunc(grdsvc_enable(p1,resource=SASMain; workload=ETL)); %let p2_rc=%sysfunc(grdsvc_enable(p2,resource=SASMain; workload=ETL)); %let p4_rc=%sysfunc(grdsvc_enable(p4,resource=SASMain; workload=ETL)); %let p3=aaa.bbb.ccc.com; signon p1; signon p2; signon p3 user=xxx pass=yyy; signon p4;
Syntax
Macro Function: %let var=%sysfunc(grdsvc_getname(remote-session-ID));
Data Step: var=grdsvc_getname("remote-session-ID");
where "remote-session-ID" could also be a character data step variable.
Arguments
The remote-session-ID is the SAS/CONNECT remote session ID (i.e., server-ID).
Output
Hostname is the name of the machine that was chosen for the specified remote-session-ID.
Example code:
%let mynodea=%sysfunc(grdsvc_getname(task1));
Syntax
Macro Function: %let var=%sysfunc(grdsvc_getaddr(remote-session-ID));
Data Step: var=grdsvc_getaddr("remote-session-ID");
where "remote-session-ID" could also be a character data step variable.
Arguments
The remote-session-ID is the SAS/CONNECT remote session ID (i.e., server-ID).
Output
IP address is the ip address of the machine that was chosen for the specified remote-session-ID.
Example code:
%let myip=%sysfunc(grdsvc_getaddr(task1));
Syntax
Macro Function: %let num=%sysfunc(grdsvc_nnodes(identifier));
Data Step: num=grdsvc_nnodes("identifier");
where "identifier" could also be a character data step variable.
Arguments
identifier: SASApplicationServer | resource=SASApplicationServer
SASApplicationServer specifies the SAS application server that contains the grid logical server definition pertaining to the grid environment.
Output
Num is the number of processors that exist for parallel or grid processing. This number does not represent "unused" vs. "busy" processors, rather all of the processors that are known.
Example code:
%let numnodes=%sysfunc(grdsvc_nnodes(SASMain));
Note: When a grid environment is available, the value returned is the number of processors having the specified identifier (regardless of the number of machines). This number is resolved at the time the function is called. Therefore, you may get a different number from one call to the next if processors are added or removed from the pool of resources. With the grid environment, the smallest value this function would ever return is 0.
Note: If a grid environment is not available, this function will assume an SMP environment and return the value of the CPUCOUNT option. In this case, the smallest value this function would return is 1.
The following sample program is designed to verify the basic setup and configuration of a SAS grid environment. Be sure that you have completed the steps necessary to install and configure:
Note: On UNIX systems, make sure that you have initialized the LSF environment by running one of the following (where % and $ represent UNIX command line prompts):
/* The grdsvc_enable call will go out to the SAS Metadata Server and */ /* find the SAS Grid Server definition. A return code of 0 means that */ /* all signons will use the grid. A non-0 return code means that there */ /* is a problem that should be investigated. */
/* This program assumes a SAS application server of SASMain. If you */ /* are using SASApp or some other SAS application server, you must */ /* modify the enable and nodes function calls to specify your SAS */ /* application server. */ %let rc=%sysfunc(grdsvc_enable(_all_,resource=SASMain)); %put SAMPLE NOTE: Expecting rc to be 0.; %put SAMPLE NOTE: Value of rc=&rc.; %put SAMPLE NOTE: Do not proceed if rc is not 0.; /* The grdsvc_nnodes call will provide the number of grid nodes */ /* available in the grid. */ %let nnodes=%sysfunc(grdsvc_nnodes(resource=SASMain)); /* You can view the progress of the signons using the Grid Manager. */ /* Watch for Job Names such as SASGrid:xxxx where xxxx is the value of */ /* the sysjobid of this SAS session. */ %put Job Name=SASGrid:&sysjobid; /* Define a macro to loop to make sure that the grid nodes have been */ /* set up correctly. */ %macro loop; %do i=1 %to &nnodes; signon grdn&i; %put Session started on grid node %sysfunc(grdsvc_getname(grdn&i)); %end; %mend; /* Monitor the progress of the signon to the nodes using SAS */ /* Management Console and the Grid Manager. */ /* Invoke the loop macro to issue the signon. */ %loop; /* Stop SAS running on the grid nodes. */ signoff _all_;
Minimal modification is required to any existing SAS program that utilizes the parallel processing capabilities of SAS/CONNECT in order to use SAS Grid Manager in a grid environment. The following statements can be added either to your autoexec.sas file or to the beginning of your SAS program:
options metaserver='xxx.yyy.zzz.com'; options metaport=8561; options metarepository='Foundation'; options metauser='userxyz'; options metapass='passwd'; %let rc=%sysfunc(grdsvc_enable(_all_, resource=SASMain));
The following steps describe how to set up such a key definition.
options noconnectpersist; options noconnectwait; options metaserver='dnnnn'; options metaport=8561; options metarepository='Foundation'; options metauser='sasdemo'; options metapass='passwd'; /* should be encrypted password */ %let rc=%sysfunc(grdsvc_enable(grid, resource=SASMain)); signon grid;
gsubmit "%include 'c:\gpre.sas';"; rsubmit;
You can then type or include any SAS program into your program editor and then press F12 and the program will be submitted to computing resources on your SAS grid instead of executed locally.
Notes of interest: