Submitting Batch SAS Jobs to the Grid

Overview

The SAS Grid Manager Client Utility enables you to run SAS jobs on the grid in batch. You can also use the utility to check job status, end a job, and retrieve job output. Most of the options that are used by the SAS Grid Manager Client Utility are contained in the sasgsub.cfg file. This file is automatically created by the SAS Deployment Wizard. These options specify the information that the SAS Grid Manager Client Utility uses every time it runs.
The SAS Grid Manager Client Utility and Platform LSF must be installed on any machine on which the SAS Grid Manager Client Utility runs.

Grid Manager Client Utility File Handling

This is how files are handled by the SAS Grid Manager Client Utility when processing a job on the grid in batch mode:
  1. SASGSUB creates a job directory in the GRIDWORK directory under the directory of the user who is submitting the job. For example, if GRIDWORK is /grid/share and the submitting user is sasuser1, then a job directory is created in /grid/share/sasuser1 for the files.
  2. SASGSUB copies the SAS program and any files specified by GRIDFILESIN into the new directory.
  3. SASGSUB submits a job to the grid that includes information about the location of the job directory. It uses either GRIDWORK or GRIDWORKREM to specify the location of the job information to the grid. If you are staging files, SASGSUB also passes the stage file command specified by the GRIDSTAGECMD option to the grid.
  4. If the grid job is using staging when the job starts, the grid copies the files in the job directory under GRIDWORK to a temporary job directory. The temporary directory is in the grid's shared directory location that is specified during the SAS Deployment Wizard installation process.
  5. The grid runs the SAS program from the job directory and places the LOG and LST file back into the same job directory. For a shared file system, this directory is the one specified by the GRIDWORK option. This is also the directory that SASGSUB copied files into. If you are staging files, this directory is the job directory that is in the grid shared directory.
  6. If you are staging files, after the job is complete, the files in the job directory in the grid shared location are copied to the job directory that is specified by the GRIDWORK option.
  7. At this point in processing, the job directory in GRIDWORK contains all of the files that are required and produced by SAS batch processing. You can then retrieve the files using the GRIDGETRESULTS command.

Submitting Jobs in Batch Using the SAS Grid Manager Client Utility

To submit a SAS job in batch mode to a grid using the SAS Grid Manager Client Utility, issue the following command from an operating system command line:
<path/>SASGSUB -GRIDSUBMITPGM sas-program-file
The path option specifies the path for the SASGSUB program. By default, the location is <configuration_directory>/Applications/SASGridManagerClientUtility/<version>.
The -GRIDSUBMITPGM option specifies the name and path of the SAS program that you want to submit to the grid.
In addition, you can specify other options that are passed to the grid or used when processing the job, including workload resource names. For a complete list of options, see SASGSUB Syntax: Submitting a SAS Program in Batch Mode.
Specifying the -GRIDWATCHOUTPUT argument displays the standard output and standard error of the submitted batch job on your machine.

Running Commands in Batch Using the SAS Grid Manager Client Utility

To submit a command to a grid in batch mode using the SAS Grid Manager Client Utility, issue the following command from an operating system command line:
<path/>SASGSUB -GRIDRUNCMDcommand

Viewing Job Status Using the SAS Grid Manager Client Utility

After you submit a job to the grid, you might want to check the status of the job. To check the status of a job, issue the following command from a command line:
<path/>SASGSUB -GRIDGETSTATUS [job-ID | ALL]
-GRIDGETSTATUS specifies the ID of the job that you want to check, or ALL to check the status of all jobs submitted by your user ID. For a complete list of options, see SASGSUB Syntax: Viewing Job Status.
The following is an example of the output produced by the SASGSUB -GRIDGETSTATUS command.
Output Produced by SASGSUB -GRIDGETSTATUS Command
Current Job Information
  Job 1917 (testPgm) is Finished:  Submitted: 08Dec2008:10:28:57, Started: 08Dec2008:10:28:57 on Host host1, Ended: 08Dec2008:10:28:57
  Job 1918 (testPgm) is Finished:  Submitted: 08Dec2008:10:28:57, Started: 08Dec2008:10:28:57 on Host host1, Ended: 08Dec2008:10:28:57
  Job 1919 (testPgm) is Finished:  Submitted: 08Dec2008:10:28:57, Started: 08Dec2008:10:28:57 on Host host1, Ended: 08Dec2008:10:28:57
  Job information in directory U:\pp\GridSub\GridWork\user1\SASGSUB-2008-11-24_13.17.17.327_testPgm is invalid.
  Job 1925 (testPgm) is Submitted: Submitted: 08Dec2008:10:28:57

Ending Jobs Using the SAS Grid Manager Client Utility

If a job that has been submitted to the grid is causing problems or otherwise needs to be terminated, use the SAS Grid Manager Client Utility to end the job. Issue the following command from a command line:
<path/>SASGSUB -GRIDKILLJOB [job-ID | ALL]
-GRIDKILLJOB specifies the ID of the job that you want to end, or ALL to end all jobs submitted by your user ID. For a complete list of options, see SASGSUB Syntax: Ending a Job.

Retrieving Job Output Using the SAS Grid Manager Client Utility

After a submitted job is complete, use the SAS Grid Manager Client Utility to retrieve the output produced by the job. Issue the following command from a command line:
<path/>SASGSUB -GRIDGETRESULTS [job-ID | ALL] -GRIDGETRESULTSDIR
-GRIDGETRESULTS specifies the ID of the job whose results you want to retrieve, or you can specify ALL to retrieve the results from all jobs submitted by your user ID.
-GRIDRESULTSDIR specifies the directory in which the jobs results should be moved. When the results are retrieved, they are removed from the GRIDWORK directory, which keeps this directory from filling up with completed jobs. If you do not specify this parameter, the results are copied to a job subdirectory in the current directory.
A file named job.info is created along with the job output. This file contains information about the execution of the job, including the submit time, start time, end time, the machine on which the job ran, the job ID, and the return code from the SAS program.
The following is an example of the output produced by the SASGSUB -GRIDGETRESULTS command.
Output Produced by SASGSUB -GRIDGETRESULTS Command
Current Job Information
  Job 1917 (testPgm) is Finished:  Submitted: 08Dec2008:10:53:33, Started: 08Dec2008:10:53:33 on Host host1, Ended: 08Dec2008:10:53:33
    Moved job information to .\SASGSUB-2008-11-21_21.52.57.130_testPgm

  Job 1918 (testPgm) is Finished:  Submitted: 08Dec2008:10:53:33, Started: 08Dec2008:10:53:33 on Host host1, Ended: 08Dec2008:10:53:33
    Moved job information to .\SASGSUB-2008-11-24_13.13.39.167_testPgm

  Job 1919 (testPgm) is Finished:  Submitted: 08Dec2008:10:53:34, Started: 08Dec2008:10:53:34 on Host host1, Ended: 08Dec2008:10:53:34
    Moved job information to .\SASGSUB-2008-11-24_13.16.06.060_testPgm

  Job 1925 (testPgm) is Submitted: Submitted: 08Dec2008:10:53:34

Retrieving a SAS Grid Manager Client Utility Log

After a submitted job is complete, you can find the SAS program log file for the job in this location: GRIDWORK/user id/SASGSUB-YYYY-MM-DD_HH:MM_SS_mmm_job_name/program_name.log
The SAS Grid Manager Client Utility uses the standard SAS logging facility. Output from the SAS Grid Manager Client Utility is directed to the console unless you use the SAS logging facility to create a log.See the -LOGCONFIGLOC option in SASGSUB Syntax: Submitting a SAS Program in Batch Mode for a list of the supported logging keys.

Monitoring Batch Processing Using the SAS Grid Manager Client Utility

When you use SASGSUB to submit a program or a command to the grid in batch mode, you can use the -GRIDWATCHOUTPUT argument to interactively monitor the processing on the grid. The option specifies that the output of what was submitted by the SASGSUB command is displayed on your machine. If you use this argument when submitting a SAS program using -GRIDSUBMITPGM, the SAS log and output are displayed. If you use the argument when submitting a command using -GRIDRUNCMD, the command’s standard output and standard error are displayed. While the output is being displayed, entering the command prompt does not affect the processing in the grid.
If you terminate the SASGSUB session while in interactive monitoring mode, the batch job continues to run and does not terminate.

Using a Grid without a Shared Directory

If your grid configuration does not permit a directory structure to be shared between the grid client machines and the grid nodes, you can specify that the grid job move files into the grid before processing and move files out of the grid when the job is complete. The file movement (called file staging) is performed by the grid job using a remote copy program such as rcp, scp, or lsrcp. When using file staging, files are moved into and out of the grid using the GRIDWORK directory. The SAS Grid Manager Client Utility passes information to the grid that indicates which files need to be sent to the grid and where the files are located. After the grid processes the job, the results are copied back to the GRIDWORK directory. If the user is offline, the results are held in the shared file system until they are retrieved.
During the installation process, the SAS Deployment Wizard enables you to specify whether you will use a shared directory or if you will be staging files. If you specify that you will be staging files, you must also specify the staging command that you want to use to move the files (rcp, lsrcp, scp, pscp, or smbclient). You can also specify the host that you will use to stage files to and from the grid, if you are not using the current host.
To submit jobs to a grid without a shared file system, follow these steps:
  1. Use the GRIDSTAGECMD parameter on the SASGSUB command to specify the transfer method to use for moving the files from the staging directory to the grid.
  2. If the machine that stages the files is not the current host, use the GRIDSTAGEHOST parameter on the SASGSUB command to specify the host that is used to stage the files. For example, use this parameter if you are using a laptop to submit jobs to the grid and then disconnecting or shutting down the laptop before the jobs are completed or submitted. The laptop must have a GRIDWORK directory on a file server that is always available to the grid. Use the GRIDSTAGEHOST command to specify the file server host name.