Using the Grid Manager Plug-in for SAS Environment Manager

Overview

The Grid Manager plug-in for SAS Environment Manager enables you to monitor a SAS grid cluster. The plug-in provides some of the same functions as Platform RTM, so you can monitor your grid using the same application that you use to monitor your SAS environment. This plug-in enables you to manage grid resources by providing dynamic information and management functions for the following:
  • The defined LSF cluster
  • Hosts that make up the grid
  • Queues that control grid processing
  • Jobs submitted to the grid cluster
  • High availability applications defined for the grid
  • Audit log for any management actions on the plug-in
The management functions provided by the plug-in include the following:
Jobs
Kill (stop execution), suspend (pause execution), and resume (restart a suspended job).
Hosts
Close (prevents jobs from processing on the machine), open (enables jobs to be processed on the machine).
Queue
Close (the queue cannot accept jobs), open (accepts jobs), inactivate (the queue can accept jobs, but cannot process them), activate (reactivates an inactive queue).
High availability applications
Migrate (start the application on the failover host), stop, start, restart (start the application on the original host). Management of high availability applications is available only for LSF administrators.
You must install and configure Platform Web Services in order to use the Grid Manager plug-in for SAS Environment Manager. Use the SAS Deployment Wizard to install Platform Web Services. See Installing and Configuring SAS Environment Manager in a Grid Environment for more information.
Platform Web Services must run under the account of the LSF Administrator user. To change the account that the service is running under, follow these steps:
  1. In the Windows Services window, click the entry for the service SASServer14_1 and select Stop from the context menu.
  2. In the dialog box for the SASServer14_1 service, click the Log On tab.
  3. Select the This account radio button and specify the user ID and password of the LSF Administrator user. Click OK to close the dialog box.
  4. Click the entry for the service and select Start from the context menu.

Using the Grid Manager Plug-in as the LSF Administrator

These Grid Manager functions are available only for the LSF Administrator user:
  • High Availability tab
  • Audit Log tab
  • Ability to view job information for all users
Users in SAS Environment Manager are mapped to users created in SAS metadata. In order for the LSF Administrator user identity to be available as a valid user in SAS Environment Manager, you must perform one of the following actions:
  • If the LSF Administrator identity is defined in metadata but not in SAS Environment Manager, then add the identity to the group SAS_EV_Super_User in SAS metadata and synchronize the users in SAS Environment Manager (select Synchronize Users under the Manage tab).
  • If the LSF Administrator identity is defined in metadata and in SAS Environment Manager, select Managethen selectList Users, select the user identity, then click Add to List to add the identity to the Super User Role.
  • If identities have been defined in the SAS_EV_Super_User group, then you can specify one of those identities as an LSF Administrator by adding the identity to the LSF configuration file. You do not have to synchronize the SAS metadata and SAS Environment Manager user information if you use this approach.
Adding the LSF Administrator user to the SAS_EV_Super_User group enables the user to sign on to SAS Environment Manager using the LSF Administrator’s credentials and assigns the user to the Super User role in SAS Environment Manager. See SAS Environment Manager: User's Guide for information about user management in SAS Environment Manager.

Viewing Grid Information

Start the SAS Environment Manager plug-in by selecting Analyzethen selectGrid Manager. The plug-in initially displays information about the defined LSF clusters. The management functions of the plug-in are identified by these tabs:
  • Cluster
  • Hosts
  • Queues
  • Jobs
  • Host Dashboard
  • High Availability
  • Audit Logs

Viewing LSF Cluster Information

The initial view of the SAS Environment Manager Grid Manager plug-in is the Cluster tab, which displays the Cluster Summary table. This table contains the status of the LSF cluster. A cluster is a group of hosts, organized and managed by an LSF administrator. Clusters are the basis for job sharing in Platform LSF.
cluster tab

Managing Hosts

The SAS Environment Manager Hosts tab enables you to close or reopen hosts on the grid, and to view information about each host. A closed host cannot process any jobs that are sent to the grid. Closing a host is useful when you want to remove the host from the grid for maintenance. You can also close the grid control server to prevent it from receiving work.
To perform an action on a host in SAS Environment Manager, follow these steps:
  1. Select the Hosts tab to display the Host Summary table.
    host tab
  2. In the table, select the check box for the host that you want to close or open.
  3. In the Choose an action menu, select the action that you want to perform and click Submit.
Specify a value in the Auto Refresh field to specify how often the information in the table is updated. By default, auto refresh is disabled. Click the refresh icon refresh icon to manually refresh the table.
Click on an entry in the Host Name column in the Host Summary table to view the Host Details table, which contains detailed information about the selected machine.
host detail screen

Managing Queues

You can use the Queues tab on the Grid Manager plug-in to close, open, activate, and inactivate queues. You can also view detailed information about each queue. A closed queue cannot accept any jobs that are sent to the grid. An inactive queue can still accept jobs, but none of the jobs in the queue can be processed. Closing a queue is useful when you need to make configuration changes to the queue.
To manage a queue in SAS Environment Manager, follow these steps:
  1. Select the Queues tab to display the Queue Summary table.
    queue tab
  2. In the table, select the check box for the queues that you want to perform an action on.
  3. In the Choose an action menu, select the action that you want to perform and click Submit. Choices for actions include the following:
    Open
    Opens a closed queue. The queue can accept new jobs and processes the jobs in the queue.
    Close
    Closes a queue. A closed queue cannot accept any jobs that are sent to the grid. Closing a queue is useful when you need to make configuration changes.
    Activate
    Activate an inactivated queue. The queue can accept jobs, and the jobs in the queue are processed.
    Inactivate
    Makes a queue inactive. An inactive queue can still accept jobs, but none of the jobs in the queue can be processed.
Specify a value in the Auto Refresh field to specify how often the information in the table is updated. By default, auto refresh is disabled. Click the refresh icon refresh icon to manually refresh the table.
Click on an entry in the Queue Name column in the Queue Summary table to view the Queue Details tables, which contain detailed information about the selected queue.
queue detail page

Managing Jobs

You can use the Jobs tab on the Grid Manager plug-in to terminate or suspend running jobs, terminate or resume suspended jobs, or requeue both running and suspended jobs. You can also view detailed information and the history for each job.
To manage jobs in SAS Environment Manager, follow these steps:
  1. On the Jobs tab, select the check box for the job on which you want to perform an action.
  2. In the Choose an action menu, select the action that you want to perform and click Submit. Choices for actions are:
    Terminate
    Stop execution of the selected job. If you log on to SAS Environment Manager using a user ID that has been identified as an LSF Administrator ID, you can terminate any jobs that have been submitted to the grid. Otherwise, you can terminate only your own jobs.
    Suspend
    Pause the execution of the selected job.
    Resume
    Resume processing of a suspended job.
Select a value in the Auto Refresh field to specify how often the information in the table is updated. By default, auto refresh is disabled. Click the refresh icon refresh icon to manually refresh the table.
To view detailed information about a job, click the entry in the Job ID column of the Job Summary table. The Job Property Details table displays complete information about the job. Click the job history icon job history icon to display the history of the job.
job history view
If you log on to SAS Environment Manager using a user ID that is defined as an LSF Administrator ID, you can terminate any jobs that have been submitted to the grid. Users can terminate only their own jobs. If you are terminating a job on Windows, be sure to match the domain name exactly (including case). See Using the Grid Manager Plug-in as the LSF Administrator for more information.
Note: Job names that use double-byte characters are not supported by the Grid Manager plug-in.

Viewing Grid Host Information

The SAS Environment Manager Host Dashboard tab enables you to view a graphic display of the status of all of the machines on the grid. The tab graphically displays all of the grid machines along with an icon indicating their status:
host dashboard view
ok iconOK
The host is online and accepting jobs.
unavailable iconUnavailable
Either the host is down or Platform LSF is unreachable.
unreachable iconUnreachable
Platform LSF is running on the host, but sbatchd is unreachable.
unlicensed iconUnlicensed
The Platform LSF license has expired on the machine.
closed iconClosed
The host is active and available, but has been closed and is not accepting jobs.
other iconOther
The status of the host is a state not covered by one of the other defined states.

Managing High Availability Applications

The SAS Environment Manager High Availability tab enables you to control and view information about high availability applications running on the grid. High availability applications are configured through Platform RTM, and are defined to have a primary host and a failover host. If the primary host fails, the application automatically starts on the failover host. The tab enables you to view the status of high availability applications running on a selected host, to stop applications, and to start applications on either the primary host or the failover host.
high availability tab
Note: The High Availability tab and its contents are visible only if you sign on to SAS Environment Manager using the credentials for an LSF Administrator. See Using the Grid Manager Plug-in as the LSF Administrator for more information.
The applications listed in the table in the High Availability tab must be configured using Platform RTM as high availability applications.
To perform an action on a high availability application, select the check box for the application in the table, select the action that you want to perform in the Choose an action menu, and then click Submit. You can perform these actions:
Migrate
Start the selected application on the failover host. Because migrating an application might take a long time, you should migrate only a few applications at one time in order to avoid time-out errors. This option is useful if you are performing maintenance on the primary host.
Stop
Stop the selected application.
Start
Start the selected application based on the application definition.
Restart
Restart the application on the original host if that host is running.
Specify a value in the Auto Refresh field to specify how often the information in the table is updated. By default, auto refresh is disabled. Click the refresh icon refresh icon to manually refresh the table.

Viewing Audit Log Records

The Audit Logs tab enables you to view information about records in the LSF audit log database. Select the tab to view the Audit Log Summary table, which contains a list of the audit log records.
audit log summary table
Specify a value in the Auto Refresh field to specify how often the information in the table is updated. By default, auto refresh is disabled. Click the refresh icon refresh icon to manually refresh the table.
Click on an entry in the Audit Log ID column to view the Details table for the selected audit log record.
audit log detail table
By default, audit logs are purged after 30 days. To change the amount of time that audit logs are kept, change the value of the AUDIT_LOG_KEEP parameter in the Platform Web Services database.
Note: The Audit Logs tab and its contents are visible only if you sign on to SAS Environment Manager using the credentials for an LSF Administrator. See Using the Grid Manager Plug-in as the LSF Administrator for more information.