Ganglia — Accessing Raw Data

Preparing Ganglia Data for SAS IT Resource Management

Ganglia is a scalable distributed system monitor tool for high-performance computing systems such as clusters and grids. It enables the user to remotely view live or historical statistics (such as CPU load averages or network utilization) for all machines that are being monitored.
Time series metrics from Ganglia are stored in a round-robin database (RRD) using the RRDtool. The SAS IT Resource Management adapter for Ganglia reads the RRD that was created using Ganglia.
Note: Before running the Adapter Setup wizard, install the RRDtool. The round-robin database tool, RRDtool, is a freeware package that is available for download from Tobias Oetiker.
The Ganglia RRD data can be gathered at any interval (step), any metrics can be gathered, and any consolidation function (CF) can be used. Because SAS IT Resource Management has its own aggregation process, the data read from the round-robin databases should be detail data, not consolidated data. To store detail data using the RRDtool, the RRDs should store the data with the CF set to Average. In addition, the average should be generated based on one step of data, where a step is the interval of time that was specified when the RRD was generated. If the data in the RRD is consolidated, then SAS IT Resource Management requires that the CF staging parameter be set accordingly for that consolidation. If this staging parameter is left blank, then the data in all of the CFs in the RRD are collected.
The adapter can read a single round-robin database, or it can read all round-robin databases in a directory. If multiple round-robin databases are read, the data is combined into a single staging table.
Because a round-robin database can store only numeric data, users of the RRDtool generally store identifying character data in the name or path of the round-robin database. The filename of each round-robin database is saved by the Ganglia adapter in a field called filename. The filename field that is saved by the Ganglia adapter creates Grid performance metrics or any relevant computed columns with this identifying information.
Note: You can backload data from the Ganglia adapter. To do so, use the rrdtool fetch --start option and rrdtool fetch --end option staging parameters to specify the date range of data to be read. For information about backloading, see How to Backload Raw Data.

Preparing the Ganglia Adapter for Staging

To read the raw data from the RRDtool database, Perl scripts are run as part of the staging process. This means that Perl must be installed on the machine that the SAS program runs on (that is, the SAS Workspace Server, the SAS batch server, or interactive SAS).
The Perl scripts that are used by the staging code are generated as part of the code generation of the staging job. This code can be viewed and modified as necessary in the Code tab of the job.
To run these scripts in batch mode, set the XCMD option for the Batch server. To do so, change set USERMODS_OPTIONS= to set USERMODS_OPTIONS=XCMD.
To run these scripts from within the SAS Workspace Server, the SAS option XCMD must be turned on. By default, this option is off in the SAS Workspace Server. For instructions on how to turn on the XCMD option, see How to Turn On the XCMD Option.

How to Turn On the XCMD Option

  1. Launch SAS Management Console. Log on as an administrator.
  2. Expand the Server Manager in the left panel.
  3. Expand the SAS Application Server that was specified when you configured SAS IT Resource Management. (This server is typically named SASApp or SASITRM.) Then expand <SAS Application Server> - Logical Workspace Server.
  4. Right-click the entry for the <SAS Application Server> - Logical Workspace Server and select Properties. The Workspace Server Properties dialog box appears.
  5. Select the Options tab and click Advanced Options.
  6. Select the Launch Properties tab to open the following dialog box.
    Launch Properties Tab of the Advanced Options Dialog Box
    Launch Properties Tab of the Advanced Options Dialog Box
  7. Select the Allow XCMD check box.
  8. Click OK to close all the open dialog boxes.
  9. Stop and then restart your Object Spawner service.
    Windows Specifics: To stop your Object Spawner service, select Startthen selectAll Programsthen selectSASthen selectSAS Configurationthen select<configuration-name>then selectObject Spawnerthen selectStop. To restart your Object Spawner service, select Startthen selectAll Programsthen selectSASthen selectSAS Configurationthen select<configuration-name>then selectObject Spawnerthen selectStop.
    UNIX Specifics: To stop and then restart your Object Spawner service, from the command line, change directories to SAS-config-dir/Lev1/SASMain/ObjectSpawner. Stop the Object Spawner by issuing this command: $ ./ObjectSpawner.sh stop. When you receive a confirmation that the Object Spawner has stopped, start it again by issuing this command: $ ./ObjectSpawner.sh start. You should receive a confirmation that the Object Spawner has started.

Notes about the SSH Host Command

The SSH host command is an executable object that is available as part of the functionality of the Ganglia, RRDtool, and SNMP adapters. This command specifies the RSH or SSH version of the command and the name of the host for running the rrdtool command. The rrdtool command facilitates reading data from round-robin database files that are located on other hostnames. Entering the SSH version of the command triggers this SSH functionality. This is the format of the command: ssh user@hostname. It is entered in the rsh/ssh host command field on the Staging Parameters tab of the Properties dialog box for the adapter’s staging transformation.
If you enter a value in the rsh/ssh host command field that begins with SSH, then SAS IT Resource Management assumes that this job is running on a UNIX environment. The Perl script is changed so that it uses the UNIX find command to get the list of round-robin database files from the other hostnames.
To enable this functionality, set up SSH authentication using the SSH key-gen file. Then copy the file that stores the key to the host where the RRDtool executable and raw data are located. After the SSH key-gen file is copied, make sure that you can access the target host that is issuing the SSH command from the source host. If you have alias names for your target host, make sure that you execute the SSH command manually from the source host with that alias. By doing this, the command generates the host/RSA key for that alias hostname. Thus, the expected warning or error message is avoided during execution of the staging job. The following message is an example of the warning or error message:
Host key verification failed
To use this SSH method, go to the Staging Parameters tab of the Properties dialog box for the adapter’s staging transformation. Then fill in values for the following options:
  • Raw data input directory: Enter the location of the raw data in the target host.
  • rsh/ssh host command: Start this command with SSH (for example: ssh user@hostname).
  • For the SNMP adapter only, specify the Use snmpwalk to gather character data parameter. To do so, select Yes to use snmpwalk. A script is generated and executes when the staging code runs. This script executes the snmpwalk command to gather the RRDtool data for specified character metrics and adds that data directly to the staged tables.
    The default value for this option is No.
  • For the SNMP adapter, specify the Choose access command parameter. To do so, enter RRDTool.
  • rrdtool executable: Enter the RRDtool location of the target host machine.