RRDtool for Open-Source Management Tools

Overview

Many system administrators have begun to use open-source system management tools. Starting with SAS IT Resource Management 3.4, the solution provides documentation to gather and analyze measurements for these tools. Open-source system management tools reduce costs, increase flexibility, and provide quick and easy deployment. However, these tools do not offer the robust support that proprietary system management tools offer. In addition, many open-source tools are designed to run on Linux exclusively. Also, system administrators who work with open-source system management tools must have a good knowledge of scripting languages.
This document provides information about how SAS IT Resource Management 3.4 and later can work with several open-source system management tools. The SAS IT Resource Management RRDtool adapter can be used for processing the RRDs that contain the data that is collected from these system management tools. Perl scripts are used to update RRDs with the collected system management data. The primary focus is on the netstat and vmstat tools. However, the iostat, nmon, Nagios, Xymon, and Zenoss tools are also briefly discussed.

Preparing the Servers

Preparing the SAS IT Resource Management Server

The SAS IT Resource Management server requires Perl to be installed and the RRDtool to be accessible. You can either install the RRDtool adapter directly or use the RSH/SSH host command to connect to an RRDtool remotely. You must also enable XCMD both in the SAS Workspace Server and in batch mode. For more information about the RRDtool and the RSH/SSH command, see RRDtool — Accessing Raw Data and RRDtool Staging and Duplicate-Data Checking Parameters.

Preparing the Data Collection Server

The data collection server is the machine that is used to collect data into an RRD from the open-source system management tools. This machine does not have to be the SAS IT Resource Management server. The data collection server requires the installation of both Perl and RRDtool. (However, a remote installation of RRDtool can be used.)
Perl scripts that collect the data and write it to RRDs are available. The RRDs can be made available for processing on the SAS IT Resource Management server through a shared network-based file system. You can also copy the RRDs with the file transfer protocol (FTP) or the secure copy protocol (SCP). Sample Perl scripts are available for both netstat and vmstat data sources. The scripts are at a location that is based on the operating environment. .
Windows Specifics: SASHome\SASFoundation\9.4\itmsmvadata\sasmisc
UNIX Specifics: SASHOME/SASFoundation/9.4/misc
z/OS Specifics: &prefix.ITRM.CPMISC
Note: These scripts are intended for use on a Linux server (RHEL 6.1) Kernel 2.6. The scripts use the RRD Perl module that is available in Perl-RRDtool, which is a component of the RRDtool software. In order to use the RRD module in the Perl scripts, locate its path in the report that is produced when the RRDtool is installed. Then add that path as an extra directory to the Perl search path at the top of the Perl scripts. (The paths in the supplied scripts might need to be modified based on the RRDtool installation.)

Collecting the Performance Data

Performance Data Sources

For correct processing by SAS IT Resource Management, measurements that are collected from open-source system management tools must be written to an RRD file. For netstat and vmstat data sources, sample Perl scripts are available. These scripts capture measurements and load them into an RRD file. The scripts can be modified to capture additional performance metrics. (The scripts are examples and can be used as a starting point when working with other performance data sources such as iostat.) Open-source system management tools such as nmon and Nagios data sources have free post processing tools that create RRDs. Some other open-source tools such as Xymon create RRDs for itself as well as for netstat and vmstat data sources.

Working with the netstat Tool

The netstat command-line system monitor tool can be used to display network statistics. This tool is available on operating systems that are based on Windows NT and most types of UNIX operating systems. Information about network connections, routing tables, interface statistics, masquerade connections, and multicast memberships is available. The sample
Network.pl
Perl script that is provided displays network statistics using the netstat –s command. The script captures the network TCP statistics and loads them to an RRD file. If that file does not exist, the script creates it. You must update the rrdloc value in the script to the path where your RRD files are located.
TCP Network Statistics Displayed Using the netstat -s Command
TCP Network Statistics Displayed Using the netstat -s Command

Working with the vmstat Tool

The vmstat command-line system monitor tool can be used to display virtual memory statistics. This tool is available on most types of UNIX operating systems. Information about processes, memory, paging, block IO, traps, and CPU activity is available. The sample
VmStats.pl
Perl script that is provided displays various event counters and memory statistics using the vmstat –s command. The script captures the memory statistics and loads them to an RRD file. If that file does not exist, the script creates it. You must update the rrdloc value in the script to the path where your RRD files are located.
Virtual Memory Statistics Displayed Using the vmstat -s Command
Virtual Memory Statistics Displayed Using the vmstat -s Command

Working with the iostat Tool

The iostat command-line system monitor tool can be used to display operating system storage input and output statistics. This tool is available on most types of UNIX operating systems. Information about CPU utilization, device utilization, and the network file system is available.
Device Utilization Statistics Displayed Using the iostat -x Command
Device Utilization Statistics Displayed Using the iostat -x Command

Working with the nmon Tool

The nmon system monitor tool can be used to display key performance statistics. It can be operated in an online mode for real-time monitoring or in capture mode for processing at a later time. This tool is a free, downloadable tool that is available for the AIX and Linux operating systems. Information about CPU, memory, disks, adapters, networks, NFS, kernel statistics, file systems, and top processes is available. Workload Manager and Workload Partitions are also available on the AIX operating system. A post-processing tool named nmon2rrd is available at no cost. Nmon2rrd creates an RRD file and generates graphs using RRDtool.

Working with the Nagios Tool

The Nagios Core system monitor tool can be used to display key performance statistics for the entire IT infrastructure. This is a free, downloadable tool available on most Linux operating systems. Information about system metrics, network protocols, applications, services, servers, and network infrastructure is available. A Nagios addon project named nagiosgraph is available to create an RRD file and generate graphs using RRDtool. Support is provided by an enterprise-class solution that is built on Nagios Core, called Nagios XI.

Working with the Xymon Tool

The Xymon system monitor tool can be used to monitor servers, applications, and networks. This free, downloadable tool from SourceForge is available on most types of UNIX operating systems. It collects this information and presents it in a frequently updated web page, displaying the status of all the systems. Much of the information is stored in RRDs. It can generate many RRDs, including both vmstat and netstat RRDs. The RRDs can be processed directly by the SAS IT Resource Management RRDtool adapter. This type of processing eliminates the need to develop custom Perl scripts. Support is provided by means of mailing lists.

Working with the Zenoss Tool

The Zenoss Core product was developed to eliminate the need for multiple tools to perform availability monitoring, performance monitoring, event management, and more. A Zenoss Enterprise product can provide everything that you might need to establish and maintain awareness of the IT infrastructure.

Processing the Collected Performance Data

Consolidation of the Collected Performance Data

In most cases, open-source systems management tools are used to collect performance data that is being collected from multiple servers. As such, it is best to consider a consolidation strategy to simplify the setup of the SAS IT Resource Management RRDtool adapter. For best results, use a shared network-based file system, or copy the RRDs with FTP or SCP.
Tip
Include the host name in the name of the RRDs to help identify the host source. In addition, the RRDs could be in a single directory, which enables the adapter to take advantage of directory-based processing of the raw data.

RRDtool Adapter Overview

The RRDtool adapter reads any RRD that was created with the RRDtool. The adapter creates staged table metadata that is based on the contents of the RRDs. In addition to the staged table metadata, the adapter also creates a basic set of Aggregation and Information Map transformations.
Tip
This metadata can be modified as needed to meet your site’s requirements.
The data in an RRD can contain data that is already aggregated. For best results, the data should not be aggregated. This enables SAS IT Resource Management to perform its own aggregation. RRDs that are read with the RRDtool adapter should have data that is stored with a consolidation function (CF) of AVERAGE. In addition, the average should be generated based on one step of data. If so, the data is essentially detail data (or data that is not aggregated). However, if the data in the RRD is consolidated, the adapter can still read it. It has a staging parameter for the CF that you want. If this parameter is blank, then data at all consolidation levels in the RRD is collected.
RRDs can store numeric data only. Character data cannot be stored. Character information can be placed in the name of the RRD. For example, the host name can be stored in RRDs to aid in the consolidation of collected performance data. For example, the name of the RRD can be hostname_vmstat.rrd, where hostname is the name of the host for the collected performance data and vmstat is the type of performance data. When the RRDtool adapter reads the data, it stores the path and filename in a column called filename. You can then create computed columns based on the filename. For example, you can create these columns:
  • a column named host with an expression of scan(filename, 1, “_”)
  • a column named type with an expression of scan(filename, 2, “_.”)

The RRDtool Adapter Data Model

There is no data model for the RRDtool adapter.
The adapter reads the header information from the RRD and, based on its contents, creates metadata for the appropriate staged table, aggregation table, and information map. The Adapter Setup wizard, when used with the RRDtool adapter, creates jobs that contain transformations for a staged table, a set of aggregation tables, and a set of information maps.

The RRDtool Adapter Staged Table

To create a staged table from the Adapter Setup wizard or the New Staged Table wizard, you must specify the following information:
  • Rawdata: specify a directory that contains RRD files or a single RRD file.
  • rrdtool executable: specify the executable for accessing RRDs.
  • Consolidation Function (CF): specify the value of the consolidation function for which you want to collect the data. If it is left blank, columns are created in the staged table for all the CFs in the RRD.
Note: The staged table is named RRDstage.
For information about using the Adapter Setup wizard, see Using the Adapter Setup Wizard. For information about using the New Staged Table wizard, see Create Staged Tables.
For every staging table, a set of common datetime-related columns is included. In addition to these columns, there are columns for the metrics that are found in the RRDs. The process reads the header information from each RRD and gets a list of all the data sources (DSs or metrics). It also looks for the CF that the user requested. A column is added to the staging table with these attributes and values:
  • External Name: specify as the name of the DS in the RRD.
  • Name: specify as the name of the DS in the RRD.
  • Description: specify as the name of the DS in the RRD.
  • Type: specify as N.
  • Length: specify as 8.
  • Format: specify as NLNUM16.2.
If the requested CF is not in the RRD, then an error is displayed. If the user left the CF option blank, then each DS is combined with all the CFs. A column is added to the staging table with these attributes and values:
  • External Name: specify as DSName:CFValue (for example: active_mem:AVERAGE).
  • Name: specify as DSName_cfCFValue (for example: active_mem_cfAVERAGE).
  • Description: specify as DSName_cfCFValue (for example: active_mem_cfAVERAGE).
  • Type: specify as N.
  • Length: specify as 8.
  • Format: specify as NLNUM16.2.
After the staging table metadata is created, you can edit the metadata to remove columns, add new columns, or change the attributes of existing columns. When editing the metadata, make sure that the External Name field is correct. It must match the DS name from the RRD. In addition, if the CF option is set to blank, then it must also have a :CFValue after the DS name. If the CF option is not set to blank, then the External Name field should be set to the DS name. The staging code relies on the value in the External Name to match the data from the RRD with the staging table column. The following display shows the stage table RRDstage columns based on the memory statistics collected from the vmstat command-line system monitor tool.
vmstat RRDstage Stage Table
vmstat RRDstage Stage Table

RRDtool Adapter Aggregation Table

You can use the Adapter Setup wizard to create aggregation table metadata. You can choose day, week, month, key metrics, and shift aggregations. Based on these selections, the Adapter Setup wizard creates the appropriate aggregation tables for the RRDtool adapter.
Tip
The tables created by the wizard can be modified to meet your site’s requirements.
The aggregation tables are based on the staged table that was created. Each aggregation table has the standard columns (TimePeriod, CompletedDay, LastUpdated, and ContribCount). The class columns are all character columns, in addition to the needed date columns. For each metric, a weighted mean statistic column is created using duration as the weight column. Duration is the only statistics column that has only a SUM statistic. In addition to the statistics columns, there are also some standard date rank columns that are created, depending on the aggregation table. No join columns are created by default. Only class, ID, statistic, and rank columns are created. The following display shows the aggregation table DayRRD columns based on the memory statistics collected from the vmstat command-line system monitor tool.
vmstat DayRRD Aggregation Table
vmstat DayRRD Aggregation Table

Additional Resources

The following table lists the tools that pertain to handling open-source system management tools:
Open-Source Tools and Documentation Resources
Tool
Description
Location
iostat
iostat tool
http://linux.die.net/man/1/iostat
Nagios
Nagios IT Infrastructure monitoring tool
http://www.nagios.org/
netstat
netstat command-line tool
http://www.faqs.org/docs/linux_network/x-087-2-iface.netstat.html
nmon
nmon tool
http://nmon.sourceforge.net/pmwiki.php
Perl
Perl programming language
http://www.perl.org/
RRDtool
RRD tool
http://oss.oetiker.ch/rrdtool/index.en.html
vmstat
vmstat command-line tool
http://linuxcommand.org/man_pages/vmstat8.html
Xymon Monitor
Xymon system for monitoring servers and networks
http://xymon.sourceforge.net/
Zenoss
Zenoss Open Source Monitoring and Systems Management
http://community.zenoss.org/index.jspa