| RMF Appendix 1: Working with Job Data | |
The primary IT Service Vision supplied table definition for use in analyzing jobs data is XJOBS.
Note: Remember to customize IMACSPIN before collecting jobs data.
IMACSPIN is customized when you want to use SMF jobs data for the IT Service Vision XJOBS table and related tables (XNJPURG, XPRINT, XSMFINT, XSPNJOB, and XSTEPS).
IMACACCT defines how many accounting fields are kept, and the length of each, for your site. Copy IMACACCT from MXG.MXG.SOURCLIB to MXG.USERID.SOURCLIB. Edit this copy of IMACACCT according to: the instructions (comments) in IMACACCT; the IMACACCT instructions, if any, in the most recent version of CMAPP2.
If you intend to collect data for the IT Service Vision table XSTEPS, copy EXPDBSPU from MXG.MXG.SOURCLIB to MXG.USERID.SOURCLIB. Add the following lines to the copy of EXPDBSPU:
DATA WORK.KEEPSTPS;
SET WORK.STEPS;
RUN;
The dates on the detail-level observations in the XJOBS table (and the number of observations) are affected by the architecture of MXG spin data sets.
The following is a "reprint" of the information in the newsletter supplementary File 3.XJOBS:
Note: This article assumes that the reader has a thorough understanding
of the MXG JOBS and SPIN data set structures. For more information on
these, refer to Chapter 34 of Merrill's Expanded Guide to Computer
Performance Evaluation Using the SAS System and related chapters and
Chapter 34 of Merrill's Expanded Guide Supplement and related chapters.
Or search the MXG ACHAP* files for matches on 'spin' and 'spun.' (By
the way, 'spin' and 'spun' appear to be plays on the word 'spooled,'
and thus refer to data held temporarily for later action.)
DETAIL.XJOBS is the detail level of the XJOBS table in the SAS/CPE PDB.
The data in the XJOBS table come from the MXG PDB.JOBS data set. The
JOBS data set consists of summary records, which are created by a complex
process, which (described briefly) merges data from SMF type 6, 26, and
30 records. Thus a summary record contains information about a job's
batch job resource usage and response time and is sometimes used for
billing.
Data from the JOBS data set is brought into the detail level of the
SAS/CPE PDB through the SAS/CPE process step. Any data older than the
most recent data minus the detail-level agelimit of the XJOBS table is
kept in the detail level until the next time process runs. At that
point, the out-of-date data is discarded. This one-process-cycle delay
in purging allows the out-of-date data to be picked up by the reduce
step, assuming that the reduce step runs in the period between the two
process steps.
So for example, if the detail level of the XJOBS table is set to keep
10 days of data and data that is 11 days older than the newest data is
processed into the PDB, the older data will be kept for one cycle
(because it is incoming data) and then purged the next time process
runs (because it is no longer protected as incoming data and it is out
of date).
In the case of XJOBS, there are two data sources which can cause
old data to be brought into DETAIL.
1) JOBS observations that arrive after data waited in spin data
sets for several days until the job completed:
Sometimes a job takes several days to complete, especially if
it has HELD or unprinted output associated with it. In that case,
the lack of completion data causes the already-available records
to be held in spin data sets until all of the information about
the job is available. Then a number of observations are released
from the spin data sets and result in a single observation about
the job in JOBS.
2) JOBS observations that arrive after some of the data waited in
spin data sets until it was forced out and JOBS observations that
arrive even later:
Sometimes, a job doesn't complete within the number of days
specified for holding records in the spin data sets. When this
happens, the already-available records are forced out even though
the remaining records have not yet been received. The already-
available records result in a single observation (of regular
length but with only partial data) in JOBS.
When the job's printing or purging of old output data sets does
complete, the remaining records are written and result in a
second, single observation (of regular length but with only partial
data) in JOBS.
Let's take the example of DETAIL data kept for 10 days and a test for
SPINCNT set to 5. Assuming that SAS/CPE %CMPROCES is run daily, then a
SPINCNT test of 5 means that spun data will be held for, at most, 5
MXG cycles (thus, generally, 5 days). If the presence of all the data
hasn't enabled it to leave before then, it is forced out at that point.
It then follows a path that ends at the JOBS data set. SAS/CPE then
processes it to the DETAIL level of the PDB. Since the DETAIL
retention is 10, ordinarily these observations fall late within the
retention range and they don't appear as out-of-date data in DETAIL.
However, let's say one of these jobs was forced out of the spin data
sets after 5 days but still had HELD or unprinted output associated
with it. If the output is printed a week later, then you will get a
second JOBS record at that time containing print and job purge
activity. Or, if the output is canceled a week later, then you will
get a second JOBS record reflecting the fact that the job was purged.
SAS/CPE processes that record into DETAIL.XJOBS. The record arrives
out of date for the retention period but, as an incoming record, is
kept until the next process step. So, during that time, it appears as
out-of-date data in DETAIL. It also occupies extra space in DETAIL
for that one cycle.
There are several ways to cut down on the volume of JOBS data.
1) You can change the test for SPINCNT to a higher value. More jobs
will complete in time for all of the data to be on a single record
in JOBS and XJOBS. Fewer jobs will be represented by pairs of
records. For details, see the Merrill books mentioned at the
beginning of this file.
However, if you test SPINCNT against a higher number (high
enough for most or all jobs to complete printing and purging), a
significant number of jobs will arrive in DETAIL.XJOBS much later
than the day they ran.
The main effect of this is that the partial data that is
already available is delayed a long time waiting for the remainder
of the data. Thus a report based on DETAIL.XJOBS during that
period would show only the data for jobs that have been printed and
purged, not the data for the jobs still waiting to print and purge.
2) If you're not interested in the number of lines SPOOLed (as
opposed to actually printed) then you can use the MXG exit EXPDBJOB
to delete records with an INBITS value of ' P'.
3) If you're not interested in any output writer or purge activity
that may occur after the SPIN count has been exhausted, you can
eliminate records with INBITS values of ' P', ' WP', or
' W '.
This is a "use at your own risk" approach. Exit EXPDBJOB is not
documented as a place to delete observations, but it will work
because it is the last thing executed in the DATA step that creates
the JOBS data set.
Finally, the best way to reduce space for the XJOBS table is not to use
it at all. If the following are all true, then use XTY30_5 instead of
XJOBS. It's a lot faster and a lot smaller.
1) You don't care about JES2 SPOOL counts.
2) You're not interested in associating output writer activity with
jobs.
3) You don't mind losing data for jobs which cut some 30_4 records
but never cut a 30_5 record because the system crashed.
As explained in File 3.XJOBS, XJOBS can have more than one observation per job. The fastest way to get a count of the number of jobs that ran is to use the Count statistic that is associated with the CPU time.
The following is a "reprint" of the information in the newsletter article I4.A24.M on counting jobs:
How to Tell The table with jobs data is XJOBS. As explained in File
How Many Jobs 3.XJOBS, XJOBS can have more than one observation per job.
Ran The fastest way to get a count of the number of jobs that
ran is to use the Count statistic associated with the CPU
time. Although it is possible for a job to be counted twice
(for instance, if the data center loses power the first time
the job is running), it is unlikely. So the variable to use
is CPUTM__C at the DAY, WEEK, MONTH, or YEAR level, as
required.
The statistic CPUTM average is explicitly selected by
default. Thus, because the CPUTM count is required to
calculate the CPUTM average, the statistic CPUTM__C is
implicitly selected by default. (For an explanation of
explicitly and implicitly selected statistics, see "Default
statistics" on p. 222 of SAS/CPE Software for the MVS
Environment: Usage and Reference, Version 6, First Edition.)
[Or see Help -> Help Index -> Variable Interpretations and
Default Statistics -> ItemActions -> Browse Help... .]
Implicitly selected statistics are calculated and are
available internally (for instance, the count is used in
calculating the average). To make them available
externally, you must explicitly select them.
To select the Count statistic by means of the interactive
interface, check that you have Write access to the PDB and
then follow this path from the main menu:
PDB Operations -> PDB Dictionary -> XJOBS -> ItemActions
-> List Variables -> CPUTM -> ItemActions -> Edit
Statistics -> select Count at each of the levels at which
you want to use it -> OK -> File -> End -> OK
To select the Count statistics in batch, use these
statements in the %CPDDUTL macro:
set table name=xjobs;
update variable name=cputm
day=(count) | specify Count at
week=(count) | each of the levels
month=(count) | at which you want
year=(count); | to use it
Note: It is worth reading File 3.XJOBS to understand the
effect of the delay in the MXG spin data sets on the jobs
that are *available* to count during a particular period.