RMF Appendix 1: Working with Job Data | |
The primary IT Service Vision supplied table definition for use in analyzing jobs data is XJOBS.
Note: Remember to customize IMACSPIN before collecting jobs data.
IMACSPIN is customized when you want to use SMF jobs data for the IT Service Vision XJOBS table and related tables (XNJPURG, XPRINT, XSMFINT, XSPNJOB, and XSTEPS).
IMACACCT defines how many accounting fields are kept, and the length of each, for your site. Copy IMACACCT from MXG.MXG.SOURCLIB to MXG.USERID.SOURCLIB. Edit this copy of IMACACCT according to: the instructions (comments) in IMACACCT; the IMACACCT instructions, if any, in the most recent version of CMAPP2.
If you intend to collect data for the IT Service Vision table XSTEPS, copy EXPDBSPU from MXG.MXG.SOURCLIB to MXG.USERID.SOURCLIB. Add the following lines to the copy of EXPDBSPU:
DATA WORK.KEEPSTPS;
SET WORK.STEPS;
RUN;
The dates on the detail-level observations in the XJOBS table (and the number of observations) are affected by the architecture of MXG spin data sets.
The following is a "reprint" of the information in the newsletter supplementary File 3.XJOBS:
Note: This article assumes that the reader has a thorough understanding of the MXG JOBS and SPIN data set structures. For more information on these, refer to Chapter 34 of Merrill's Expanded Guide to Computer Performance Evaluation Using the SAS System and related chapters and Chapter 34 of Merrill's Expanded Guide Supplement and related chapters. Or search the MXG ACHAP* files for matches on 'spin' and 'spun.' (By the way, 'spin' and 'spun' appear to be plays on the word 'spooled,' and thus refer to data held temporarily for later action.) DETAIL.XJOBS is the detail level of the XJOBS table in the SAS/CPE PDB. The data in the XJOBS table come from the MXG PDB.JOBS data set. The JOBS data set consists of summary records, which are created by a complex process, which (described briefly) merges data from SMF type 6, 26, and 30 records. Thus a summary record contains information about a job's batch job resource usage and response time and is sometimes used for billing. Data from the JOBS data set is brought into the detail level of the SAS/CPE PDB through the SAS/CPE process step. Any data older than the most recent data minus the detail-level agelimit of the XJOBS table is kept in the detail level until the next time process runs. At that point, the out-of-date data is discarded. This one-process-cycle delay in purging allows the out-of-date data to be picked up by the reduce step, assuming that the reduce step runs in the period between the two process steps. So for example, if the detail level of the XJOBS table is set to keep 10 days of data and data that is 11 days older than the newest data is processed into the PDB, the older data will be kept for one cycle (because it is incoming data) and then purged the next time process runs (because it is no longer protected as incoming data and it is out of date). In the case of XJOBS, there are two data sources which can cause old data to be brought into DETAIL. 1) JOBS observations that arrive after data waited in spin data sets for several days until the job completed: Sometimes a job takes several days to complete, especially if it has HELD or unprinted output associated with it. In that case, the lack of completion data causes the already-available records to be held in spin data sets until all of the information about the job is available. Then a number of observations are released from the spin data sets and result in a single observation about the job in JOBS. 2) JOBS observations that arrive after some of the data waited in spin data sets until it was forced out and JOBS observations that arrive even later: Sometimes, a job doesn't complete within the number of days specified for holding records in the spin data sets. When this happens, the already-available records are forced out even though the remaining records have not yet been received. The already- available records result in a single observation (of regular length but with only partial data) in JOBS. When the job's printing or purging of old output data sets does complete, the remaining records are written and result in a second, single observation (of regular length but with only partial data) in JOBS. Let's take the example of DETAIL data kept for 10 days and a test for SPINCNT set to 5. Assuming that SAS/CPE %CMPROCES is run daily, then a SPINCNT test of 5 means that spun data will be held for, at most, 5 MXG cycles (thus, generally, 5 days). If the presence of all the data hasn't enabled it to leave before then, it is forced out at that point. It then follows a path that ends at the JOBS data set. SAS/CPE then processes it to the DETAIL level of the PDB. Since the DETAIL retention is 10, ordinarily these observations fall late within the retention range and they don't appear as out-of-date data in DETAIL. However, let's say one of these jobs was forced out of the spin data sets after 5 days but still had HELD or unprinted output associated with it. If the output is printed a week later, then you will get a second JOBS record at that time containing print and job purge activity. Or, if the output is canceled a week later, then you will get a second JOBS record reflecting the fact that the job was purged. SAS/CPE processes that record into DETAIL.XJOBS. The record arrives out of date for the retention period but, as an incoming record, is kept until the next process step. So, during that time, it appears as out-of-date data in DETAIL. It also occupies extra space in DETAIL for that one cycle. There are several ways to cut down on the volume of JOBS data. 1) You can change the test for SPINCNT to a higher value. More jobs will complete in time for all of the data to be on a single record in JOBS and XJOBS. Fewer jobs will be represented by pairs of records. For details, see the Merrill books mentioned at the beginning of this file. However, if you test SPINCNT against a higher number (high enough for most or all jobs to complete printing and purging), a significant number of jobs will arrive in DETAIL.XJOBS much later than the day they ran. The main effect of this is that the partial data that is already available is delayed a long time waiting for the remainder of the data. Thus a report based on DETAIL.XJOBS during that period would show only the data for jobs that have been printed and purged, not the data for the jobs still waiting to print and purge. 2) If you're not interested in the number of lines SPOOLed (as opposed to actually printed) then you can use the MXG exit EXPDBJOB to delete records with an INBITS value of ' P'. 3) If you're not interested in any output writer or purge activity that may occur after the SPIN count has been exhausted, you can eliminate records with INBITS values of ' P', ' WP', or ' W '. This is a "use at your own risk" approach. Exit EXPDBJOB is not documented as a place to delete observations, but it will work because it is the last thing executed in the DATA step that creates the JOBS data set. Finally, the best way to reduce space for the XJOBS table is not to use it at all. If the following are all true, then use XTY30_5 instead of XJOBS. It's a lot faster and a lot smaller. 1) You don't care about JES2 SPOOL counts. 2) You're not interested in associating output writer activity with jobs. 3) You don't mind losing data for jobs which cut some 30_4 records but never cut a 30_5 record because the system crashed.
As explained in File 3.XJOBS, XJOBS can have more than one observation per job. The fastest way to get a count of the number of jobs that ran is to use the Count statistic that is associated with the CPU time.
The following is a "reprint" of the information in the newsletter article I4.A24.M on counting jobs:
How to Tell The table with jobs data is XJOBS. As explained in File How Many Jobs 3.XJOBS, XJOBS can have more than one observation per job. Ran The fastest way to get a count of the number of jobs that ran is to use the Count statistic associated with the CPU time. Although it is possible for a job to be counted twice (for instance, if the data center loses power the first time the job is running), it is unlikely. So the variable to use is CPUTM__C at the DAY, WEEK, MONTH, or YEAR level, as required. The statistic CPUTM average is explicitly selected by default. Thus, because the CPUTM count is required to calculate the CPUTM average, the statistic CPUTM__C is implicitly selected by default. (For an explanation of explicitly and implicitly selected statistics, see "Default statistics" on p. 222 of SAS/CPE Software for the MVS Environment: Usage and Reference, Version 6, First Edition.) [Or see Help -> Help Index -> Variable Interpretations and Default Statistics -> ItemActions -> Browse Help... .] Implicitly selected statistics are calculated and are available internally (for instance, the count is used in calculating the average). To make them available externally, you must explicitly select them. To select the Count statistic by means of the interactive interface, check that you have Write access to the PDB and then follow this path from the main menu: PDB Operations -> PDB Dictionary -> XJOBS -> ItemActions -> List Variables -> CPUTM -> ItemActions -> Edit Statistics -> select Count at each of the levels at which you want to use it -> OK -> File -> End -> OK To select the Count statistics in batch, use these statements in the %CPDDUTL macro: set table name=xjobs; update variable name=cputm day=(count) | specify Count at week=(count) | each of the levels month=(count) | at which you want year=(count); | to use it Note: It is worth reading File 3.XJOBS to understand the effect of the delay in the MXG spin data sets on the jobs that are *available* to count during a particular period.