Fax Appendix 2: Sample Code for Staging the Data | |
This is the SAS code used to stage the data from the fax log. When you want to run this code, you can submit it from the SAS PROGRAM EDITOR window or in a SAS procedure in batch/background.
Note: Do not submit it yet. Wait until you append the code, from Fax Appendix 3, for generating the definitions.
filename faxlog 'fax.log' ; data work.faxes(drop=date time label='Fax Activity'); attrib machine label='Machine name' length=$8; attrib type label='Fax Type: (S)end or (R)eceive' length=$1; attrib country label='Country Code' length=$3; attrib acode label='Area Code' length=$3; attrib phnum label='Phone Number' length=$8; attrib pages label='Pages Sent or Received'; attrib timestmp label='Datetime stamp of fax' format=datetime21.2; attrib resent label='Pages Resent'; attrib connect label='Total connect time' format=time12.2; attrib pctrtry label='Percent retries' format=percent7.2; attrib status label='Final Status Code' length=$1; infile faxlog; input @1 machine @10 type @12 country @16 acode @20 phnum @29 pages @32 date : mmddyy8. @41 time : time8. @50 resent @53 connect : time8. @62 status; status = upcase(status); if status not in('A','B','C') then do; put 'Invalid status code found in record ' _n_ ; put 'This record will not be included in the staged data'; put machine= type= country= acode= phnum= pages= date= time= resent= connect= status=; delete; end; timestmp=dhms(date,hour(time),minute(time),second(time)); if pages = 0 then pctrtry = .; else pctrtry = resent / pages ; run;
Notes:
- The FILENAME statement assigns a fileref, which is a temporary name within the current SAS session, to a file. In this case, the fileref FAXLOG is assigned to the file FAX.LOG.
- The DATA statement creates a SAS data set and specifies characteristics of the data set. The data set is to be created in the SAS library named WORK, have the name FAXES, and have the descriptive label FAX ACTIVITY. (WORK is a pre-defined SAS library that is automatically created at the beginning of a SAS session and automatically deleted at the end of a SAS session.)
- The DATA statement also begins a DATA step, which in this case ends with a RUN statement. Within the DATA step, there is an implied loop. In this case, one pass through the loop will read a record's worth of data from the input file to a buffer, modify the data as described in the data step, and write the data (except for the dropped variables
date
andtime
) from the buffer as a row in the data set. (The data set has rows that correspond to records in the input file and columns that correspond to fields on the records.)
- The ATTRIB statements specify some of the attributes of the data. These statements are associated with the INPUT statement later in the step. For more about attributes so as to simplify later steps, see Shared Appendix 6: Characteristics of Variables and Generic Collector Appendix 1: Algorithm Used by GENERATE SOURCE.
- The INFILE statement specifies the fileref of the file that supplies the input records. In this case, the file to which FAXLOG points supplies the input records.
- The INPUT statement specifies the location of each field of interest on an input record and assigns a name to the data in that field. In this case, the locations are specified by the beginning column symbol, @. This statement also causes one record's worth of data to be read into the buffer.
- The value of the fax's final transmission status is checked. If the status is not valid, the DATA step puts (writes) an error message on the SAS log and deletes the data from the buffer.
- The DATE and TIME variables are used to create the TIMESTMP variable.
- A variable (PCTRTRY), for percent re-tries, is created to monitor the efficiency of the transfer. Both TIMESTMP and PCTRTRY are derived variables. A derived variable is one that does not exist in the raw data, but is derived from it in the staged data. An alternate method would be to create a formula variable in the IT Service Vision table, but formula variables cannot be used as BY or CLASS variables nor can statistics be calculated automatically for formula variables.
- In front of the RUN statement, the contents of the buffer, if any, are written to the staged data (as a row in the staged data).
- The RUN statement specifies that SAS is to run, now, the previous statements. As a consequence, looping begins in the DATA step. The loop automatically terminates when the end of file is encountered in the input file. Thus, when the processing that was triggered by the RUN completes, a FAXES data set exists in the (temporary) WORK library. In this example, the data set would have three observations, corresponding to the three records in the input file.
- In this document, we will use these statements in two ways. First, we will add them to a one-time job (or SAS session) that creates an empty staged data set, reads the staged data set to create PDB table and variable definitions, applies the definitions to the PDB, and deletes the empty staged data set. Later, we will place them in a macro and put that macro in the daily production job. In the daily production job, one SAS session will start IT Service Vision, stage the data (by means of this macro), process the staged data into the PDB, reduce the data, (optionally) report on the data, and delete the staged data set.
- For more about the implied loop, see "DATA Step Processing" in Chapter 2 of the SAS Language: Reference documentation for your current release of SAS. That book also has additional information on each of the statements in the DATA step.