Understanding an Audit Trail

Definition of an Audit Trail

The audit trail is an optional SAS file that you can create in order to log modifications to a SAS data file. Each time an observation is added, deleted, or updated, information is written to the audit trail about who made the modification, what was modified, and when.
Many businesses and organizations require an audit trail for security reasons. The audit trail maintains historical information about the data, which gives you the opportunity to develop usage statistics and patterns. The historical information enables you to track individual pieces of data from the moment that they enter the data file to the time they leave.
The audit trail is also the only facility in SAS that stores observations from failed Append operations and that were rejected by integrity constraints. (The integrity constraints feature is described in Understanding Integrity Constraints.) The audit trail enables you to write a DATA step to extract the failed or rejected observations, use information describing why the observations failed to correct them, and then reapply the observations to the data file.

Audit Trail Description

The audit trail is created by the default Base SAS engine and has the same libref and member name as the data file, but has a type of AUDIT. It replicates the variables in the data file and also stores two types of audit variables:
  • _AT*_ variables, which automatically store modification data
  • user variables, which are optional variables that you can define to collect modification data
The _AT*_ variables are described in the following table.
_AT*_ Variables
_AT*_ Variable
Description
_ATDATETIME_
Stores the date and time of a modification
_ATUSERID_
Stores the logon user ID that is associated with a modification
_ATOBSNO_
Stores the observation number that is affected by the modification, except when REUSE=YES (because the observation number is always 0)
_ATRETURNCODE_
Stores the event return code
_ATMESSAGE_
Stores the SAS log message at the time of the modification
_ATOPCODE_
Stores a code that describes the type of modification
The _ATOPCODE_ values are listed in the following table.
_ATOPCODE_ Values
Code
Modification
AL
Auditing is resumed
AS
Auditing is suspended
DA
Added data record image
DD
Deleted data record image
DR
Before-update record image
DW
After-update record image
EA
Observation add failed
ED
Observation delete failed
EU
Observation update failed
The type of entries stored in the audit trail, along with their corresponding _ATOPCODE_ values, are determined by the options specified in the LOG statement when the audit trail is initiated. Note that if the LOG statement is omitted when the audit trail is initiated, the default behavior is to log all images.
  • The A operation codes are controlled by the ADMIN_IMAGE option.
  • The DR operation code is controlled by the BEFORE_IMAGE option.
  • All other D operation codes are controlled with the DATA_IMAGE option.
  • The E operation codes are controlled by the ERROR_IMAGE option.
The user variable is a variable that associates data values with the data file without making them part of the data file. That is, the data values are stored in the audit file, but you update them in the data file like any other variable. You might want to define a user variable to enable end users to enter a reason for each update.
User variables are defined at audit trail initiation with the USER_VAR statement. For example, the following code initiates an audit trail and creates a user variable REASON_CODE for data file MYLIB.SALES:
proc datasets lib=mylib;
  audit sales;
     initiate;
     user_var reason_code $ 20;
run;
After the audit trail is initiated, SAS retrieves the user variables from the audit trail and displays them when the data file is opened for update. You can enter data values for the user variables as you would for any data variable. The data values are saved to the audit trail as each observation is saved. (In applications that save observations as you scroll through them, it might appear that the data values have disappeared.) The user variables are not available when the data file is opened for browsing or printing. However, to rename a user variable or modify its attributes, you modify the data file, not the audit file. The following example uses PROC DATASETS to rename the user variable:
proc datasets lib=mylib;
   modify sales;
     rename reason_code =  Reason;
   run;
quit;
You must also define attributes such as format and informat in the data file with PROC DATASETS. If you define user variables, you must store values in them in order for the variables to be meaningful.
A data file can have one audit file, and the audit file must reside in the same SAS library as the data file.

Operation in a Shared Environment

The audit trail operates similarly in local and remote environments. The only difference for applications and users networking with SAS/CONNECT and SAS/SHARE is that the audit trail logs events when the observation is written to permanent storage. That is, when the data is written to the remote SAS session or server. Therefore, the time that the transaction is logged might be different from the user's SAS session.

Performance Implications

Because each update to the data file is also written to the audit file, the audit trail can negatively impact system performance. You might want to consider suspending the audit trail for large, regularly scheduled batch updates. Note that the audit variables are unavailable when the audit trail is suspended.

Preservation by Other Operations

The audit trail is not recommended for data files that are copied, moved, sorted in place, replaced, or transferred to another operating environment. Those operations do not preserve the audit trail. In a copy operation on the same host, you can preserve the data file and audit trail by renaming them using the generation data sets feature. However, logging stops because neither the auditing process nor the generation data sets feature saves the source program that caused the replacement. For more information about generation data sets, see Understanding Generation Data Sets.

Programming Considerations

For data files whose audit file contains user variables, the variable list is different when browsing and updating the data file. The user variables are selected for update but not for browsing. You should be aware of this difference when you are developing your own full-screen applications.

Other Considerations

Data values that are entered for user variables are not stored in the audit trail for Delete operations.
If the audit file becomes damaged, you cannot process the data file until you terminate the audit trail. Then you can initiate a new audit trail or process the data file without one. To terminate the audit trail for a generation data set, use the GENNUM= data set option in the AUDIT statement. You cannot initiate an audit trail for a generation data set.
In indexed data sets, the fast-append feature can cause some observations to be written to the audit trail twice, first with a DA operation code and then with an EA operation code. The observations with EA represent the observations rejected by index restrictions. For more information, see Appending to an Indexed Data Set — Fast-Append Method in Base SAS Procedures Guide.

Initiating an Audit Trail

You initiate an audit trail in the DATASETS procedure with the AUDIT statement. For syntax information, see DATASETS Procedure in Base SAS Procedures Guide.
The audit file uses the SAS password assigned to its associated data file. Therefore, it is recommended that the data file have an ALTER password. An ALTER-level password restricts Read and Edit access to SAS files. If a password other than ALTER is used, or no password is used, the software generates a warning message that the files are not protected from accidental update or deletion.

Controlling the Audit Trail

Once active, you can suspend and resume logging, and terminate (delete) the audit trail. The syntax for controlling the audit trail is described in the PROC DATASETS AUDIT statement documentation. Note that replacing the associated data file also deletes the audit trail.

Reading and Determining the Status of the Audit Trail

The audit trail is read-only. You can read the audit trail with any component of SAS that reads a data set. To refer to the audit trail, use the TYPE= data set option. For example, issue the following statement to view the contents of the audit trail. Note that the parentheses around the TYPE= option are required.
proc contents data=mylib.sales (type=audit);
run;
The CONTENTS procedure output is shown below. Notice that the output contains all of the variables from the corresponding data file, the _AT*_ variables, and the user variable.
PROC CONTENTS Output for Data File MYLIB.SALES
PROC CONTENTS output for data file MYLIB.SALES
PROC CONTENTS Output for Data File MYLIB.SALES
PROC CONTENTS output for data file MYLIB.SALES
You can also use your favorite reporting tool, such as PROC REPORT or PROC TABULATE, on the audit trail.

Audit Trails and CEDA Processing

When a SAS data file requires processing with CEDA, audit trails are not supported. For example, if you transfer a SAS data file with an initiated audit trail from one operating environment such as Windows to a different operating environment such as UNIX, CEDA translates the file for you, but the audit trail is not available. For information about CEDA processing, see Processing Data Using Cross-Environment Data Access (CEDA).
The MIGRATE procedure retains all deleted observations in migrated data sets. Therefore, PROC MIGRATE preserves and migrates audit trails. For more information, see MIGRATE Procedure in Base SAS Procedures Guide.
In contrast, conversion procedures such as PROC CPORT and PROC CIMPORT clean up data sets and restructure the data sets. For example, these procedures remove deleted observations to recover disk space. The restructuring is advantageous but results in a data set that is not historically accurate when trying to track changes through an audit trail. Because these conversion procedures do not keep deleted observations, the audit trails cannot be copied using these procedures. For more information, see CPORT Procedure in Base SAS Procedures Guide and CIMPORT Procedure in Base SAS Procedures Guide.
CAUTION:
If your data files contain audit trails, do not use your operating environment commands to copy, move, or delete your data files.

Examples of Using Audit Trails

Example of Initiating an Audit Trail

The following example shows the data and code that are used to create and initiate an audit trail for the data file MYLIB.SALES that is used in earlier examples in this section. MYLIB.SALES stores fictional invoice and renewal figures for SAS products. The audit trail records all events and stores one user variable, REASON_CODE, for users to enter a reason for the update.
Subsequent examples illustrate the effect of a data file update on the audit trail and how to use audit variables to capture observations that are rejected by integrity constraints.
libname mylib 'C:\My Documents';
   /*------------------------------------*/
   /* Create SALES data set.             */
   /*------------------------------------*/

data mylib.sales;
  length product  $9;
  input product invoice renewal;
datalines;
FSP        1270.00        570
SAS        1650.00        850
STAT       570.00         0
STAT       970.82         600
OR         239.36         0
SAS        7478.71        1100
SAS        800.00         800
;


   /*----------------------------------*/
   /* Create an audit trail with a     */
   /* user variable.                   */
   /*----------------------------------*/

proc datasets lib=mylib nolist;
  audit sales;
    initiate;
    user_var reason_code $ 20;
quit;

Example of a Data File Update

The following example inserts an observation into MYLIB.SALES.DATA and prints the update data in the MYLIB.SALES.AUDIT.
/*----------------------------------*/
   /* Do an update.                    */
   /*----------------------------------*/
 proc sql;
   insert into mylib.sales
       set product = 'AUDIT',
           invoice = 2000,
           renewal = 970,
       reason_code = "Add new product";
quit;

   /*----------------------------------------*/
   /* Print the audit trail. */
   /*----------------------------------------*/
proc sql;
  select product,
         reason_code,
         _atopcode_,
         _atdatetime_
         from mylib.sales(type=audit);
quit;
Updated Data in MYLIB.SALES.AUDIT
Updated data in MYLIB.SALES.AUDIT

Example of Using the Audit Trail to Capture Rejected Observations

The following example adds integrity constraints to MYLIB.SALES.DATA and records observations that are rejected as a result of the integrity constraints in MYLIB.SALES.AUDIT. For more information about integrity constraints, see Understanding Integrity Constraints.
   
   /*----------------------------------*/
   /* Create integrity constraints.    */
   /*----------------------------------*/
proc datasets lib=mylib;
   modify sales;
   ic create null_renewal = not null (invoice)
             message = "Invoice must have a value.";
   ic create invoice_amt = check (where=((invoice > 0) and
               (renewal <= invoice)))
             message = "Invoice and/or renewal are invalid.";
run;

   /*----------------------------------*/
   /* Do some updates.                 */
   /*----------------------------------*/
 proc sql; /* this update works */
    update mylib.sales
      set invoice = invoice * .9,
      reason_code = "10% price cut"
      where renewal > 800;

 proc sql;  /* this update fails */
    insert into mylib.sales
       set product = 'AUDIT',
           renewal = 970,
       reason_code = "Add new product";

 proc sql;  /* this update works */
    insert into mylib.sales
       set product = 'AUDIT',
           invoice = 10000,
           renewal = 970,
       reason_code = "Add new product";

proc sql;  /* this update fails */
    insert into mylib.sales
       set product = 'AUDIT',
           invoice = 100,
           renewal = 970,
       reason_code = "Add new product";
 quit;

   /*----------------------------------------*/
   /* Print the audit trail. */
   /*----------------------------------------*/
proc print data=mylib.sales(type=audit);
  format _atuserid_ $6.;
  var product reason_code _atopcode_ _atdatetime_;
title  'Contents of the Audit Trail';
run;

   /*----------------------------------------*/
   /* Print the rejected records.            */
   /*----------------------------------------*/
proc print data=mylib.sales(type=audit);
  where _atopcode_ eq "EA";
  format _atmessage_ $250.;
  var product invoice renewal _atmessage_ ;
title  'Rejected Records';
run;
The output Contents of MYLIB.SALES.AUDIT after an Update with Integrity Constraints shows the contents of MYLIB.SALES.AUDIT after several updates of MYLIB.SALES.DATA were attempted. Integrity constraints were added to the file, and then updates were attempted. The output Rejected Records on the Audit Trail prints information about the rejected observations on the audit trail.
Contents of MYLIB.SALES.AUDIT after an Update with Integrity Constraints
Contents of MYLIB.SALES.AUDIT after an update with integrity constraints
Rejected Records on the Audit Trail
Rejected records on the audit trail