Understanding the Observation Count in a SAS Data File

Definition of the Observation Count

The observation count in a SAS data file is the total number of observations (rows) that are currently in the file combined with the number of deleted observations. The observation count is a file attribute that you can list for a specific SAS data file by executing the CONTENTS procedure or the CONTENTS statement in the DATASETS procedure. In the procedure output, the observation count is the sum of the values in the Observations and Deleted Observations fields. Knowing the observation count is beneficial for managing file size and estimating disk space requirements. In addition, there is a maximum number of observations that can be counted for a SAS data file, which is determined by the long integer size for the operating environment.

Maximum Observation Count

The maximum number of observations that can be counted for a SAS data file is determined by the long integer data type size for the operating environment.
  • In operating environments with a 32-bit long integer, the maximum number is 231-1 or approximately two billion observations (2,147,483,647).
  • In operating environments with a 64-bit long integer, the maximum number is 263-1 or approximately 9.2 quintillion observations.
It is unlikely that a SAS data file in an operating environment with a 64-bit long integer will reach the maximum observation count. However, for operating environments with a 32-bit long integer, reaching the maximum observation count of approximately two billion observations is not unusual.
The SAS 9.3 operating environments whose internal data representation store the observation count as a 32-bit long integer include the following platforms:
  • Linux for 32-bit Intel architecture
  • Microsoft Windows on 32-bit platform
  • Microsoft Windows 64-bit Edition. In this 64-bit operating environment, the long integer data type uses the 32-bit model to maintain compatibility with 32-bit applications.
  • z/OS on 32-bit platform

SAS Processing When the Maximum Observation Count Is Reached

When a SAS data file reaches the maximum observation count, continued SAS processing depends on whether the file has an index or an integrity constraint that uses an index.
  • If the SAS data file has an index or an integrity constraint that uses an index (unique key, primary key, and foreign key), when an operation reaches the maximum observation count, an error message is issued. For example:
    ERROR: File MYFILES.BIGFILE contains 2G -1 observations and cannot 
    hold more because it contains an index or an Integrity Constraint 
    that uses an index.
    For SAS 9, a SAS data file is never damaged when an operation attempts to exceed the maximum observation count. However, you must take explicit action to continue processing the file.
  • If the SAS data file does not have an index or an integrity constraint that uses an index, sequential processing continues and additional observations are accepted. However, the file cannot store the observation count and does not maintain the observation numbers. Any operation that requires an observation number is not available. There are no messages to indicate that the file has reached or exceeded the maximum observation count.
    The following list describes some of the operations and features that are limited for a SAS data file that exceeds the maximum observation count and does not have an index or an integrity constraint that uses an index. For a complete list, contact SAS Technical Support.
    • SAS procedures that return an observation count (such as the PRINT procedure or the CONTENTS procedure) return a missing value, which is represented by a period (.), for the number of observations.
    • SAS procedures that depend on the observation count (for example, the SORT procedure or the COMPARE procedure) can return unpredictable results.
    • Operations that update the observation count cannot be submitted. You cannot reset the observation count by deleting observations.
    • When you request to compress a file for which the observation count is no longer maintained, the compression percentage cannot be calculated.
    • You cannot create an index or an integrity constraint.
    • For CEDA processing between operating environments, the following behavior occurs. Note that SAS 9.3 provides improved CEDA processing between operating environments:
      Operating Environment
      Operation
      Behavior Before SAS 9.3
      SAS 9.3 Behavior
      32-bit long integer operating environment
      Open a 64-bit long integer file that exceeds the 32-bit maximum.
      Open fails.
      Opens the file due to improved 32-bit counters.
      32-bit long integer operating environment
      Create a 64-bit long integer file that exceeds the 32-bit maximum.
      Output processing stops.
      Creates the file due to improved 32-bit counters.
      64-bit long integer operating environment
      Open a 32-bit long integer file that exceeds the 32-bit maximum.
      Open fails.
      Opens the file with limited functionality, because the observation number is not available.
      64-bit long integer operating environment
      Create a 32-bit long integer file that exceeds the 32-bit maximum.
      Output processing stops. File is not created.
      Creates the file up to the 32-bit maximum.

Recovering from an Exceeded Maximum Observation Count

If a SAS data file has reached or exceeded the maximum number of observations that can be counted, and the file has an index or an integrity constraint that uses an index, then you must take explicit action to continue processing.
  • You can delete the index or the integrity constraint and continue processing. However, because the file exceeds the maximum observation count, you have limited functionality. You can use the DATASETS procedure or the SQL procedure to delete indexes and integrity constraints. See the Base SAS Procedures Guide.
  • If you want to retain your index or integrity constraint, you must recreate the SAS data file and specify the EXTENDOBSCOUNTER= option. See Extending the Observation Count in a SAS Data File.
If a SAS data file has reached or exceeded the maximum number of observations that can be counted, and the file does not have an index or an integrity constraint that uses an index, there are no messages to indicate that the file has reached or exceeded the maximum observation count. However, the file has limited functionality. To regain functionality, you can recreate the SAS data file and specify the EXTENDOBSCOUNTER= option. See Extending the Observation Count in a SAS Data File.