Problem Note 37898: Incorrect results might be produced when the TRANSPOSE procedure uses 2 GB or more of memory
PROC TRANSPOSE can produce incorrect output, including corrupted data or missing values, when you run 64-bit editions of SAS® in UNIX operating environments and use more than 2 GB of memory. This problem can occur when the width of the output observation multiplied by the number of variables from the input data set that is being transposed exceeds (2**31)-1 (2,147,483,647 or, roughly, 2 billion) and the procedure allocates 2 GB (2,147,483,648) or more of memory with which to perform the transposition.
For example, this problem can occur when transposing a data set that contains 300 numeric variables and 1 million observations into a data set that contains 1 million numeric variables and 300 observations if PROC TRANSPOSE allocates enough memory (about 2.5 GB) to complete the transposition without using a utility file. Note that this 2.5 GB figure reflects only the most significant portion of memory required and not the total memory that is required by the procedure, which will be greater. This transposition will result in an observation length, in the output data set, of 8 million bytes (8 bytes per numeric variable * 1,000,000 numeric variables). Multiplying the output observation length (8,000,000) by the number of variables from the input data set being transposed (300) results in 2.4 billion, which exceeds (2**31)-1.
The problem is not limited to transpositions that occur entirely within memory, but can also occur when a utility file is used in conjunction with 2 GB or more of memory. The problem is also not limited to either wide or narrow input or output data sets, but can occur, as stated above, whenever the observation length of the output data set that is multiplied by the number of variables being transposed exceeds 2 billion and the procedure allocates 2 GB or more memory to perform the transposition.
Note that this problem is less likely to occur when a BY statement has been specified because the width of the output data set is dependent upon the maximum number of observations within all BY groups of the input data set. If there are many BY groups and few observations per group, then the output data set will be narrow, and given that the number of variables taking part in the transposition is constant, the smaller output data-set observation length that is multiplied by the number of transposed variables is less likely to exceed 2 billion than if no BY statement was specified. The case in which no BY statement is specified is equivalent to a single BY group that contains all observations. For this case, the width of the output data set is dependent upon the number of observations from the input data set that is being transposed.
To circumvent the problem, do one of the following:
- Set the REALMEMSIZE option to 2,147,483,647 at SAS startup. Note that this memory limitation will apply to all steps in the code.
- Separate the steps to run more than one PROC TRANSPOSE.
Click the Hot Fix tab in this note to access the hot fix for this issue.
Operating System and Release Information
SAS System | Base SAS | 64-bit Enabled AIX | 9.1 TS1M0 | 9.2 TS2M3 |
64-bit Enabled HP-UX | 9.1 TS1M0 | 9.2 TS2M3 |
64-bit Enabled Solaris | 9.1 TS1M0 | 9.2 TS2M3 |
Tru64 UNIX | 9.1 TS1M0 | 9.2 TS2M3 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
PROC TRANSPOSE can produce incorrect output, including corrupted data or missing values, when running 64-bit SAS on UNIX systems and using more than 2 GB of memory.
Type: | Problem Note |
Priority: | high |
Date Modified: | 2009-11-19 10:34:57 |
Date Created: | 2009-11-19 08:16:56 |