Different factors affect numeric precision, which is a common issue for many people, including SAS users. Though computers and software can help, you are still limited in how precisely you can calculate, compare, and represent data. So only those people who generate and use data can determine the exact degree of precision that suits their enterprise needs.
As you decide what degree of precision you want, you need to consider the following system factors, which can cause calculation differences:
The following factors can also cause differences:
You also need to consider how conversions are performed on, between, or across any of these system or calculation items.
Depending on the degree of precision that you want, calculating the value of r can result in a tiny residual in a floating-point unit. When you compare the value of r to 0.0, you might find that r≠0.0—the numbers are very close but not equal. This kind of discrepancy in results can stem from problems in representing, rounding, displaying, and selectively extracting data.
data x; x=10.25; put x hex16.; run;
The output from this DATA step is an exact number: 4024800000000000.
However, the number 10.1 cannot be represented exactly, as shown in this example:
data x; x=10.1; put x hex16.; run;
The output from this DATA step is an inexact number: 4024333333333333.
Rounding errors, as illustrated in the following example, can result from platform-specific differences for which there is no solution.
data x; x=10.1; put x hex16.; y=100000; newx=(x+y)-y; put newx hex16.; run;
In the Windows and Linux environments, the output from this DATA step is 4024333333333333 (8/10-byte hardware double). In the Solaris x64 environment, the output is 4024333333334000 (8/8-byte hardware double).
For certain numbers (such as x.5), the precision of displayed data depends on whether you round up or down. Low-precision formatting (rounding down) can produce different results on different platforms. In the following example, the same high-precision (rounding up) result occurs for X=8.3, X=8.5, or X=hex16. However, a different result occurs for X=8.1 because this number does not yield the same level of precision.
data; x=input('C047DFFFFFFFFFFF', hex16.); put x= 8.1 x= 8.3 x= 8.5 x= hex16.; run;
The output under Windows or Linux (high-precision formatting) is as follows:
x=-47.8 x=-47.750 x=-47.7500 x=C047DFFFFFFFFFFF
The output under Solaris x64 (low-precision formatting) is as follows:
x=-47.7 x=-47.750 x=-47.7500 x=C047DFFFFFFFFFFF
To fix the problem illustrated by this example, you must select a number that yields the next precision level; in this case, 8.2.
Results can also vary when you access data that is stored on one system by using a client on a different system. The following example illustrates running a DATA step from a Windows client to access SAS data in the MVS environment.
data z(keep=x); x=5.2; output; y=1000; x=(x+y)-y; /*almost 5.2 */ output; run; proc print data=z; run;
This DATA step produces the following output:
Obs x 1 5.2 2 5.2
The next example illustrates the output you get when you execute the DATA step interactively under Windows or under MVS:
data z1; set z(where=(x=5.2)); run;
The output under Windows is as follows:
NOTE: There were 2 observations read from the data set WORK.Z1.
The output under MVS is as follows:
NOTE: There were 1 observations read from the data set WORK.Z. WHERE x=5.2; NOTE: The data set WORK.Z1 has 1 observations and 1 variables. The DATA statement used 0.00 CPU seconds and 14476K.
In the previous example, the expected count was not returned correctly under MVS because the imperfection of the data and finite precision are not taken into account. You cannot use equality to obtain a correct count because it does not include the "almost 5.2" cases in that count. To obtain the correct results under MVS, you need to run the following DATA step:
data z1; set z(where=(compfuzz(x,5.2,1e-10)=0)); run;
Under MVS, the output from this DATA step will be as follows:
NOTE: There were 2 observations read from the data set WORK.Z. WHERE COMPFUZZ(x, 5.2, 1E-10)=0; NOTE: The data set WORK.Z1 has 2 observations and 1 variables.
Once you determine the degree of precision that your enterprise needs, you can refine your software. You can use macros, sensitivity analyses, or fuzzy comparisons such as extractions or filters to extract data from databases or from different versions of SAS.
If you are running SAS® 9.2, use the COMPFUZZ (fuzzy comparison) function. Otherwise, use the following macro:
/******************************************************************************/ /*This macro defines an EQFUZZ operator. The subsequent DATA step shows*/ /* how to use this operator to test for equality within a certain tolerance.*/ /******************************************************************************/ %macro eqfuzz(var1, var2, fuzz=1e-12); abs((&var1 - &var2) / &var1) < &fuzz %mend; data _null_; x=0; y=1; do i=1 to 10; x+0.1; end; if x=y then put 'x exactly equal to y'; else if %eqfuzz(x,y) then put 'x close to y'; else put 'x nowhere close to y'; run;
When you read numbers in from an external DBMS that supports precision beyond 15 digits, you can lose that precision. You cannot do anything about this for existing databases. However, when you design new databases, you can set constraints to limit precision to about 15 digits or you can select a numeric DBMS data type to match the numeric SAS data type. For example, select type BINARY_DOUBLE in Oracle (precise up to 15 digits) instead of type NUMBER (precise up to 38 digits).
When you read numbers in from an external DBMS for noncomputational purposes, use the DBSASTYPE= data set option, as shown in this example:
libname ora oracle user=scott password=tiger path=path; data sasdata; set ora.catalina2( dbsastype= ( c1='char(20)') ) ; run;
This option retrieves numbers as character strings and preserves precision beyond 15 digits. For details about the DBSASTYPE= option, see "Data Set Options for Relational Databases" in SAS/ACCESS 9.1.3 for Relational Databases: Reference.
Refer to the following resources for more detail about numeric precision, including variables that can affect precision.
|Product Family||Product||System||SAS Release|
|SAS System||Base SAS||z/OS||9.1 TS1M3 SP4|
|Microsoft® Windows® for 64-Bit Itanium-based Systems||9.1 TS1M3 SP4|
|Microsoft Windows Server 2003 Datacenter 64-bit Edition||9.1 TS1M3 SP4|
|Microsoft Windows Server 2003 Enterprise 64-bit Edition||9.1 TS1M3 SP4|
|Microsoft Windows XP 64-bit Edition||9.1 TS1M3 SP4|
|Microsoft® Windows® for x64||9.1 TS1M3 SP4|
|Microsoft Windows 2000 Advanced Server||9.1 TS1M3 SP4|
|Microsoft Windows 2000 Datacenter Server||9.1 TS1M3 SP4|
|Microsoft Windows 2000 Server||9.1 TS1M3 SP4|
|Microsoft Windows 2000 Professional||9.1 TS1M3 SP4|
|Microsoft Windows NT Workstation||9.1 TS1M3 SP4|
|Microsoft Windows Server 2003 Datacenter Edition||9.1 TS1M3 SP4|
|Microsoft Windows Server 2003 Enterprise Edition||9.1 TS1M3 SP4|
|Microsoft Windows Server 2003 Standard Edition||9.1 TS1M3 SP4|
|Microsoft Windows XP Professional||9.1 TS1M3 SP4|
|Windows Vista||9.1 TS1M3 SP4|
|64-bit Enabled AIX||9.1 TS1M3 SP4|
|64-bit Enabled HP-UX||9.1 TS1M3 SP4|
|64-bit Enabled Solaris||9.1 TS1M3 SP4|
|HP-UX IPF||9.1 TS1M3 SP4|
|Linux||9.1 TS1M3 SP4|
|Linux on Itanium||9.1 TS1M3 SP4|
|OpenVMS Alpha||9.1 TS1M3 SP4|
|Tru64 UNIX||9.1 TS1M3 SP4|
|Date Modified:||2008-04-24 15:43:47|
|Date Created:||2008-03-21 14:07:49|