Problem Note 16089: Transform Variables node may incorrectly assign missing values when
taking the log
Taking the logarithm is one of the transformations that the Transform
Variables node provides. If the minimum value of the variable being
transformed (X) is less than or equal to 0 in the metadata sample, the
node adds a constant (C) to every value to make the minimum 1 before
logging.
Y = log(X+C)
Since the minimum of the variable in the metadata sample might not be as
small as the minimum of the variable in the full data set, the node
assigns a non-missing transformed value only if X is greater than 0.
If X > 0 then Y=log(X+C);
else Y = .;
This is not correct, and as a result, observations might be assigned a
missing value by mistake.
Instead, the node should assign a non-missing transformed value only if
(X+C) is greater than 0.
If (X+C) > 0 then Y=log(X+C)
else Y=.;
There are no error or warning messages to indicate there is a problem.
The only way to know that the values are incorrect is to compare the
minimum value in the full data set to the minimum value in the metadata
sample. If they are not the same, then the problem will occur.
A fix for SAS Enterprise Miner Server 4.3 for this issue is
available at:
http://ftp.sas.com/techsup/download/hotfix/dmine43.html
A fix for SAS Enterprise Miner Server 4.3 for National Language
Support (Japanese Translations) for this issue is available at:
http://ftp.sas.com/techsup/download/hotfix/nls_d9_ja.html
Operating System and Release Information
SAS System | SAS Enterprise Miner | Microsoft Windows NT Workstation | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows Server 2003 Standard Edition | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft® Windows® for 64-Bit Itanium-based Systems | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows Server 2003 Enterprise Edition | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows Server 2003 Datacenter Edition | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows 2000 Professional | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows 2000 Server | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Solaris | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows 2000 Advanced Server | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows 2000 Datacenter Server | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
z/OS | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
64-bit Enabled Solaris | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Microsoft Windows XP Professional | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Linux | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
HP-UX | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
HP-UX IPF | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
64-bit Enabled HP-UX | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
64-bit Enabled AIX | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
OpenVMS Alpha | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
Tru64 UNIX | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
AIX | 4.3 | 5.1 | 9.1 TS1M3 | 9.1 TS1M0 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Type: | Problem Note |
Priority: | alert |
Topic: | Analytics ==> Data Mining
|
Date Modified: | 2006-05-02 08:51:31 |
Date Created: | 2005-09-02 13:00:41 |