Problem Note 30645: TMFILTER macro may set TRUNCATED=1 incorrectly
TRUNCATED is a Boolean flag indicating whether the text content of a document file was truncated when creating the SAS data set. The value is set to 0 when the document text is not truncated, and 1 when the document text is truncated.
If TMFILTER identifies a file as truncated, then all additional files TMFILTER processes will be incorrectly flagged as TRUNCATED, even if the complete data was read.
Use the value of the SIZE variable in the SAS data set to determine whether the text data was truncated. SIZE correctly identifies the number of bytes in the file. If SIZE is greater than the value you set for the NUMCHARS or NUMBYTES parameter in TMFILTER, then your document was truncated.
Even if TRUNCATED was set to 1 incorrectly, the data was still processed correctly. The error is only in the setting of the indicator variable value.
Select the Hot Fix tab in this note to access the hot fix for this issue.
Operating System and Release Information
SAS System | SAS Text Miner | Microsoft Windows 2000 Professional | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows NT Workstation | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows Server 2003 Datacenter Edition | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows Server 2003 Enterprise Edition | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows Server 2003 Standard Edition | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows XP Professional | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
64-bit Enabled AIX | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
64-bit Enabled Solaris | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows 2000 Server | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows 2000 Datacenter Server | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
Microsoft Windows 2000 Advanced Server | 3.1 | 3.2 | 9.1 TS1M3 SP4 | 9.1 TS1M3 SP4 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
TMFILTER assigns TRUNCATED=1 for all files following the first file properly identified as truncated.
Type: | Problem Note |
Priority: | medium |
Date Modified: | 2008-04-30 15:26:35 |
Date Created: | 2007-11-28 20:38:50 |