Problem Note 54481: PROC HPTMINE might retain some Punctuation terms that have an attribute of Mixed
In SAS® Text Miner, the HPTMINE procedure (and therefore the HP Text Miner node), does not correctly drop all terms that have a role of Punctuation. If the punctuation terms have a role of Mixed, then PROC HPTMINE fails to drop them. This problem occurs because PROC HPTMINE handles the characters using a LATIN1 encoding instead of a WLATIN1 encoding. WLATIN1 is a strict superset of LATIN1 encoding.
There are no errors or warnings to indicate a problem.
To work around the problem, add the following line to your sasv9.cfg file:
-ENCODING WLATIN1
Operating System and Release Information
SAS System | SAS Text Miner | Microsoft® Windows® for x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows 8 Enterprise x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows 8 Pro x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows 8.1 Enterprise 32-bit | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows 8.1 Enterprise x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows 8.1 Pro | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows 8.1 Pro 32-bit | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows Server 2008 R2 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows Server 2008 for x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows Server 2012 Datacenter | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows Server 2012 R2 Datacenter | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows Server 2012 R2 Std | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Microsoft Windows Server 2012 Std | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Windows 7 Enterprise x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Windows 7 Professional x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
64-bit Enabled AIX | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
64-bit Enabled Solaris | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
HP-UX IPF | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Linux for x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
Solaris for x64 | 12.1_M1 | 12.3 | 9.3 TS1M2 | 9.4 TS1M0 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
PROC HPTMINE might not drop some Punctuation terms that have an attribute of Mixed
Type: | Problem Note |
Priority: | high |
Topic: | Analytics ==> Data Mining Analytics ==> Text Mining
|
Date Modified: | 2014-11-12 10:43:37 |
Date Created: | 2014-10-29 11:04:14 |