SUPPORT / SAMPLES & SAS NOTES
 

Support

Problem Note 33261: %TMFILTER macro does not find links when URL= uses single quotes

DetailsHotfixAboutRate It

The %TMFILTER macro, included with SAS® Text Miner, supports the URL= parameter for extracting text content from web pages. The filtering engine should begin at the page referenced by the URL= parameter, then "crawl" to each page linked.

However, if the HTTP reference tag quotes the URL address using single–quote characters, the link is ignored and its pages are omitted from the %TMFILTER output.

There are no errors or warnings indicating that links were overlooked. To determine whether the problem occurred, examine the %TMFILTER output to see if it contains fewer observations than you expected.

The problem does not occur for HTTP reference tags that use double–quotes.

For example, %TMFILTER will not follow the link to the resources page if the HTTP reference tag is written like this:

<a href='/resources'>Company Resources</a>

The same link written with double-quotes works:

<a href="/resources">Company Resources</a>


Operating System and Release Information

Product FamilyProductSystemProduct ReleaseSAS Release
ReportedFixed*ReportedFixed*
SAS SystemSAS Text Miner64-bit Enabled Solaris3.24.19.1 TS1M3 SP49.2 TS2M0
64-bit Enabled AIX3.24.19.1 TS1M3 SP49.2 TS2M0
Microsoft Windows XP Professional3.24.19.1 TS1M3 SP49.2 TS2M0
Microsoft Windows Server 2003 Standard Edition3.24.19.1 TS1M3 SP49.2 TS2M0
Microsoft Windows Server 2003 Enterprise Edition3.24.19.1 TS1M3 SP49.2 TS2M0
Microsoft Windows Server 2003 Datacenter Edition3.24.19.1 TS1M3 SP49.2 TS2M0
Microsoft Windows NT Workstation3.29.1 TS1M3 SP4
Microsoft Windows 2000 Professional3.29.1 TS1M3 SP4
Microsoft Windows 2000 Server3.29.1 TS1M3 SP4
Microsoft Windows 2000 Datacenter Server3.29.1 TS1M3 SP4
Microsoft Windows 2000 Advanced Server3.29.1 TS1M3 SP4
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.