Problem Note 63952: Queries of Hadoop tables return "WARNING: The following column could have a length in SAS of 32767..." and cause performance problems
When you query Hadoop tables, you might encounter performance problems and see a warning similar to the following issued in the SAS® log:
WARNING: The following column could have a length in SAS of 32767. If so, SAS performance is
impacted. See SAS/ACCESS documentation for details. The column read from Hive
followed by the maximum length observed was: col_v1:2
This problem occurs because SAS maps STRING data types and other complex data types to the maximum character length possible, which is CHAR(32767). This behavior occurs regardless of the actual length of the data. This behavior causes problems with performance and causes unnecessarily large SAS tables to be created.
To work around this issue, use the DBMAX_TEXT= LIBNAME statement or data set option or the DBSASTYPE= data set option. These options control the column lengths. If you specify the DBMAX_TEXT= option in the connection string to Hadoop, the value is applied to all character columns. However, you must set the DBSASTYPE= data set option for each STRING column.
Click the Hot Fix tab in this note to access the hot fix for this issue.
After you install the hot fix, you can access a new environment variable that fails queries of Hadoop tables that contain columns with STRING or other complex data types. You must set this environment variable before you issue a LIBNAME connection to Hadoop. If you assign the environment variable after the LIBNAME connection to Hadoop, the behavior remains unchanged.
You can set the environment variable using the traditional methods. To set the environment variable in a SAS session, use this syntax:
options set=SAS_HADOOP_FAIL_32767="1";
Operating System and Release Information
SAS System | SAS/ACCESS Interface to Hadoop | Microsoft® Windows® for x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8 Enterprise 32-bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8 Enterprise x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8 Pro 32-bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8 Pro x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8.1 Enterprise 32-bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8.1 Enterprise x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8.1 Pro 32-bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 8.1 Pro x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows 10 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows Server 2008 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows Server 2008 R2 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows Server 2008 for x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows Server 2012 Datacenter | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows Server 2012 R2 Datacenter | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows Server 2012 R2 Std | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Microsoft Windows Server 2012 Std | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Enterprise 32 bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Enterprise x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Home Premium 32 bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Home Premium x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Professional 32 bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Professional x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Ultimate 32 bit | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Windows 7 Ultimate x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
64-bit Enabled AIX | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
64-bit Enabled Solaris | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
HP-UX IPF | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Linux for x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
Solaris for x64 | 9.43 | 9.45 | 9.4 TS1M3 | 9.4 TS1M5 |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
When you perform queries of Hadoop tables that contain complex or STRING data types, you encounter performance problems. You also see a message similar to the following: "WARNING: The following column could have a length in SAS of 32767. If so, SAS performance is impacted..."
Type: | Problem Note |
Priority: | high |
Date Modified: | 2019-04-10 14:37:23 |
Date Created: | 2019-04-02 09:12:55 |