Problem Note 70029: Loading a Teradata table into SAS® Cloud Analytic Services (CAS) with SCANSTRINGCOLUMNS=true and NUMREADNODES=0 might take a long time to execute
Loading a DBMS table into SAS Cloud Analytic Services (CAS) might take longer than expected or fail to complete when you use the following data connector options in your CASLIB:
- SCANSTRINGCOLUMNS=TRUE
- NUMREADNODES=0 (or greater than 1)
The problem occurs when loading Teradata tables, but the issue might also occur with other data connectors.
The implementation of SCANSTRINGCOLUMNS=true for SAS® Data Connector to Teradata was not intended for use when NUMREADNODES is not 1. In addition, there is not any expected performance benefits when you use both data connector options.
Your database administrator might report the unexpected behavior, or you might observe it in the driver trace outputs, which are available by adding the following options to your CASLIB:
DRIVER_TRACE="SQL",
DRIVER_TRACEFILE="path-available-to-all-cas-nodes/my-trace$HOST.log",
DRIVER_TRACEOPTIONS="TIMESTAMP|APPEND"
A DRIVER SQL SELECT clause is displayed in each CAS log. This clause contains a MAX(LENGTH("column-name")) for each VARCHAR column. This clause should appear only in the CAS head node log, but this clause also occurs in each of the CAS node logs that are referenced by DRIVER_TRACEFILE, which informs you that it ran on every CAS Node.
Workarounds
To circumvent this issue, complete any of the following workarounds:
- Use SCANSTRINGCOLUMNS=TRUE with NUMREADNODES=1.
- Use SCANSTRINGCOLUMNS=FALSE with NUMREADNODES=0 or more than 1.
- Implement parallel data transfer where you must license SAS® In-Database Technologies, and your SAS administrator must deploy SAS® Embedded Process.
Click the Hot Fix tab in this note for a link to instructions about accessing and applying the software update.
Operating System and Release Information
SAS System | SAS Data Connector to Teradata | Linux for x64 | V.03.05 | | Viya | |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
The SCANSTRINGCOLUMNS=true data connector option causes additional queries to run in Teradata for every CAS Node, which results in poor performance.
Type: | Problem Note |
Priority: | high |
Topic: | Data Management ==> Data Sources ==> External Databases ==> Teradata
|
Date Modified: | 2023-04-26 09:17:16 |
Date Created: | 2023-04-18 03:31:18 |