Problem Note 70899: The joining of two or more BigQuery tables might result in either a performance issue or incorrect results
Trying to join two or more BigQuery tables in a SQL procedure query might result in either slow performance or incorrect results when at least one of the tables is located in a different project than the billing project. In order to access a BigQuery table that is located in a different project than the billing project, use the QUALIFIER= option to point to where the table resides generating a 3-level name: project.dataset.table_name.
When the query includes a join, the qualifier is incorrectly left off the table reference. As a result, the table is looked at in the wrong project. If there isn't a table in the project specified by the project= option, then a note similar to the following might occur, the tables are retrieved into SAS, and the join is processed in SAS:
ACCESS ENGINE: ERROR: CLI prepare error: Table "austin_waste.waste_and_diversion" does not exist or cannot be accessed Error:
googleapi: Error 404: Not found: Dataset sas-sbo:austin_waste, notFound
ACCESS ENGINE: SQL statement was not passed to the DBMS, SAS will do the processing.
However, if there was a table with the same name found in the project specified with the project= option, the join is passed to the database and the tables in that project are joined, resulting in what is likely incorrect results.
The only workaround for this issue is to use explicit pass-through.
Click the Hot Fix tab in this note to access the hot fix for this issue.
Operating System and Release Information
SAS System | SAS/ACCESS Interface to Google BigQuery | Linux for x64 | 9.4 TS1M6 | |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Type: | Problem Note |
Priority: | high |
Date Modified: | 2024-09-24 13:47:42 |
Date Created: | 2024-07-23 11:40:11 |