Problem Note 67580: Model Studio pipelines create incorrect results when Data Mining Preprocessing nodes are used with variable names that are long
In SAS® Visual Data Mining and Machine Learning, Model Studio pipelines might generate incorrect results. The problem occurs when these conditions are true:
- Some of the variables in the pipeline have long names, and those names are non-unique in the early part of the name.
- The pipeline contains more than one of the following Data Mining Preprocessing nodes in the same branch of the flow: Imputation, Transformations, Replacement.
Under those conditions, when the nodes create variables that are assigned with the prefix _DUP, these variables might be overwritten and dropped in successor nodes.
There are no errors or warnings to indicate a problem.
You might be able to avoid the problem by using short names for variables or by making sure that variable names are unique within the first few characters.
Examples
Non-unique variable names:
- my_really_long_variable_name_1
- my_really_long_variable_name_2
Unique variable names:
- my_1 _really_long_variable_name
- my_2 _really_long_variable_name
Click the Hot Fix tab in this note for a link to instructions about accessing and applying the software update.
Operating System and Release Information
SAS System | SAS Visual Data Mining and Machine Learning | Linux for x64 | 8.1 | 2020.1.5 | Viya | Viya |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Type: | Problem Note |
Priority: | alert |
Date Modified: | 2021-05-20 09:19:23 |
Date Created: | 2021-03-10 08:28:02 |