SUPPORT / SAMPLES & SAS NOTES
 

Support

Problem Note 67580: Model Studio pipelines create incorrect results when Data Mining Preprocessing nodes are used with variable names that are long

DetailsHotfixAboutRate It

In SAS® Visual Data Mining and Machine Learning, Model Studio pipelines might generate incorrect results. The problem occurs when these conditions are true:  

  • Some of the variables in the pipeline have long names, and those names are non-unique in the early part of the name.
  • The pipeline contains more than one of the following Data Mining Preprocessing nodes in the same branch of the flow: Imputation, Transformations, Replacement.

Under those conditions, when the nodes create variables that are assigned with the prefix _DUP, these variables might be overwritten and dropped in successor nodes.

There are no errors or warnings to indicate a problem.

You might be able to avoid the problem by using short names for variables or by making sure that variable names are unique within the first few characters.

Examples

Non-unique variable names:

  • my_really_long_variable_name_1
  • my_really_long_variable_name_2

Unique variable names:

  • my_1 _really_long_variable_name
  • my_2 _really_long_variable_name

Click the Hot Fix tab in this note for a link to instructions about accessing and applying the software update.



Operating System and Release Information

Product FamilyProductSystemProduct ReleaseSAS Release
ReportedFixed*ReportedFixed*
SAS SystemSAS Visual Data Mining and Machine LearningLinux for x648.12020.1.5ViyaViya
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.