Scoring Code

Scoring code is SAS code that creates new variables or transforms existing variables. The scoring code is usually, but not necessarily, in the form of a single DATA step. Enterprise Miner recognizes two types of SAS scoring code:
  • Flow Scoring Code — This scoring code is used to score data tables within a SAS Enterprise Miner process flow.
  • Publish Scoring Code — This scoring code is used to publish a SAS Enterprise Miner model to a scoring system outside of a process flow.
When the scoring code is generated dynamically by the node, the code must be written to specific files that are recognized by SAS Enterprise Miner. These files are specified by the macro variables &EM_FILE_EMFLOWSCORECODE and &EM_FILE_EMPUBLISHSCORECODE. If the code is to be used only within the process flow, the code is written to the file specified by &EM_FILE_EMFLOWSCORECODE. When scoring external tables, the code is written to the file specified by &EM_FILE_EMPUBLISHSCORECODE. If the scoring code is not pure DATA step code, assign the macro variable, &EM_SCORECODEFORMAT, a value of OTHER. By default, &EM_SCORECODEFORMAT has a value of DATASTEP. If the Flow scoring code and the Publish scoring code are identical, you can just generate the Flow code using the file designated by &EM_FILE_EMFLOWSCORECODE and then assign the macro variable, &EM_PUBLISHCODE, a value of FLOW.
Some SAS modeling procedures have OUTPUT statements that produce output data sets containing newly created variables, and are, therefore, performing the act of scoring. When these methods are used for scoring, the newly generated variables can be exported by the node and imported by successor nodes. However, since this method does not actually generate scoring code, the scoring formula cannot be exported outside of the flow. Also, some SAS Enterprise Miner nodes (for example, the Scoring node) collect and aggregate all of the scoring code that is generated by predecessor nodes in a process flow diagram. Such nodes cannot recognize this form of scoring since no scoring code is generated. Hence, the aggregated scoring code contains no references to the variables that are generated by an OUTPUT statement.