Glossary

activity
See task
activity status
See task status
analytical model
a statistical model that is designed to perform a specific task or to predict the probability of a specific event.
attribute
See variable attribute
baseline
the initial performance prediction against which the output data from later tasks is compared.
bin
a grouping of predictor variable values that is used for frequency analysis.
candidate model
a predictive model that evaluates a model's predictive power as compared with the champion model's predictive power.
challenger model
a model that is compared and assessed against a champion model for the purpose of replacing the champion model in a production scoring environment.
champion model
the best predictive model that is chosen from a pool of candidate models in a data mining environment.
characteristic report
a report that detects and quantifies shifts in the distribution of input variables over time in data that is used to create predictive models.
classification model
a predictive model that has a categorical, ordinal, or binary target.
clustering model
a model in which data sets are divided into mutually exclusive groups in such a way that the observations for each group are as close as possible to one another, and different groups are as far as possible from one another.
component file
a file that defines a predictive model. Component files can be SAS programs or data sets, XML files, log files, SPK files, or CSV files.
data model training
the process of building a predictive model from data.
data object
an object that holds the business data that is required to execute workflow tasks.
data set
See SAS data set
data source
a table, view, or file from which you will extract information. Sources can be in any format that SAS can access, on any supported hardware platform. The metadata for a source is typically an input to a job.
DATA step
in a SAS program, a group of statements that begins with a DATA statement and that ends with either a RUN statement, another DATA statement, a PROC statement, or the end of the job. The DATA step enables you to read raw data or other SAS data sets and to create SAS data sets.
DATA step fragment
a block of SAS code that does not begin with a DATA statement. In SAS Model Manager, all SAS Enterprise Miner models use DATA step fragments in their score code.
delta report
a report that compares the input and output variable attributes for each of the variables that are used to score two candidate models.
dynamic lift report
a graphical report that plots the sequential lift performance of one or more models over time, against test data.
file reference
See fileref
fileref
a name that is temporarily assigned to an external file or to an aggregate storage location such as a directory or a folder. The fileref identifies the file or the storage location to SAS.
format
See SAS format
Gini coefficient
a benchmark statistic that is a measure of the inequality of distribution, and that can be used to summarize the predictive accuracy of a model.
holdout data
a portion of the historical data that is set aside during model development. Holdout data can be used as test data to benchmark the fit and accuracy of the emerging predictive model.
informat
See SAS informat
input variable
a variable that is used in a data mining process to predict the value of one or more target variables.
instance
See workflow instance
Kolmogorov-Smirnov chart
a chart that shows the measurement of the maximum vertical separation, or deviation between the cumulative distributions of events and non-events.
library reference
See libref
libref
a SAS name that is associated with the location of a SAS library. For example, in the name MYLIB.MYFILE, MYLIB is the libref, and MYFILE is a file in the SAS library.
life cycle phase
a collection of milestones that complete a major step in the process of selecting and monitoring a champion model. Typical life cycle phases include development, test, production, and retire.
logistic regression
a form of regression analysis in which the target variable (response variable) represents a binary-level, categorical, or ordinal-level response.
macro variable
a variable that is part of the SAS macro programming language. The value of a macro variable is a string that remains constant until you change it. Macro variables are sometimes referred to as symbolic variables.
metadata
descriptive data about data that is stored and managed in a database, in order to facilitate access to captured and archived data for further use.
milestone
a collection of tasks that complete a significant event. The significant event can occur either in the process of selecting a champion model, or in the process of monitoring a champion model that is in a production environment.
model assessment
the process of determining how well a model predicts an outcome.
model function
the type of statistical model, such as classification, prediction, or segmentation.
model input variable report
reports the frequencies that input variables are used in the models for an organizational folder, a project, or a version.
model profile report
reports the profile data that is associated with the model input variables, output variables, and target variables.
model scoring
the process of applying a model to new data in order to compute outputs.
model target variable report
a report that indicates the frequency in which target variables are used in the models that exist in the selected folder.
neural network
any of a class of models that usually consist of a large number of neurons, interconnected in complex ways and organized into layers. Examples are flexible nonlinear regression models, discriminant models, data reduction models, and nonlinear dynamic systems.
observation
a row in a SAS data set. All of the data values in an observation are associated with a single entity such as a customer or a state. Each observation contains either one data value or a missing-value indicator for each variable.
organizational folder
a folder in the SAS Model Manager Project Tree that is used to organize project and document resources. An organizational folder can contain zero or more organizational folders in addition to other objects.
output variable
in a data mining process, a variable that is computed from the input variables as a prediction of the value of a target variable.
package
See SAS package file
package file
See SAS package file
participant
a user, group, or role that is assigned to a task. These users, groups, and roles are defined in SAS metadata and are mapped to standard roles for the workflow.
performance table
a table that contains response data that is collected over a period of time. Performance tables are used to monitor the performance of a champion model that is in production.
PFD
See process flow diagram
PMML
See Predictive Modeling Markup Language
prediction model
a model that predicts the outcome of an interval target.
Predictive Modeling Markup Language
an XML based standard for representing data mining results for scoring purposes. It enables the sharing and deployment of data mining results between applications and across data management systems. Short form: PMML.
process flow diagram
a graphical sequence of interconnected symbols that represent an ordered set of steps or tasks that, when combined, form a workflow designed to yield an analytical result.
production models aging report
reports the number and the aging distribution of champion models.
profile data
information that consists of the model name, type, length, label, format, level, and role.
project
a collection of models, SAS programs, data tables, scoring tasks, life cycle data, and reporting documents.
project tree
a hierarchical structure made up of folders and nodes that are related to a single folder or node one level above it and to zero, one, or more folders or nodes one level below it.
property
any of the characteristics of a component that collectively determine the component's appearance and behavior. Examples of types of properties are attributes and methods.
publication channel
an information repository that has been established using the SAS Publishing Framework and that can be used to publish information to users and applications.
Receiver Operating Characteristic chart
a chart used in signal detection theory to plot the sensitivity, or true positive rate, against the false positive rate (1 − specificity, or 1 − true negative rate) of binary data values. An ROC chart is used to assess a model's predictive performance. Short form: ROC
ROC
See Receiver Operating Characteristic chart
SAS code model
a SAS program or a DATA step fragment that computes output values from input values. An example of a SAS code model is the LOGISTIC procedure.
SAS data set
a file whose contents are in one of the native SAS file formats. There are two types of SAS data sets: SAS data files and SAS data views.
SAS format
a type of SAS language element that is used to write or display data values according to the data type: numeric, character, date, time, or timestamp. Short form: format.
SAS informat
a type of SAS language element that is used to read data values according to the data's type: numeric, character, date, time, or timestamp. Short form: informat.
SAS Metadata Repository
a container for metadata that is managed by the SAS Metadata Server.
SAS Model Manager repository
a location in the SAS Content Server where SAS Model Manager data is stored, organized, and maintained.
SAS package file
a container for data that has been generated or collected for delivery to consumers by the SAS Publishing Framework. Packages can contain SAS files, binary files, HTML files, URLs, text files, viewer files, and metadata.
SAS publication channel
See publication channel
SAS variable
a column in a SAS data set or in a SAS data view. The data values for each variable describe a single characteristic for all observations (rows).
scoring
See model scoring
scoring function
a user-defined function that is created by the SAS Scoring Accelerator from a scoring model and that is deployed inside the database.
scoring task
a workflow that executes a model's score code.
scoring task input table
a table that contains the variables and data that are used as input in a SAS Model Manager scoring task.
scoring task output table
a table that contains the output variables and data that result from performing a SAS Model Manager scoring task. Before executing a scoring task, the scoring task output table defines the variables to keep as the scoring results.
segmentation model
a model that identifies and forms segments, or clusters, of individual observations that are associated with an attribute of interest.
source
See data source
SPK
See SAS package file
stability report
a graphical report that detects and quantifies shifts in the distribution of output variables over time in data that is produced by a model.
swimlane
a workflow diagram element that enables you to group tasks that are assigned to the same participant.
target event value
for binary models, the value of a target variable that a model attempts to predict. In SAS Model Manager, the target event value is a property of a model.
target variable
a variable whose values are known in one or more data sets that are available (in training data, for example) but whose values are unknown in one or more future data sets (in a score data set, for example). Data mining models use data from known variables to predict the values of target variables.
task
a workflow element that associates executable logic with an event such as a status change or timer event.
task status
the outcome of a task in a workflow. The status of a task (for example, Started, Canceled, Accepted) is typically used to trigger the next task.
test table
a SAS data set that is used as input to a model that tests the accuracy of a model's output.
training data
data that contains input values and target values that are used to train and build predictive models.
universal unique identifier
a number that is used to uniquely identify information in distributed systems without significant central coordination. There are 32 hexadecimal digits in a UUID, and these are divided into five groups with hyphens between them as follows: 8-4-4-4-12. Altogether the 16-byte (128 bit) canonical UUID has 32 digits and 4 hyphens, or 36 characters.
user-defined report
a customized report. The customized report is a SAS program and its auxiliary files and is stored on the workspace server that is used by SAS Model manager. You access a user defined report by using the New Reports wizard.
UUID
See universal unique identifier
variable
See SAS variable
variable attribute
any of the following characteristics that are associated with a particular variable: name, label, format, informat, data type, and length.
version folder
a folder in the Project Tree that typically represents a time phase and that contains models, scoring tasks, life cycle data, reports, documents, resources, and model performance output.
view
a particular representation of a model’s data.
workflow
a series of tasks, together with the participants and the logic that is required to execute the tasks. A workflow includes policies, status values, and data objects.
workflow definition
a workflow template that has been uploaded to the server and activated. Workflow definitions are used by the SAS Workflow Engine to create new workflow instances.
workflow instance
a workflow that is running in the SAS Workflow Engine. After a workflow template is uploaded to the server and activated, client applications can use the template to create and run a new copy of the workflow definition. Each new copy is a workflow instance.
workflow template
a model of a workflow that has been saved to an XML file.