Glossary
            
         
         
         
            
            - analytical model
               
            
- 
               
               a statistical model that is designed to perform
                  a specific task or to predict the probability of a specific event.
                  
                
            
            - attribute
               
            
- 
               
               
               
            
            
            - backtesting
               
            
- 
               
               a procedure for monitoring the quality of behavioral
                  and application scoring models. Backtesting validates the accuracy
                  of the model's predictions.
                  
                
            
            - baseline
               
            
- 
               
               the initial performance prediction against which
                  the output data from later tasks is compared.
                  
                
            
            - bin
               
            
- 
               
               a grouping of predictor variable values that is
                  used for frequency analysis.
                  
                
            
            - candidate model
               
            
- 
               
               a predictive model that evaluates a model's
                  predictive power as compared with the champion model's predictive
                  power.
                  
                
            
            - challenger model
               
            
- 
               
               a model that is compared and assessed against
                  a champion model for the purpose of replacing the champion model in
                  a production scoring environment.
                  
                
            
            - champion model
               
            
- 
               
               the best predictive model that is chosen from
                  a pool of candidate models in a data mining environment.
                  
                
            
            - characteristic report
               
            
- 
               
               a report that detects and quantifies shifts in
                  the distribution of input variables over time in data that is used
                  to create predictive models.
                  
                
            
            - classification model
               
            
- 
               
               a predictive model that has a categorical, ordinal,
                  or binary target.
                  
                
            
            - clustering model
               
            
- 
               
               a model in which data sets are divided into mutually
                  exclusive groups in such a way that the observations for each group
                  are as close as possible to one another, and different groups are
                  as far as possible from one another.
                  
                
            
            - component file
               
            
- 
               
               a file that defines a predictive model. Component
                  files can be SAS programs or data sets, XML files, log files, SPK
                  files, or CSV files.
                  
                
            
            - data model training
               
            
- 
               
               the process of building a predictive model from
                  data.
                  
                
            
            - data object
               
            
- 
               
               an object that holds the business data that is
                  required to execute workflow tasks.
                  
                
            
            - data set
               
            
- 
               
               
               
            
            
            - data source (source)
               
            
- 
               
               a table, view, or file from which you will extract
                  information. Sources can be in any format that SAS can access, on
                  any supported hardware platform. The metadata for a source is typically
                  an input to a job.
                  
                
            
            - DATA step
               
            
- 
               
               in a SAS program, a group of statements that begins
                  with a DATA statement and that ends with either a RUN statement, another
                  DATA statement, a PROC statement, or the end of the job. The DATA
                  step enables you to read raw data or other SAS data sets and to create
                  SAS data sets.
                  
                
            
            - DATA step fragment
               
            
- 
               
               a block of SAS code that does not begin with a
                  DATA statement. In SAS Model Manager, all SAS Enterprise Miner models
                  use DATA step fragments in their score code.
                  
                
            
            - delta report
               
            
- 
               
               a report that compares the input and output variable
                  attributes for each of the variables that are used to score two candidate
                  models.
                  
                
            
            - dynamic lift report
               
            
- 
               
               a graphical report that plots the sequential lift
                  performance of one or more models over time, against test data.
                  
                
            
            - file reference
               
            
- 
               
               
               
            
            
            - fileref (file reference)
               
            
- 
               
               a name that is temporarily assigned to an external
                  file or to an aggregate storage location such as a directory or a
                  folder. The fileref identifies the file or the storage location to
                  SAS.
                  
                
            
            - format
               
            
- 
               
               
               
            
            
            - Gini coefficient
               
            
- 
               
               a benchmark statistic that is a measure of the
                  inequality of distribution, and that can be used to summarize the
                  predictive accuracy of a model.
                  
                
            
            - holdout data
               
            
- 
               
               a portion of the historical data that is set aside
                  during model development. Holdout data can be used as test data to
                  benchmark the fit and accuracy of the emerging predictive model.
                  
                
            
            - identity
               
            
- 
               
               
               
            
            
            - index
               
            
- 
               
               a component of a SAS data set that enables SAS
                  to access observations in the SAS data set quickly and efficiently.
                  The purpose of SAS indexes is to optimize WHERE-clause processing
                  and to facilitate BY-group processing.
                  
                
            
            - informat
               
            
- 
               
               
               
            
            
            - inner join
               
            
- 
               
               a join between two tables that returns all of
                  the rows in one table that have one or more matching rows in the other
                  table.
                  
                
            
            - input variable
               
            
- 
               
               a variable that is used in a data mining process
                  to predict the value of one or more target variables.
                  
                
            
            - Kolmogorov-Smirnov chart
               
            
- 
               
               a chart that shows the measurement of the maximum
                  vertical separation, or deviation between the cumulative distributions
                  of events and non-events.
                  
                
            
            - library reference
               
            
- 
               
               
               
            
            
            - libref (library reference)
               
            
- 
               
               a SAS name that is associated with the location
                  of a SAS library. For example, in the name MYLIB.MYFILE, MYLIB is
                  the libref, and MYFILE is a file in the SAS library.
                  
                
            
            - life cycle phase
               
            
- 
               
               a collection of milestones that complete a major
                  step in the process of selecting and monitoring a champion model.
                  Typical life cycle phases include development, test, production, and
                  retire.
                  
                
            
            - logistic regression
               
            
- 
               
               a form of regression analysis in which the target
                  variable (response variable) represents a binary-level, categorical,
                  or ordinal-level response.
                  
                
            
            - macro variable (symbolic variable)
               
            
- 
               
               a variable that is part of the SAS macro programming
                  language. The value of a macro variable is a string that remains constant
                  until you change it.
                  
                
            
            - metadata
               
            
- 
               
               descriptive data about data that is stored and
                  managed in a database, in order to facilitate access to captured and
                  archived data for further use.
                  
                
            
            - metadata identity (identity)
               
            
- 
               
               a metadata object that represents an individual
                  user or a group of users in a SAS metadata environment. Each individual
                  and group that accesses secured resources on a SAS Metadata Server
                  should have a unique metadata identity within that server.
                  
                
            
            - milestone
               
            
- 
               
               a collection of tasks that complete a significant
                  event. The significant event can occur either in the process of selecting
                  a champion model, or in the process of monitoring a champion model
                  that is in a production environment.
                  
                
            
            - model assessment
               
            
- 
               
               the process of determining how well a model predicts
                  an outcome.
                  
                
            
            - model function
               
            
- 
               
               the type of statistical model, such as classification,
                  prediction, or segmentation.
                  
                
            
            - model input variable report
               
            
- 
               
               reports the frequencies that input variables are
                  used in the models for an organizational folder, a project, or a version.
                  
                
            
            - model profile report
               
            
- 
               
               reports the profile data that is associated with
                  the model input variables, output variables, and target variables.
                  
                
            
            - model scoring (scoring)
               
            
- 
               
               the process of applying a model to new data in
                  order to compute outputs.
                  
                
            
            - model target variable report
               
            
- 
               
               a report that indicates the frequency in which
                  target variables are used in the models that exist in the selected
                  folder.
                  
                
            
            - monitoring report
               
            
- 
               
               a report that consists of assessment charts, a
                  ROC chart, a Gini Trend chart, a KS (Kolmogorov-Smirnov) chart, and
                  a KS trend chart that can be used to compare the model performance
                  curves of several candidate models.
                  
                
            
            - neural network
               
            
- 
               
               any of a class of models that usually consist
                  of a large number of neurons, interconnected in complex ways and organized
                  into layers. Examples are flexible nonlinear regression models, discriminant
                  models, data reduction models, and nonlinear dynamic systems.
                  
                
            
            - observation
               
            
- 
               
               a row in a SAS data set. All of the data values
                  in an observation are associated with a single entity such as a customer
                  or a state. Each observation contains either one data value or a missing-value
                  indicator for each variable.
                  
                
            
            - package
               
            
- 
               
               
               
            
            
            - package file (package)
               
            
- 
               
               a container for data that has been generated or
                  collected for delivery to consumers by the SAS Publishing Framework.
                  Packages can contain SAS files, binary files, HTML files, URLs, text
                  files, viewer files, and metadata.
                  
                
            
            - participant
               
            
- 
               
               a user, group, or role that is assigned to a task.
                  These users, groups, and roles are defined in SAS metadata and are
                  mapped to standard roles for the workflow.
                  
                
            
            - performance table
               
            
- 
               
               a table that contains response data that is collected
                  over a period of time. Performance tables are used to monitor the
                  performance of a champion model that is in production.
                  
                
            
            - PFD
               
            
- 
               
               
               
            
            
            - PMML
               
            
- 
               
               
               
            
            
            - policy
               
            
- 
               
               a workflow element that associates event-driven
                  logic with a task or subflow. Policies are usually triggered automatically
                  by an event such as a status change or a timer event.
                  
                
            
            - prediction model
               
            
- 
               
               a model that predicts the outcome of an interval
                  target.
                  
                
            
            - Predictive Modeling Markup Language (PMML)
               
            
- 
               
               an XML based standard for representing data mining
                  results for scoring purposes. It enables the sharing and deployment
                  of data mining results between applications and across data management
                  systems.
                  
                
            
            - process flow diagram (PFD)
               
            
- 
               
               a graphical sequence of interconnected symbols
                  that represent an ordered set of steps or tasks that, when combined,
                  form a workflow designed to yield an analytical result.
                  
                
            
            - production models aging report
               
            
- 
               
               reports the number and the aging distribution
                  of champion models.
                  
                
            
            - profile data
               
            
- 
               
               information that consists of the model name, type,
                  length, label, format, level, and role.
                  
                
            
            - project
               
            
- 
               
               a collection of models, SAS programs, data tables,
                  scoring tests, performance data, and reporting documents.
                  
                
            
            - project tree
               
            
- 
               
               a hierarchical structure made up of folders and
                  nodes that are related to a single folder or node one level above
                  it and to zero, one, or more folders or nodes one level below it.
                  
                
            
            - property
               
            
- 
               
               any of the characteristics of a component that
                  collectively determine the component's appearance and behavior.
                  Examples of types of properties are attributes and methods.
                  
                
            
            - publication channel (SAS publication channel)
               
            
- 
               
               an information repository that has been established
                  using the SAS Publishing Framework and that can be used to publish
                  information to users and applications.
                  
                
            
            - publish
               
            
- 
               
               to deliver electronic information to one or more
                  destinations. These destinations can include message queues, publication
                  channels, and so on.
                  
                
            
            - Publishing Framework
               
            
- 
               
               a component of SAS Integration Technologies that
                  enables both users and applications to publish SAS files (including
                  data sets, catalogs, and database views), and other digital content
                  to a variety of destinations. The Publishing Framework also provides
                  tools that enable both users and applications to receive and process
                  published information.
                  
                
            
            - Receiver Operating Characteristic chart (ROC)
               
            
- 
               
               a chart used in signal detection theory to plot
                  the sensitivity, or true positive rate, against the false positive
                  rate (1 − specificity, or 1 − true negative rate) of
                  binary data values. An ROC chart is used to assess a model's
                  predictive performance.
                  
                
            
            - ROC
               
            
- 
               
               
               
            
            
            - SAS code model
               
            
- 
               
               a SAS program or a DATA step fragment that computes
                  output values from input values. An example of a SAS code model is
                  the LOGISTIC procedure.
                  
                
            
            - SAS Content Server
               
            
- 
               
               a server that stores digital content (such as
                  documents, reports, and images) that is created and used by SAS client
                  applications. To interact with the server, clients use WebDAV-based
                  protocols for access, versioning, collaboration, security, and searching.
                  
                
            
            - SAS data set (data set)
               
            
- 
               
               a file whose contents are in one of the native
                  SAS file formats. There are two types of SAS data sets: SAS data files
                  and SAS data views.
                  
                
            
            - SAS format (format)
               
            
- 
               
               a type of SAS language element that is used to
                  write or display data values according to the data type: numeric,
                  character, date, time, or timestamp.
                  
                
            
            - SAS informat (informat)
               
            
- 
               
               a type of SAS language element that is used to
                  read data values according to the data's type: numeric, character,
                  date, time, or timestamp.
                  
                
            
            - SAS Metadata Repository
               
            
- 
               
               a container for metadata that is managed by the
                  SAS Metadata Server.
                  
                
            
            - SAS Metadata Server
               
            
- 
               
               a multi-user server that enables users to read
                  metadata from or write metadata to one or more SAS Metadata Repositories.
                  
                
            
            - SAS Model Manager repository
               
            
- 
               
               a location in the SAS Content Server where SAS
                  Model Manager data is stored, organized, and maintained.
                  
                
            
            - SAS publication channel
               
            
- 
               
               
               
            
            
            - SAS variable (variable)
               
            
- 
               
               a column in a SAS data set or in a SAS data view.
                  The data values for each variable describe a single characteristic
                  for all observations (rows).
                  
                
            
            - scoring
               
            
- 
               
               
               
            
            
            - scoring function
               
            
- 
               
               a user-defined function that is created by the
                  SAS Scoring Accelerator from a scoring model and that is deployed
                  inside the database.
                  
                
            
            - scoring input table
               
            
- 
               
               a table that contains the variables and data that
                  are used as input in a scoring test.
                  
                
            
            - scoring output table
               
            
- 
               
               a table that contains the output variables and
                  data that result from performing a scoring test. Before executing
                  a scoring test, the scoring output table defines the variables to
                  keep as the scoring results.
                  
                
            
            - scoring test
               
            
- 
               
               a workflow that executes a model's score
                  code.
                  
                
            
            - segmentation model
               
            
- 
               
               a model that identifies and forms segments, or
                  clusters, of individual observations that are associated with an attribute
                  of interest.
                  
                
            
            - source
               
            
- 
               
               
               
            
            
            - stability report
               
            
- 
               
               a graphical report that detects and quantifies
                  shifts in the distribution of output variables over time in data that
                  is produced by a model.
                  
                
            
            - subscriber
               
            
- 
               
               a recipient of information that is published to
                  a SAS publication channel.
                  
                
            
            - swimlane
               
            
- 
               
               a workflow diagram element that enables you to
                  group tasks that are assigned to the same participant.
                  
                
            
            - symbolic variable
               
            
- 
               
               
               
            
            
            - target event value
               
            
- 
               
               for binary models, the value of a target variable
                  that a model attempts to predict. In SAS Model Manager, the target
                  event value is a property of a model.
                  
                
            
            - target variable
               
            
- 
               
               a variable whose values are known in one or more
                  data sets that are available (in training data, for example) but whose
                  values are unknown in one or more future data sets (in a score data
                  set, for example). Data mining models use data from known variables
                  to predict the values of target variables.
                  
                
            
            - task
               
            
- 
               
               
               
            
            
            - task status
               
            
- 
               
               the outcome of a task in a workflow. The status
                  of a task (for example, Started, Canceled, Approved) is typically
                  used to trigger the next task.
                  
                
            
            - test table
               
            
- 
               
               a SAS data set that is used as input to a model
                  that tests the accuracy of a model's output.
                  
                
            
            - training data
               
            
- 
               
               data that contains input values and target values
                  that are used to train and build predictive models.
                  
                
            
            - universally unique identifier (UUID)
               
            
- 
               
               a number that is used to uniquely identify information
                  in distributed systems without significant central coordination. There
                  are 32 hexadecimal characters in a UUID, and these are divided into
                  five groups with hyphens between them as follows: 8-4-4-4-12. Altogether
                  the 16-byte (128-bit) canonical UUID has 36 characters (32 alphanumeric
                  characters and 4 hyphens). For example: 123e4567-e89b-12d3-a456-426655440000
                  
                
            
            - user-defined report
               
            
- 
               
               a customized report. The customized report is
                  a SAS program and its auxiliary files and is stored on the workspace
                  server that is used by SAS Model manager. User-defined reports are
                  accessible from the New Reports wizard.
                  
                
            
            - UUID
               
            
- 
               
               
               
            
            
            - variable
               
            
- 
               
               
               
            
            
            - variable attribute (attribute)
               
            
- 
               
               any of the following characteristics that are
                  associated with a particular variable: name, label, format, informat,
                  data type, and length.
                  
                
            
            - WebDAV server
               
            
- 
               
               an HTTP server that supports the collaborative
                  authoring of documents that are located on the server. The server
                  supports the locking of documents, so that multiple authors cannot
                  make changes to a document at the same time. It also associates metadata
                  with documents in order to facilitate searching. The SAS business
                  intelligence applications use this type of server primarily as a report
                  repository. Common WebDAV servers include the Apache HTTP Server (with
                  its WebDAV modules enabled), Xythos Software's WebFile Server,
                  and Microsoft Corporation's Internet Information Server (IIS).
                  
                
            
            - workflow
               
            
- 
               
               a series of tasks, together with the participants
                  and the logic that is required to execute the tasks. A workflow includes
                  policies, status values, and data objects.
                  
                
            
            - workflow definition
               
            
- 
               
               a workflow template that has been uploaded to
                  the server and activated. Workflow definitions are used by the SAS
                  Workflow Engine to create new workflow instances.
                  
                
            
            - workflow instance
               
            
- 
               
               a workflow that is running in the SAS Workflow
                  Engine. After a workflow template is uploaded to the server and activated,
                  client applications can use the template to create and run a new copy
                  of the workflow definition. Each new copy is a workflow instance.
                  
                
            
            - workflow task (task)
               
            
- 
               
               a workflow element that associates executable
                  logic with an event such as a status change or timer event.
                  
                
            
            - workflow template
               
            
- 
               
               a model of a workflow that has been saved to an
                  XML file.
                  
                
 
      
      
      
      
         
         Copyright © SAS Institute Inc. All rights reserved.