Glossary
- administrator
-
the person who is responsible for maintaining
the technical attributes of an object such as a table or a library.
For example, an administrator might specify where a table is stored
and who can access the table. See also owner.
- alternate
key
-
another term for unique key. See also unique key.
- analysis
data set
-
in SAS data quality, a SAS output data set that
provides information on the degree of divergence in specified character
values.
- business
key
-
one or more columns in a dimension table that
comprise the primary key in a source table in an operational system.
- CDC
-
See change data capture.
- change
analysis
-
the process of comparing one set of metadata to
another set of metadata and identifying the differences between the
two sets of metadata. For example, in SAS Data Integration Studio,
you have the option of performing change analysis on imported metadata.
Imported metadata is compared to existing metadata. You can view any
changes in the Differences window and choose which changes to apply.
To help you understand the impact of a given change, you can run impact
analysis or reverse impact analysis on tables and columns in the Differences
window.
- change
data capture
-
the process of capturing changes that are made
to data, and making these changes available in a machine-readable
format. By capturing only the changes in the data, CDC reduces the
volume of information that is required for data integration. Short
form: CDC.
- change management
-
in the SAS Open Metadata Architecture, a facility
for metadata source control, metadata promotion, and metadata replication.
- channel
-
a virtual communication path for distributing
information. In SAS, a channel is identified with a particular topic
(just as a television channel is identified with a particular radio
frequency). Using the features of the Publishing Framework, authorized
users or applications can publish digital content to the channel,
and authorized users and applications can subscribe to the channel
in order to receive the content. See also publish and subscribe.
- cluster
-
in SAS data quality, a set of character values
that have the same match code.
- comparison
result
-
the output of change analysis. For example, in
SAS Data Integration Studio, the metadata for a comparison result
can be selected, and the results of that comparison can be viewed
in a Differences window and applied to a metadata repository. See
also change analysis.
- cross-reference table
-
a table that contains only the current rows of
a larger dimension table. Columns generally include all business key
columns and a digest column. The business key column is used to determine
if source rows are new dimensions or updates to existing dimensions.
The digest column is used to detect changes in source rows that might
update an existing dimension. During updates of the fact table that
is associated with the dimension table, the cross-reference table
can provide generated keys that replace the business key in new fact
table rows.
- custom repository
-
an optional metadata store for a SAS Metadata
Server that can be configured in addition to the foundation repository.
Custom repositories are useful for physically segregating metadata
for storage or security purposes.
- data
analysis
-
in SAS data quality, the process of evaluating
input data sets in order to determine whether data cleansing is needed.
- data
cleansing
-
the process of eliminating inaccuracies, irregularities,
and discrepancies from data.
- data integration
-
the process of consolidating data from a variety
of sources in order to produce a unified view of the data.
- data
lineage
-
a search that seeks to identify the tables, columns,
and transformations that have an impact on a selected table or column.
See also impact analysis, reverse impact analysis, and transformation.
- data
store
-
a table, view, or file that is registered in a
data warehouse environment. Data stores can contain either individual
data items or summary data that is derived from the data in a database.
- data
transformation
-
in SAS data quality, a cleansing process that
applies a scheme to a specified character variable. The scheme creates
match codes internally to create clusters. All values in each cluster
are then transformed to the standardization value that is specified
in the scheme for each cluster.
- database
library
-
a collection of one or more database management
system files that are recognized by SAS and that are referenced and
stored as a unit. Each file is a member of the library.
- database
server
-
a server that provides relational database services
to a client. Oracle, DB/2 and Teradata are examples of relational
databases.
- delimiter
-
a character that separates words or phrases in
a text string.
- delivery transport
-
in the Publishing Framework, the method of delivering
a package to the consumer. Supported transports include e-mail, message
queue, and WebDAV. Although not a true transport, a channel also functions
as a delivery mechanism. See also e-mail, message queue, WebDAV (Web
Distributed Authoring and Versioning), and channel.
- derived
mapping
-
a mapping between a source column and a target
column in which the value of the target column is a function of the
value of the source column. For example, if two tables contain a Price
column, the value of the target table's Price column might be equal
to the value of the source table's Price column multiplied by 0.8.
- digest
column
-
a column in a cross-reference table that contains
a concatenation of encrypted values for specified columns in a target
table. If a source row has a digest value that differs from the digest
value for that dimension, then changes are detected and the source
row becomes the new current row in the target. The old target row
is closed out and receives a new value in the end date/time column.
- dimension
-
a category of contextual data or detail data that
is implemented in a data model such as a star schema. For example,
in a star schema, a dimension named Customers might associate customer
data with transaction identifiers and transaction amounts in a fact
table.
- dimension
table
-
in a star schema or snowflake schema, a table
that contains data about a particular dimension. A primary key connects
a dimension table to a related fact table. For example, if a dimension
table named Customers has a primary key column named Customer ID,
then a fact table named Customer Sales might specify the Customer
ID column as a foreign key.
- dynamic cluster table
-
two or more SAS SPD Server tables that are virtually
concatenated into a single entity, using metadata that is managed
by the SAS SPD Server.
- e-mail
-
a system for transmitting messages electronically,
usually between two computers. See also delivery transport.
- fact
table
-
the central table in a star schema or snowflake
schema. A fact table typically contains numerical measurements or
amounts and is supplemented by contextual information in dimension
tables. For example, a fact table might include transaction identifiers
and transaction amounts. Dimension tables could add contextual information
about customers, products, and salespersons. Fact tables are associated
with dimension tables via key columns. Foreign key columns in the
fact table contain the same values as the primary key columns in the
dimension tables.
- foreign key
-
a column or combination of columns in one table
that references the corresponding primary key in another table. A
foreign key must have the same attributes as the primary key that
it references.
- foundation repository
-
the required metadata store for a SAS Metadata
Server. Each SAS Metadata Server has one foundation repository that
is created by default when the metadata server is configured.
- generated
key
-
a column in a dimension table that contains values
that are sequentially generated using a specified expression. Generated
keys are used to implement surrogate keys and retained keys.
- generated
transformation
-
in SAS Data Integration Studio, a transformation
that is created with the Transformation Generator wizard, which helps
you specify SAS code for the transformation. See also transformation.
- global
resource
-
an object, such as a server or a library, that
is shared on a network.
- impact analysis
-
a search that seeks to identify the tables, columns,
and transformations that would be affected by a change in a selected
table or column. See also transformation and data lineage.
- Integrated
Object Model server
-
a SAS object server that is launched in order
to fulfill client requests for IOM services. Short form: IOM server.
- intersection
table
-
a table that describes the relationships between
two or more tables. For example, an intersection table could describe
the many-to-many relationships between a table of users and a table
of groups.
- IOM server
-
See Integrated Object Model server.
- iterative
job
-
a job with a control loop in which one or more
processes are executed multiple times. Iterative jobs can be executed
in parallel. See also job.
- iterative processing
-
a method of processing in which a control loop
executes one or more processes multiple times.
- job
-
a collection of SAS tasks that create output.
- locale
-
a value that reflects the language, local conventions,
and culture for a geographic region. Local conventions can include
specific formatting rules for dates, times, and numbers, and a currency
symbol for the country or region. Collating sequences, paper sizes,
and conventions for postal addresses and telephone numbers are also
typically specified for each locale. Some examples of locale values
are French_Canada, Portuguese_Brazil, and English_USA.
- lookup
standardization
-
a process that applies a scheme to a data set
for the purpose of data analysis or data cleansing.
- match
code
-
an encoded version of a character value that is
created as a basis for data analysis and data cleansing. Match codes
are used to cluster and compare character values. See also sensitivity.
- message
queue
-
in application messaging, a place where one program
can send messages that will be retrieved by another program. The two
programs communicate asynchronously. Neither program needs to know
the location of the other program nor whether the other program is
running. See also delivery transport.
- metadata
administrator
-
a person who defines the metadata for servers,
metadata repositories, users, and other global resources.
- metadata
model
-
a definition of the metadata for a set of objects.
The model describes the attributes for each object, as well as the
relationships between objects within the model.
- metadata
object
-
a set of attributes that describe a table, a server,
a user, or another resource on a network. The specific attributes
that a metadata object includes vary depending on which metadata model
is being used.
- metadata repository
-
a collection of related metadata objects, such
as the metadata for a set of tables and columns that are maintained
by an application. A SAS Metadata Repository is an example.
- metadata
server
-
a server that provides metadata management services
to one or more client applications. A SAS Metadata Server is an example.
- operational
data
-
data that is captured by one of more applications
in an operational system. For example, an application might capture
and manage information about customers, products, or sales. See also
operational system.
- operational system
-
one or more applications that capture and manage
data for an organization. For example, a business might have a set
of applications that manage information about customers, products,
and sales.
- owner
-
the person who is responsible for the contents
of an object such as a table or a library. See also administrator.
- parameterized
job
-
a job that specifies its inputs and outputs as
parameters. See also job.
- parameterized table
-
a table whose metadata specifies some attributes
as variables rather than as literal values. For example, the input
to an iterative job could be a parameterized table whose metadata
specifies its physical pathname as a variable. See also iterative
job.
- PFD
-
See process flow diagram.
- primary
key
-
a column or combination of columns that uniquely
identifies a row in a table.
- process flow diagram
-
a diagram that specifies the sequence of each
source, target, and process in a job. In the diagram, each source,
target, and process has its own metadata object. Each process in the
diagram is specified by a metadata object called a transformation.
Short form: PFD.
- project repository
-
a metadata repository that serves as an individual
work area or playpen. Project repositories are available for SAS Data
Integration Studio only. In general, each user who participates in
change management has his or her own project repository.
- publish
-
to deliver electronic information, such as SAS
files (including SAS data sets, SAS catalogs, and SAS data views),
other digital content, and system-generated events to one or more
destinations. These destinations can include e-mail addresses, message
queues, publication channels and subscribers, WebDAV-compliant servers,
and archive locations.
- Quality Knowledge Base
-
a collection of locales and other information
that is referenced during data analysis and data cleansing. For example,
to create match codes for a data set that contains street addresses
in Great Britain, you would reference the ADDRESS match definition
in the ENGBR locale in the Quality Knowledge Base.
- register
-
to save metadata about an object to a metadata
repository. For example, if you register a table, you save metadata
about that table to a metadata repository.
- retained
key
-
a numeric column in a dimension table that is
combined with a begin-date column to make up the primary key. During
the update of a dimensional target table, source rows that contain
a new business key are added to the target. A key value is generated
and added to the retained key column and a date is added to the begin-date
column. When a source row has the same business key as a row in the
target, the source row is added to the target, including a new begin-date
value. The retained key of the new column is copied from the target
row.
- reverse
impact analysis
-
See data lineage.
- SAS
Application Server
-
in the SAS Intelligence Platform, a logical entity
that represents the SAS server tier. This logical entity contains
specific servers (for example, a SAS Workspace Server and a SAS Stored
Process Server) that execute SAS code. A SAS Application Server has
relationships with other metadata objects. For example, a SAS library
can be assigned to a SAS Application Server. When a client application
needs to access that library, the client submits code to the SAS Application
Server to which the library is assigned.
- SAS
Management Console
-
a Java application that provides a single user
interface for performing SAS administrative tasks.
- SAS
metadata
-
metadata that is created by SAS software. Metadata
that is in SAS Open Metadata Architecture format is one example.
- SAS
OLAP Server
-
a SAS server that provides access to multidimensional
data. The data is queried using the multidimensional expressions (MDX)
language.
- SAS Open Metadata Architecture
-
a general-purpose metadata management facility
that provides metadata services to SAS applications. The SAS Open
Metadata Architecture enables applications to exchange metadata, which
makes it easier for these applications to work together.
- SAS
Stored Process Server
-
a SAS IOM server that is launched in order to
fulfill client requests for SAS Stored Processes. See also IOM server.
- SAS
task
-
a logical process that is executed by a SAS session.
A task can be a procedure, a DATA step, a window, or a supervisor
process.
- SAS XML library
-
a library that uses the SAS XML LIBNAME engine
to access an XML file.
- SAS/CONNECT server
-
a server that provides SAS/CONNECT services to
a client. When SAS Data Integration Studio generates code for a job,
it uses SAS/CONNECT software to submit code to remote computers. SAS
Data Integration Studio can also use SAS/CONNECT software for interactive
access to remote libraries.
- SAS/SHARE library
-
a SAS library for which input and output requests
are controlled and executed by a SAS/SHARE server.
- SAS/SHARE
server
-
the result of an execution of the SERVER procedure,
which is part of SAS/SHARE software. A server runs in a separate SAS
session that services users' SAS sessions by controlling and executing
input and output requests to one or more SAS libraries.
- scheme
-
a lookup table or data set of character variables
that contains variations of data items and specifies the preferred
variation form or standard. When these schemes are applied to the
data, the data is transformed or analyzed according to the predefined
rules to produce standardized values.
- sensitivity
-
in SAS data quality, a value that specifies the
amount of information in match codes. Greater sensitivity values result
in match codes that contain greater amounts of information. As sensitivity
values increase, character values must be increasingly similar to
generate the same match codes.
- server
administrator
-
a person who installs and maintains server hardware
or software. See also metadata administrator.
- server
component
-
in SAS Management Console, a metadata object that
specifies information about how to connect to a particular kind of
SAS server on a particular computer.
- slowly
changing dimensions
-
a technique for tracking changes to dimension
table values in order to analyze trends. For example, a dimension
table named Customers might have columns for Customer ID, Home Address,
Age, and Income. Each time the address or income changes for a customer,
a new row could be created for that customer in the dimension table,
and the old row could be retained. This historical record of changes
could be combined with purchasing information to forecast buying trends
and to direct customer marketing campaigns.
- snowflake
schema
-
tables in a database in which a single fact table
is connected to multiple dimension tables. The dimension tables are
structured to minimize update anomalies and to address single themes.
This structure is visually represented in a snowflake pattern. See
also star schema.
- source
-
an input to an operation.
- star
schema
-
tables in a database in which a single fact table
is connected to multiple dimension tables. This is visually represented
in a star pattern. SAS OLAP cubes can be created from a star schema.
- subscribe
-
to sign up to receive electronic content that
is published to a SAS publication channel.
- surrogate
key
-
a numeric column in a dimension table that is
the primary key of that table. The surrogate key column contains unique
integer values that are generated sequentially when rows are added
and updated. In the associated fact table, the surrogate key is included
as a foreign key in order to connect to specific dimensions.
- target
-
an output of an operation.
- transformation
-
a SAS task that extracts data, transforms data,
or loads data into data stores.
- unique
key
-
one or more columns that can be used to uniquely
identify a row in a table. A table can have one or more unique keys.
- Web
Distributed Authoring and Versioning
-
an emerging industry standard, based on extensions
to HTTP 1.1, that enables users to collaborate in the development
of files and collections of files on remote Web servers. Short form:
WebDAV. See also delivery transport.
- Web
service
-
a programming interface that enables distributed
applications to communicate even if the applications are written in
different programming languages or are running on different operating
systems.
- WebDAV
-
See Web Distributed Authoring and Versioning.
Copyright © SAS Institute Inc. All rights reserved.