Previous Page | Next Page

Glossary

Glossary

add-in code generator

is a SAS/Warehouse Administrator Metadata API application that dynamically generates the code for a process in SAS/Warehouse Administrator. It enables you to access and use relevant metadata to drive your table processes. For example, the Teradata Loader add-in reads the metadata for an input table, then generates the code which loads the table into a Teradata DBMS, using Teradata load utilities. If add-in code generators have been installed at your site, they are typically selected from the Process Library.

add-in metadata exporter

is a SAS/Warehouse Administrator Metadata API application that exports SAS/Warehouse Administrator metadata in a format that can be used by another application. For example, the SAS/EIS Metabase Exporter exports metadata in SAS/EIS format. If custom metadata exporters have been installed at your site, they can be selected from the Export Format Selection window.

add-in tool

is a SAS SCL application that performs a task in SAS/Warehouse Administrator. For example, the Publish HTML Documentation tool reads the metadata for an input table and publishes the metadata in HTML format (column descriptions, input sources, source code, and other metdatata). Add-in tools are available from the Tools-> Add-Ins pull-down menu.

analysis column

is a numeric column that stores raw data to be used to calculate statistics. The values for an analysis column are the measurements you want to analyze, for example, using OLAP techniques. For example, Age and Income could be analysis columns.

category column

see class column.

cardinality

is the number of discrete values for any OLAP class column within a population.

class column

is a required OLAP summary role column, which is a numeric or character column that is used to group data into classifications. The values for each class column define groups for analysis. That is, the rows in the summary data store are grouped according to the values of the column, and a separate analysis is run for each group. Class columns can have continuous values, but they typically have a few discrete values that define the classifications of the column. For example, if columns STATE and COUNTY are class columns, you can sort the data so that the states are first, and SAS/Warehouse Administrator will summarize data for each county within each state.

Contact record

is a metadata record which specifies the owner or administrator who is responsible for a group, data store, process, or Job. An owner is the person who formulates policy and makes decisions about an object. An administrator is the person who implements decisions formulated by the owner in accordance with established policy. Contact records are stored along with the other metadata for the current Environment. Once created, contact records may be included in the metadata records for groups, data stores, processes and Jobs in the current Environment.

crossing

is a unique ordered list of one or more class columns that defines a summarization level (subtable) to be stored in one or more OLAP summary data stores. That is, a crossing represents a grouping on which summary statistics are calculated. You must have at least one crossing for an OLAP Table or an OLAP MDDB, and both summary data stores can have multiple crossings. All class columns must be in at least one crossing. A crossing represents the physically stored data, which provides the quick response when displaying a report in an OLAP application. For each crossing, there is a single record for each unique combination of all named class columns in the original raw input data.

Data File

a metadata record which specifies a SAS file that is an input to an Operational Data Definition (ODD). If you are defining an ODD whose Load Step is a DATA step view or an SQL view (but not a Pass-Through view), you must define its inputs in the Process Editor. Even if your ODD does not meet the conditions above, you may still want to specify a Process Flow for this Job for documentation purposes.

Data Group

is a simple grouping element for Data Tables, Information Marts, and other Data Groups.

data integration

is a kind of data transformation in which diverse data attributes are reconciled and data values are put into a standard, consistent form. Integration can be performed as a separate task, but is often performed simultaneously with other transformation tasks or during the extraction phase. See also denormalization.

data mart

is a limited data warehouse that is often designed to meet the needs of a particular department or individual. A data mart is more limited in scope than a data warehouse, which typically contains information used by more than one department. To implement a data mart in SAS/Warehouse Administrator, use an appropriate SAS/Warehouse Administrator object, such as a Data Warehouse or a Data Group.

data mining

is the process of searching for patterns, trends, or correlations in large data stores. Data mining uses statistical analysis techniques, pattern recognition technology, or both. SAS Enterprise Miner software can be used for this purpose.

data preparation process

is a general term for routines which extract, validate, or transform data for loading into a table or file. SAS/Warehouse Administrator data preparation processes include Mappings, Record Selectors, Data Transfers, and User Exits.

data source

is any information that you will extract, transform, and summarize in your warehouse data stores. Data sources can be in any format that SAS can read, on any supported hardware platform.

data store

is a table, a view, or a file that is registered in a Warehouse Environment. Except for ODDs, whether a data store is in SAS format or in another format, you can have SAS/Warehouse Administrator generate the SAS code that is required to read, write, or update the data store. If a file is not a SAS file, you can specify a command that SAS/Warehouse Administrator will use to open the file. Data stores can be in any format that SAS can read and write, on any supported hardware platform.

data structuring

is a kind of data transformation in which new columns are created and/or existing columns are modified to provide users with needed subject information.

data summarization

is a kind of data transformation in which data is aggregated by computing secondary statistics on numeric data or by creating counts on noncontinuous columns.

Data Table

is a metadata record which specifies a SAS table or view or a DBMS table or view that can serve multiple purposes. Data Tables are frequently used to define intermediate data stores. They can be used to define detail data stores, summary data stores (if you write your own summary code), a look-up table included as part of a join, or a table that holds information that does not fit anywhere else.

Data Transfer process

is a metadata record used to generate or retrieve a routine which moves data from one host to another. Data Transfers are required when an input source and the target data reside on different hosts. SAS/Warehouse Administrator generates code that uses PROC UPLOAD or PROC DOWNLOAD to move the data.

data transformation

is a general term for all of the tasks that prepare data for loading into a data warehouse. See also data integration, data scrubbing, data structuring, data validation, denormalization, and summarization.

data validation

is a kind of data transformation in which potentially invalid, out-of-range, missing, or duplicate values are detected.

data warehouse

in general, a data warehouse is a collection of data that is extracted from one or more sources for the purpose of query and analysis. Data warehouses are subject-oriented, integrated, time-variant, and nonvolatile. In contrast to a data mart, a data warehouse is better suited for larger, more comprehensive information storage.

Data Warehouse

in SAS/Warehouse Administrator, a Data Warehouse object is a metadata record which specifies the SAS library _DWMD. The _DWMD library contains the metadata for most groups and data stores in a data warehouse or a data mart at your site.

Data Warehouse Environment

is a metadata record which specifies the SAS library _MASTER. The _MASTER library contains host definitions and other metadata that is shared among one or more Data Warehouses and ODD Groups. On the SAS/Warehouse Administrator desktop, Environments are displayed as icons. In the SAS/Warehouse Administrator Explorer, the Environment you selected from the desktop is the top-most object.

DBMS

data base management system; in the context of SAS/Warehouse Administrator, this term usually refers to a DBMS other than SAS, such as Oracle, Teradata, or Microsoft Access.

DBMS connection definition

is a metadata record which specifies a user name, a password, DBMS options and other information that SAS requires to access tables in a data base management system (DBMS). Once created, DBMS connection definitions can be included in the metadata records for DBMS data stores in the current Environment. SAS/Warehouse Administrator will then be able to generate the appropriate SAS/ACCESS statements to access these data stores.

DBMS LIBNAME definition

see SAS/ACCESS LIBNAME definition.

denormalization

is a kind of data transformation in which DBMS data is freed from its normalized structure by combining information from separate tables. Denormalization can increase the performance of a data warehouse, though it frequently results in some data redundancy.

detail data

is information at or near the fact level in a data base. It is data that has not been summarized or has only been lightly summarized after extraction from a source. In SAS/Warehouse Administrator, detail data is stored in Data Tables, Detail Logical Tables, and Detail Tables.

Detail Logical Table

is a metadata record which specifies a SAS table or view that can serve multiple purposes. Detail Logical Tables are frequently used to define views on multiple, related Detail Tables. They can be also be used to define a detail data store or a simple group for Detail Tables. A Subject can have only one Detail Logical Table. A Detail Logical Table can contain any number of Detail Tables or views. Detail Logical Tables in different Subjects can share (link to) the same Detail Table.

Detail Table

is a metadata record which specifies a SAS table or view or a DBMS table or view that can serve as a detail data store in SAS/Warehouse Administrator. Detail Tables are children of Detail Logical Tables.

dimension

acts as an index for identifying values. A dimension groups related data, which is defined in hierarchies. For example, you could organize sales data into three dimensions: Geography, Time, and Product. The Time dimension could include the following hierarchies, which provide different paths in order to drill down to increasing levels of detail: Time-by-Week and Time-by-Month.

drill-down analysis

is the process of displaying increasing levels of detail within a specified dimension, such as time (from Year to Month to Day) or geographic location (from Country to State/Province to City).

Event

is a metadata record which specifies a condition that controls a Job, such as checking for certain return codes or verifying the existence of a file. Currently, to use Events, you must write a Metadata API program which reads Job Flows with Events and generates code for them.

Environment

See Data Warehouse Environment.

External File

is an input to an ODD which extracts information from one or more sources that are not in SAS format. That is, an External File is an input to an ODD whose Load Step is a DATA step view.

fact

is an instance of some particular occurrence and the properties of the occurrence all stored in a data base.

global metadata

is metadata that, for the most part, can be shared among one or more Data Warehouses and ODD Groups within an Environment. After you define a Warehouse Environment, you can define metadata records that are shared at the Environment level. These records can be included in the metadata for data stores and other resources in the current Environment. The global metadata types are: SAS library definitions (including SAS/ACCESS LIBNAME definitions and Jobs Information libraries), Host definitions, DBMS connection definitions, Contact records, and Scheduling server definitions.

group

is an element in the SAS/Warehouse Administrator Explorer or Process Editor that is used to organize other elements. Groups in the Explorer include Environments, ODD Groups, Warehouses, and Subjects.

hierarchy

is a unique list of class columns that specifies related data and is a member of a dimension. Each hierarchy provides a navigational path in order to drill down to increasing levels of detail. For example, for a dimension named Time, you could define a hierarchy named Time-by-Month that consists of the class columns YEAR, MONTH, and DATE. Note that the term hierarchy is not the same as a stored summary level, which is a crossing.

HOLAP

is hybrid online analytical processing (OLAP). HOLAP combines the functionality of both ROLAP and MOLAP, producing a hybrid OLAP solution that combines the best features of both. HOLAP provides access to diverse data sources on local and/or remote servers. SAS/Warehouse Administrator supports HOLAP with an OLAP Group of type HOLAP. Such a group can contain both OLAP Tables and OLAP MDDBs.

When an OLAP Group of type HOLAP is specified as an output data store in a Job, SAS/Warehouse Administrator generates a proxy MDDB, which is a physical file that represents all of the data in an OLAP Group of type HOLAP. The proxy MDDB can be used by SAS/EIS software to provide more efficient access to multiple OLAP Tables and/or OLAP MDDBs.

Host definition

is a metadata record which specifies a computer where data stores reside, where processes and Jobs execute, or where process output is sent. Once created, host definitions may be included in the metadata records for data stores, processes and Scheduling Server definitions in the current Environment.

ID column

is an optional OLAP summary role column that can be used to include additional values in the summary data store. You can have multiple ID columns. Specifying an ID column applies to an OLAP Table only; it is not supported for an OLAP MDDB.

Information Mart (Info Mart)

is a grouping element for Information Mart Items and Information Mart Files.

Information Mart File (Info Mart File)

is a metadata record which specifies a file other than a SAS file that you want to register in a Warehouse. The file can be a spreadsheet, an HTML report, or any file that can be opened by an external application.

Information Mart Item (Info Mart Item)

is a metadata record which contains or displays information generated from detail data or summary data in the Warehouse. These items are usually SAS charts, reports, graphs, or queries.

Job

is a metadata record that specifies the processes that create one or more data stores (output tables). A Job enables you to connect a series of process steps into a single unit. The processes may be specified with a Process Flow diagram in the Process Editor. If a process flow diagram is specified, SAS/Warehouse Administrator can generate code for the job. Alternatively, a job may simply reference a user-supplied program which contains the processes that create the data store(s). A job may include scheduling metadata which enables the process flow or user-supplied program to be executed in batch mode at a specified date and time.

Job Flow

is a user-defined diagram in the Job View of the Process Editor. It is composed of symbols, with connecting arrows and descriptive text, that illustrate the sequence in which Jobs and Events are executed. Job Flows are not required.

SAS/Warehouse Administrator allows you to create Job Flows, which define metadata for Job dependencies. However, the current release does not generate code for the Job Flows. To use them, you must write a Metadata API program that reads Job Flows and generates code for them. For details about writing metadata API programs, see SAS/Warehouse Administrator Metadata API Reference.

Job Group

is a simple grouping element for Process Editor Jobs, Events, and other Job Groups.

Load process (Load Step)

is a metadata record used to generate or retrieve a routine which puts data into a specified target object. After you define the metadata for a given data store, you must define a Load process which actually creates and loads the data store.

load time

is the date and time when a data value is loaded into a table.

logical model

is a model of a specific business process or concept from an end-user's perspective. A logical model identifies the subjects and relationships among data elements but does not describe the functional or physical characteristics of the data elements.

Mapping

is a metadata record used to generate or retrieve a routine which maps columns from one or more data sources into one or more Data Tables, Detail Tables, OLAP Tables, or OLAP MDDBs. Common mappings include one-to-one (one data source to a target table), joins (one or more data sources merged by one or more common columns), and unions (two or more data sources appended to a target table).

metadata

is a definition or description of data. Using the windows in SAS/Warehouse Administrator, you specify metadata which defines data sources, data stores, code libraries, and other warehouse resources. SAS/Warehouse Administrator then uses this metadata to generate or retrieve the code which extracts, transforms, and loads the data into your warehouse.

There are two main kinds of metadata: physical metadata and business metadata. The physical metadata for a SAS table might specify a certain number of rows and columns, with certain transformations applied to some of the columns. The business metadata for a SAS table might describe the purpose of the table and contact information for the person responsible for the accuracy of the information in the table.

metadata application program interface (API)

is a set of software tools that enable you to read or write SAS/Warehouse Administrator metadata without going through its user interface.

metadata repository

is a data store that contains an application's metadata. SAS/Warehouse Administrator stores its metadata in two SAS libraries: libref _MASTER (metadata for an Environment) and libref _DWMD (metadata for a Warehouse).

MOLAP

is multidimensional OLAP: online analytical processing performed on a multidimensional database, such as a SAS MDDB. SAS/Warehouse Administrator supports MOLAP with an OLAP Group of type MOLAP. Such a group is a grouping mechanism intended to contain only OLAP MDDBs. Multiple MDDBs can be contained in the group, but each MDDB generally represents an entire OLAP application.

MDDB

is a multidimensional data base. See OLAP MDDB.

NWAY crossing

is the most detailed type of crossing for an OLAP object. An NWAY crossing consists of all the assigned class columns.

ODD

is an operational data definition, a metadata record which provides access to data sources. The ODDs, in turn, are used as inputs to data stores in your warehouse.

At a minimum, in order for a data source to be visible in a Warehouse Environment, you must specify the location of that data source in an ODD. You can define an ODD that simply registers the location of a SAS table or view, or that registers the location of a DBMS table with the help of a SAS/ACCESS LIBNAME definition. You can also define an ODD that extracts information from a data source, saves the results to a SAS table or view, then specifies the location of the extraction table or view. See also Data File and External File.

ODD Group

is a metadata record which specifies a group that is used to organize ODDs. It may also contain Information Marts.

OLAP

is online analytical processing, a kind of data analysis that is designed to answer "what happened and why" questions. OLAP enables you to easily and selectively extract and view data from different points of view. For example, a user can request that data be analyzed to display a report showing all of a company's products sold in Virginia in the month of September, 2000, then compare revenue figures with those for the same products in September, 1999, and then see a comparison of other product sales in Virginia in the same time period. SAS/Warehouse Administrator supports OLAP with its OLAP Groups, OLAP Tables, and OLAP MDDBs.

OLAP cube

is a metadata record that represents the logical relationships (dimensions and hierarchies) of the OLAP data so that you can run an OLAP report. For HOLAP, the cube is associated with the OLAP Group and is registered in SAS/EIS software as associated with the proxy MDDB. The result is one OLAP Cube describing all the separate summary data stores. For MOLAP and ROLAP, there is normally one cube associated with each OLAP MDDB and OLAP Table.

OLAP Group

is a metadata record that specifies a group used to organize related summary data, which is stored in OLAP Tables and/or OLAP MDDBs. OLAP Group properties specify the logical structure of the summarized data. OLAP Groups replace the Summary Groups used in earlier releases.

OLAP MDDB

is a metadata record that specifies a SAS MDDB (multidimensional database). A SAS MDDB is not a SAS table. It is a specialized storage format that stores derived summary data in a multidimensional form, which is a highly indexed and compressed format. To load an OLAP MDDB, SAS/Warehouse Administrator generates code for the MDDB procedure, which summarizes data similar to the SUMMARY procedure. OLAP MDDBs replace the Summary MDDBs used in earlier releases.

OLAP Table

is a metadata record that specifies a file to store derived summary data. This file can be a SAS table or view or a DBMS table or view. To load an OLAP Table, SAS/Warehouse Administrator generates code for the SUMMARY procedure, which summarizes data by computing descriptive statistics for columns across all rows or within groups of rows. OLAP Tables replace the Summary Tables used in earlier releases.

OLTP systems

are online transaction processing systems, the systems that an enterprise uses for processing business transactions on a day-to-day basis.

operational data

is data from an OLTP system. Operational data is a common source for the information that is extracted and loaded into a warehouse data store.

physical metadata

is a set of software instructions that define how an application element stores, moves, or transforms data. For example, the physical metadata for a SAS table might specify a certain number of rows and columns, with certain transformations applied to some of the columns.

physical model

is the database design that identifies the structure and function of the database. A physical model is based on a logical model and is a technical specification that identifies schemas, tables, columns, indexes, and so on.

post-load process

is a routine that executes after a table is loaded. Post-load processes are specified on the Source Code tab of a load process window, which you display from the Process Editor.

process

is a routine that creates a warehouse data store, or that extracts data, transforms data, or loads data into a data store. In SAS/Warehouse Administrator, you define metadata records that are used to generate or retrieve the source code for processes. Mappings, User Exits, Data Transfers, Record Selectors, and Load Steps are all metadata records that generate or retrieve processes.

Each process that you define in the Process View of the Process Editor generates or retrieves code. SAS/Warehouse Administrator can generate source code for any process except a User Exit or an ODD Load Step. However, you can specify a user-written routine for any process.

Process Flow

is a user-defined diagram in the Process View of the Process Editor. It is composed of symbols, with connecting arrows and descriptive text, that illustrate the sequence of each process associated with the Job that is selected in the Job Hierarchy of the Process Editor. The Process Flow illustrates how the data moves from input source(s) to output table(s) and what extractions and transformations occur in between.

A Job must include a Process Flow if SAS/Warehouse Administrator will generate the source code for the Job. If you will supply the source code for a Job, no Process Flow is required, although you may want to create one for documentation purposes.

Process Library

is a collection of registered routines that extract data, transform it, and/or load it into warehouse tables. For example, a Process Library might include routines that do a range-check on certain data values, or that eliminate duplicate values. As you use the Process Editor to define a step in the Process Flow for a particular table, you have the option of selecting a predefined routine from the Process Library, instead of defining your own process for that step.

proxy MDDB

See the discussion under HOLAP.

Record Selector process

is a metadata record used to generate or retrieve a routine which subsets data prior to loading it to a specified table. In the current release, a Record Selector can be used only to subset the source data specified in an ODD or in a Data File (which is an input to an ODD).

ROLAP

is relational OLAP: online analytical processing performed on a relational database, such as a SAS table or an ORACLE table. SAS/Warehouse Administrator supports ROLAP with an OLAP Group of type ROLAP. Such a group is a grouping mechanism intended to contain only OLAP Tables. Multiple OLAP Tables can be contained in the group, but each table generally represents an entire OLAP application.

SAS Component Language (SCL)

is a programming language used with SAS/AF and SAS/FSP software to develop interactive SAS applications. It was formerly known as SAS Screen Control Language.

SAS library definition

is a metadata record for a SAS library that contains data, views, source code, or other information that is used in the current Warehouse Environment. SAS library definitions are included in the metadata records for data stores, processes, and Jobs in the current Environment.

SAS/ACCESS LIBNAME definition

is a special SAS library definition that can be used to extract source data in DBMS format or to create warehouse data stores in a DBMS. SAS/Warehouse Administrator uses a SAS/ACCESS LIBNAME definition to generate a SAS/ACCESS LIBNAME statement. Some of the metadata that you specify in the definition corresponds to the options in the LIBNAME statement. For example, a SAS/ACCESS LIBNAME definition specifies a SAS/ACCESS engine - such as oracle or sybase - that enables you to access the corresponding DBMS as if it were a SAS library.

SAS/Warehouse Administrator

is an application which provides a visual environment for managing data warehouses. Using the windows in this application, you specify the metadata which defines data sources, data stores, code libraries, and other warehouse resources. SAS/Warehouse Administrator then uses this metadata to generate or retrieve the code which extracts, transforms, and loads the data into your warehouse.

Scheduling Server definition

is a metadata record which specifies a scheduling server application (such as CRON under UNIX System V), a definition for the host where the scheduling server runs, directories where the scheduling server can send temporary files, the commands used to start SAS on the scheduling server host, and the default job-tracking option for jobs using this scheduling server definition.

star schema

is an arrangement of database tables in which a large fact table that has a composite, foreign key is joined to several dimension tables. Each dimension table has a single primary key.

statistic column

a required OLAP summary role, which is a numeric column for storing computed summary statistics...these are the measures you want to analyze. You must have at least one statistic column, with an unlimited maximum. The values for the input column (analysis column) are used to compute the output summary statistics, which then become the values for the statistic column in the summary data store.

For example, you could add a column named MINSALES, assign it as a statistic column using the MIN statistic, then define a Mapping process to compute the derived statistic from an analysis column like SALES to the statistic column MINSALES.

Each statistic column has a specific keyword associated with it that specifies which statistic to compute.

Subject

is a grouping element for data related to one topic within a Data Warehouse. For example, a Data Warehouse might have a Subject called Products (information related to products) or Sales (information related to sales). Each Subject may be composed of a number of different data collections: detail data, summary data, charts, reports, and graphs.

summary data

is information that is derived from the facts in a data base. It is data that has been summarized after extraction from a source.

summary data store

in SAS/Warehouse Administrator, OLAP Tables or OLAP MDDBs that store crossings and statistic columns. As each crossing is accumulated, all records of the input data are sorted into groups by the values of the class columns included. Each group represents a specific subpopulation. Within each group, the summary statistics are calculated on all input records and the statistics are stored in statistic columns with the same name as the analysis columns they were derived from. A single record is written to the crossing for each subpopulation.

Summary Group

has been replaced by OLAP Group. These are provided for compatibility with earlier releases of SAS/Warehouse Administrator.

Summary MDDB (Multidimensional Database)

has been replaced by OLAP MDDB. These are provided for compatibility with earlier releases of SAS/Warehouse Administrator.

Summary Table

has been replaced by OLAP Table. These are provided for compatibility with earlier releases of SAS/Warehouse Administrator.

User Exit process

is a metadata record used to retrieve a user-written routine. The routine must be stored in a SAS catalog with an entry type of SOURCE or SCL. A User Exit routine often extracts or transforms information for a warehouse data store, but it could do many other tasks.

Warehouse Environment

See Data Warehouse Environment.

Previous Page | Next Page | Top of Page