Planning for SAS OLAP Cubes |
A cube is always loaded from data in relational tables. This data can be stored either in SAS tables or in external RDBMSs and can be accessed through a wide selection of SAS data engines, including the Base SAS engine, SPD Server, SPD Engine, and the SAS/ACCESS engines to external RDBMSs. SAS software enables you to be independent of the physical storage format of the data. When building a SAS OLAP cube, you must consider the format that your source data is stored in and how that source data is to be used in the SAS OLAP cube. Cubes can be loaded from data that is contained in any of the following types of tables:
detail tables
fact tables and dimension tables (star schema)
summary tables
aggregation tables
drill-through tables
Detail Tables |
A detail, or base, table is any table defined in the SAS metadata that contains the measures and levels for a cube. A detail table consists of unsummarized data that must include one column for each level and one numeric analysis column for each set of measures that will be generated. A detail table includes all the columns that are needed to load the hierarchy levels, the level properties, and measures. Specifically, a detail table must contain the following elements:
one column per dimension level (char or num)
one column for each property (char or num)
numeric analysis variables
Fact Tables and Dimension Tables (Star Schema) |
A star schema consists of a single fact table and one or more dimension tables. The fact table must contain one numeric analysis column for each set of measures that will be generated. You cannot have more than one property value for each distinct value of the corresponding level. To successfully load a cube, all foreign keys in the fact table need to have a corresponding primary key in a dimension table. A star schema configuration is organized with the following requirements and considerations:
The input columns for data measures must be stored in the fact table.
The input columns for dimension levels and properties must be stored in one table for each dimension.
The data records of the fact table and the dimension tables must be linked via primary and foreign keys.
If the dimension levels are defined in a dimension table, all the level columns for that dimension must be contained in the same dimension table.
Both the dimension keys and fact keys are single columns, not combinations of columns.
A dimension can be in the fact table. In this case, all the level columns are in the fact table and no fact or dimension key is required.
The dimension key can also be a level in the dimension.
The fact table requires the following conditions:
one key column with a foreign key for each dimension (character or numeric)
numeric analysis variables
A dimension table requires the following conditions:
one key column with a primary key (character or numeric)
one column for each dimension level (character or numeric)
one column for each property (character or numeric)
Summary Tables |
The summary table is also known as the N-way data set. It is a table that is already summarized and requires the following elements:
one column per dimension level (character or numeric)
one column for each property (character or numeric)
one column per stored measure, summarized by the appropriate statistic for the measure
Aggregation Tables |
In addition to a summary table for the N-way aggregation, summary tables can also be provided for other aggregations. Aggregation tables contain presummarized data for any combination of dimension levels. This feature enables cubes to access summarized numbers quickly. Aggregated data can be created and automatically stored in MOLAP (Multidimensional OLAP) tables as the cube is built or manually stored in ROLAP (Relational OLAP) tables and linked into the cube.
All aggregation tables must contain a column for each measure in the cube where the statistic for the measure is one of the following: N, NMISS, SUM, MAX, MIN, or USS. An aggregation table can be used in two ways:
as an NWAY data source for the cube. In this case, the table must contain a column for every level in the cube and a column for every stored measure.
as a subaggregation for the cube. In this case, the table must include a column for each level of the aggregation and a column for every stored measure.
When planning your aggregation tables, note that the column names and attributes (type, format) must be consistent across all input tables for a cube. Aggregation tables require the following structure and organization.
one column for each dimension level in the aggregation
one column per measure
Drill-Through Tables |
Many SAS OLAP applications give users the ability to select a cell or a range of cells and then view the input data that the cell data was summarized from. Drill-through tables can be used by client applications to provide a view from processed data into the underlying data source. Drill-through tables are views that represent all of the input data used to define a cube. Assigning a drill-through table to a cube is optional.
When you are evaluating the data source for a drill-through table you should consider the following requirements:
The drill-through table must have the same columns with the same attributes (name, type, format) as the tables that the cube was loaded from.
In many cases, the detail table is also the drill-through table.
For a star schema, a drill-through table is created with a view that fully joins the star schema. For a summary table, that summary table can be used as a drill-through table.
Note: When you select a data table for drill-through, you may need to consider user access and security restrictions for that table. For further information, see Security for Drill-through Tables.
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.