The OLAP Procedure

AGGREGATION Statement

Defines an aggregation of the cube based on level information that you provide. You can specify level names that are associated with an unsummarized data source, or you can specify level names that match columns in a table that contains existing aggregated data. The levels can exist in more than one dimension. You do not need to include dimension names, because level names must be unique across dimensions.

Syntax

AGGREGATION level-name <level-name2 level-name3 ...level-nameN> / <option(s)>;

Details

Required Arguments

level-name: is the level that is to be used to create the aggregation. Additional level names are optional. Names are separated by spaces. Names are separated from option specifications with a required slash character ( / ). You do not have to include all levels that are specified in all HIERARCHY statements, but the names that you do specify must match the names that are used in the HIERARCHY statements. You can include a TABLE= option to identify a table that contains existing aggregated information for your specified levels. The levels that you specify must match columns in the input table.

Levels must be listed in drill-path order. You cannot specify an aggregation that contains a summary level that could never be requested. For example, if your TIME hierarchy contains the levels Year, Month, and Day, you could specify Year and Month as an aggregation, but not Month by itself.

Options

Note: For a list of the options that can be used to optimize cube creation and query performance, see Specialized Options for PROC OLAP.

COMPRESS | NOCOMPRESS: specifies whether to store the aggregation table in a compressed format on disk. The default value is NOCOMPRESS.

DATAPATH=('pathname' ...'pathnameN'): specifies the location of one or more partitions in which to place aggregation table data. The data is distributed by cycling through each partition location according to the partition size. This is set by using the PARTSIZE= option. For example, if you specify DATAPATH=('C:\data1' 'D:\data2'), then PROC OLAP places the first partition of the aggregation table into directory C:\data1, the second partition of the table into directory D:\data2, the third partition of the table into C:\data1, and so on. It is also possible to have aggregation tables that use fewer than the specified number of partitions. For example, your aggregation table might fit entirely into C:\data1.

The default value is the cube subdirectory of the location that is specified by the PATH= option in the PROC OLAP statement.

INDEX | NOINDEX: specifies whether to create the specified aggregation with indexes. For faster cube creation and updates, you can set this option to NOINDEX. However, the lack of indexes might adversely affect query performance. The default value is INDEX.

Note: Indexes are not created for aggregations that have fewer than 1,024 records.

INDEXPATH=('pathname' ...'pathnameN'): specifies the locations of the index component files that correspond to each aggregation table partition as specified by the DATAPATH= option.

NAME='aggregation-name': specifies a maximum of 256 characters as the name of the aggregation. If the name includes blank spaces or any characters that are not permitted in a valid SAS name, then the name must be enclosed within quotation marks. The name is stored with the cube's metadata. The default value is a name assigned by SAS, such as AGGR1.

PARTSIZE=partition-size: specifies the partition size in megabytes of the aggregation table partitions and their corresponding index components. The default value is a

SEGSIZE=rows-per-segment: specifies the number of observations (table rows) in the file segment of the index component. The value is expressed in multiples of 1,024. The minimum value is 1 (1,024 rows). The segmented indexes are used to optimize the processing of WHERE expressions. Each parallel thread is given a segment of the table to evaluate that is equal to the value of the SEGSIZE= option multiplied by 1,024. The default value is 8 (8 x 1,024 = 8,192 rows).

The value of this option overrides for the current aggregation any such value that was specified for all aggregations in the PROC OLAP statement.

TABLE=libname.dataset: specifies the name of a SAS data set or data view that contains the data for one aggregation. Every level that is listed in the AGGREGATION statement must match a column that contains aggregation information in the specified table. Place this option after the list of level names.

Analysis columns in the table are mapped to the numeric columns that are specified with the AGGR_COLUMN= option in MEASURE statements.

You can also set data set options with the TABLE= option. Options are stored within the cube and reapplied when the data is accessed at run time. For more information, see “Data Set Options” in SAS Language Reference: Concepts. You cannot use the TABLE= option in an AGGREGATION statement that is used to add an aggregation to an existing cube.

Example

Here is an example of an AGGREGATION statement that specifies three levels and uses the NAME= option. The slash character ( / ) is required to separate level names from option specifications.

aggregation country prodtype year /
   name='Product Types by Country';