space
Previous Page | Next Page

Preparing Data for Your Ready-Made Applications

Presummarizing Your Data

When you build a SAS/EIS application that uses detail data, some objects recalculate the summaries dynamically each time the object drills or expands. Recalculating the data can be essential for an EIS that uses data that changes minute by minute, so that users always get current results. However, if the data is more static (for example, if it is refreshed on a nightly, weekly, or monthly basis), the application might not need to calculate the summaries each time the user drills through the data.

By creating and using summarized tables, you can increase the performance of your application substantially. The summarized tables eliminate the need to run summaries at execution time. The presummarized data needs to be refreshed only when the underlying data changes. The process of building a summarized table or an MDDB summarizes the raw data, so SAS/EIS uses the term presummarized to refer to the data that is stored in these MDDBs or summarized tables.

Note:   In Version 9 and beyond, SAS/EIS supports only MDDBs that you create with the SAS/EIS MDDB object or the MDDB procedure. SAS/EIS does not support the new Version 9 cube structure that is created by the OLAP procedure.  [cautionend]

You can use two forms of presummarized data with SAS/EIS objects:

The following table shows the objects that support the two types of presummarized data:

MDDBs Summarized Tables
3D Business Graphs Comparison Reports
Bubble Charts Expanding Reports
Graphical Variance Reports Grouped Bar Charts
Maps Multicolumn Reports
Multidimensional Business Trends
Multidimensional Charts
Multidimensional Pie Charts
Multidimensional Reports
Organizational Charts

The following sections describe how you can create an MDDB or a summarized table to use with your EIS.


What Is an MDDB?

A multidimensional database (MDDB) is a specialized storage facility that enables data to be pulled from a data warehouse or other data sources for storage in a matrix-like format. The MDDB enables users to quickly retrieve multiple levels of presummarized data through a multidimensional view. An MDDB is not a SAS data file; instead of the traditional relational structure that is used for SAS tables, an MDDB stores its data as an NWAY cube and zero or more subcubes.

MDDB Structure

[MDDB Structure]

An NWAY cube is a multidimensional data model that specifies all the classification variables that can be used to define crossings. A crossing is a definition of one or more classification variables that exist in the data from which the MDDB is created. Each crossing represents a grouping on which summary statistics could be calculated.

Although an MDDB can consist of an NWAY cube only, it is usually created with one or more subcubes. Subcubes are created using simple statements of hierarchy. These statements define one or more additional specific crossings that are derived from the NWAY cube. Subcubes are built to enhance reporting speed and are projections of the most likely aggregates that business managers and other users will expect to see. If a subcube does not exist for a particular aggregate query -- that is, if no subcube defines the exact crossings that are required in order to answer the query -- the aggregate data will be derived from the smallest subcube that can provide the data. If no subcube can provide the data, it is derived from the NWAY cube.

By default, a minimum number of statistics are stored in the MDDB. When you create the MDDB, you can specify up to 8 statistics for each analysis variable. Depending on which of the 8 statistics are stored, up to 13 additional statistics can be calculated by SAS/MDDB Server software at run time, allowing for up to 21 available statistics.

Other characteristics common to all SAS MDDBs include the following:

For more information about updating MDDBs and about using MDDBs with SAS/EIS and SAS/AF, refer to the SAS/MDDB Server online Help or to the SAS OLAP Server Administrator's Guide, Release 8.1.

There are four ways that you can build an MDDB. You can use

The following sections provide instructions on how to build an MDDB using SAS/EIS and the MDDB procedure. You can choose one method over another based on the SAS software products that you use and with which you are most familiar.

Note:    In order to create an MDDB in Version 8, you must have SAS/MDDB Server software licensed and installed.  [cautionend]

Note:   For Version 9 and beyond, you must have SAS OLAP Server software licensed and installed in order to create an MDDB. This is because SAS/MDDB Server is now included as part of SAS OLAP Server software. SAS/EIS supports only MDDBs that you create with the SAS/EIS MDDB object or the MDDB procedure. SAS/EIS does not support the new Version 9 cube structure that is created by the OLAP procedure.  [cautionend]

For more information on using the SAS/MDDB Server classes and using SAS/Warehouse Administrator software to create MDDBs, refer to the SAS/Warehouse Administrator 2.0 User's Guide.

Building an MDDB with the SAS/EIS MDDB Object

This section provides instructions on how to use SAS/EIS software to build an MDDB. You supply information about the MDDB in a series of SAS/EIS windows. When you build an MDDB using SAS/EIS software, the MDDB is registered automatically in the metabase facility.

Note:    SAS/EIS uses the term column instead of variable to describe what you register in a repository.  [cautionend]

Before you create an MDDB in SAS/EIS, you must have a table registered with the ANALYSIS attribute and either CATEGORY and/or HIERARCH attributes.

To create an MDDB using the MDDB object, follow these steps:

  1. Double-click Build EIS in the EIS Main Menu to display the Build EIS window.

  2. In the Build EIS window, specify a path and an application database, if you have not previously done so. Click Add.

  3. In the Add window, select Data Access from the Object Databases list box and select Multidimensional database from the Objects list box. Click Build. The Multidimensional Database window appears, where you enter all the information needed to create an MDDB.

  4. In the Multidimensional Database window, type a name and, optionally, a description of the MDDB that you are creating. Select the right arrow beside the MDDB field to open the MDDB window, where you specify the repository in which the MDDB will be saved and the path information in which the MDDB will be registered.

  5. In the MDDB window, specify information on where to save and register the MDDB that you are creating. You must register the MDDB in a repository. Use the down arrow beside the Repository field to specify a repository. You can also use the Password protected check box to add password protection to the MDDB. Click OK to return to the Multidimensional Database window.

  6. Select the right arrow beside the Table field to open the Select Table window, where you specify the registered table to be used as input for the MDDB. Double-click a table in the Available list box to move it to the Selected list box. Click OK to return to the Multidimensional Database window.

  7. Select the right arrow beside the Dimensions field to open the Column Selection window, where you can select the dimension and analysis columns. Select a column in the Type list box to display the list of available columns. Double-click a column in the Available list box to move it to the Selected list box. Click OK to return to the Multidimensional Database window.

  8. Specify the statistics that will be stored in the MDDB for each analysis column. Click Customize, then click Statistic in the Attributes window. Select an analysis column, then select the statistics you want to store for that column. See the SAS/EIS online Help for a complete list of the available statistics.

    The statistics that you choose to store affect the statistics that you will be able to calculate when using the MDDB. For example, if you want to calculate the range of an analysis column, you must have previously stored the MIN and MAX statistics for that column in the MDDB.

  9. Click Advanced and use the Advanced window to specify the summary tables that are created for the MDDB. Creating a summary table for a particular combination of columns and analysis columns increases the speed with which SAS/EIS can retrieve the data for that combination. However, as the number of summary tables increases, the amount of storage space required also increases. By default, SAS/EIS creates a summary table for every possible combination of dimension columns.

  10. Click Subset and use the WHERE window to specify a subset for the data in the MDDB. If you subset the MDDB data, SAS/EIS creates a permanent view of the data with the subset applied.

    Note:   If you move the MDDB to another library or operating environment, you must also move the view to the same location. If you do not move the view, you can still use the MDDB, but you cannot use the "Show Detail" option on the viewers.  [cautionend]

  11. Click Create to create the MDDB and to register it in the repository. If you do not click Create, the MDDB is not created or registered until the application runs. When you are finished, click OK in the MDDB window to store the MDDB application.

CAUTION:
Re-executing the MDDB application will cause the MDDB to be re-created, overwriting any previous changes.

This includes re-executing the MDDB by editing the MDDB and clicking Create or Test in the Build EIS window, or by using the RUNEIS APPL=eis.app-name command.  [cautionend]


Building an MDDB with the MDDB Procedure

This section provides the syntax for the MDDB procedure and explains how to use the procedure to create an MDDB in the Program Editor window.

For more information on using the MDDB procedure to build an MDDB, refer to the SAS/MDDB Server online Help or to the SAS OLAP Server Administrator's Guide, Release 8.1.

Note:   After you create an MDDB, you need to register it in the SAS/EIS metabase facility in order to use it with SAS/EIS reports. For more information, refer to Registering an MDDB.  [cautionend]

Here is the syntax for the MDDB procedure:

PROC MDDB DATA=libref.dsname OUT=libref.outmddb <IN=libref.inmddb> <LABEL='description'> <PW=password> <VMEMSIZE=msize> <PKTSIZE=psize> <TOTALS=YES|NO>;
CLASS var1 var2 ... / <order-options>;
HIERARCHY class-var1 class-var2 ... / <NAME=name | "name"> <DISPLAY=YES|NO|NODATA> <TOTALS=YES|NO>;
ADDHIER class-var1 class-var2 ... / <NAME=name | "name"> <DISPLAY=YES|NO|NODATA> <TOTALS=YES|NO>;
REMOVEHIER class-var1 class-var2 ... / <NAME=name | "name">;
VAR var1 var2 ... / <stat-options>;
SUMVAR to-name anlvar / stat-name;
RUN;

PROC MDDB Statement

PROC MDDB DATA=libref.dsname OUT=libref.outmddb <IN=libref.inmddb> <LABEL='description'> <PW=password> <VMEMSIZE=msize> <PKTSIZE=psize> <TOTALS=YES|NO>;

You can use the following options in the PROC MDDB statement:

DATA= libref.dsname

specifies the SAS table that contains the CATEGORY and ANALYSIS columns to be used as the source for the MDDB. If you do not specify a table name, the most recently created table is used. This option is required.

OUT= libref.outmddb

specifies the name of the output MDDB that you are creating and registering in a repository. This option is required.

IN= libref.inmddb

specifies the existing MDDB. Use the IN= option for an incremental update of an MDDB. Use the DATA= option to specify the name of the table that contains the incremental data that will be added to the data in the input MDDB. Finally, use the OUT= option to specify the name of the output MDDB to which the data in IN= and DATA= will be written. This parameter is optional.

LABEL= 'description'

specifies a description to be stored with the MDDB. The character description string can be up to 256 characters long and must be enclosed in quotation marks. This parameter is optional.

PW= password

specifies a password that is to be associated with the MDDB. The password must be no more than eight characters and is not case sensitive. Any passwords that are specified in the MDDB name will override the password that is specified as an option in the PROC MDDB statement. This parameter is optional.

You can specify READ, WRITE, and ALTER passwords using the same syntax as for data sets. For examples of specifying passwords, refer to the SAS/MDDB Server online Help or to the SAS OLAP Server Administrator's Guide, Release 8.1.

VMEMSIZE= msize

specifies the maximum amount of memory (in megabytes) that the MDDB procedure will use to keep the analysis and the statistics data in memory. Setting VMEMSIZE to a small value (for example, VMEMSIZE=2 ) allows more memory for your indexes (the class variable values and the keys). However, the smaller the value of VMEMSIZE, the more time is spent swapping data. Use the VMEMSIZE option if you need to build a cube that is larger than the available memory.

Note that you must have enough real memory available to hold each subtable, including the NWAY, while each subtable is being summarized. The default value of 0 indicates no restriction. This parameter is optional.

PKTSIZE= psize

specifies the maximum amount of memory (in kilobytes) that is to be swapped whenever a value has been specified for the VMEMSIZE= option. This is especially important during remote file transfers when a smaller block size will decrease the overall net traffic. The block size should never be below eight kilobytes. The default value is 1024 kilobytes. This parameter is optional.

TOTALS=YES|NO

specifies that totals for the NWAY cube and all subcubes are stored with the MDDB. This reduces reporting time but increases the size of the MDDB and the time that is required to create it. The default value is NO. To store totals only for specific subcubes, use the TOTALS= option on the HIERARCHY statement and omit the TOTALS= option on the PROC MDDB statement. This parameter is optional.

For examples of using the TOTALS= option, refer to the SAS/MDDB Server online Help.


CLASS Statement

CLASS var1 var2 ... / <order-options>;

Use the CLASS statement to specify the columns from the base table that are to be used as the classification variables in the MDDB. You can specify one or more CLASS statements. However, a given variable can appear only once in all CLASS statements. The class variable can be either numeric or character. If you do not specify a sort order, ASCENDING is used. You can specify any of the following order options:

You can also specify a different sort order for each CLASS variable. To do this, use a separate CLASS statement for each variable to be sorted.

HIERARCHY Statement

HIERARCHY class-var1 class-var2 ... / <NAME=name | "name"> <DISPLAY=YES|NO|NODATA> <TOTALS=YES|NO>;

Use the HIERARCHY statement to define one or more subcubes to be stored in your MDDB by using a HIERARCHY statement. If you do not specify a hierarchy, only the NWAY cube hierarchy is stored in the MDDB. You can specify multiple CLASS variables; however, you can specify only one CLASS variable in each HIERARCHY statement.

Note:   In previous releases of SAS/MDDB Server, you could request the same hierarchy multiple items in a PROC MDDB step, and the data was stored for each request. In Version 8 and later releases, if you request the same hierarchy more than once in a PROC MDDB step, you will receive a message indicating that the request is a duplicate and that data for the hierarchy will not be stored.  [cautionend]

You can use the following options in the HIERARCHY statement:

NAME=name | "name"

specifies a name for the hierarchy. If the name contains a space or blank, it must be enclosed in quotes. If you do not specify a name for your hierarchy, the default name HIERn is used, where n is a number (beginning with 1).

DISPLAY=YES|NO|NODATA

only has an effect when someone chooses to register this MDDB in a SAS/EIS repository. At that time, a value of YES is interpreted to mean that the specific hierarchy should be registered as a drill hierarchy. The default value of NO indicates that this hierarchy should not be specifically registered. The NODATA value means that only the SAS/EIS metadata is stored; no cell data is stored.

Display hierarchies are automatically assigned the HIERARCH attribute when the MDDB is registered in a repository. If you want to create DISPLAY hierarchies that store only the metadata used by SAS/EIS (and not the cell data from the hierarchy), specify DISPLAY=NODATA.

TOTALS=YES|NO

specifies that totals be stored for a specific hierarchy and its related hierarchies. A value of YES specifies that totals are stored with the MDDB for this hierarchy and for any subsequent hierarchies that are generated from it. This reduces reporting time, but increases the size of the MDDB. The default value of NO specifies that totals are not stored for this hierarchy.


ADDHIER Statement

ADDHIER class-var1 class-var2 ... / <NAME=name | "name"> <DISPLAY=YES|NO|NODATA> <TOTALS=YES|NO>;

The ADDHIER statement enables you to update an existing MDDB by adding one or more hierarchies. The syntax and requirements for the ADDHIER statement are exactly the same as those for the HIERARCHY statement. The ADDHIER statement is valid only when you are updating an MDDB. The ADDHIER statement is optional. If you choose to use it, you can specify one or several ADDHIER statements.


REMOVEHIER Statement

REMOVEHIER class-var1 class-var2 ... / <NAME=name | "name">;

The REMOVEHIER statement enables you to update an existing MDDB by removing one or more hierarchies. The syntax and requirements for the REMOVEHIER statement are similar to those for the HIERARCHY statement, except that the DISPLAY= or TOTALS= options are not supported for the REMOVEHIER statement. The REMOVEHIER statement is valid only when you are updating an MDDB. You can specify one or several REMOVEHIER statements. The REMOVEHIER statement is optional.

If you specify the NAME= option on the REMOVEHIER statement, only the hierarchy whose name matches the name value will be removed. Otherwise, all hierarchies that contain the specified classifiers will be removed.

For detailed information about updating an MDDB using the MDDB Procedure, refer to the SAS OLAP Server Administrator's Guide, Release 8.1.


VAR Statement

VAR var1 var2 ... / <stat-options>;

The VAR statement enables you to specify columns from the base table to be used as the analysis columns in the MDDB. You can specify one or more VAR statements. However, a given column can appear only once in all VAR statements. The variables must be numeric. If you do not specify a statistic, SUM is used.

Use the stat-options in the VAR statement to specify the statistics to be stored for each analysis variable. Separate each statistic with a space. You can specify any of the following statistics options:


SUMVAR Statement

SUMVAR to-name anlvar / stat-name;

The SUMVAR statement enables you to use a summary data set that was produced by PROC SUMMARY or PROC MEANS as an input data set. You use a SUMVAR statement instead of a VAR statement. A CLASS statement is required when you use a SUMVAR statement. A SUMVAR statement is required for each analysis variable that is read from the summary data set. You can use multiple SUMVAR statements.

You can use the following options in the SUMVAR statement:

to-name

is the name of the analysis variable in the MDDB that is being created.

anlvar

is the name of a numeric variable from the summary data set.

stat-name

is a statistic name from one of the following: MIN, MAX, N, SUM, NMISS, USS, SUMWGT, or UWSUM.

When an analysis variable in the new MDDB has more than one statistic computed for it, specify one SUMVAR statement for each statistic. Specify the same to-name parameter for each of these SUMVAR statements.

Use the DATA= option on the PROC MDDB statement to specify the SUMMARY data set. The other PROC MDDB options and statements (for example, OUT=, PW=, LABEL=, and CLASS) work as before. Any HIERARCHY or VAR statements are ignored.

For more information about filling an MDDB from a SUMMARY data set, see the SAS/MDDB Server online Help.


Creating a Summarized Table with the SUMMARY Procedure

Summarization aggregates the data for a certain drill hierarchy and creates a new table that has precalculated values for all levels of a given drill hierarchy. You can use either PROC SUMMARY or PROC MEANS to produce a summarized table.

The SUMMARY procedure creates a SAS table that contains summary statistics. It aggregates the data separately by particular groups of data. Aggregation can be done to calculate totals (SUM), minimum values (MIN), maximum values (MAX), or averages (MEAN). For example, if you create a summarized table using the SUM statistic, the top level of the data in the hierarchy holds the sum of all values for lower level hierarchy data. If you create a summarized table using the MIN statistic, the top level of the data in the hierarchy holds the minimum of all lower level data.

A typical PROC SUMMARY statement is shown here:

PROC SUMMARY
DATA mylib.mydata
MISSING
ORDER=data;
CLASS leveln leveln-1 ... level1;
VAR var1 var2 ...;
OUTPUT out=mylib.sumdata
SUM=<stat-options>;
where
DATA=mylib.mydata

names the table that contains a number of drill and analysis variables.

MISSING

includes missing values in the summarization process. This is optional.

ORDER=data

specifies the sort order for the data. By default, ORDER=INTERNAL is assumed. You can specify any of the following order options:

  • EXTERNAL

  • INTERNAL

  • FORMATTED.

CLASS leveln leveln-1 ... level1

specifies the variables for the corresponding hierarchy levels.

Note:   If you want the drill hierarchy to be

  1. level1

  2. level2

  3. leveln

then specify the corresponding variables in the CLASS statement as follows:

CLASS leveln leveln-1 ... level1.  [cautionend]

VAR var1 var2 ...

specifies all variables that should be aggregated in the VAR statement. Typically, these are the variables that you register in the repository with the ANALYSIS attribute.

OUTPUT out=mylib.sumdata

names the output table that you register in the repository.

SUM=stat-options

names the statistic to be used for aggregation of the variables in the VAR statement. Possible values are

  • MEAN=

  • MIN=

  • MAX=

  • RANGE=

  • STD=

  • STDERR=

  • SUM=

space
Previous Page | Next Page | Top of Page