SUPPORT / SAMPLES & SAS NOTES
 

Support

Usage Note 19765: Understanding the Result Set Size from a query on a SAS OLAP cube

DetailsAboutRate It
The following information describes how the structure of a SAS OLAP
cube and the report design can impact the result set size and therefore
the memory usage of a query.

Suppose a cube has the following structure. Each dimension contains a
single level. The cardinality of the level for this query is included
next to the dimension name.

   Dimension A - 45   unique members
   Dimension B - 60      "      "
   Dimension C - 20      "      "
   Dimension D - 10      "      "
   Dimension E - 15      "      "

   Measures - 10 measures

First design a report with the five dimensions placed on rows and the
measures placed on columns. The set size can be calculated with this
formula:

  Axis Set Size =
     Number of Dimensions *
        (Cardinality of Dimension 1 *
         Cardinality of Dimension 2 * ... *
         Cardinality of Dimension N)

  Result Set Size = Axis 1 Set Size + Axis 2 Set Size + ... + Axis N Set
  Size

The set for this report would then be:

  Row Set Size = 5 * (45 * 60 * 20 * 10 * 15) = 40,500,000 tuples
  Column Set Size = 1 * (10) = 10 tuples

  Result Set Size = 40,500,000 + 10 = 40,500,010

However, by moving the two smallest dimensions to the columns axis, the
set size and memory usage can be significantly decreased:

  Row Set Size = 3 * (45 * 60 * 20) = 162,000 tuples
  Column Set Size = 3 * (10 * 10 * 15) = 4,500 tuples

  Result Set Size = 162,000 + 4,500 = 166,500 tuples

As you can see, proper distribution of the dimensions is essential for
an efficient query. Alternatively, if the levels do have high
cardinality, you can impose subsets by creating logical buckets. These
would allow you to navigate down into smaller portions of the data, and
limit the result set size being requested. For example, if you have user
IDs from 10000 to 99999, you might consider using logical buckets based
on several levels of detail. For example, groups of 10000, then groups
of 1000, then groups of 100, and so on.

If a measure is defined to use the NUNIQUE statistic, you must also 
consider the size of the set that is requested by this measure. The
number of unique values for the level might be the maximum set size for 
your particular query.


Operating System and Release Information

Product FamilyProductSystemSAS Release
ReportedFixed*
SAS SystemSAS OLAP ServerTru64 UNIX9.1 TS1M0
64-bit Enabled AIX9.1 TS1M0
OpenVMS Alpha9.1 TS1M0
64-bit Enabled HP-UX9.1 TS1M0
HP-UX IPF9.1 TS1M0
Linux9.1 TS1M0
z/OS9.1 TS1M0
64-bit Enabled Solaris9.1 TS1M0
Microsoft Windows 2000 Advanced Server9.1 TS1M0
Microsoft Windows 2000 Datacenter Server9.1 TS1M0
Microsoft Windows 2000 Professional9.1 TS1M0
Microsoft Windows 2000 Server9.1 TS1M0
Microsoft Windows Server 2003 Datacenter Edition9.1 TS1M0
Microsoft Windows Server 2003 Enterprise Edition9.1 TS1M0
Microsoft Windows Server 2003 Standard Edition9.1 TS1M0
Microsoft® Windows® for 64-Bit Itanium-based Systems9.1 TS1M0
Microsoft Windows NT Workstation9.1 TS1M0
Microsoft Windows XP Professional9.1 TS1M0
* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.