The OLAP Procedure |
The PROC OLAP statement specifies the input data source, cube name, and path. This statement can also be used to do the following:
specify options that might improve query performance
delete cubes
specify global settings for handling missing hierarchy members in ragged and unbalanced hierarchies
perform actions on existing cubes
change nonstructural cube elements
update cube data
coalese partition aggregations
PROC OLAP <option(s)>; |
Options |
Note: For a list of the options that can be used to optimize cube creation and query performance, see Options Used for Performance.
specifies an incremental update to a cube. The ADD_DATA option enables you to add new members and data to a cube. You can update a cube in-place or create generations of the cube. The new data that the cube is updated with is specified with the DATA= option. Here is an example of the ADD_DATA option used with PROC OLAP:
proc olap data=mylib.newdata cube=cubeA add_data outcube=cubeB outschema=testSchema; metasvr host="myhost" port=8561 repository=myrepository olap_schema=prodSchema; run;
The ADD_DATA option requires that either UPDATE_IN_PLACE, OUTCUBE=, or OUTSCHEMA= also be specified. The DATA= or FACT= option is required to specify a table with the new input records.
Note: You cannot use ADD_DATA with cubes that have been created with the NONUPDATEABLE option.
See Updating SAS OLAP Cubes for further information.
specifies a limit on the number of indices that will be created in parallel during the cube build process. By default, all indices for an aggregation in the cube are created asynchronously. The number of indices built is based on the number of hierarchies present in the aggregation. Therefore, if your cube contains a large number of hierarchies and your machine has limited resources, you may find that limiting the indices created asynchronously can result in better cube build performance. Use a value greater than zero to set this limit. The default value is 0. There is no limit, as all indices for an aggregation are created asynchronously.
Note: The NWAY is the largest aggregation in the cube and contains data for ALL hierarchies.
specifies that all aggregations in a cube will be coalesced. When a cube is updated, multiple aggregation partitions (separate SPDE tables)) are produced by one or more cube update (ADD_DATA) operations. The more partitions an aggregation has, the slower the access time will be. To prevent reduced query performance on a cube, these aggregation partitions can be coalesced (combined) back into a single aggregation table. Coalescing updated aggregations periodically keeps the number of aggregation partitions at a minimum. When an aggregation is coalesced, all existing partitions are grouped together to create a single SPDE table containing data from all partitions. This becomes the new aggregation table and the old partitions are deleted. Only aggregations with more than one partition are coalesced. All others are ignored (and a warning message is printed to the log).
Only MOLAP aggregations (those aggregations created by PROC OLAP) can be coalesced. No ROLAP aggregations are coalesced. If a ROLAP aggregation is specified in a COALESCE_AGGREGATION statement, that statement is ignored and a warning message is printed to the log.
specifies that the cube build will include an additional summarization step that is designed to decrease the size of the NWAY aggregation and improve viewing performance. The amount of improvement depends on the nature of the data. The cubes that improve the most are those that have the largest number of rows that can be included in the additional summarization step.
Candidates for compaction are cubes that are built from star schema, where the cube does not define levels for all columns in all dimension tables. This can result in a fact table that contains many rows that belong to the same leaf member of a given hierarchy. These are the rows that are summarized to decrease the size of the NWAY aggregation.
For example, assume that a cube is built from a star schema that contains a Time dimension table. The Time table contains columns for year, quarter, month, and day, along with a primary key column. If the cube is defined so that the day column is not specified as a level of a time hierarchy, then there are up to 31 key values that refer to each unique combination of year, quarter, and month. Together, these key values define a unique leaf member of that hierarchy. These are the values that are summarized at build time.
The amount of compaction in the NWAY aggregation is determined by the number of source rows that can be summarized. The number of summarized rows depends on the number of unique key values in the fact table that refer to the same leaf member of a hierarchy. Another compaction factor is the number of rows in the fact table that contain unique combinations of keys; these rows are not compacted.
specifies whether or not to store the aggregation tables in a compressed format on disk.
Note: This option applies to the automatically created NWAY and all aggregations that do not explicitly specify a COMPRESS option in the AGGREGATION statement.
Default: | NOCOMPRESS |
specifies the maximum number of aggregations to create in parallel. This option does not apply to the NWAY aggregation, which is always built first (unless the NO_NWAY option is set).
Default: | 2, which is based on the results of a special algorithm that takes into consideration the number of aggregations that are being created and the number of processors that are available. The algorithm assumes that CPU resources should be reserved for creating aggregation indexes. |
Tip: | So that each built index has a fair share of the assigned INDEXSORTSIZE memory, INDEXSORTSIZE is divided by the CONCURRENT value. The value of INDEXSORTSIZE should give each concurrent index build enough memory to at least hold a table PARTSIZE. For best performance, INDEXSORTSIZE divided by CONCURRENT should be greater than PARTSIZE. |
specifies a valid SAS name for the cube to be created, renamed, or updated. The folder path that is specified is a metadata folder path. It exists only in the metadata. In SAS 9.2, cubes are identified both by their location in an OLAP schema and by their association with a metadata folder. By default, if no folder-path is given, the cube will be associated with the folder that contains the OLAP schema. When you specify a metadata folder, you can optionally add the metadata type (CUBE) after the cube name. For naming guidelines, see Naming Guidelines and Rules for the SAS OLAP Server.
Note: The cube name must be unique within the OLAP schema and the metadata folder. The specified metadata folder path must exist.
Here are some examples of the CUBE= option.
Example 1
proc olap cube='MYCUBE' /* other options */; metasvr ... olap_schema='SASApp - OLAP Schema'; /* further statements */ run;This example creates a cube named MYCUBE in the OLAP schema SASApp - OLAP Schema and also surfaces the cube in that schema's metadata folder. It is located by default in /Shared Data/SASApp - OLAP Schema:
Example2
proc olap cube='/Shared Data/Cubes/MYCUBE(cube)' /* other options */; metasvr ... olap_schema='SASApp - OLAP Schema'; /* further statements */ run;This example creates a cube named MYCUBE in the OLAP schema SASApp - OLAP Schema and also surfaces the cube in the metadata folder /Shared Data/Cubes.
specifies the data source for the cube. The unsummarized data source can be any SAS data file, including files that are supported by SAS/ACCESS software engines. If you load the cube from a star schema, then the dsname is the name of the fact table that contains the analysis variables from which to derive the measures for the cube. The fact table must also contain fact keys that correspond to dimension tables in the star schema.
You can also provide data set options along with DATA | FACT=. Options are stored within the cube and reapplied when the data is accessed at run time. For more information, see "Data Set Options" in SAS Language Reference: Concepts.
Note: This option is not required if you want to define the cube by using input data from a fully summarized external data source (a crossing of all dimensions known as an NWAY). In that case, you specify the data source for the cube by using the TABLE= option in the AGGREGATION statement.
Interaction: |
If you load the cube from a star
schema, then you must use the DIMENSION statement to do the following:
|
specifies the location of one or more partitions in which to place aggregation table data. The data is distributed by cycling through each partition location according to the partition size (set using the PARTSIZE= option). For example, if you specify DATAPATH=('C:\data1' 'D:\data2'), then PROC OLAP places the first partition of each aggregation table into directory C:\data1, the second partition of each table into directory D:\data2, the third partition of each table into C:\data1, and so on. It is also possible to have aggregation tables that use fewer than the specified number of partitions. For example, your cube might contain an aggregation table that fits entirely into C:\data1.
Note: This option applies to the automatically created NWAY and all aggregations that do not explicitly specify a DATAPATH= option in the AGGREGATION statement.
Default: | The cube subdirectory of the location that is specified by the PATH= option in the PROC OLAP statement |
deletes the physical cube that is specified with the CUBE= option. It also deletes the cube's definition, which is stored in the metadata repository.
If either the physical cube, its registration, or both are not present, then the DELETE option behaves as explained in the following table:
Note: The use of the DELETE option will remove all information about a cube, including security information and information maps.
Physical Cube Exists | Registration eExists | Behavior of DELETE Option |
---|---|---|
No | Yes | The physical cube is not deleted. The registration is deleted. If there is a registration, and you use the DELETE option, the registration is always deleted and you cannot recreate the cube from the registration. You can only recreate the cube from the registration when you use the DELETE_PHYSICAL option. |
No | No | Fails because there is nothing to delete. |
Yes | No | Fails because the cube cannot be located without its registration information. You must manually delete the cube files. |
deletes the physical cube that is specified with the CUBE= option but leaves the cube definition intact. This enables you to build a new cube based on the saved cube definition. With the DELETE_PHYSICAL option, the cube registration is maintained in the metadata. As a result, you can change the definition of the cube without changing the cube name or file path. The physical cube must be removed before that cube can be rebuilt.
Note: The use of the DELETE_PHYSICAL option is preferable when rebuilding a cube, because it preserves information about a cube, including security information and information maps.
If either the physical cube, its registration, or both are not present, then the DELETE_PHYSICAL option behaves as explained in the following table:
Physical Cube Exists | Registration Exists | Behavior of DELETE_PHYSICAL Option |
---|---|---|
No | Yes | A warning message is given that there is no physical cube to delete. |
No | No | Fails because there is nothing to delete. |
Yes | No | Fails because the cube cannot be located without its registration information. You must manually delete the physical cube files. |
specifies up to 200 characters to be stored as descriptive text. If the text includes blank spaces or any characters that are not permitted in a valid SAS name, then enclose the text within quotation marks.
Note: Cubes that are built with a cube description that is longer than 200 characters cannot be fully registered in the SAS Metadata Repository. The cube description will be truncated to 200 characters.
specifies an optional drill-through table. Drill-through tables can be used by client applications to provide a view from processed data into the underlying data source. You can specify the DATA | FACT= table or a different table that includes the necessary data and columns.
You can also specify data set options with this option. Options are stored within the cube and reapplied when the data is accessed at run time. For more information, see "Data Set Options" in SAS Language Reference: Concepts.
specifies the quoted text string that identifies members of character levels that are to be skipped or disregarded. Members are skipped in order to create ragged or unbalanced hierarchies, as described in Defining Ragged and Unbalanced Hierarchies for a Dimension. The maximum length of the quoted string is 256 characters.
To be skipped, a member in a character level must have a caption whose value matches the value of the EMPTY_CHAR= option. For example, if a member in a character level is skipped, and if the caption of that member is Empty, then the EMPTY_CHAR= option is specified as follows:
empty_char='Empty'
Interaction: | When specified in the PROC OLAP statement, the EMPTY_CHAR= option can be overridden by the EMPTY_CHAR= or IGNORE_EMPTY options in a HIERARCHY statement or by the EMPTY= or IGNORE_EMPTY options in a LEVEL statement. |
To skip members in numeric levels, use the EMPTY_NUM= option.
specifies the quoted text string that identifies members of numeric levels that are to be skipped or disregarded. Members are skipped in order to create ragged or unbalanced hierarchies, as described in Defining Ragged and Unbalanced Hierarchies for a Dimension. The maximum length of the quoted string is 256 characters.
To be skipped, a member in a numeric level must have a caption whose value matches the value of the EMPTY_NUM= option. For example, if a member in a numeric level is skipped, and if the caption of that member is Empty, then the EMPTY_NUM= option is specified as follows:
empty_num='.'
Interaction: | When specified in the PROC OLAP statement, the EMPTY_NUM= option can be overridden by the EMPTY_NUM= or IGNORE_EMPTY options in a HIERARCHY statement or by the EMPTY= or IGNORE_EMPTY options in a LEVEL statement. |
Note: If there is no format that is associated with the member value, then BEST12 is used as the format.
To skip members in character levels, use the EMPTY_CHAR= option.
specifies the ESRI map server to which this cube should be linked. The map server must already be defined on the metadata server. The ESRI map server must be an ArcGIS server defined on the metadata server. The name will be unique across all ArcGIS servers defined on the metadata server. There can only be one map server per cube.
indicates that any existing associations to ESRI metadata should be replaced with those specified in this PROC OLAP session. If no ESRI options are provided, all existing linkages are removed. If ESRI_REPLACE is not specified and there is a conflict between existing associations and those specified in this session, an error results and no associations are added or changed.
when specified while building a cube from a star schema, causes SAS to ignore an error condition, log the error, and continue building the cube. The error condition is detected when the fact table contains foreign key values that are not present in one of the contributing dimension tables. By default, and when this option is not specified, any missing dimension keys stop the build of the cube. When IGNORE_MISSING_DIMKEYS=TERSE is specified, the cube build continues and the fact table row with the missing key is ignored (it is not built into the cube). The SAS log receives an entry that lists the total number of key values that are missing from each dimension table. Specifying a value of VERBOSE produces the same behavior, except that the log receives additional details; the missing keys are listed for each dimension table.
specifies whether or not to create the aggregations with indexes. For faster cube creation and adding and deleting aggregations, you can set this option to NOINDEX. However, the lack of indexes will adversely affect query performance.
Note: This option applies to the automatically created NWAY and all aggregations that do not explicitly specify an INDEX option in the AGGREGATION statement.
Default: | INDEX |
specifies the locations of the index component files that correspond to each aggregation table partition as specified by the DATAPATH= option.
Note: This option applies to the automatically created NWAY and all aggregations that do not explicitly specify an INDEXPATH= option in the AGGREGATION statement.
Note: Indexes are not created for aggregations that have fewer than 1,024 records.
Default: | The cube subdirectory of the location that is specified by the PATH= option in the PROC OLAP statement |
specifies the amount of memory in megabytes that is available when aggregations are being created.
Default: | The system's available memory |
Tip: | So that each built index has a fair share of the assigned INDEXSORTSIZE memory, INDEXSORTSIZE is divided by the CONCURRENT value. The value of INDEXSORTSIZE should give each concurrent index build enough memory to at least hold a table PARTSIZE. For best performance, INDEXSORTSIZE divided by CONCURRENT should be greater than PARTSIZE. |
specifies the number of times proc OLAP will attempt to reconnect to the metadata server after a connection is lost. The value must be between 0 and 25. The default is 1, which indicates a single attempt to reconnect. A value of 0 means that PROC OLAP will not attempted to reconnect.
specifies the maximum number of seconds to wait before PROC OLAP attempts to reconnect to the metadata server. The time to wait (initially MIN_RETRY_WAIT) will be doubled each time a reconnect fails until this limit is reached. The value must be between 1 and 3600. The default value is the greater of MIN_RETRY_WAIT and 30.
specifies the maximum number of threads that are used to asynchronously create the aggregation indexes. The processing engine calculates how many threads are needed based on the number of indexes that are being created and the INDEXSORTSIZE= value. This option sets a limit on the number of threads regardless of the number that is calculated by the processing engine. However, if the processing engine determines that fewer than the maximum number of threads is needed, then only the calculated number of threads are used.
Default: | The value of the SAS system option SPDEMAXTHREADS or 0. If the value is 0, then the processing engine determines the number of threads based on the number of indexes that are created plus the available memory. The maximum value is 65,536 threads. |
specifies the initial number of seconds to wait before PROC OLAP attempts to reconnect to the metadata server. This delay applies to a second attempt to reconnect to the metadata server. There is no delay for the first attempt to reconnect. The value must be between 1 and 3600 (one hour). The default value is the lesser of MAX_RETRY_WAIT and 30.
prevents PROC OLAP from automatically creating an NWAY aggregation (the crossing of all dimension levels) for the new cube. The automatically created NWAY is usually the largest in the cube and most resembles the content of the unsummarized data source.
Interaction: | If you use this option, then the input data source that is specified with the DATA= or FACT= option must be available at run time; otherwise, queries that are not covered by other aggregations will fail. |
specifies that the dimension (or dimensions) for a cube should be built with the minimum amount of disk space to represent the members available when the cube is created. By default, new dimensions are built to allow for new members to be added in future updates. NONUPDATEABLE is valid only when the cube is first created.
If this option is set, no new members can be added to the dimension in future updates of the cube. This option can be specified either on the PROC OLAP statement or on the individual DIMENSION statements. To make individual dimensions non-updateable, use this option on the DIMENSION statement instead.
Specifies a valid SAS name for the new cube registration. For naming guidelines, see Naming Guidelines and Rules for the SAS OLAP Server. The folder path that is specified is a metadata folder path. It exists only in the metadata. The OUTCUBE= option can be used in combination with either the RENAME= options or the ADD_DATA option.
If you are updating a cube, you can use OUTCUBE= option with the ADD_DATA option to specify the location of the new cube registration. If you are renaming an existing cube, you can use the OUTCUBE= with the RENAME option.
OUTCUBE= does not create a copy of existing cubes. In SAS 9.2, cubes are identified both by their location in an OLAP schema and by their association with a metadata folder. By default, if no folder-path is given, the cube will be associated with the folder of the original cube's OLAP schema, or, if specified, with the folder of the OLAP schema specified in the OUTSCHEMA= option. When you specify a metadata folder, you can optionally add the metadata type (CUBE) after the cube name.
Note: The cube name must be unique within the OLAP schema and the metadata folder. The specified metadata folder path must exist.
Here are some examples of the OUTCUBE= option.
Example 1
proc olap rename cube=MYCUBE_A outcube='/Shared Data/Cubes/MYCUBE_B'; metasvr ... olap_schema='SASApp - OLAP Schema'; run;This example creates a cube registration named MYCUBE_B in the schema SASApp - OLAP Schema and the folder Shared Data/Cubes.
Example 2
proc olap add_data cube=MYCUBE_A outcube='MYCUBE_B' /* more options */; metasvr ... olap_schema='SASApp - OLAP Schema'; /* more statements */ run;This example creates a registration named MYCUBE_B for the updated version of the cube MYCUBE_A. The new registration is located in the same OLAP schema and folder as MYCUBE_A.
Note: See Updating SAS OLAP Cubes for further information.
specifies the name of the OLAP schema into which a new cube should be placed. The OLAP schema must already exist within the SAS Metadata Server and the metadata (or a repository on which it depends) specified in the METASVR statement. Changing the OLAP schema for the cube is strictly a change to the metadata and does not result in a physical move.
You can use OUTSCHEMA= in combination with the RENAME option to move an existing cube registration into a different OLAP schema. You can simultaneously use the OUTCUBE= option to also change the cube's name. In addition, OUTSCHEMA= can be used in combination with the ADD_DATA option to specify the target OLAP schema for the registration of the updated cube.
Note: See Updating SAS OLAP Cubes for further information.
specifies the partition size in megabytes of the aggregation table partitions and their corresponding index components.
Note: This option applies to the automatically created NWAY and all aggregations that do not explicitly specify a PARTSIZE= option in the AGGREGATION statement.
Default: | 128 megabytes. The minimum value is 16 megabytes. |
specifies the physical path to the location of a new cube. Within the specified path, the cube is stored in a directory that uses the name of the cube. However, if the cube folder already existis, a unique subdirectory is generated. When a cube is renamed with the OUTCUBE= option, either with the RENAME or ADD_DATA option, the physical path name does not change. Here is an example where this might cause a unique subdirectory to be generated:
Create a cube named MRKTDATA.
Rename the cube MRKTDATA as NEW_MARKET_DATA (or update MRKTDATA and specify OUTCUBE=NEW_MARKET_DATA).
Create another cube named MRKTDATA.
The new cube MRKTDATA would have a cube folder of C:\v9cubes\MRKTDATA1 because the cube folder C:\v9cubes\MRKTDATA still exists for cube NEW_MARKET_DATA.
specifies that metadata for a cube is to be registered, but the cube is not to be physically built. All of the metadata for the cube is added to the SAS Metadata Repository. The physical cube can be built later using the existing metadata definition, with either the short form of PROC OLAP or in SAS OLAP Cube Studio. Note that all data sets must physically exist at registration time. The data sets can be empty--they do not need to contain data. Complete data sets are required when the cube is physically built.
indicates that the cube should be renamed. Renaming a cube updates the metadata for the cube but does not change the file structure or physical location of the cube. This process requires an exclusive lock on the cube. If the cube is being queried by SAS OLAP Server sessions, the cube will need to be disabled (see the OLAPOPERATE procedure) before it is renamed.
Here is the syntax usage for the RENAME option:
PROC OLAP RENAME CUBE=cube-name1 <OUTCUBE=cube-name2> <OUTSCHEMA=olap-schema-name>; METASVR options; RUN; |
Note: The cube that is being renamed must be defined in the metadata but does not need to physically exist.
enables you to reorganize all eligible levels for a cube. It is used when you have performed multiple updates on a cube such that the level values must be reorganized. If this option is used, only those levels that have previously been updated with new captions during prior instances of cube update are eligible to be reorganized.
Here is an example of the REORGANIZE_LEVELS option:
proc olap cube=test REORG_LEVELS; metasvr ... etc;; run
controls how permission conditions are interpreted at query time. With SECURITY_SUBSET= YES, cell values are recalculated at query time based on the security subset defined by the active permission conditions for the given user. SECURITY_SUBSET=NO does not recalculate the cell values. The default value (NO) includes all members within a total.
Note: This option can also be found in SAS OLAP Cube Studio. See the option Include secured member values in presummarized computations on the Cube Designer - General page of the Cube Designer wizard.
specifies the number of observations (table rows) in the file segment of the index component. The value is expressed in multiples of 1,024. The minimum value is 1 (1,024 rows). The segmented indexes are used to optimize the processing of WHERE expressions. Each parallel thread is given a segment of the table to evaluate that is equal to the value of the SEGSIZE= option multiplied by 1,024.
Note: This option applies to the NWAY aggregation and all aggregations that do not explicitly specify a SEGSIZE= option in the AGGREGATION statement.
Default: | 8 (8 x 1,024 = 8,192 rows) |
synchronizes the column names stored in the cube's internal metadata with the input column names for a cube. This is necessary when dataset column names have changed since the cube was last built. This applies to cube levels, measures, and properties.
Note: If column names for a cube have changed and the cube was not synchronized, you will not be able to update the cube. If the cube has presummarized aggregations, you will not be able to query the cube.
enables you to update the captions for dimensions, hierarchies, levels, member properties, and measures. This option specifies that captions or descriptions on the cube are going to be modified. The UPDATE_DISPLAY_NAMES option is allowed with the ADD_DATA option or by itself on the PROC OLAP statement. It can also be used to change the description or caption on the DIMENSION, HIERARCHY, LEVEL, MEASURE and PROPERTY statements. Here is the syntax usage for the UPDATE_DISPLAY_NAMES option:
PROC OLAP CUBE=cubename DESCRIPTION="new description" UPDATE_DISPLAY_NAMES DT=newdtname; METASVR options; DIMENSION TIME DESCRIPTION="New TimeDescription" CAPTION="New TimeCaptions";
Specific considerations for the UPDATE_DISPLAY_NAMES option include the following:
Drill-through Table |
can be updated with the UPDATE_DISPLAY_NAMES option. When you rename a cube, only the cube description or drill-through table name can be updated via the UPDATE_DISPLAY_NAMES option. |
NUNIQUE Measures |
When the UPDATE_DISPLAY_NAMES option is specified, the NUNIQUE measure's INCLUDE_CALCULATED_MEMBERS option can be toggled ON or OFF (with the addition of the NOINCLUDE_CALCULATED_MEMBERS option). |
RENAME |
The only description that can change during a RENAME of a cube is the cube description (the drill-through table name can also be changed during a RENAME). Other descriptions and captions must be altered in a separate step after the RENAME is executed. |
enables you to control how the updated cube is made available to users. An update of the existing cube will occur. UPDATE_IN_PLACE can only be be used when the ADD_DATA option is specified. It cannot be used in combination with the OUTCUBE= or OUTSCHEMA= options. See Updating SAS OLAP Cubes for further information.
specifies one or more locations for temporary work files.
Note: The SPD Engine options are honored by PROC OLAP only if the REGISTER_ONLY option is set on the PROC OLAP statement, and only if the long form of the procedure is used to build a cube. The long form of the procedure is used when you run a SAS program that contains PROC OLAP code. The short form of the procedure is used by SAS OLAP Cube Studio.
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.