![]() | ![]() | ![]() |
An index consists of one or more columns that are used to uniquely identify each row within a table. Functioning as a SAS object that contain the values in one or more columns in a table, an index may be defined as numeric, character, or a combination of both. Although there is no rule that says a table has to have an index, when present, an index is most frequently used to make information retrieval more efficient.
Determining When an Index Is Necessary
To help determine when an index is necessary, it is important to understand the existing data as well as the way the base table(s) will be used. It is always important to know what queries will be used and how users will access columns of data. If an index is used to specify some order within a table, such as movie title or movie category in the MOVIES table, care should be exercised to fully assess what the impact of that index will be.
There are times when the column(s) making up an index are obvious and other times when they are not. When determining whether an index provides any value, some very important rules should be kept in mind. An index should permit the greatest flexibility so every column in a table can be accessed and displayed. Indexes should also be assigned to discriminating column(s) only, since query results will benefit greatest when this is the case.
Rules Associated with Index Creation
In deciding when to create an index, there are a few rules that SAS users should be aware of. First, an index should be created only when it is absolutely necessary. The creation of too many, or unnecessary, indexes requires the expenditure of an unwarranted amount of computer resources for their maintenance. Since an index has to be updated any time a DELETE, INSERT, or UPDATE is performed on rows in a table, care should be used when deciding what indexes should be created.
Second, if the table is small, sequential processing may be just as fast or faster than processing with an index.
Next, when the number of matches or a subset is 15% or less of the rows in the population, then an index should be considered because the PROC SQL processor may take advantage of any defined indexes. This is due to the way WHERE clause processing is performed in SAS. As the number of rows in a subset grows in size, information retrieval may become slower than for smaller subsets consisting of fewer rows.
Finally, if the page count as displayed in the CONTENTS procedure is less than three pages, avoid creating or using an index.
Types of Indexes
Two types of indexes can be defined in PROC SQL: simple and composite. When a simple index is created, it references a single column only. In contrast, a composite index references two or more columns in a table.
Creating a Simple Index
A simple index is specifically defined for one column in a table and must have the same name as the column. Suppose you wanted to create an index consisting of movie category (CATEGORY) in the MOVIES table. Once created, the index becomes a separate object located in the specified SAS library.
SAS Code
PROC SQL;
CREATE INDEX CATEGORY
ON libref.MOVIES;
QUIT;
Creating a Composite Index
A composite index is defined for two or more columns in a table and must have a different name from the column. Suppose you wanted to create an index consisting of the movie category (CATEGORY) and the movie rating (RATING) located in the MOVIES table. Users should be aware that only one composite index is allowed per set of columns. The index, as with the simple index, becomes a separate object located in the designated SAS library.
SAS Code
PROC SQL;
CREATE INDEX CATRATE
ON libref.MOVIES (CATEGORY, RATING);
QUIT;
Modifying Columns Containing Indexes
Altering the attributes of a column that contains an associated index (simple or composite) does NOT prohibit the values in the altered column from using the index. But, if a column that contains an index is dropped, then the index is also dropped. Accordingly, when a column is dropped, any data in that index is also lost.
Removing an Index
When one or more indexes are no longer needed, the DROP INDEX statement can be used to remove them. Suppose you determined that due to poor performance the composite index CATRATE (created earlier) is no longer needed.
SAS Code
PROC SQL;
DROP INDEX CATRATE
FROM libref.MOVIES;
QUIT;
About the Author
Kirk Paul Lafler is the author of PROC SQL: Beyond the Basics Using SAS, published by SAS Press. He also writes the
popular SAS tips column "Kirk's Korner," which appears regularly in several SAS users group newsletters, and is a frequent speaker at
SAS users group meetings.
His book is available from the online bookstore.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
| Type: | Sample |
| Topic: | SAS Reference ==> Procedures ==> SQL Data Management ==> Access ==> Indexes Non SAS Authors ==> Kirk Paul Lafler |
| Date Modified: | 2005-12-13 03:03:19 |
| Date Created: | 2005-11-22 16:41:19 |
| Product Family | Product | Host | SAS Release | |
| Starting | Ending | |||
| SAS System | Base SAS | All | n/a | n/a |



