Processing Simple Arrays

Grouping Variables in a Simple Array

The following ARRAY statement creates an array named BOOKS that contains the three variables Reference, Usage, and Introduction:
array books{3} Reference Usage Introduction;
When you define an array, SAS assigns each array element an array reference with the form array-name{subscript}, where subscript is the position of the variable in the list. The following table lists the array reference assignments for the previous ARRAY statement:
Array Reference Assignments for Array Books
Variable
Array Reference
Reference
books{1}
Usage
books{2}
Introduction
books{3}
Later in the DATA step, when you want to process the variables in the array, you can refer to a variable by either its name or its array reference. For example, the names Reference and books{1} are equivalent.

Using a DO Loop to Repeat an Action

To perform the same action several times, use an iterative DO loop. A simple iterative DO loop that processes an array has the following form:
DO index-variable=1 TO number-of-elements-in-array;
… more SAS statements …
END;
The loop is processed repeatedly (iterates) according to the instructions in the iterative DO statement. The iterative DO statement contains an index-variable whose name you specify and whose value changes at each iteration of the loop.
To execute the loop as many times as there are variables in the array, specify that the values of index-variable are 1 TO number-of-elements-in-array. SAS increases the value of index-variable by 1 before each new iteration of the loop. When the value exceeds the number-of-elements-in-array, SAS stops processing the loop. By default, SAS automatically includes index-variable in the output data set. Use a DROP statement or the DROP= data set option to prevent the index variable from being written to your output data set.
An iterative DO loop that executes three times and has an index variable named count has the following form:
do count=1 to 3;
    … more SAS statements …
end;
The first time that the loop processes, the value of count is 1; the second time, 2; and the third time, 3. At the beginning of the fourth iteration, the value of count is 4, which exceeds the specified range and causes SAS to stop processing the loop.

Using a DO Loop to Process Selected Elements in an Array

To process particular elements of an array, specify those elements as the range of the iterative DO statement. For example, the following statement creates an array DAYS that contains seven elements:
array days{7} D1-D7;
The following DO statements process selected elements of the array DAYS:
DO Statement Processing
DO Statement
Description
do i=2 to 4;
processes elements 2 through 4
do i=1 to 7 by 2;
processes elements 1, 3, 5, and 7
do i=3,5;
processes elements 3 and 5

Selecting the Current Variable

You must tell SAS which variable in the array to use in each iteration of the loop. Recall that you identify variables in an array by their array references and that you use a variable name, a number, or an expression as the subscript of the reference. Therefore, you can write programming statements so that the index variable of the DO loop is the subscript of the array reference (for example, array-name{index-variable}). When the value of the index variable changes, the subscript of the array reference (and therefore the variable that is referenced) also changes.
The following example uses the index variable count as the subscript of array references inside a DO loop:
array books{3} Reference Usage Introduction; 
do count=1 to 3;
   if books{count}=. then books{count}=0;
end;
When the value of count is 1, SAS reads the array reference as books{1} and processes the IF-THEN statement on books{1}, which is the variable Reference. When count is 2, SAS processes the statement on books{2}, which is the variable Usage. When count is 3, SAS processes the statement on books{3}, which is the variable Introduction.
The statements in the example tell SAS to
  • perform the actions in the loop three times
  • replace the array subscript count with the current value of count for each iteration of the IF-THEN statement
  • locate the variable with that array reference and process the IF-THEN statement on it
  • replace missing values with zero if the condition is true.
The following DATA step defines the array BOOK and processes it with a DO loop.
options nodate pageno=1 linesize=80 pagesize=60; 

data changed(drop=count);
   input Reference Usage Introduction;
   array book{3} Reference Usage Introduction;
   do count=1 to 3;
      if book{count}=. then book{count}=0;
   end;
   datalines;
45 63 113
.  75 150
62 .   98
;

proc print data=changed;
   title 'Number of Books Sold';
run;
The following output shows the CHANGED data set.
Using an Array Statement to Process Missing Data Values
                              Number of Books Sold                             1

                   Obs    Reference    Usage    Introduction

                    1         45         63          113    
                    2          0         75          150    
                    3         62          0           98    

Defining the Number of Elements in an Array

When you define the number of elements in an array, you can either use an asterisk enclosed in braces ({*}), brackets ([*]), or parentheses ((*)) to count the number of elements or to specify the number of elements. You must list each array element if you use the asterisk to designate the number of elements. In the following example, the array C1TEMP references five variables with temperature measures.
array c1temp{*} c1t1 c1t2 c1t3 c1t4 c1t5;
If you specify the number of elements explicitly, you can omit the names of the variables or array elements in the ARRAY statement. SAS then creates variable names by concatenating the array name with the numbers 1, 2, 3, and so on. If a variable name in the series already exists, SAS uses that variable instead of creating a new one. In the following example, the array c1t references five variables: c1t1, c1t2, c1t3, c1t4, and c1t5.
array c1t{5};

Rules for Referencing Arrays

Before you make any references to an array, an ARRAY statement must appear in the same DATA step that you used to create the array. Once you have created the array, you can perform the following tasks:
  • Use an array reference anywhere that you can write a SAS expression.
  • Use an array reference as the arguments of some SAS functions.
  • Use a subscript enclosed in braces, brackets, or parentheses to reference an array.
  • Use the special array subscript asterisk (*) to refer to all variables in an array in an INPUT or PUT statement or in the argument of a function.
    Note: You cannot use the asterisk with _TEMPORARY_ arrays.
An array definition is in effect only for the duration of the DATA step. If you want to use the same array in several DATA steps, you must redefine the array in each step. You can, however, redefine the array with the same variables in a later DATA step by using a macro variable. A macro variable is useful for storing the variable names that you need, as shown in this example:
%let list=NC SC GA VA;

data one;
   array state{*} &list;
   … more SAS statements …
run;

data two;
   array state{*} &list;
   … more SAS statements …
run;