ARRAY Statement

Defines the elements of an array.
Valid in: DATA step
Category: Information
Type: Declarative

Syntax

Arguments

array-name
specifies the name of the array.
Restriction:Array-name must be a SAS name that is not the name of a SAS variable in the same DATA step.
CAUTION:
Using the name of a SAS function as an array name can cause unpredictable results.
If you inadvertently use a function name as the name of the array, SAS treats parenthetical references that involve the name as array references, not function references, for the duration of the DATA step. A warning message is written to the SAS log.
{subscript}
describes the number and arrangement of elements in the array by using an asterisk, a number, or a range of numbers. Subscript has one of these forms:
{dimension-size(s)}
specifies the number of elements in each dimension of the array. Dimension-size is a numeric representation of either the number of elements in a one-dimensional array or the number of elements in each dimension of a multidimensional array.
Tip:You can enclose the subscript in braces ({}), brackets ( [ ] ) or parentheses (( )).
Examples:This ARRAY statement defines a one dimensional array that is named SIMPLE. The SIMPLE array groups together three variables that are named RED, GREEN, and YELLOW:
array simple{3} red green yellow;

An array with more than one dimension is known as a multidimensional array. You can have any number of dimensions in a multidimensional array. For example, a two-dimensional array provides row and column arrangement of array elements. SAS places variables into a two-dimensional array by filling all rows in order, beginning at the upper left corner of the array (known as row-major order). This statement defines a two-dimensional array with five rows and three columns:

array x{5,3} score1-score15;

{<lower :>upper<, ...<lower :> upper>}
are the bounds of each dimension of an array, where lower is the lower bound of that dimension and upper is the upper bound.
Range:In most explicit arrays, the subscript in each dimension of the array ranges from 1 to n, where n is the number of elements in that dimension.
Tips:For most arrays, 1 is a convenient lower bound. Thus, you do not need to specify the lower and upper bounds. However, specifying both bounds is useful when the array dimensions have a convenient beginning point other than 1.

To reduce the computational time that is needed for subscript evaluation, specify a lower bound of 0.

Examples:In the following example, the value of each dimension is by default the upper bound of that dimension.
array x{5,3} score1-score15;

As an alternative, the following ARRAY statement is a longhand version of the previous example:

array x{1:5,1:3} score1-score15;

{*}
specifies that SAS is to determine the subscript by counting the variables in the array. When you specify the asterisk, also include array-elements.
Restriction:You cannot use the asterisk with _TEMPORARY_ arrays or when you define a multidimensional array.
$
specifies that the elements in the array are character elements.
Tip:The dollar sign is not necessary if the elements have been previously defined as character elements.
length
specifies the length of elements in the array that have not been previously assigned a length.
array-elements
specifies the names of the elements that make up the array. Array-elements must be either all numeric or all character, and they can be listed in any order. The elements can be
variables
lists variable names.
Range:The names must be either variables that you define in the ARRAY statement or variables that SAS creates by concatenating the array name and a number. For example, when the subscript is a number (not the asterisk), you do not need to name each variable in the array. Instead, SAS creates variable names by concatenating the array name and the numbers 1, 2, 3, …n.
Restriction:If you use _ALL_, all the previously defined variables must be of the same type.
Tips:These SAS variable lists enable you to reference variables that have been previously defined in the same DATA step:

_NUMERIC_ specifies all numeric variables.

_CHARACTER_ specifies all character variables.

_ALL_ specifies all variables.

_TEMPORARY_
creates a list of temporary data elements.
Range:Temporary data elements can be numeric or character.
Tips:Temporary data elements behave like DATA step variables with these exceptions:

They do not have names. Refer to temporary data elements by the array name and dimension.

They do not appear in the output data set.

You cannot use the special subscript asterisk (*) to refer to all the elements.

Temporary data element values are always automatically retained, rather than being reset to missing at the beginning of the next iteration of the DATA step.

Arrays of temporary elements are useful when the only purpose for creating an array is to perform a calculation. To preserve the result of the calculation, assign it to a variable. You can improve performance time by using temporary data elements.

(initial-value-list)
gives initial values for the corresponding elements in the array. The values for elements can be numbers or character strings. You must enclose all character strings in quotation marks. To specify one or more initial values directly, use the following format:
(initial-value(s))
To specify an iteration factor and nested sublists for the initial values, use the following format:
<constant-iter-value*> <(>constant value | constant-sublist<)>
Restriction:If you specify both an initial-value-list and array-elements, then array-elements must be listed before initial-value-list in the ARRAY statement.
Tips:You can assign initial values to both variables and temporary data elements.

Elements and values are matched by position. If there are more array elements than initial values, the remaining array elements receive missing values and SAS issues a warning.

You can separate the values in the initial value list with either a comma or a blank space.

You can also use a shorthand notation for specifying a range of sequential integers. The increment is always +1.

If you have not previously specified the attributes of the array elements (such as length or type), the attributes of any initial values that you specify are automatically assigned to the corresponding array element. Initial values are retained until a new value is assigned to the array element.

When any (or all) elements are assigned initial values, all elements behave as if they were named on a RETAIN statement.

Example:The following examples show how to use the iteration factor and nested sublists. All of these ARRAY statements contain the same initial value list:
ARRAY x{10} x1-x10 (10*5); 
ARRAY x{10} x1-x10 (5*(5 5));
ARRAY x{10} x1-x10 (5 5 3*(5 5) 5 5);
ARRAY x{10} x1-x10 (2*(5 5) 5 5 2*(5 5));
ARRAY x{10} x1-x10 (2*(5 2*(5 5)));

Details

The ARRAY statement defines a set of elements that you plan to process as a group. You refer to elements of the array by the array name and subscript. Because you usually want to process more than one element in an array, arrays are often referenced within DO groups.

Comparisons

  • Arrays in the SAS language are different from arrays in many other languages. A SAS array is simply a convenient way of temporarily identifying a group of variables. It is not a data structure, and array-name is not a variable.
  • An ARRAY statement defines an array. An array reference uses an array element in a program statement.

Examples

Example 1: Defining Arrays

  • array rain {5} janr febr marr aprr mayr;
  • array days{7} d1-d7;
  • array month{*} jan feb jul oct nov;
  • array x{*} _NUMERIC_;
  • array qbx{10};
  • array meal{3};

Example 2: Assigning Initial Numeric Values

  • array test{4} t1 t2 t3 t4 (90 80 70 70);
  • array test{4} t1-t4 (90 80 2*70);
  • array test{4} _TEMPORARY_ (90 80 70 70);

Example 3: Defining Initial Character Values

  • array test2{*} $ a1 a2 a3 ('a','b','c');

Example 4: Defining More Advanced Arrays

  • array new{2:5} green jacobs denato fetzer;
  • array x{5,3} score1-score15;
  • array test{3:4,3:7} test1-test10;
  • array temp{0:999} _TEMPORARY_;
  • array x{10} (2*1:5);

Example 5: Creating a Range of Variable Names That Have Leading Zeros

The following example shows that you can create a range of variable names that have leading zeros. Each variable name has a length of three characters, and the names sort correctly (A01, A02, … A10). Without leading zeros, the variable names would sort in the following order: A1, A10, A2, … A9.
data test (drop=i);
   array a{10} A01-A10;
   do i=1 to 10;
      a{i}=i;
   end;
run;
proc print noobs data=test;
run;
Array Names That Have Leading Zeros
Array Names That Have Leading Zeros

See Also

Array Processing in SAS Language Reference: Concepts