PROC OPTMODEL: CREATE DATA Statement :: SAS/OR(R) 9.2 User's Guide: Mathematical Programming

The OPTMODEL Procedure

CREATE DATA Statement

CREATE DATA SAS-data-set FROM [ [ key-column(s) ] [ =key-set ] ] column(s);

The CREATE DATA statement creates a new SAS data set and copies data into it from PROC OPTMODEL parameters and variables. The CREATE DATA statement can create a data set with a single observation or a data set with observations for every location in one or more arrays. The data set is closed after the execution of the CREATE DATA statement.

The arguments to the CREATE DATA statement are as follows:

SAS-data-set: specifies the output data set name and options.
key-column(s): declares index values and their corresponding data set variables. The values are used to index array locations in column(s) .
key-set: specifies a set of index values for the key-column(s) .
column(s): specifies data set variables as well as the PROC OPTMODEL source data for the variables.

Each column or key-column defines output data set variables and a data source for a column. For example, the following code generates the output SAS data set resdata from the PROC OPTMODEL array opt, which is indexed by the set indset:

  
   create data resdata from [solns]=indset opt;

The output data set variable solns contains the index elements in indset.

Columns

Column(s) can have the following forms:

identifier-expression

transfers data from the PROC OPTMODEL parameter or variable specified by the identifier-expression. The output data set variable has the same name as the name part of the identifier-expression (see the section "Identifier Expressions"). If the identifier-expression refers to an array, then the index can be omitted when it matches the key-column(s). The following example creates a data set with the variables m and n:

  
   proc optmodel; 
      number m = 7, n = 5; 
      create data example from m n;

name = expression

transfers the value of a PROC OPTMODEL expression to the output data set variable name. The expression is reevaluated for each observation. If the expression contains any operators or function calls, then it must be enclosed in parentheses. If the expression is an identifier-expression that refers to an array, then the index can be omitted if it matches the key-column(s). The following example creates a data set with the variable ratio:

  
   proc optmodel; 
      number m = 7, n = 5; 
      create data example from ratio=(m/n);

COL(name-expression) = expression

transfers the value of a PROC OPTMODEL expression to the output data set variable named by the string expression name-expression. The PROC OPTMODEL expression is reevaluated for each observation. If this expression contains any operators or function calls, then it must be enclosed in parentheses. If the PROC OPTMODEL expression is an identifier-expression that refers to an array, then the index can be omitted if it matches the key-column(s). The following example uses the COL expression to form the variable s5:

  
   proc optmodel; 
      number m = 7, n = 5; 
      create data example from col("s"||n)=(m+n);

{ index-set } < column(s) >

performs the transfers by iterating each column specified by <column(s)> for each member of the index set. If there are

columns and

index set members, then

columns are generated. The dummy parameters from the index set can be used in the columns to generate distinct output data set variable names in the iterated columns, using COL expressions. The columns are expanded when the CREATE DATA statement is executed, before any output is performed. This form of column(s) cannot be nested. In other words, the following form of column(s) is NOT allowed:

{ index-set } < { index-set } < column(s) > >

The following example demonstrates the use of the iterated column(s) form:

    proc optmodel;                                                                                                                          
       set<string> alph = {'a', 'b', 'c'};                                                                                                  
       var x{1..3, alph} init 2;                                                                                                            
       create data example from [i]=(1..3)
          {j in alph}<col("x"||j)=x[i,j]>;

The data set created by this code is shown in Output 6.9.

Obs	i	xa	xb	xc
1	1	2	2	2
2	2	2	2	2
3	3	2	2	2

Figure 6.9: CREATE DATA with COL Expression

Note: When no key-column(s) are specified, the output data set has a single observation.

The following code incorporates several of the preceding examples to create and print a data set by using PROC OPTMODEL parameters:

    proc optmodel;
       number m = 7, n = 5;
       create data example from m n ratio=(m/n) col("s"||n)=(m+n);
    proc print;
       run;

The output from the PRINT procedure is shown in Output 6.10.

Obs	m	n	ratio	s5
1	7	5	1.4	12

Figure 6.10: CREATE DATA for Single Observation

Key columns

Key-column(s) declare index values that enable multiple observations to be written from array column(s) . An observation is created for each unique index value combination. The index values supply the index for array column(s) that do not have an explicit index.

Key-column(s) define the data set variables where the index value elements are written. They can also declare local dummy parameters for use in expressions in the column(s) . Key-column(s) are syntactically similar to column(s) , but are more restricted in form. The following forms of key-column(s) are allowed:

name: transfers an index element value to the data set variable name . A local dummy parameter, name , is declared to hold the index element value.
COL(name-expression) [ = index-name ]: transfers an index element value to the data set variable named by the string-valued name-expression . Index-name optionally declares a local dummy parameter to hold the index element value.

A key-set in the CREATE DATA statement explicitly specifies the set of index values. Key-set can be specified as a set expression, although it must be enclosed in parentheses if it contains any function calls or operators. Key-set can also be specified as an index set expression, in which case the index-set dummy parameters override any dummy parameters that are declared in the key-column(s) items. The following code creates a data set from the PROC OPTMODEL parameter m, a matrix whose only nonzero entries are located at (1, 1) and (4, 1):

    proc optmodel;
       number m{1..5, 1..3} = [[1 1] 1 [4 1] 1];                                                                                            
       create data example 
          from [i j] = {setof{i in 1..2}<i**2>, {1, 2}} m; 
    proc print data=example noobs;
       run;

The dummy parameter i in the SETOF expression takes precedence over the dummy parameter i declared in the key-column(s) item. The output from this code is shown in Output 6.11.

i	j	m
1	1	1
1	2	0
4	1	1
4	2	0

Figure 6.11: CREATE: Key-set with SETOF Aggregation Expression

If no key-set is specified, then the set of index values is formed from the union of the index sets of the implicitly indexed column(s) . The number of index elements for each implicitly indexed array must match the number of key-column(s) . The type of each index element (string versus numeric) must match the element of the same position in other implicit indices.

The arrays for implicitly indexed columns in a CREATE DATA statement do not need to have identical index sets. A missing value is supplied for the value of an implicitly indexed array location when the implied index value is not in the array's index set.

In the following code, the key-set is unspecified. The set of index values is $\{1, 2, 3\}$ , which is the union of the index sets of x and y. These index sets are not identical, so missing values are supplied when necessary. The results of this code are shown in Output 6.12.

    proc optmodel;
       number x{1..2} init 2;
       var y{2..3} init 3;
       create data exdata from [keycol] x y;
    proc print;
       run;

Obs	keycol	x	y
1	1	2	.
2	2	2	3
3	3	.	3

Figure 6.12: CREATE: Unspecified Key-set

The types of the output data set variables match the types of the source values. The output variable type for a key-column(s) matches the corresponding element type in the index value tuple. A numeric element matches a NUMERIC data set variable, while a string element matches a CHAR variable. For regular column(s) the source expression type determines the output data set variable type. A numeric expression produces a NUMERIC variable, while a string expression produces a CHAR variable.

Lengths of character variables in the output data set are determined automatically. The length is set to accommodate the longest string value output in that column.

You can use the iterated column(s) form to output selected rows of multiple arrays, assigning a different data set variable to each column. For example, the following code outputs the last two rows of the two-dimensional array, a, along with corresponding elements of the one-dimensional array, b:

    proc optmodel;
       num m = 3;  /* number of rows/observations */
       num n = 4;  /* number of columns in a */
       num a{i in 1..m, j in 1..n} = i*j;  /* compute a */
       num b{i in 1..m} = i**2;  /* compute b */
       set<num> subset = 2..m;  /* used to omit first row */
       create data out
              from [i]=subset {j in 1..n}<col("a"||j)=a[i,j]> b;

To specify the data set to be created, the CREATE DATA statement uses the form key-column(s) {index set}<column(s)> column(s). The preceding code creates a data set out, which has observations and variables. The variables are named i, a1 through a, and b, as shown in Output 6.13.

Obs	i	a1	a2	a3	a4	b
1	2	2	4	6	8	4
2	3	3	6	9	12	9

Figure 6.13: CREATE DATA Set: The Iterated Column Form

See the section "Data Set Input/Output" for more examples of using the CREATE DATA statement.

Top of Page