- CREATE DATA SAS-data-set
FROM
[ [ key-column(s) ]
[ =key-set ] ]
column(s);
The CREATE DATA statement creates a new SAS data set and copies data into
it from PROC OPTMODEL parameters and variables. The CREATE DATA statement can create a data
set with a single observation or a data set with observations for
every location in one or more arrays. The data set is closed after
the execution of the CREATE DATA statement.
The arguments to the CREATE DATA statement
are as follows:
- SAS-data-set
-
specifies the output data set name and options.
- key-column(s)
-
declares index values and their corresponding data set variables. The
values are used to index array locations in column(s) .
- key-set
-
specifies a set of index values for the key-column(s) .
- column(s)
-
specifies data set variables as well as the PROC OPTMODEL source data
for the variables.
Each
column or
key-column defines output data set
variables and a data source for a column. For example, the following code
generates the output SAS data set
resdata from the PROC
OPTMODEL array
opt, which is indexed by the set
indset:
create data resdata from [solns]=indset opt;
The output data set variable
solns contains the index elements
in
indset.
Columns
Column(s) can have the following
forms:
- identifier-expression
-
transfers data from the PROC OPTMODEL parameter or variable specified by
the identifier-expression. The output data
set variable has the same name as the name part of the
identifier-expression (see the section "Identifier Expressions"). If the identifier-expression refers
to an array, then the index can be omitted when it matches the
key-column(s). The following example creates a data set with the variables
m and n:
proc optmodel;
number m = 7, n = 5;
create data example from m n;
- name = expression
-
transfers the value of a PROC OPTMODEL expression to the output data set
variable name. The expression is reevaluated for
each observation. If the expression contains any operators or
function calls, then it must be enclosed in parentheses. If the
expression is an identifier-expression
that refers to an array, then the index can be omitted if it matches
the key-column(s). The following example creates a data set with the
variable ratio:
proc optmodel;
number m = 7, n = 5;
create data example from ratio=(m/n);
- COL(name-expression) = expression
-
transfers the value of a PROC OPTMODEL expression to the output data set
variable named by the string expression name-expression. The PROC
OPTMODEL expression is reevaluated for each observation. If this
expression contains any operators or function calls, then it must be
enclosed in parentheses. If the PROC OPTMODEL expression is an
identifier-expression that refers to an array, then the
index can be omitted if it matches the key-column(s). The following
example uses the COL expression to form the variable s5:
proc optmodel;
number m = 7, n = 5;
create data example from col("s"||n)=(m+n);
- { index-set } < column(s) >
-
performs the transfers by
iterating each column specified by <column(s)> for each member of the index
set. If there are columns and index set
members, then columns are generated. The dummy parameters
from the index set can be used in the columns to generate
distinct output data set variable names in the iterated columns, using
COL expressions. The columns are expanded when the CREATE DATA statement
is executed, before any output is performed. This form of
column(s) cannot be nested. In other words, the following
form of column(s) is NOT allowed:
{ index-set } < { index-set } < column(s) > >
The following example demonstrates the use of the iterated column(s) form:
proc optmodel;
set<string> alph = {'a', 'b', 'c'};
var x{1..3, alph} init 2;
create data example from [i]=(1..3)
{j in alph}<col("x"||j)=x[i,j]>;
The data set created by this code is shown in Output 6.9.
Figure 6.9: CREATE DATA with COL Expression
Note: When no
key-column(s) are specified, the output data set has a
single observation.
The following code incorporates several of the preceding examples to create and print a data
set by using PROC OPTMODEL parameters:
proc optmodel;
number m = 7, n = 5;
create data example from m n ratio=(m/n) col("s"||n)=(m+n);
proc print;
run;
The output from the PRINT procedure is shown in
Output 6.10.
Figure 6.10: CREATE DATA for Single ObservationKey columns
Key-column(s) declare index values that enable
multiple observations to be written from array
column(s) . An
observation is created for each unique index value combination. The
index values supply the index for array
column(s) that do not
have an explicit index.
Key-column(s) define the data set variables where the index
value elements are written. They can also declare local dummy
parameters for use in expressions in the
column(s) .
Key-column(s) are syntactically similar to
column(s) , but
are more restricted in form. The following forms of
key-column(s)
are allowed:
- name
-
transfers an index element value to the data set variable name .
A local dummy parameter, name , is declared to hold the index element
value.
- COL(name-expression)
[ = index-name ]
-
transfers an index element value to the data set variable named by the
string-valued name-expression . Index-name optionally
declares a local dummy parameter to hold the index element value.
A key-set in the CREATE DATA statement explicitly specifies the set
of index values. Key-set can be specified as a set expression,
although it must be enclosed in parentheses if it contains any
function calls or operators. Key-set can also be specified as
an index set expression, in which case the
index-set dummy parameters override any dummy parameters that are
declared in the key-column(s) items. The following code creates
a data set from the PROC OPTMODEL parameter m, a matrix whose only
nonzero entries are located at (1, 1) and (4, 1):
proc optmodel;
number m{1..5, 1..3} = [[1 1] 1 [4 1] 1];
create data example
from [i j] = {setof{i in 1..2}<i**2>, {1, 2}} m;
proc print data=example noobs;
run;
The dummy parameter i in the SETOF expression takes precedence over
the dummy parameter i declared in the key-column(s) item. The
output from this code is shown in Output 6.11.
Figure 6.11: CREATE: Key-set with SETOF Aggregation Expression
If no key-set is specified, then the set of index values is
formed from the union of the index sets of the implicitly indexed
column(s) .
The number of index elements for each implicitly
indexed array must match the number of key-column(s) . The type
of each index element (string versus numeric) must match the element of
the same position in other implicit indices.
The arrays for implicitly indexed columns in a CREATE DATA statement do not
need to have identical index sets. A missing value is supplied for
the value of an implicitly indexed array location when the implied
index value is not in the array's index set.
In the following code, the key-set is unspecified. The set of index
values is , which is the union of the index sets of
x and y. These index sets are not identical, so missing values
are supplied when necessary. The results of this code are shown in
Output 6.12.
proc optmodel;
number x{1..2} init 2;
var y{2..3} init 3;
create data exdata from [keycol] x y;
proc print;
run;
Figure 6.12: CREATE: Unspecified Key-set
The types of the output data set variables match the types of the
source values. The output variable type for a key-column(s)
matches the corresponding element type in the index value tuple. A
numeric element matches a NUMERIC data set variable, while a string
element matches a CHAR variable. For regular column(s) the
source expression type determines the output data set variable type.
A numeric expression produces a NUMERIC variable, while a string
expression produces a CHAR variable.
Lengths of character variables in the output data set are determined
automatically. The length is set to accommodate the longest
string value output in that column.
You can use the iterated column(s) form to output selected rows of multiple
arrays, assigning a different data set variable to each column. For
example, the following code outputs the last two rows of the two-dimensional
array, a, along with corresponding elements of the one-dimensional array, b:
proc optmodel;
num m = 3; /* number of rows/observations */
num n = 4; /* number of columns in a */
num a{i in 1..m, j in 1..n} = i*j; /* compute a */
num b{i in 1..m} = i**2; /* compute b */
set<num> subset = 2..m; /* used to omit first row */
create data out
from [i]=subset {j in 1..n}<col("a"||j)=a[i,j]> b;
To specify the data set to be created, the CREATE DATA statement uses the form key-column(s) {index set}<column(s)> column(s). The preceding code creates a data set out, which has observations and
variables. The variables are named i, a1
through a, and b, as shown in Output 6.13.
Figure 6.13: CREATE DATA Set: The Iterated Column Form
See the section "Data Set Input/Output" for more examples of using the CREATE DATA statement.