Data Set Input/Output

You can use the CREATE DATA and READ DATA statements to exchange PROC OPTMODEL data with SAS data sets. The statements can move data into and out of PROC OPTMODEL parameters and variables. For example, the following statements use a CREATE DATA statement to save the results from an optimization into a data set:

   proc optmodel;
      var x;
      min z = (x-5)**2;
      solve;
      create data optdata from xopt=x z;

These statements write a single observation into the data set OPTDATA. The data set contains two variables, xopt and z, and the values contain the optimized values of the PROC OPTMODEL variable x and objective z, respectively. The statement xopt=x renames the variable x to xopt.

The group of values held by a data set variable in different observations of a data set is referred to as a column. The READ DATA and CREATE DATA statements specify a set of columns for a data set and define how data are to be transferred between the columns and PROC OPTMODEL parameters.

Columns in square brackets ([ ]) are handled specially. Such columns are called key columns. Key columns specify element values that provide an implicit index for subsequent array columns. The following example uses key columns with the CREATE DATA statement to write out variable values from an array:

proc optmodel;
   set LOCS = {'New York', 'Washington', 'Boston'};  /* locations */
   set DOW = 1..7;  /* day of week */
   var s{LOCS, DOW} init 1;
   create data soldata from [location day_of_week]={LOCS, DOW} sale=s;

In this case the optimization variable s is initialized to a value of 1 and is indexed by values from the set parameters LOCS and DOW. The output data set contains an observation for each combination of values in these sets. The output data set contains three variables, location, day_of_week, and sale. The data set variables location and day_of_week save the index element values for the optimization variable s that is written in each observation. The data set created is shown in Figure 5.38.

Figure 5.38: Data Sets Created

Data Set: SOLDATA

Obs location day_of_week sale
1 New York 1 1
2 New York 2 1
3 New York 3 1
4 New York 4 1
5 New York 5 1
6 New York 6 1
7 New York 7 1
8 Washington 1 1
9 Washington 2 1
10 Washington 3 1
11 Washington 4 1
12 Washington 5 1
13 Washington 6 1
14 Washington 7 1
15 Boston 1 1
16 Boston 2 1
17 Boston 3 1
18 Boston 4 1
19 Boston 5 1
20 Boston 6 1
21 Boston 7 1


Note that the key columns in the preceding example do not name existing PROC OPTMODEL variables. They create new local dummy parameters, location and day_of_week, in the same manner as dummy parameters in index sets. These local parameters can be used in subsequent columns. For example, the following statements demonstrate how to use a key column value in an expression for a later column value:

   proc optmodel;
      create data tab
             from [i]=(1..10) 
             Square=(i*i) Cube=(i*i*i);

These statements create a data set that has 10 observations that hold squares and cubes of the numbers from 1 to 10. The key column variable here is named i and is explicitly assigned the values from 1 to 10, while the data set variables Square and Cube hold the square and cube, respectively, of the corresponding value of i.

In the preceding example the key column values are simply the numbers from 1 to 10. The value is the same as the observation number, so the variable i is redundant. You can remove the data set variable for a key column via the DROP data set option, as follows:

   proc optmodel;
      create data tab2 (drop=i)
             from [i] =(1..10)
             Square=(i*i) Cube=(i*i*i);

The local parameters declared by key columns receive their values in various ways. For a READ DATA statement, the key column values come from the data set variables for the column. In a CREATE DATA statement, the values can be defined explicitly, as shown in the previous example. Otherwise, the CREATE DATA statement generates a set of values that combines the index sets of array columns that need implicit indexing. The statements that produce the output in Figure 5.38 demonstrate implicit indexing.

Use a suffix (Suffixes) to read or write auxiliary values, such as variable bounds or constraint duals. For example, consider the following statements:

data pdat;
   input p $ maxprod cost;
   datalines;
ABQ    12  0.7
MIA     9  0.6
CHI    14  0.5
run;

proc optmodel;
   set<string> plants;
   var prod{plants} >= 0;
   number cost{plants};
   read data pdat into plants=[p] prod.ub=maxprod cost;

The statement plants=[p] in the READ DATA statement declares p as a key column and instructs PROC OPTMODEL to store the set of plant names from the data set variable p into the set parameter plants. The statement assigns the upper bound for the variable prod indexed by p to be the value of the data set variable maxprod. The cost parameter location indexed by p is also assigned to be the value of the data set variable cost.

The target variables prod and cost in the preceding example use implicit indexing. Indexing can also be performed explicitly. The following version of the READ DATA statement makes the indices explicit:

   read data pdat into plants=[p] prod[p].ub=maxprod cost[p];

Explicit indexing is useful when array indices need to be transformed from the key column values in the data set. For example, the following statements reverse the order in which elements from the data set are stored in an array:

data abcd;
   input letter $ @@;
   datalines;
a b c d
;
proc optmodel;
   set<num> subscripts=1..4;
   string  letter{subscripts};
   read data abcd into [_N_] letter[5-_N_];
   print letter;

The output from this example appears in Figure 5.39.

Figure 5.39: READ DATA Statement: Explicit Indexing

[1] letter
1 d
2 c
3 b
4 a


The following example demonstrates the use of explicit indexing to save sequential subsets of an array in individual data sets:

data revdata;
   input month rev @@;
   datalines;
1 200 2 345 3 362 4 958
5 659 6 804 7 487 8 146
9 683 10 732 11 652 12 469
;
proc optmodel;
   set m = 1..3;
   var revenue{1..12};
   read data revdata into [_N_] revenue=rev; 
   create data qtr1 from [month]=m revenue[month];
   create data qtr2 from [month]=m revenue[month+3];
   create data qtr3 from [month]=m revenue[month+6];
   create data qtr4 from [month]=m revenue[month+9];

Each CREATE DATA statement generates a data set that represents one quarter of the year. Each data set contains the variables month and revenue. The data set qtr2 is shown in Figure 5.40.

Figure 5.40: CREATE DATA Statement: Explicit Indexing

Obs month revenue
1 1 958
2 2 659
3 3 804