This example demonstrates how to rewrite certain models for more efficient processing. Sometimes optimization models that run out of memory during problem generation can be rewritten to take advantage of sparsity to use memory more efficiently. This often occurs when a large array is modeled in a dense format but most of its entries are zeros. Usually, the array provides problem coefficients or it contains optimization variables.
The model for this example solves the facility location problem that is described in Example 8.3. This example is concerned with the resources that are required for PROC OPTMODEL problem generation and solver initialization. So the size of the problem has been increased, but the model has also been modified to make it easier to solve. In order to handle the larger problem size, the model eliminates a large number of the potential assignments of customers to facilities based on distance, making the problem sparse.
The following code generates a random instance of the facility location problem:
%let NumCustomers = 1500; %let NumSites = 250; %let SiteCapacity = 50; %let MaxDemand = 10; %let xmax = 200; %let ymax = 100; %let seed = 938; /* generate random customer locations */ data cdata(drop=i); length name $8; call streaminit(&seed); do i = 1 to &NumCustomers; name = compress('C'||put(i,best.)); x = rand('UNIFORM') * &xmax; y = rand('UNIFORM') * &ymax; demand = 1; output; end; run; /* generate random site locations and fixed charge */ data sdata(drop=i); length name $8; call streaminit(&seed); do i = 1 to &NumSites; name = compress('SITE'||put(i,best.)); x = rand('UNIFORM') * &xmax; y = rand('UNIFORM') * &ymax; fixed_charge = 300 * (abs(&xmax/2-x)/&xmax + abs(&ymax/2-y)/&ymax); output; end; run;
The following code uses a dense version of the facility location model. This model is equivalent to the model from Example 8.3 except for the added constraint distance_at_most_30
. This constraint eliminates from consideration the assignment of customers to facilities over long distances by forcing the
corresponding Assign
variables to 0.
proc optmodel; performance details; profile on percent=0.1; set <str> CUSTOMERS; set <str> SITES init {}; /* x and y coordinates of CUSTOMERS and SITES */ num x {CUSTOMERS union SITES}; num y {CUSTOMERS union SITES}; num demand {CUSTOMERS}; num fixed_charge {SITES}; /* distance from customer i to site j */ num dist {i in CUSTOMERS, j in SITES} = sqrt((x[i] - x[j])^2 + (y[i] - y[j])^2); read data cdata into CUSTOMERS=[name] x y demand; read data sdata into SITES=[name] x y fixed_charge; var Assign {CUSTOMERS, SITES} binary; var Build {SITES} binary; min CostNoFixedCharge = sum {i in CUSTOMERS, j in SITES} dist[i,j] * Assign[i,j]; min CostFixedCharge = CostNoFixedCharge + sum {j in SITES} fixed_charge[j] * Build[j]; /* each customer assigned to exactly one site */ con assign_def {i in CUSTOMERS}: sum {j in SITES} Assign[i,j] = 1; /* if customer i assigned to site j, then facility must be built at j */ con link {i in CUSTOMERS, j in SITES}: Assign[i,j] <= Build[j]; /* each site can handle at most &SiteCapacity demand */ con capacity {j in SITES}: sum {i in CUSTOMERS} demand[i] * Assign[i,j] <= &SiteCapacity * Build[j]; /* do not assign customer to site more than 30 units away */ con distance_at_most_30 {i in CUSTOMERS, j in SITES: dist[i,j] > 30}: Assign[i,j] = 0; /* solve the MILP */ solve with milp/timetype=real; quit;
If you inspect the log after running the preceding code, then you will see that the MILP presolver has pruned down the problem size considerably. If you also run the code with the SAS option FULLSTIMER enabled on a 64-bit system, then you will notice that about 1.3GB of memory is required for the PROC OPTMODEL step when you are running on a single CPU.
The solution and timing results for the dense model are shown in Output 5.7.1. The PERFORMANCE DETAILS statement from the model requests display of the task timing table for the SOLVE statement. The PROFILE ON statement requests further timing details, including evaluation time for declarations used by problem generation.
Output 5.7.1: Dense Model Results
Solution Summary | |
---|---|
Solver | MILP |
Algorithm | Branch and Cut |
Objective Function | CostFixedCharge |
Solution Status | Optimal within Relative Gap |
Objective Value | 18001.14035 |
Relative Gap | 0.000098081 |
Absolute Gap | 1.7653963848 |
Primal Infeasibility | 1.14871E-12 |
Bound Infeasibility | 9.9999999E-7 |
Integer Infeasibility | 1E-6 |
Best Bound | 17999.374954 |
Nodes | 5 |
Iterations | 40131 |
Presolve Time | 1.20 |
Solution Time | 62.73 |
Procedure Task Timing | ||
---|---|---|
Task | Time (sec.) |
% Time |
Problem Generation | 4.57 | 6.64% |
Solver Initialization | 0.90 | 1.30% |
Code Generation | 0.06 | 0.08% |
Presolve | 1.20 | 1.75% |
Root Node Processing | 50.70 | 73.62% |
Branch And Cut | 9.32 | 13.54% |
Synchronization | 0.20 | 0.30% |
Idle | 1.13 | 1.64% |
Other Tasks | 0.17 | 0.25% |
Solver Postprocessing | 0.62 | 0.89% |
Profile Information | ||||||
---|---|---|---|---|---|---|
Item | Line | Col. | Execution Count |
Net Time (sec.) |
Wait Time (sec.) |
% Total Time |
SOLVE | 3546 | 4 | 1 | 64.85 | 2.17 | 85.9% |
Constraint distance_at_most_30 | 3542 | 8 | 5.59 | 0.00 | 7.4% | |
Constraint link | 3533 | 8 | 3.22 | 0.00 | 4.3% | |
Number dist | 3514 | 8 | 0.52 | 0.00 | 0.7% | |
Constraint capacity | 3537 | 8 | 0.38 | 0.00 | 0.5% | |
Min CostNoFixedCharge | 3523 | 8 | 0.37 | 0.00 | 0.5% | |
Var Assign | 3520 | 8 | 0.29 | 0.00 | 0.4% | |
Constraint assign_def | 3529 | 8 | 0.28 | 0.00 | 0.4% | |
Other profiled items | 0.01 | 0.00 | 0.0% |
Note: | Total profiled time is 75.51 seconds. |
The best approach for reducing the memory requirements is to eliminate the Assign
variables that are always going to be 0. This is accomplished in the following sparse version of the code. Instead of indexing
Assign
over the crossproduct of CUSTOMERS
and SITES
, now the code defines a new set of pairs that satisfy the distance requirement, CUSTOMERS_SITES
. This set replaces the constraint distance_at_most_30
in the dense model. The objective and constraints have been modified to use the new indexing scheme, with implicit set slicing
(as described in the section More on Index Sets) for constraints assign_def
and capacity
.
proc optmodel; performance details; profile on percent=0.1; set <str> CUSTOMERS; set <str> SITES init {}; /* x and y coordinates of CUSTOMERS and SITES */ num x {CUSTOMERS union SITES}; num y {CUSTOMERS union SITES}; num demand {CUSTOMERS}; num fixed_charge {SITES}; /* distance from customer i to site j */ num dist {i in CUSTOMERS, j in SITES} = sqrt((x[i] - x[j])^2 + (y[i] - y[j])^2); read data cdata into CUSTOMERS=[name] x y demand; read data sdata into SITES=[name] x y fixed_charge; set CUSTOMERS_SITES = {i in CUSTOMERS, j in SITES: dist[i,j] <= 30}; var Assign {CUSTOMERS_SITES} binary; var Build {SITES} binary; min CostNoFixedCharge = sum {<i,j> in CUSTOMERS_SITES} dist[i,j] * Assign[i,j]; min CostFixedCharge = CostNoFixedCharge + sum {j in SITES} fixed_charge[j] * Build[j]; /* each customer assigned to exactly one site */ con assign_def {i in CUSTOMERS}: sum {<(i),j> in CUSTOMERS_SITES} Assign[i,j] = 1; /* if customer i assigned to site j, then facility must be built at j */ con link {<i,j> in CUSTOMERS_SITES}: Assign[i,j] <= Build[j]; /* each site can handle at most &SiteCapacity demand */ con capacity {j in SITES}: sum {<i,(j)> in CUSTOMERS_SITES} demand[i] * Assign[i,j] <= &SiteCapacity * Build[j]; /* solve the MILP */ solve with milp/timetype=real; quit;
The log from running the preceding code shows that the MILP presolver does not find anything to improve with this version of the model. On a 64-bit system, the FULLSTIMER option shows that memory requirements have been reduced to about 580MB when you are running on a single CPU, less than half the requirements of the previous model.
The solution and timing results for the dense model are shown in Output 5.7.2. Note that the dense model (Output 5.7.1) and the sparse model (Output 5.7.2) are equivalent after presolver processing and generate the same result using similar amounts of solver time. On the other hand, problem generation time is significantly reduced as are other times including presolve time. Both models used the solver option TIMETYPE=REAL so that all times are reported in seconds of real time. You can see from the "Profile Information" table that the overhead associated with problem declarations has been significantly reduced.
Output 5.7.2: Sparse Model Results
Solution Summary | |
---|---|
Solver | MILP |
Algorithm | Branch and Cut |
Objective Function | CostFixedCharge |
Solution Status | Optimal within Relative Gap |
Objective Value | 18001.14035 |
Relative Gap | 0.000098081 |
Absolute Gap | 1.7653963848 |
Primal Infeasibility | 1.14871E-12 |
Bound Infeasibility | 9.9999999E-7 |
Integer Infeasibility | 1E-6 |
Best Bound | 17999.374954 |
Nodes | 5 |
Iterations | 40131 |
Presolve Time | 0.29 |
Solution Time | 60.60 |
Procedure Task Timing | ||
---|---|---|
Task | Time (sec.) |
% Time |
Problem Generation | 1.19 | 1.92% |
Solver Initialization | 0.10 | 0.17% |
Code Generation | 0.00 | 0.01% |
Presolve | 0.29 | 0.47% |
Root Node Processing | 49.82 | 80.44% |
Branch And Cut | 9.02 | 14.57% |
Synchronization | 0.20 | 0.32% |
Idle | 1.10 | 1.77% |
Other Tasks | 0.17 | 0.28% |
Solver Postprocessing | 0.04 | 0.06% |
Profile Information | ||||||
---|---|---|---|---|---|---|
Item | Line | Col. | Execution Count |
Net Time (sec.) |
Wait Time (sec.) |
% Total Time |
SOLVE | 3596 | 4 | 1 | 60.79 | 0.12 | 97.5% |
Number dist | 3567 | 8 | 0.49 | 0.00 | 0.8% | |
Set CUSTOMERS_SITES | 3573 | 8 | 0.46 | 0.00 | 0.7% | |
Constraint link | 3587 | 8 | 0.35 | 0.00 | 0.6% | |
Constraint assign_def | 3583 | 8 | 0.08 | 0.00 | 0.1% | |
Constraint capacity | 3591 | 8 | 0.07 | 0.00 | 0.1% | |
Other profiled items | 0.08 | 0.00 | 0.1% |
Note: | Total profiled time is 62.32 seconds. |