The Decomposition Algorithm

Example 14.9 ATM Cash Management in Distributed Mode

This section illustrates how you can use PROC OPTMODEL and the decomposition algorithm in distributed mode. The problem is the same as the one described in Example 14.8 for managing the cash flow of an ATM network. The only difference between single-machine and distributed mode is that the PERFORMANCE statement specifies the number of threads to be used in single-machine mode or the number of threads and nodes to be used in distributed mode.

The following statement changes the operating mode to distributed mode:

   /* set the number of nodes and threads and get performance details */
   performance details nodes=5 nthreads=4;

The performance information is displayed in Output 14.9.1. When you specify NODES=5 and NTHREADS=4 in the PERFORMANCE statement in distributed mode, each grid node processes up to four threads simultaneously.

Output 14.9.1: Performance Information

Performance Information
   
Host Node << your grid host >>
Execution Mode Distributed
Grid Mode Symmetric
Number of Compute Nodes 5
Number of Threads per Node 4


The solution summary and procedure task timing tables are displayed in Output 14.9.2.

Output 14.9.2: Performance Information, Solution Summary, and Task Timing Tables

Solution Summary
Solver MILP
Algorithm Decomposition
Objective Function CashFlowDiff
Solution Status Optimal within Relative Gap
Objective Value 2463699.6896
   
Relative Gap 0.0000626319
Absolute Gap 154.29644769
Primal Infeasibility 1.20508E-11
Bound Infeasibility 4.440892E-16
Integer Infeasibility 7.882583E-15
   
Best Bound 2463545.3932
Nodes 14
Iterations 22
Presolve Time 0.21
Solution Time 98.49

Procedure Task Timing
Task Time
(sec.)
% Time
Problem Generation 0.04 0.04%
Solver Initialization 0.09 0.09%
Code Generation 0.00 0.00%
Solver 98.50 99.61%
Solver Postprocessing 0.26 0.26%

Performance Information
   
Host Node << your grid host >>
Execution Mode Distributed
Grid Mode Symmetric
Number of Compute Nodes 5
Number of Threads per Node 4


The iteration log, which contains the problem statistics, the progress of the solution, and the optimal objective value, is shown in Output 14.9.3.

Output 14.9.3: Log

NOTE: There were 100 observations read from the data set WORK.BUDGET_DATA.            
NOTE: There were 20 observations read from the data set WORK.CASHOUT_DATA.            
NOTE: There were 2000 observations read from the data set WORK.POLYFIT_DATA.          
NOTE: Problem generation will use 4 threads.                                          
NOTE: The problem has 6480 variables (0 free, 0 fixed).                               
NOTE: The problem has 2220 binary and 0 integer variables.                            
NOTE: The problem has 4380 linear constraints (2340 LE, 2040 EQ, 0 GE, 0 range).      
NOTE: The problem has 58878 linear constraint coefficients.                           
NOTE: The problem has 0 nonlinear constraints (0 LE, 0 EQ, 0 GE, 0 range).            
NOTE: The MILP presolver value AUTOMATIC is applied.                                  
NOTE: The MILP presolver removed 502 variables and 345 constraints.                   
NOTE: The MILP presolver removed 1172 constraint coefficients.                        
NOTE: The MILP presolver modified 0 constraint coefficients.                          
NOTE: The presolved problem has 5978 variables, 4035 constraints, and 57706           
      constraint coefficients.                                                        
NOTE: The MILP solver is called.                                                      
NOTE: The Decomposition algorithm is used.                                            
NOTE: The Decomposition algorithm is executing in the distributed computing           
      environment with 5 worker nodes.                                                
NOTE: The DECOMP method value USER is applied.                                        
NOTE: The problem has a decomposable structure with 20 blocks. The largest block      
      covers 5.18% of the constraints in the problem.                                 
NOTE: The decomposition subproblems cover 5978 (100.00%) variables and 3935 (97.52%)  
      constraints.                                                                    
NOTE: The deterministic parallel mode is enabled.                                     
NOTE: The Decomposition algorithm is using up to 4 threads.                           
      Iter         Best       Master         Best       LP       IP  Real             
                  Bound    Objective      Integer      Gap      Gap  Time             
NOTE: Starting phase 1.                                                               
         1       0.0000       1.1767            . 1.18e+00        .    16             
         2       0.0000       0.0000            .    0.00%        .    16             
NOTE: Starting phase 2.                                                               
         .   2.4432e+06   2.7074e+06   2.7754e+06   10.82%   13.60%    16             
         4   2.4526e+06   2.4878e+06   2.7754e+06    1.43%   13.16%    26             
         5   2.4630e+06   2.4642e+06   2.7754e+06    0.05%   12.68%    30             
NOTE: The Decomposition algorithm stopped on the continuous RELOBJGAP= option.        
         .   2.4630e+06   2.4632e+06   2.4660e+06    0.01%    0.12%    30             
         6   2.4630e+06   2.4632e+06   2.4660e+06    0.01%    0.12%    30             
NOTE: Starting branch and bound.                                                      
         Node  Active   Sols         Best         Best      Gap    Real               
                                  Integer        Bound             Time               
            0       1      2   2.4660e+06   2.4630e+06    0.12%      30               
            9       9      3   2.4654e+06   2.4635e+06    0.07%      69               
           10      10      3   2.4654e+06   2.4635e+06    0.07%      73               
           13       5      4   2.4637e+06   2.4635e+06    0.01%      87               
NOTE: The Decomposition algorithm used 4 threads.                                     
NOTE: The Decomposition algorithm time is 87.17 seconds.                              
NOTE: Optimal within relative gap.                                                    
NOTE: Objective = 2463699.6896.                                                       
NOTE: The data set WORK.PERFINFO has 5 observations and 3 variables.                  


Notice how this iteration log differs from the iteration log from single-machine mode in Example 14.8. In distributed mode, the processing is done on multiple grid machines, as opposed to being done on one client machine in single-machine mode. In this example, the grid machines and the client machine have different operating systems, and some numerical rounding off leads to different paths in the search space. When you compare two runs on different operating systems (or that use different compilers), this behavior is expected.