The OPTGRAPH Procedure

SUMMARY Statement

  • SUMMARY < options >;

The SUMMARY statement invokes an algorithm that calculates various summary metrics on an input graph.

The summary metrics are described in the section Summary.

You can specify the following options in the SUMMARY statement:

BICONCOMP

specifies whether to calculate information about biconnected components. The graph must be undirected.

BY_CLUSTER

specifies whether to decompose the calculations by cluster (or subgraph). If this option is specified, PROC OPTGRAPH looks for a definition of the clusters in the input data set specified in the DATA_NODES= option.

CONCOMP

specifies whether to calculate information about connected components.

DIAMETER_APPROX=WEIGHT | UNWEIGHT | BOTH

specifies whether to calculate information about the approximate diameter and what type of calculations to perform. Use this option when calculating the exact diameter (by calculating all shortest paths) is too expensive.

Table 1.40: Values for the DIAMETER_APPROX= Option

Option Value

Description

WEIGHT

Calculates approximate diameter based on the weighted graph.

UNWEIGHT

Calculates approximate diameter based on the unweighted graph.

BOTH

Calculates approximate diameter based on both weighted and unweighted graphs.


If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight). This option works only for undirected graphs.

LOGFREQNODE=number

controls the frequency for displaying iteration logs for some of the summary metrics. For computationally intensive summary metrics such as shortest path, this option displays progress every number nodes. If you also specify the BY_CLUSTER option in this statement or a value greater than 1 for the NTHREADS= option in the PERFORMANCE statement, this option is ignored and the display frequency is determined by using the LOGFREQTIME= option instead. The value of number can be any integer greater than or equal to 1. The default is determined automatically based on the size of the graph. Setting this value too low can hurt performance on large-scale graphs.

LOGFREQTIME=number

controls the frequency for displaying iteration logs for some of the summary metrics. For computationally intensive summary metrics such as shortest path, this option displays progress every number seconds. When you specify a value greater than 1 for the NTHREADS= option in the PERFORMANCE statement, PROC OPTGRAPH displays the number of nodes that have completed. When you specify the BY_CLUSTER option, PROC OPTGRAPH displays the number of subgraphs that have completed. The value of number can be any integer greater than or equal to 1; the default is 5. Setting this value too low can hurt performance on large-scale graphs.

LOGLEVEL=number

controls the amount of information that is displayed in the SAS log. Table 1.41 describes the valid values for this option.

Table 1.41: Values for LOGLEVEL= Option

number

string

Description

0

NONE

Turns off all algorithm-related messages in the SAS log

1

BASIC

Displays a basic summary of the algorithmic processing

2

MODERATE

Displays a summary of the algorithmic processing

3

AGGRESSIVE

Displays a detailed summary of the algorithmic processing


The default is the value that is specified in the LOGLEVEL= option in the PROC OPTGRAPH statement (or BASIC if that option is not specified).

OUT=SAS-data-set

specifies the output data set to contain the summary results.

SHORTPATH=WEIGHT | UNWEIGHT | BOTH

specifies whether to calculate information about shortest paths and what type of calculations to perform.

Table 1.42: Values for the SHORTPATH= Option

Option Value

Description

WEIGHT

Calculates shortest paths based on the weighted graph.

UNWEIGHT

Calculates shortest paths based on the unweighted graph.

BOTH

Calculates shortest paths based on both weighted and unweighted graphs.


If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight).

SUBSIZESWITCH=number

specifies the size of the subgraphs (number of nodes) to run separately when you also specify the BY_CLUSTER option in this statement and a value greater than 1 for the NTHREADS= option in the PERFORMANCE statement. When PROC OPTGRAPH processes summary by subgraphs, it uses thread logic to simultaneously process n subgraphs, where n is the number of threads specified in the NTHREADS= option in the PERFORMANCE statement. Subgraphs that have more nodes than number are processed sequentially, enabling the threading to be done at the summary metric level. The default is 10,000.