A spanning tree of a connected undirected graph is a subgraph that is a tree that connects all the nodes together. When weights have been assigned to the links, a minimum spanning tree (MST) is a spanning tree whose sum of link weights is less than or equal to the sum of link weights of every other spanning tree. More generally, any undirected graph (not necessarily connected) has a minimum spanning forest, which is a union of minimum spanning trees of its connected components.
In PROC OPTNET, you can invoke the minimum spanning tree algorithm by using the MINSPANTREE statement. The options for this statement are described in the section MINSPANTREE Statement. This algorithm can be used only on undirected graphs.
The resulting minimum spanning tree is contained in the output data set that is specified in the OUT= option in the MINSPANTREE statement.
The minimum spanning tree algorithm reports status information in a macro variable called _OROPTNET_MST_. See the section Macro Variable _OROPTNET_MST_ for more information about this macro variable.
PROC OPTNET uses Kruskal’s algorithm (Kruskal 1956) to compute the minimum spanning tree. This algorithm runs in time and therefore should scale to very large graphs.
As a simple example, consider the weighted undirected graph in Figure 2.46.
The links data set can be represented as follows:
data LinkSetIn; input from $ to $ weight @@; datalines; A B 7 A D 5 B C 8 B D 9 B E 7 C E 5 D E 15 D F 6 E F 8 E G 9 F G 11 H I 1 I J 3 H J 2 ;
The following statements calculate a minimum spanning forest and output the results in the data set MinSpanForest
:
proc optnet data_links = LinkSetIn; minspantree out = MinSpanForest; run;
The data set MinSpanForest
now contains the links that belong to a minimum spanning forest, which is shown in Figure 2.47.
The minimal cost links are shown in green in Figure 2.48.
For a more detailed example, see Minimum Spanning Tree for Computer Network Topology.