CENTRALITY < options >;
The CENTRALITY statement enables you to select which centrality metrics to calculate for the given input graph. It also enables you to specify options for particular metrics. The resulting metrics are included in the node output data set (specified in the OUT_NODES= option) or the link output data set (specified in the OUT_LINKS= option).
The centrality metrics are described in the section Centrality.
You can specify the following options in the CENTRALITY statement.
specifies which type of authority centrality to calculate.
Table 1.10: Values for the AUTH= Option
Option Value |
Description |
---|---|
WEIGHT |
Calculates authority centrality based on the weighted graph. |
UNWEIGHT |
Calculates authority centrality based on the unweighted graph. |
BOTH |
Calculates authority centrality based on both weighted and unweighted graphs. |
If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight). This centrality metric can be used only for directed graphs. The authority centrality metric is described in the section Hub and Authority Scoring.
specifies which type of betweenness centrality to calculate.
Table 1.11: Values for the BETWEEN= Option
Option Value |
Description |
---|---|
WEIGHT |
Calculates betweenness centrality based on the weighted graph. |
UNWEIGHT |
Calculates betweenness centrality based on the unweighted graph. |
BOTH |
Calculates betweenness centrality based on both weighted and unweighted graphs. |
If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight). If the OUT_NODES= option is specified in the PROC OPTGRAPH statement, the node betweenness metric is produced. If the OUT_LINKS= option is specified, the link betweenness metric is produced. The betweenness centrality metric is described in the section Betweenness Centrality.
specifies whether to normalize the betweenness centrality metrics.
Table 1.12: Values for the BETWEEN_NORM= Option
Option Value |
Description |
---|---|
YES |
Normalizes the betweenness metrics. This is the default. |
NO |
Does not normalize the betweenness metrics. |
The normalization factor for betweenness centrality is described in the section Betweenness Centrality.
decomposes the calculations by cluster (or subgraph). If this option is specified, PROC OPTGRAPH looks for a definition of the clusters in the input data set specified by the DATA_NODES= option in the PROC OPTGRAPH statement. The use of the BY_CLUSTER option is described in the section Processing by Cluster.
specifies which type of closeness centrality to calculate.
Table 1.13: Values for the CLOSE= Option
Option Value |
Description |
---|---|
WEIGHT |
Calculates closeness centrality based on the weighted graph. |
UNWEIGHT |
Calculates closeness centrality based on the unweighted graph. |
BOTH |
Calculates closeness centrality based on both weighted and unweighted graphs. |
If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight). The closeness centrality metric is described in the section Closeness Centrality.
specifies a method for accounting for a shortest path distance between two nodes when a path does not exist (disconnected nodes).
Table 1.14: Values for the CLOSE_NOPATH= Option
Option Value |
Description |
---|---|
NNODES |
Uses the number of nodes as a shortest path between disconnected nodes. This option cannot be used in calculating weighted closeness centrality. |
DIAMETER |
Uses the graph diameter (plus one) as a shortest path between disconnected nodes. This is the default. |
ZERO |
Uses zero as a shortest path between disconnected nodes. |
HARMONIC |
Uses the harmonic formula for closeness centrality. |
For each option, there is a slight variation in the formula for the closeness centrality metric. These differences are described in the section Closeness Centrality.
calculates the node clustering coefficient. The cluster coefficient is described in the section Clustering Coefficient.
specifies which type of degree centrality to calculate for the input graph.
Table 1.15: Values for the DEGREE= Option
Option Value |
Description |
---|---|
IN |
Calculates degree based on in-links. |
OUT |
Calculates degree based on out-links. |
BOTH |
Calculates degree based on in-links and out-links. |
For an undirected graph, the option values IN and BOTH are ignored, because there is only one notion of degree, which corresponds to the degree of out-links. The degree centrality metric is described in the section Degree Centrality.
specifies which type of eigenvector centrality to calculate.
Table 1.16: Values for the EIGEN= Option
Option Value |
Description |
---|---|
WEIGHT |
Calculates eigenvector centrality based on the weighted graph. |
UNWEIGHT |
Calculates eigenvector centrality based on the unweighted graph. |
BOTH |
Calculates eigenvector centrality based on both weighted and unweighted graphs. |
If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight). This centrality metric can be used only for undirected graphs. The eigenvector centrality metric is described in the section Eigenvector Centrality.
specifies the algorithm to use in calculating centrality metrics that require solving eigensystems (EIGEN, HUB, and AUTH).
Table 1.17: Values for the EIGEN_ALGORITHM= Option
Option Value |
Description |
---|---|
AUTOMATIC |
Requests that PROC OPTGRAPH automatically determine the eigensolver to use. This is the default. |
JACOBI_DAVIDSON (JD) |
Uses a variant of the Jacobi-Davidson algorithm for solving eigensystems (Sleijpen and van der Vorst 2000). This is used as the default for the eigenvector metric on undirected graphs and the hub and authority metrics. |
POWER |
Uses the power method to calculate eigenvectors. This is used as the default for the eigenvector metric on directed graphs. |
specifies the maximum number of iterations to use for eigenvector calculations to limit the amount of computation time spent when convergence is slow. By default, EIGEN_MAXITER=10,000.
specifies which type of hub centrality to calculate.
Table 1.18: Values for the HUB= Option
Option Value |
Description |
---|---|
WEIGHT |
Calculates hub centrality based on the weighted graph. |
UNWEIGHT |
Calculates hub centrality based on the unweighted graph. |
BOTH |
Calculates hub centrality based on both weighted and unweighted graphs. |
If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight). This centrality metric can be used only for directed graphs. The hub centrality metric is described in the section Hub and Authority Scoring.
specifies which type of influence centrality to calculate.
Table 1.19: Values for the INFLUENCE= Option
Option Value |
Description |
---|---|
WEIGHT |
Calculates influence centrality based on the weighted graph. |
UNWEIGHT |
Calculates influence centrality based on the unweighted graph. |
BOTH |
Calculates influence centrality based on both weighted and unweighted graphs. |
If the input graph does not contain weights, then WEIGHT and UNWEIGHT both give the same results (using 1.0 for each link weight). The influence centrality metric is described in the section Influence Centrality.
controls the frequency for displaying iteration logs for some of the centrality metrics. For computationally intensive algorithms such as betweenness and closeness centrality, this option displays progress every number nodes. If you also specify the BY_CLUSTER option in this statement or a value greater than 1 for the NTHREADS= option in the PERFORMANCE statement, this option is ignored and the display frequency is determined by using the LOGFREQTIME= option instead. The value of number can be any integer greater than or equal to 1; the default is determined automatically based on the size of the graph. Setting this value too low can hurt performance on large-scale graphs.
controls the frequency for displaying iteration logs for some of the centrality metrics. For computationally intensive algorithms such as betweenness and closeness centrality, this option displays progress every number seconds. If you specify a value greater than 1 for the NTHREADS= option in the PERFORMANCE statement, PROC OPTGRAPH displays the number of nodes that have completed. If you specify the BY_CLUSTER option, PROC OPTGRAPH displays the number of subgraphs that have completed. The value of number can be any integer greater than or equal to 1; the default is 5. Setting this value too low can hurt performance on large-scale graphs.
controls the amount of information that is displayed in the SAS log. Table 1.20 describes the valid values for this option.
Table 1.20: Values for LOGLEVEL= Option
number |
string |
Description |
---|---|---|
0 |
NONE |
Turns off all algorithm-related messages in the SAS log |
1 |
BASIC |
Displays a basic summary of the algorithmic processing |
2 |
MODERATE |
Displays a summary of the algorithmic processing including a progress log using the interval that is specified in the LOGFREQNODE= or LOGFREQTIME= option |
3 |
AGGRESSIVE |
Displays a detailed summary of the algorithmic processing including a progress log using the interval that is specified in the LOGFREQNODE= or LOGFREQTIME= option |
The default is the value that is specified in the LOGLEVEL= option in the PROC OPTGRAPH statement (or BASIC if that option is not specified).
specifies the size of the subgraphs (number of nodes) to run separately when you also specify the BY_CLUSTER option in this statement and a value greater than 1 for the NTHREADS= option in the PERFORMANCE statement. When PROC OPTGRAPH processes summary by subgraphs, it uses thread logic to simultaneously process n subgraphs, where n is the number of threads specified in the NTHREADS= option in the PERFORMANCE statement. Subgraphs that have more nodes than number are processed sequentially, enabling the threading to be done at the centrality metric level. The default is 10,000.
specifies the data set variable name for a second link weight. The value of column must be numeric. The use of this option is described in more detail in the section Weight Interpretation.