The MODECLUS Procedure |
This example uses distance data and illustrates the use of the TRANSPOSE procedure and the DATA step to fill in the upper triangle of the distance matrix. The results are displayed in Output 57.2.1 through Output 57.2.3.
The following statements produce Output 57.2.1:
title 'Modeclus Analysis of 10 American Cities'; title2 'Based on Flying Mileages'; data mileages(type=distance); input (Atlanta Chicago Denver Houston LosAngeles Miami NewYork SanFrancisco Seattle DC) (5.) @53 City $15.; datalines; 0 Atlanta 587 0 Chicago 1212 920 0 Denver 701 940 879 0 Houston 1936 1745 831 1374 0 Los Angeles 604 1188 1726 968 2339 0 Miami 748 713 1631 1420 2451 1092 0 New York 2139 1858 949 1645 347 2594 2571 0 San Francisco 2182 1737 1021 1891 959 2734 2408 678 0 Seattle 543 597 1494 1220 2300 923 205 2442 2329 0 Washington D.C. ;
*-----Fill in Upper Triangle of Distance Matrix---------------; proc transpose out=tran; copy city; run;
data mileages(type=distance); merge mileages tran; array var[*] atlanta--dc; array col[*] col1-col10; do i = 1 to 10; var[i] = sum(var[i], col[i]); end; drop col1-col10 _name_ i; run;
*-----Clustering with K-Nearest-Neighbor Density Estimates-----; proc modeclus data=mileages all m=1 k=3; id CITY; run;
Modeclus Analysis of 10 American Cities |
Based on Flying Mileages |
Nearest Neighbor List | ||
---|---|---|
City | Neighbor | Distance |
Atlanta | Washington D.C. | 543.0000000 |
Chicago | 587.0000000 | |
Chicago | Atlanta | 587.0000000 |
Washington D.C. | 597.0000000 | |
Denver | Los Angeles | 831.0000000 |
Houston | 879.0000000 | |
Houston | Atlanta | 701.0000000 |
Denver | 879.0000000 | |
Los Angeles | San Francisco | 347.0000000 |
Denver | 831.0000000 | |
Miami | Atlanta | 604.0000000 |
Washington D.C. | 923.0000000 | |
New York | Washington D.C. | 205.0000000 |
Chicago | 713.0000000 | |
San Francisco | Los Angeles | 347.0000000 |
Seattle | 678.0000000 | |
Seattle | San Francisco | 678.0000000 |
Los Angeles | 959.0000000 | |
Washington D.C. | New York | 205.0000000 |
Atlanta | 543.0000000 |
Modeclus Analysis of 10 American Cities |
Based on Flying Mileages |
Sums of Density Estimates Within Neighborhood | ||||||
---|---|---|---|---|---|---|
Cluster | City | Estimated Density |
Same Cluster |
Other Clusters |
Total | Cluster Proportion Same/Total |
1 | Atlanta | 0.00025554 | 0.0005275 | 0 | 0.0005275 | 1.000 |
Chicago | 0.00025126 | 0.00053178 | 0 | 0.00053178 | 1.000 | |
Houston | 0.00017065 | 0.00025554 | 0.00017065 | 0.00042619 | 0.600 | |
Miami | 0.00016251 | 0.00053178 | 0 | 0.00053178 | 1.000 | |
New York | 0.00021038 | 0.0005275 | 0 | 0.0005275 | 1.000 | |
Washington D.C. | 0.00027624 | 0.00046592 | 0 | 0.00046592 | 1.000 | |
2 | Denver | 0.00017065 | 0.00018051 | 0.00017065 | 0.00035115 | 0.514 |
Los Angeles | 0.00018051 | 0.00039189 | 0 | 0.00039189 | 1.000 | |
San Francisco | 0.00022124 | 0.00033692 | 0 | 0.00033692 | 1.000 | |
Seattle | 0.00015641 | 0.00040174 | 0 | 0.00040174 | 1.000 |
Boundary Objects -Cluster Proportions- | ||||
---|---|---|---|---|
City | Density | Cluster | 1 | 2 |
Denver | 0.0001706485 | 2 | 0.486 | 0.514 |
Houston | 0.0001706485 | 1 | 0.600 | 0.400 |
The following statements produce Output 57.2.2:
*------Clustering with Uniform-Kernel Density Estimates--------; proc modeclus data=mileages all m=1 r=600 800; id CITY; run;
Modeclus Analysis of 10 American Cities |
Based on Flying Mileages |
Nearest Neighbor List | ||
---|---|---|
City | Neighbor | Distance |
Atlanta | Washington D.C. | 543.0000000 |
Chicago | 587.0000000 | |
Miami | 604.0000000 | |
Houston | 701.0000000 | |
New York | 748.0000000 | |
Chicago | Atlanta | 587.0000000 |
Washington D.C. | 597.0000000 | |
New York | 713.0000000 | |
Houston | Atlanta | 701.0000000 |
Los Angeles | San Francisco | 347.0000000 |
Miami | Atlanta | 604.0000000 |
New York | Washington D.C. | 205.0000000 |
Chicago | 713.0000000 | |
Atlanta | 748.0000000 | |
San Francisco | Los Angeles | 347.0000000 |
Seattle | 678.0000000 | |
Seattle | San Francisco | 678.0000000 |
Washington D.C. | New York | 205.0000000 |
Atlanta | 543.0000000 | |
Chicago | 597.0000000 |
Modeclus Analysis of 10 American Cities |
Based on Flying Mileages |
Sums of Density Estimates Within Neighborhood | ||||||
---|---|---|---|---|---|---|
Cluster | City | Estimated Density |
Same Cluster |
Other Clusters |
Total | Cluster Proportion Same/Total |
1 | Atlanta | 0.00025 | 0.00058333 | 0 | 0.00058333 | 1.000 |
Chicago | 0.00025 | 0.00058333 | 0 | 0.00058333 | 1.000 | |
New York | 0.00016667 | 0.00033333 | 0 | 0.00033333 | 1.000 | |
Washington D.C. | 0.00033333 | 0.00066667 | 0 | 0.00066667 | 1.000 | |
2 | Los Angeles | 0.00016667 | 0.00016667 | 0 | 0.00016667 | 1.000 |
San Francisco | 0.00016667 | 0.00016667 | 0 | 0.00016667 | 1.000 | |
3 | Denver | 0.00008333 | 0 | 0 | 0 | . |
4 | Houston | 0.00008333 | 0 | 0 | 0 | . |
5 | Miami | 0.00008333 | 0 | 0 | 0 | . |
6 | Seattle | 0.00008333 | 0 | 0 | 0 | . |
Cluster Statistics | ||||
---|---|---|---|---|
Cluster | Frequency | Maximum Estimated Density |
Boundary Frequency |
Estimated Saddle Density |
1 | 4 | 0.00033333 | 0 | . |
2 | 2 | 0.00016667 | 0 | . |
3 | 1 | 0.00008333 | 0 | . |
4 | 1 | 0.00008333 | 0 | . |
5 | 1 | 0.00008333 | 0 | . |
6 | 1 | 0.00008333 | 0 | . |
Modeclus Analysis of 10 American Cities |
Based on Flying Mileages |
Sums of Density Estimates Within Neighborhood | ||||||
---|---|---|---|---|---|---|
Cluster | City | Estimated Density |
Same Cluster |
Other Clusters |
Total | Cluster Proportion Same/Total |
1 | Atlanta | 0.000375 | 0.001 | 0 | 0.001 | 1.000 |
Chicago | 0.00025 | 0.000875 | 0 | 0.000875 | 1.000 | |
Houston | 0.000125 | 0.000375 | 0 | 0.000375 | 1.000 | |
Miami | 0.000125 | 0.000375 | 0 | 0.000375 | 1.000 | |
New York | 0.00025 | 0.000875 | 0 | 0.000875 | 1.000 | |
Washington D.C. | 0.00025 | 0.000875 | 0 | 0.000875 | 1.000 | |
2 | Los Angeles | 0.000125 | 0.0001875 | 0 | 0.0001875 | 1.000 |
San Francisco | 0.0001875 | 0.00025 | 0 | 0.00025 | 1.000 | |
Seattle | 0.000125 | 0.0001875 | 0 | 0.0001875 | 1.000 | |
3 | Denver | 0.0000625 | 0 | 0 | 0 | . |
The following statements produce Output 57.2.3:
*------Clustering Neighborhoods Extended to Nearest Neighbor--------; proc modeclus data=mileages list m=1 ck=2 r=600 800; id CITY; run;
Modeclus Analysis of 10 American Cities |
Based on Flying Mileages |
Sums of Density Estimates Within Neighborhood | ||||||
---|---|---|---|---|---|---|
Cluster | City | Estimated Density |
Same Cluster |
Other Clusters |
Total | Cluster Proportion Same/Total |
1 | Atlanta | 0.00025 | 0.00058333 | 0 | 0.00058333 | 1.000 |
Chicago | 0.00025 | 0.00058333 | 0 | 0.00058333 | 1.000 | |
Houston | 0.00008333 | 0.00025 | 0 | 0.00025 | 1.000 | |
Miami | 0.00008333 | 0.00025 | 0 | 0.00025 | 1.000 | |
New York | 0.00016667 | 0.00033333 | 0 | 0.00033333 | 1.000 | |
Washington D.C. | 0.00033333 | 0.00066667 | 0 | 0.00066667 | 1.000 | |
2 | Denver | 0.00008333 | 0.00016667 | 0 | 0.00016667 | 1.000 |
Los Angeles | 0.00016667 | 0.00016667 | 0 | 0.00016667 | 1.000 | |
San Francisco | 0.00016667 | 0.00016667 | 0 | 0.00016667 | 1.000 | |
Seattle | 0.00008333 | 0.00016667 | 0 | 0.00016667 | 1.000 |
Cluster Statistics | ||||
---|---|---|---|---|
Cluster | Frequency | Maximum Estimated Density |
Boundary Frequency |
Estimated Saddle Density |
1 | 6 | 0.00033333 | 0 | . |
2 | 4 | 0.00016667 | 0 | . |
Modeclus Analysis of 10 American Cities |
Based on Flying Mileages |
Sums of Density Estimates Within Neighborhood | ||||||
---|---|---|---|---|---|---|
Cluster | City | Estimated Density |
Same Cluster |
Other Clusters |
Total | Cluster Proportion Same/Total |
1 | Atlanta | 0.000375 | 0.001 | 0 | 0.001 | 1.000 |
Chicago | 0.00025 | 0.000875 | 0 | 0.000875 | 1.000 | |
Houston | 0.000125 | 0.000375 | 0 | 0.000375 | 1.000 | |
Miami | 0.000125 | 0.000375 | 0 | 0.000375 | 1.000 | |
New York | 0.00025 | 0.000875 | 0 | 0.000875 | 1.000 | |
Washington D.C. | 0.00025 | 0.000875 | 0 | 0.000875 | 1.000 | |
2 | Denver | 0.0000625 | 0.000125 | 0 | 0.000125 | 1.000 |
Los Angeles | 0.000125 | 0.0001875 | 0 | 0.0001875 | 1.000 | |
San Francisco | 0.0001875 | 0.00025 | 0 | 0.00025 | 1.000 | |
Seattle | 0.000125 | 0.0001875 | 0 | 0.0001875 | 1.000 |
Copyright © SAS Institute, Inc. All Rights Reserved.