Usage Note 46314: The Cluster node creates only one big cluster, or does not find clusters in the data
The Cluster node in SAS® Enterprise Miner™ enables you to cluster large data sets quickly. The clusters are formed using k-means methods. Sometimes the node produces one large cluster (some say "no clusters" in this context) with a few very small clusters. In this scenario, the following causes might be relevant.
Your data contains only one large cluster, and the node correctly identified it. Having a single cluster is a mathematical possibility. If you want more clusters, consider using more variables, or fewer variables.
The data needs to be standardized. By default, the variable values are divided by the standard deviation. The mean is not subtracted. Variables with large variances tend to have more effect on the resulting clusters than variables with small variances. If all variables are measured in the same units, then standardization might not be necessary. Otherwise, some form of standardization is strongly recommended.
The clusters are irregularly shaped. K-means methods are very good at finding clusters that are well-separated, compact, and spherical. If your clusters are substantially different, then consider nonparametric methods such as those in PROC MODECLUS in SAS/STAT® software.
The data contains outliers. The Filter node enables you to apply a filter to the data set in order to exclude outliers or observations that you want excluded from your data mining analysis.
The wrong number of clusters is selected. The CCC plot does not give a clear indication of the number of clusters. To interpret the plot, see the chapter "Cubic Clustering Criterion" in SAS Enterprise Miner Help. Look for the following warning in the Cluster node log:
WARNING: The number of clusters selected based on the CCC values may not be valid.
Please refer to the documentation on the Cubic Clustering Criterion.
The list of causes above is common in practice, but the list is not exhaustive. Further analysis of your data might be required. The following references contain additional information about cluster analysis.
"SAS/STAT User's Guide" in SAS Product Documentation: refer to the clustering introductory chapter, as well as chapters for the CLUSTER, FASTCLUS, and MODECLUS procedures.
SAS Enterprise Miner Help: from within SAS Enterprise Miner, select Help ► Contents to access the Help. Customers who license SAS Enterprise Miner can also access the Help from our website. SAS Enterprise Miner. To request User ID and Password information, contact SAS Technical Support and provide your SAS Site number.
Operating System and Release Information
SAS System | SAS Enterprise Miner | z/OS | | |
Microsoft® Windows® for 64-Bit Itanium-based Systems | | |
Microsoft Windows Server 2003 Datacenter 64-bit Edition | | |
Microsoft Windows Server 2003 Enterprise 64-bit Edition | | |
Microsoft Windows XP 64-bit Edition | | |
Microsoft® Windows® for x64 | | |
Microsoft Windows 8 | | |
Microsoft Windows 95/98 | | |
Microsoft Windows 2000 Advanced Server | | |
Microsoft Windows 2000 Datacenter Server | | |
Microsoft Windows 2000 Server | | |
Microsoft Windows 2000 Professional | | |
Microsoft Windows 2012 | | |
Microsoft Windows NT Workstation | | |
Microsoft Windows Server 2003 Datacenter Edition | | |
Microsoft Windows Server 2003 Enterprise Edition | | |
Microsoft Windows Server 2003 Standard Edition | | |
Microsoft Windows Server 2003 for x64 | | |
Microsoft Windows Server 2008 | | |
Microsoft Windows Server 2008 for x64 | | |
Microsoft Windows XP Professional | | |
Windows 7 Enterprise 32 bit | | |
Windows 7 Enterprise x64 | | |
Windows 7 Home Premium 32 bit | | |
Windows 7 Home Premium x64 | | |
Windows 7 Professional 32 bit | | |
Windows 7 Professional x64 | | |
Windows 7 Ultimate 32 bit | | |
Windows 7 Ultimate x64 | | |
Windows Millennium Edition (Me) | | |
Windows Vista | | |
Windows Vista for x64 | | |
64-bit Enabled AIX | | |
64-bit Enabled HP-UX | | |
64-bit Enabled Solaris | | |
ABI+ for Intel Architecture | | |
AIX | | |
HP-UX | | |
HP-UX IPF | | |
Linux | | |
Linux for x64 | | |
Linux on Itanium | | |
Solaris | | |
Solaris for x64 | | |
Tru64 UNIX | | |
*
For software releases that are not yet generally available, the Fixed
Release is the software release in which the problem is planned to be
fixed.
Date Modified: | 2012-11-20 13:58:26 |
Date Created: | 2012-04-16 09:45:25 |