The course looks at the theoretical and practical implications of a wide array of clustering techniques currently available in SAS. The techniques considered include cluster preprocessing, variable clustering, k-means clustering, and hierarchical clustering.
Learn how to
- prepare and explore data for a cluster analysis
- distinguish among many different clustering techniques, making informed choices about which to use
- evaluate the results of a cluster analysis
- determine the appropriate number of clusters to retain
- profile and describe clustered observations
- score observations into clusters.
Who should attend
Intermediate or senior level statisticians, data analysts, and data miners
Duration: 2 days
To Register
| I am attending SAS Global Forum conference. |
|
Register Now |
| I am not attending the conference but would like to register for this course.
| |
Register Now |
Before attending this course, you should
- be able to execute SAS programs and create SAS data sets. You can gain this experience by completing the SAS Programming 1: Essentials course.
- have completed a graduate-level course in statistics or the Statistics 1: Introduction to ANOVA, Regression, and Logistic Regression course.
- have an understanding of matrix algebra.
This course addresses SAS/STAT software.
Introduction to Clustering
- identifying types of clustering
- measuring similarity
- assessing multivariate normality
- using classification matrices
Preparation for Clustering
- using variable clustering for variable selection
- using graphical clustering aids
- making elongated clusters more spherical
- viewing the impact of input standardization
Partitive Clustering
- k-means clustering for segmentation
- outlining the advantages of nonparametric clustering
Hierarchical Clustering
- comparing hierarchical clustering methods
Assessing Clustering Results
- determining the number of clusters
- profiling a cluster solution
- scoring new observations
Cluster Analysis Case Study
- variable selection
- graphical exploration of selected variables
- hierarchical clustering and determining the number of clusters
- profiling the seven-cluster solution
- modeling cluster membership
- scoring the customer database
Canonical Discriminant Analysis (CDA)Plots
- using canonical discriminant analysis to summarize multivariate data
- interpreting CANDISC procedure output
Fuzzy Clustering
- performing fuzzy clustering using the FACTOR procedure
- interpreting the PROC FACTOR output in terms of fuzzy clustering membership
Assessing Multivariate Normality
- defining multivariate normality
- exploring the implications of univariate and multivariate normality in the context of clustering
- illustrating the calculation of mulitvariate normality
| This course description was created using SAS software.
| CLUS93 |