Contents: | Purpose / History / Requirements / Usage / Details / Limitations / Missing Values / References |
NOTE: Beginning in SAS 9.4, this macro is no longer needed. Use the OUTPLC= option in Base SAS PROC CORR to save a matrix of polychoric (or tetrachoric) correlations.
Version | Update Notes |
1.7 | Fixed errors referring to variable type or length of the NAME variable when analyzing character variables. Improved distance example. Added ID=, ORDER=, and PRINTLEVELS= options. Made _NUMERIC_, _CHARACTER_, _ALL_ available with VAR=. Fixed looping if variable named I or J used. Fixed problem if NAME is variable in input data set. Use of ODS now makes SAS 8 the minimum release. |
1.6 | Fixed bug that didn't allow a variable named X in the input data. Default for VAR= is now all variables, not just numeric variables. Fixed problem where correlation from previous variable pair was used when the current pair does not form at least a 2x2 table (a WARNING appears in the log when this happens). Use of %sysfunc requires SAS 6.12 or later. Added automatic check for new version. |
1.5 | Fixed display of converge message. Capture and reset NOTES option at end. |
1.4 | Removed NOSTIMER option. Allow for long variable names in SAS 7 or later. Added version indicator to macro notes. |
1.3 | Print message if no convergence when computing polychoric correlations. Added CONVERGE= and MAXITER= options. |
1.2 | Fixed problem with macro notes not printing. |
%inc "<location of your file containing the POLYCHOR macro>";
Following this statement, you may call the %POLYCHOR macro. See the Results tab for an example.
The options and allowable values are:
Convergence problems
The PLCORR option uses an iterative, maximum likelihood method to
estimate the polychoric correlation. Occasionally, this method
will not converge on an estimate, and in that case, the value of
the correlation is set to missing. By adjusting the values of the
CONVERGE= and/or MAXITER= options in the TABLE statement of PROC
FREQ, you may be able to obtain an estimate. For example, the
following statements attempt to estimate the polychoric correlation
between variables X1 and X2 setting the convergence criterion to a
more lenient 0.001 and allowing more iterations than the default.
See the FREQ chapter of the SAS/STAT User's Guide for details on
these options.
proc freq; table x1 * x2 / plcorr converge=.001 maxiter=30; run;
You can adjust the CONVERGE= and MAXITER= options for the estimation of all polychoric correlations by using the CONVERGE= and MAXITER= options in the %POLYCHOR macro.
Ordering of levels
The variables are assumed to be ordinal variables. PROC FREQ forms a crosstabulation for each pair of variables. Note that the ordering of the rows and columns in a table affects the computation of the polychoric correlation. For instance, two variables with levels LOW, MEDIUM, and HIGH, in this order, will produce a different correlation estimate when ordered MEDIUM, LOW, HIGH. You should verify that all of your variables will be in the desired order for your chosen setting of the ORDER= option. The PRINTLEVELS=YES option displays all character variable levels in the order used when computing the correlations. The ORDER= option affects the ordering of all variables, character and numeric, in the analysis.
Obtaining a correlation matrix
If TYPE=CORR is specified, the individual correlation coefficients are then assembled into
a TYPE=CORR data set containing a matrix of polychoric
correlations. The resulting data set can be used, but for descriptive analyses only, in
either the FACTOR or the CALIS procedure (specify METHOD=ULS in
either procedure). If the
maximum likelihood method (METHOD=ML) is used, note that none of
the hypothesis tests will be valid, and the polychoric correlation
matrix may be indefinite with small samples.
Obtaining a distance matrix
If TYPE=DISTANCE is specified, a TYPE=DISTANCE data set is created containing a matrix of
dissimilarity values. The dissimilarity value used is computed as
1 - plcorr 2, where plcorr is the polychoric correlation.
It is assumed that the columns of the input data set are the items to be clustered and that each row of the data set is a variable (such as HEIGHT) on which the items are measured. Collectively, the variables (rows) locate the items (columns) to be clustered. Note that this data structure is the transpose of the usual data set input to such procedures as PROC CLUSTER in which the items to be clustered are the rows (observations) in the data set and the variables which locate the items are the columns. You can use PROC TRANSPOSE to convert rows to columns. The distance data set created by the %POLYCHOR macro includes a variable containing the item (column) names which can be used in subsequent analyses to identify the items. You can name this variable using the ID= option (the default name is _ID_).
The output data set
can be used in the CLUSTER procedure (but the CCC value is not
valid) or the MODECLUS procedure. As discussed in the documentation of the these procedures and the DISTANCE procedure, variables with higher variability have greater effect on the distance measure. As a result, you may want to standardize the variables before computing distances. This can be done by using PROC STDIZE.
See the Appendix, Special SAS Data Sets in the SAS/STAT User's Guide for a description of TYPE=CORR and DISTANCE data sets.
If a message which begins like this appears in the SAS log:
WARNING: No OUTPUT data set is produced because no statistics can be computed for this table, which has ...
it indicates that the current pair of variables does not form at least a 2x2 table. The polychoric correlation can not be computed and is set to missing in the output data set.
If some polychoric correlations could not be estimated and are missing in the OUT= data set, then an attempt to use the data set as input to an analytical procedure such as PROC PRINCOMP results in this message in the SAS log:
ERROR: CORR matrix incomplete in data set WORK._PLCORR.
All correlations must be nonmissing in order to do an analysis.
See the DETAILS and LIMITATIONS sections above for information on polychoric correlation estimates that are set to missing.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.
data norm; length id $ 8; array x{5} x1-x5; do n=1 to 20; do i=1 to 5; x{i}=rannor(238423)*3+10; end; id=cats(n,''); keep id x1-x5; output; end; run; proc rank data=norm out=ordinal groups=5; var x1-x5; run;
This statement defines the POLYCHOR macro and makes it available for use:
%inc "<location of your file containing the POLYCHOR macro>";
The following statements call the POLYCHOR macro which creates a TYPE=CORR data set named _PLCORR containing a matrix of polychoric correlations among the numeric variables (X1-X5) in the last-created data set, ORDINAL.
%polychor(var=_numeric_) proc print noobs; var _type_ _name_ x1-x5; run;
PROC PRINT displays the TYPE=CORR data set, _PLCORR, containing the polychoric correlation matrix:
_TYPE_ _NAME_ x1 x2 x3 x4 x5 N 20.0000 20.0000 20.0000 20.0000 20.0000 MEAN 2.0000 2.0000 2.0000 2.0000 2.0000 STD 1.4510 1.4510 1.4510 1.4510 1.4510 CORR x1 1.0000 . . . . CORR x2 -0.2023 1.0000 . . . CORR x3 0.2163 -0.1856 1.0000 . . CORR x4 0.3830 -0.1577 0.3868 1.0000 . CORR x5 -0.0689 0.1593 0.1964 0.0890 1.0000 |
The following steps create a TYPE=DISTANCE data set named DIST containing a dissimilarity matrix for the first six observations of data set ORDINAL. PROC TRANSPOSE creates the required data structure -- items to be clustered as columns, variables that locate items as rows. The call of the POLYCHOR macro requests computation of the distance matrix using all numeric variables and allowing for extra iteration in the algorithm that computes the correlations. A variable named ID is created containing the names of the items (variables) being clustered. PROC CLUSTER reads the distance data set and does a cluster analysis of the six observations. PROC TREE draws a dendrogram showing the clustering process.
proc transpose data=ordinal out=ordt prefix=ID; where id in ('1' '2' '3' '4' '5' '6'); id id; var x1-x5; run; %polychor(var=_numeric_, type=distance, id=id, out=dist, maxiter=100) proc print noobs; run; proc cluster data=dist method=average outtree=tree noprint; id id; run; proc tree data=Tree horizontal; id id; run;
PROC PRINT displays the TYPE=DISTANCE data set, DIST, containing the distances based on the polychoric correlations, followed by the dendrogram from PROC TREE:
id ID1 ID2 ID3 ID4 ID5 ID6 ID1 0.00000 . . . . . ID2 0.99150 0.00000 . . . . ID3 0.89850 0.33690 0.00000 . . . ID4 0.99906 0.98574 0.20678 0.00000 . . ID5 0.99906 0.90162 0.00000 0.00000 0 . ID6 0.89850 0.33689 0.00000 0.20679 .000002000 0 |
Clustering of the six observations is shown in the dendrogram displayed by PROC TREE.
Right-click on the link below and select Save to save
the %POLYCHOR macro definition
to a file. It is recommended that you name the file
polychor.sas
.
Download and save polychor.sas
Type: | Sample |
Topic: | SAS Reference ==> Procedures ==> FREQ Analytics ==> Longitudinal Analysis Analytics ==> Exact Methods Analytics ==> Categorical Data Analysis Analytics ==> Nonparametric Analysis SAS Reference ==> Procedures ==> CORR Analytics ==> Descriptive Statistics |
Date Modified: | 2015-09-17 16:46:27 |
Date Created: | 2005-01-13 15:03:32 |
Product Family | Product | Host | SAS Release | |
Starting | Ending | |||
SAS System | SAS/STAT | z/OS | ||
64-bit Enabled HP-UX | ||||
Microsoft Windows Server 2012 R2 Std | ||||
Microsoft Windows Server 2012 Std | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
OS/2 | ||||
Microsoft Windows 8 Enterprise 32-bit | ||||
Microsoft® Windows® for x64 | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Z64 | ||||
Microsoft Windows 8.1 Enterprise x64 | ||||
Microsoft Windows 8.1 Enterprise 32-bit | ||||
Microsoft Windows 8 Pro x64 | ||||
Microsoft Windows 8 Enterprise x64 | ||||
Microsoft Windows 8 Pro 32-bit | ||||
Microsoft Windows Server 2012 R2 Datacenter | ||||
Microsoft Windows Server 2012 Datacenter | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2008 R2 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 8.1 Pro 32-bit | ||||
Microsoft Windows 8.1 Pro | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |