SUPPORT / SAMPLES & SAS NOTES
 

Support

Sample 25031: Compute six intraclass correlation measures

DetailsResultsDownloadsAboutRate It

Compute six intraclass correlation measures

Contents: Purpose / History / Requirements / Usage / Details / References

NOTE: This macro was not written by and is not supported by SAS Institute.
Author: Robert M. Hamer, Ph.D., Research Professor of Biostatistics, School of Public Health, University of North Carolina - Chapel Hill, 08-01-2009. Copyright (C) 1990 by Robert M. Hamer, all rights reserved. This macro may be distributed freely as long as all comments are included.
PURPOSE:
The %INTRACC macro calculates reliabilities for intraclass correlations. The macro calculates the six intraclass correlations discussed in Shrout and Fleiss (1979). Additionally it calculates two intraclass correlations using formulae from Winer (1971) which are identical to two of the six from Shrout and Fleiss. It also calculates the reliability of the mean of nrater ratings, where nrater is a parameter of the macro, using the Spearmen-Brown prophecy formula so that one can examine the effect obtaining more raters would have on the reliability of a mean.
HISTORY:
Updated 11Jan2000.
REQUIREMENTS:
Base SAS and SAS/STAT software are required.
USAGE:
Follow the instructions in the Downloads tab of this sample to save the %INTRACC macro definition. Replace the text within quotes in the following statement with the location of the %INTRACC macro definition file on your system. In your SAS program or in the SAS editor window, specify this statement to define the %INTRACC macro and make it available for use:
   %inc "<location of your file containing the INTRACC macro>";

Following this statement, you may call the %INTRACC macro. See the Results tab for an example.

The following options are available. The TARGET=, RATER=, and DEPVAR= options are required.

data=
SAS dataset containing data. Default is _LAST_.
target=
variable indexing the experimental units, often subjects or persons, each of whom is rated several times.
rater=
variable indexing judge, or whatever is producing multiple ratings for each subject.
depvar=
dependent variable list, or list of variables for which each target was rated by each rater.
nrater=
For use in Spearman-Brown Prophecy formula to estimate the reliability of the mean of nrater ratings, where nrater is different than the number of raters actually used in the data. Default is 0, which omits this computation.
out=
Name of output data set to contain the statistics. Default is _DATA_.
print=
0 for no printout, 1 to print the intraclass correlations and related statistics, 2 to print the summary statistics from GLM as well, 3 to print all the GLM results as well. Default is 1.
DETAILS:
Notation used in calculating the three correlations via the Winer formulae is taken from Winer while notation used in calculating the six correlations using the Shrout and Fleiss formulae is taken from Shrout and Fleiss. That means that in some cases I used two differently named variables to hold the same thing so I could use a variable with the same name as the reference when calculating a correlation taken from that reference.

If there are n targets and k ratings for each target, each target-rating occupies one observation, or in other words, there are n*k observations in the dataset.

The macro uses PROC GLM to break the total variability into that due to between targets, between judges, and residual. For the formulae which assume a one-way design, the SS and DF for between judges and residual are added to give simply a within-targets SS. This macro assumes that all targets are rated by judges numbered with the same judge numbers, even if they are not the same judges. In other words, each subject is rated by k judges, labeled, say, 1,2,...,k, even if they are not the same judges for each subject. That is so PROC GLM can break out a between judges SS.

In Shrout and Fleiss notation, the six correlations and their uses are as follows:

ICC(1,1):
used when each subject is rated by multiple raters, raters assumed to be randomly assigned to subjects, all subjects have the same number of raters.
ICC(2,1):
used when all subjects are rated by the same raters who are assumed to be a random subset of all possible raters.
ICC(3,1):
used when all subjects are rated by the same raters who are assumed to be the entire population of raters.
ICC(1,k):
Same assumptions as ICC(1,1) but reliability for the mean of k ratings.
ICC(2,k):
Same assumptions as ICC(2,1) but reliability for the mean of k ratings.
ICC(3,k):
Same assumptions as ICC(3,1) but reliability for the mean of k ratings. Assumes additionally no subject by judges interaction.
REFERENCES:
Shrout, P.E., and Fleiss, J.L (1979), "Intraclass correlations: uses in assessing rater reliability," Psychological Bulletin, 86, 420-428.

Winer, B.J. (1971), Statistical Principles in Experimental Design, New York: McGraw Hill.




These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.