Contents: |
Purpose / History / Requirements / Usage / Details / References |

Author: Robert M. Hamer, Ph.D., Research Professor of Biostatistics, School of Public Health, University of North Carolina - Chapel Hill, 08-01-2009. Copyright (C) 1990 by Robert M. Hamer, all rights reserved. This macro may be distributed freely as long as all comments are included.

*PURPOSE:*- The %INTRACC macro calculates reliabilities for intraclass correlations. The macro calculates the six intraclass correlations discussed in Shrout and Fleiss (1979). Additionally it calculates two intraclass correlations using formulae from Winer (1971) which are identical to two of the six from Shrout and Fleiss. It also calculates the reliability of the mean of nrater ratings, where nrater is a parameter of the macro, using the Spearmen-Brown prophecy formula so that one can examine the effect obtaining more raters would have on the reliability of a mean.
*HISTORY:*- Updated 11Jan2000.
*REQUIREMENTS:*- Base SAS and SAS/STAT software are required.
*USAGE:*-
Follow the instructions in the Downloads tab of this
sample to save the %INTRACC macro definition. Replace the text within quotes in the following statement with the location of the %INTRACC macro definition file on your system. In your SAS program or in the SAS editor window, specify this statement to define the %INTRACC macro and make it available for use:
%inc "<location of your file containing the INTRACC macro>";

Following this statement, you may call the %INTRACC macro. See the Results tab for an example.

The following options are available. The TARGET=, RATER=, and DEPVAR= options are required.

**data=**- SAS dataset containing data. Default is _LAST_.
**target=**- variable indexing the experimental units, often subjects or persons, each of whom is rated several times.
**rater=**- variable indexing judge, or whatever is producing multiple ratings for each subject.
**depvar=**- dependent variable list, or list of variables for which each target was rated by each rater.
**nrater=**- For use in Spearman-Brown Prophecy formula to estimate the reliability of the mean of nrater ratings, where nrater is different than the number of raters actually used in the data. Default is 0, which omits this computation.
**out=**- Name of output data set to contain the statistics. Default is _DATA_.
**print=**- 0 for no printout, 1 to print the intraclass correlations and related statistics, 2 to print the summary statistics from GLM as well, 3 to print all the GLM results as well. Default is 1.

*DETAILS:*-
Notation used in calculating the three correlations via the Winer
formulae is taken from Winer while notation used in calculating the
six correlations using the Shrout and Fleiss formulae is taken from
Shrout and Fleiss. That means that in some cases I used two
differently named variables to hold the same thing so I could use
a variable with the same name as the reference when calculating
a correlation taken from that reference.
If there are n targets and k ratings for each target, each target-rating occupies one observation, or in other words, there are n*k observations in the dataset.

The macro uses PROC GLM to break the total variability into that due to between targets, between judges, and residual. For the formulae which assume a one-way design, the SS and DF for between judges and residual are added to give simply a within-targets SS. This macro assumes that all targets are rated by judges numbered with the same judge numbers, even if they are not the same judges. In other words, each subject is rated by k judges, labeled, say, 1,2,...,k, even if they are not the same judges for each subject. That is so PROC GLM can break out a between judges SS.

In Shrout and Fleiss notation, the six correlations and their uses are as follows:

**ICC(1,1):**- used when each subject is rated by multiple raters, raters assumed to be randomly assigned to subjects, all subjects have the same number of raters.
**ICC(2,1):**- used when all subjects are rated by the same raters who are assumed to be a random subset of all possible raters.
**ICC(3,1):**- used when all subjects are rated by the same raters who are assumed to be the entire population of raters.
**ICC(1,k):**- Same assumptions as ICC(1,1) but reliability for the mean of k ratings.
**ICC(2,k):**- Same assumptions as ICC(2,1) but reliability for the mean of k ratings.
**ICC(3,k):**- Same assumptions as ICC(3,1) but reliability for the mean of k ratings. Assumes additionally no subject by judges interaction.

*REFERENCES:*-
Shrout, P.E., and Fleiss, J.L (1979), "Intraclass correlations:
uses in assessing rater reliability,"
*Psychological Bulletin*, 86, 420-428.Winer, B.J. (1971),

*Statistical Principles in Experimental Design*, New York: McGraw Hill.

These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.

These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.

*EXAMPLE 1:*-
data ratings; do product=1 to 5; do judge=1 to 3; input rating @@; output; end; end; datalines; 1 1 5 3 2 6 5 3 7 7 4 8 9 5 9 ; /* Define the INTRACC macro */ %inc "<location of your file containing the INTRACC macro>"; %intracc(depvar=rating,target=product,rater=judge,nrater=10); %intracc(data=ratings,depvar=rating,target=product,rater=judge, print=3,out=intclcor);

*RESULTS:*-
Following are the results of the first %INTRACC call:
Intraclass Correlations for Inter-Rater Reliability Calculate all reliabilities in one fell swoop _NAME_ msw msb wms ems edf bms bdf jms jdf k theta rating 4.66667 13.3333 4.66667 0.83333 8 13.3333 4 20 2 3 0.61905 Winer Winer Winer reliability: reliability: Shrout-Fleiss Shrout-Fleiss reliability: mean of mean of 10 reliability: reliability: _NAME_ single score k scores scores single score random set rating 0.38235 0.65 0.86093 0.38235 0.47170 Shrout-Fleiss Shrout-Fleiss Shrout-Fleiss Shrout-Fleiss reliability: reliability: rel: rand set rel: fxd set _NAME_ fixed set mean k scores mean k scrs mean k scrs rating 0.83333 0.65 0.72816 0.9375

These results come from the second call to %INTRACC:Intraclass Correlations for Inter-Rater Reliability The GLM Procedure Class Level Information Class Levels Values product 5 1 2 3 4 5 judge 3 1 2 3 Number of Observations Read 15 Number of Observations Used 15 Intraclass Correlations for Inter-Rater Reliability The GLM Procedure Dependent Variable: rating Sum of Source DF Squares Mean Square F Value Pr > F Model 6 93.3333333 15.5555556 18.67 0.0003 Error 8 6.6666667 0.8333333 Corrected Total 14 100.0000000 R-Square Coeff Var Root MSE rating Mean 0.933333 18.25742 0.912871 5.000000 Source DF Type I SS Mean Square F Value Pr > F product 4 53.33333333 13.33333333 16.00 0.0007 judge 2 40.00000000 20.00000000 24.00 0.0004 Source DF Type III SS Mean Square F Value Pr > F product 4 53.33333333 13.33333333 16.00 0.0007 judge 2 40.00000000 20.00000000 24.00 0.0004 Intraclass Correlations for Inter-Rater Reliability Statistics from 2-way ANOVA w/o Interaction Obs _NAME_ _SOURCE_ _TYPE_ DF SS F PROB 1 rating ERROR ERROR 8 6.6667 . . 2 rating judge SS1 2 40.0000 24 .000416493 3 rating judge SS3 2 40.0000 24 .000416493 4 rating product SS1 4 53.3333 16 .000694339 5 rating product SS3 4 53.3333 16 .000694339 Intraclass Correlations for Inter-Rater Reliability Calculate all reliabilities in one fell swoop _NAME_ msw msb wms ems edf bms bdf jms jdf k rating 4.66667 13.3333 4.66667 0.83333 8 13.3333 4 20 2 3 Winer Winer reliability: Shrout-Fleiss Shrout-Fleiss reliability: mean of reliability: reliability: _NAME_ theta single score k scores single score random set rating 0.61905 0.38235 0.65 0.38235 0.47170 Shrout-Fleiss Shrout-Fleiss Shrout-Fleiss Shrout-Fleiss reliability: reliability: rel: rand set rel: fxd set _NAME_ fixed set mean k scores mean k scrs mean k scrs rating 0.83333 0.65 0.72816 0.9375

*EXAMPLE 2:*-
This example illustrates using various methods in SAS Software to compute the
reliability statistics presented in MacLennan (1993).
- MacLennan, R.N. (1993), "Interrater Reliability
With SPSS for Windows 5.0,"
*The American Statistician*, 47(4), 292-296.

In data set TABLE1, each score on page 293 of the article is referenced by a JUDGE / PAIR combination. This is the format that PROC GLM will need. The data set will be transposed later for PROC CORR.

data table1; length pair $ 6; input judges $ 1-7 @@; do pair='first','second','third','fourth','fifth','sixth'; input score @; output; end; datalines; France 5.9 5.8 5.5 5.6 5.4 5.5 Czech 5.9 5.7 5.7 5.8 5.5 5.2 Austl 5.9 5.8 5.7 5.6 5.4 5.2 USA 5.9 5.7 5.8 5.7 5.4 5.3 Germany 5.8 5.7 5.3 5.4 5.4 5.1 Canada 5.8 5.7 5.6 5.6 5.4 5.4 Italy 5.9 5.8 5.5 5.3 5.4 5.2 Unified 5.9 5.8 5.7 5.6 5.4 5.2 UK 5.9 5.8 5.3 5.7 5.5 5.3 ; proc print noobs; title 'Interrater Reliability with the %INTRACC macro'; run;

The following statements invoke the %INTRACC macro and compute the following:

- (1b) is given automatically as type III MS for Judges
- (4b) = S-F fixed set
- (3b) = S-F fixed set mean

/* Define the INTRACC macro */ %inc "<location of your file containing the INTRACC macro>"; %intracc(data=table1,target=pair,depvar=score,rater=judges,print=3);

Abbreviated output from the %INTRACC macro call follows:

Interrater Reliability with the %INTRACC macro Intraclass Correlations for Inter-Rater Reliability The GLM Procedure Dependent Variable: score Sum of Source DF Squares Mean Square F Value Pr > F Model 13 2.35796296 0.18138177 16.14 <.0001 Error 40 0.44962963 0.01124074 Corrected Total 53 2.80759259 R-Square Coeff Var Root MSE score Mean 0.839852 1.900168 0.106022 5.579630 Source DF Type I SS Mean Square F Value Pr > F pair 5 2.18537037 0.43707407 38.88 <.0001 judges 8 0.17259259 0.02157407 1.92 0.0838 Source DF Type III SS Mean Square F Value Pr > F pair 5 2.18537037 0.43707407 38.88 <.0001 judges 8 0.17259259 0.02157407 1.92 0.0838 Interrater Reliability with the %INTRACC macro Intraclass Correlations for Inter-Rater Reliability Calculate all reliabilities in one fell swoop _NAME_ msw msb wms ems edf bms bdf jms jdf score 0.012963 0.43707 0.012963 0.011241 40 0.43707 5 0.021574 8 Winer Winer reliability: Shrout-Fleiss Shrout-Fleiss reliability: mean of reliability: reliability: _NAME_ k theta single score k scores single score random set score 9 3.63524 0.78426 0.97034 0.78426 0.78495 Shrout-Fleiss Shrout-Fleiss Shrout-Fleiss Shrout-Fleiss reliability: reliability: rel: rand set rel: fxd set _NAME_ fixed set mean k scores mean k scrs mean k scrs score 0.80803 0.97034 0.97046 0.97428

Compute (2b) from the output data set. (3b) as above can be computed here too.

data alpha; set _stats_; if upcase(_source_)='JUDGES' and _type_='SS1' then do; alpha=1-(1/f); stat='reliability/consistency across skaters'; output; end; if upcase(_source_)='PAIR' and _type_='SS1' then do; alpha=1-(1/f); stat='reliability/consistency across judges'; output; end; run; proc print data=alpha noobs; var alpha stat; title; run;

The above statements produce the following output:alpha stat 0.47897 reliability/consistency across skaters 0.97428 reliability/consistency across judges

title 'Another approach pp. 294-5: Transpose and use PROC CORR ALPHA'; proc transpose data=table1 out=out(where=(upcase(_name_) eq 'SCORE')); id pair; by judges notsorted; run; proc print data=out noobs; title2 'examine the skaters'; run; proc corr alpha data=out; var first -- sixth; run;

The above statements produce the following abbreviated output:Another approach pp. 294-5: Transpose and use PROC CORR ALPHA examine the skaters judges _NAME_ first second third fourth fifth sixth France score 5.9 5.8 5.5 5.6 5.4 5.5 Czech score 5.9 5.7 5.7 5.8 5.5 5.2 Austl score 5.9 5.8 5.7 5.6 5.4 5.2 USA score 5.9 5.7 5.8 5.7 5.4 5.3 Germany score 5.8 5.7 5.3 5.4 5.4 5.1 Canada score 5.8 5.7 5.6 5.6 5.4 5.4 Italy score 5.9 5.8 5.5 5.3 5.4 5.2 Unified score 5.9 5.8 5.7 5.6 5.4 5.2 UK score 5.9 5.8 5.3 5.7 5.5 5.3 Cronbach Coefficient Alpha Variables Alpha ---------------------------- Raw 0.478970 Standardized 0.537838

proc sort data=table1 out=table1; by pair; run; proc transpose data=table1 out=out(where=(upcase(_name_) eq 'SCORE')); id judges; by pair; run; proc print data=out noobs; title2 'examine the judges'; run; proc corr alpha data=out; var france -- uk; run;

The above statements produce the following abbreviated output:Another approach pp. 294-5: Transpose and use PROC CORR ALPHA examine the judges pair _NAME_ France Czech Austl USA Germany Canada Italy Unified UK fifth score 5.4 5.5 5.4 5.4 5.4 5.4 5.4 5.4 5.5 first score 5.9 5.9 5.9 5.9 5.8 5.8 5.9 5.9 5.9 fourth score 5.6 5.8 5.6 5.7 5.4 5.6 5.3 5.6 5.7 second score 5.8 5.7 5.8 5.7 5.7 5.7 5.8 5.8 5.8 sixth score 5.5 5.2 5.2 5.3 5.1 5.4 5.2 5.2 5.3 third score 5.5 5.7 5.7 5.8 5.3 5.6 5.5 5.7 5.3 Cronbach Coefficient Alpha Variables Alpha ---------------------------- Raw 0.974282 Standardized 0.977863

- MacLennan, R.N. (1993), "Interrater Reliability
With SPSS for Windows 5.0,"

Right-click on the link below and select **Save** to save
the %INTRACC macro definition
to a file. It is recommended that you name the file
`intracc.sas`

.

The %INTRACC macro calculates six intraclass correlations. It also calculates the reliability of the mean of nrater
ratings (where nrater is specified) using the Spearmen-Brown prophecy formula.

#### Operating System and Release Information

Type: | Sample |

Topic: | Analytics ==> Nonparametric Analysis SAS Reference ==> Procedures ==> GLM Analytics ==> Descriptive Statistics Analytics ==> Regression Analytics ==> Longitudinal Analysis SAS Reference ==> Procedures ==> CORR Analytics ==> Analysis of Variance |

Date Modified: | 2007-08-14 03:03:11 |

Date Created: | 2005-01-17 08:41:30 |

Product Family | Product | Host | SAS Release | |

Starting | Ending | |||

SAS System | SAS/STAT | All | n/a | n/a |