The IRT Procedure (Experimental)

Example 51.1 Unidimensional IRT Models

This example shows you the features that PROC IRT provides for unidimensional analysis. The data set comes from the 1978 Quality of American Life Survey. The survey was administered to a sample of all U.S. residents aged 18 years and older in 1978. In this survey, subjects were asked to rate their satisfaction with many different aspects of their lives. This example selects eight items. These items are designed to measure people’s satisfaction in the following areas on a seven-point scale: community, neighborhood, dwelling unit, life in the United States, amount of education received, own health, job, and how spare time is spent. For illustration purposes, the first five items are dichotomized and the last three items are collapsed into three levels.

The following DATA step creates the data set IrtUni.

data IrtUni;
   input item1-item8 @@;
   datalines;
1 0 0 0 1 1 2 1 1 1 1 1 1 3 3 3 0 1 0 0 1 1 1 1 1 0 0 1 0 1 2 3 0 0 0
0 0 1 1 1 1 0 0 1 0 1 3 3 0 0 0 0 0 1 1 3 0 0 1 0 0 1 2 2 0 1 0 0 1 1

   ... more lines ...   

3 3 0 1 0 0 1 2 2 1 
; 

Because all the items are designed to measure subjects’ satisfaction in different aspects of their lives, it is reasonable to start with a unidimensional IRT model. The following statements fit such a model by using several user-specified options:

ods graphics on;
proc irt data=IrtUni link=probit pinitial itemfit plots=ICC;
   var item1-item8;
   model item1-item4/resfunc=twop, item5-item8/resfunc=graded;
run;
ods graphics off;

The first option is the LINK= option, which specifies that the link function be the probit link. Next, you request initial parameter estimates by using the PINITIAL option. Item fit statistics are displayed using the ITEMFIT option. In the PROC IRT statement, you can use the PLOTS option to request different plots. In this example, you request item characteristic curves by using the PLOTS=ICC option.

In this example, you use the MODEL statement to specify different response models for different items. The specifications in the MODEL statement suggest that the first four items, item1 to item4, are fitted using the two-parameter model, whereas the last four items, item5 to item8, are fitted using the graded response model.

Output 51.1.1 displays two tables. From the Modeling Information table, you can observe that the link function has changed from the default LOGIT link to the specified PROBIT link. The Item Information table shows that item1 to item5 each have two levels and item6 to item8 each have three levels. The last column shows the raw values of these different levels.

Output 51.1.1: Basic Information

The IRT Procedure

Modeling Information
Data Set WORK.IRTUNI
Link Function Probit
Number of Items 8
Number of Factors 1
Number of Observations Read 500
Number of Observations Used 500
Estimation Method Marginal Maximum Likelihood

Item Information
Response
Model
Item Levels Values
TwoP item1 2 0 1
  item2 2 0 1
  item3 2 0 1
  item4 2 0 1
Graded item5 2 0 1
  item6 3 1 2 3
  item7 3 1 2 3
  item8 3 1 2 3


PROC IRT produces the Eigenvalues of the Polychoric Correlation Matrix table in Output 51.1.2 by default. You can use these eigenvalues to assess the dimension of latent factors. For this example, the fact that only the first eigenvalue is greater than 1 suggests that a one-factor model for the items is reasonable.

Output 51.1.2: Eigenvalues of Polychoric Correlations

The IRT Procedure

Eigenvalues of the Polychoric Correlation Matrix
  Eigenvalue Difference Proportion Cumulative
1 3.11870486 2.12497677 0.3898 0.3898
2 0.99372809 0.10025986 0.1242 0.5141
3 0.89346823 0.03116998 0.1117 0.6257
4 0.86229826 0.10670185 0.1078 0.7335
5 0.75559640 0.17795713 0.0944 0.8280
6 0.57763928 0.10080017 0.0722 0.9002
7 0.47683911 0.15511333 0.0596 0.9598
8 0.32172578   0.0402 1.0000


The PINITIAL option in the PROC IRT statement displays the Initial Item Parameter Estimates table, shown in Output 51.1.3.

Output 51.1.3: Initial Parameter Estimates

The IRT Procedure

Initial Item Parameter Estimates
Response
Model
Item Parameter Estimate
TwoP item1 Threshold 0.27840
    Slope 1.05346
  item2 Threshold 0.55106
    Slope 0.93973
  item3 Threshold 0.36946
    Slope 0.82826
  item4 Threshold 0.25533
    Slope 0.50906
Graded item5 Threshold -0.35914
    Slope 0.41380
  item6 Threshold 1 -0.21462
    Threshold 2 0.72372
    Slope 0.36063
  item7 Threshold 1 -0.58249
    Threshold 2 0.44507
    Slope 0.64191
  item8 Threshold 1 -0.79898
    Threshold 2 0.18222
    Slope 0.67591


Output 51.1.4 includes tables that are related to the optimization. The Optimization Information table shows that the log likelihood is approximated by using seven adaptive Gauss-Hermite quadrature points and then maximized by using the quasi-Newton algorithm. The number of free parameters in this example is 19. The Iteration History table shows the number of function evaluations, the objective function (–$\log $ likelihood divided by number of subjects) values, the objective function change, and the maximum gradient for each iteration. This information is very useful in monitoring the optimization status. Output 51.1.4 shows the convergence status at the bottom. The optimization converges according to the GCONV=0.00000001 criterion.

Output 51.1.4: Optimization Information

The IRT Procedure

Optimization Information
Optimization Technique Quasi-Newton
Likelihood Approximation Adaptive Gauss-Hermite Quadrature
Number of Quadrature Points 19
Number of Free Parameters 19

Iteration History
Iteration Evaluations Objective
Function
Change Max
Gradient
0 2 6.19423744 6.19423744 0.015499
1 5 6.19269765 -0.00153979 0.005785
2 8 6.19256563 -0.00013202 0.003812
3 10 6.19249848 -0.00006716 0.003284
4 12 6.19245354 -0.00004493 0.004647
5 15 6.19243615 -0.00001739 0.001284
6 18 6.19242917 -0.00000698 0.000491
7 21 6.19242859 -0.00000058 0.000192
8 24 6.19242845 -0.00000013 0.000104
9 27 6.19242842 -0.00000004 0.000051
10 30 6.19242841 -0.00000001 0.000011

Convergence criterion (GCONV=.000000010) satisfied.


Output 51.1.5 displays the model fit and item fit statistics. Note that the item fit statistics apply only to the binary items. That is why these fit statistics are missing for item6 to item8.

Output 51.1.5: Fit Statistics

The IRT Procedure

Model Fit Statistics
Log Likelihood -3096.214205
AIC (Smaller is Better) 6230.4284096
BIC (Smaller is Better) 6310.5059634
Likelihood Ratio 825.73117957

Item Fit Statistics
Response
Model
Item Chi-Square Likelihood
Ratio
TwoP item1 34.16744 49.40020
  item2 30.34826 37.53096
  item3 27.54620 36.34605
  item4 22.76004 26.13437
Graded item5 18.32348 19.68465
  item6 . .
  item7 . .
  item8 . .


The last table for this example is the Item Parameter Estimates table in Output 51.1.6. This table contains parameter estimates, standard errors, and p-values. These p-values suggest that all the parameters are significantly different from zero.

Output 51.1.6: Parameter Estimates

The IRT Procedure

Item Parameter Estimates
Response
Model
Item Parameter Estimate Standard
Error
Pr > |t|
TwoP item1 Threshold 0.26898 0.08081 0.0004
    Slope 0.98378 0.14144 <.0001
  item2 Threshold 0.54245 0.08382 <.0001
    Slope 0.90006 0.13111 <.0001
  item3 Threshold 0.36666 0.07487 <.0001
    Slope 0.79519 0.11392 <.0001
  item4 Threshold 0.25561 0.06380 <.0001
    Slope 0.50431 0.08567 <.0001
Graded item5 Threshold -0.36195 0.06334 <.0001
    Slope 0.45385 0.08238 <.0001
  item6 Threshold 1 -0.21154 0.06001 0.0002
    Threshold 2 0.72508 0.06552 <.0001
    Slope 0.35769 0.06777 <.0001
  item7 Threshold 1 -0.59688 0.07436 <.0001
    Threshold 2 0.46832 0.07240 <.0001
    Slope 0.72675 0.09313 <.0001
  item8 Threshold 1 -0.82590 0.08102 <.0001
    Threshold 2 0.19222 0.07075 0.0033
    Slope 0.76385 0.09754 <.0001


Item characteristic curves (ICC) are also produced in this example. By default, these ICC plots are displayed in panels. To display an individual ICC plot for each item, use the UNPACK suboption in the PLOTS= option in the PROC IRT statement.

Output 51.1.7: ICC Plots


Now, suppose your research hypothesis includes some equality constraints on the model parameters—for example, the slopes for the first four items are equal. Such equality constraints can be specified easily by using the EQUALITY statement. In the following example, the slope parameters of the first four items are equal:

proc irt data=IrtUni;
   var item1-item8;
   model item1-item4/resfunc=twop, item5-item8/resfunc=graded;
   equality item1-item4/parm=[slope];
run;

To estimate the factor score for each subject and add these scores to the original data set, you can use the OUT= option in the PROC IRT statement. PROC IRT provides three factor score estimation methods: maximum likelihood (ML), maximum a posteriori (MAP), and expected a posteriori (EAP). You can choose an estimation method by using the SCOREMETHOD= option in the PROC IRT statement. The default method is maximum a posteriori. In the following, factor scores along with the original data are saved to a SAS data set called IrtUniFscore:

proc irt data=IrtUni out=IrtUniFscore;
   var item1-item8;
   model item1-item4/resfunc=twop,
         item5-item8/resfunc=graded;
   equality item1-item4/parm=[slope];
run;

Sometimes you might find it useful to sort the items based on the estimated threshold or slope parameters. You can do this by outputting the ODS tables for the estimates into data sets and then sorting the items by using PROC SORT. A simulated data set is used to show the steps.

The following DATA step creates the data set IrtSimu:

data IrtSimu;
   input item1-item25 @@;
   datalines;
1 1 1 0 1 1 0 0 1 1 0 0 0 0 1 0 1 1 1 0 0 0 1 0 1 1 1 1 1 1 0 1 1 0 0
0 0 0 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 0 0 0 1 0 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 0 0 0 1 1 0 0 0 0 0 0
0 1 1 1 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 1 0 0 0 1 0 1 1 0 1 1 0 0 1 1 0
1 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 1 1 1 1 1 0 1 0 1 1

   ... more lines ...   

1 1 0 1 1 1 1 1 1 1 0 1 1 0 0 0 1 1 0 1 1 1 1 1 1 0 0 1 0 1 0 1 0 0 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 0 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 1 1 1 1 1 
; 

First, you build the model and output the parameter estimates table into a SAS data set by using the ODS OUTPUT statement:

proc irt data=IrtSimu link=probit;
   var item1-item25;
   ods output ParameterEstimates=ParmEst;
run;

Output 51.1.8 shows the Item Parameter Estimates table. Notice that the threshold and slope parameters are in the same column. The reason for this is to avoid having an extremely wide table when each item has a lot of parameters.

Output 51.1.8: Basic Information

The IRT Procedure

Item Parameter Estimates
Item Parameter Estimate Standard
Error
Pr > |t|
item1 Threshold -1.91107 0.16631 <.0001
  Slope 1.44110 0.16075 <.0001
item2 Threshold -1.81557 0.17418 <.0001
  Slope 1.82037 0.18988 <.0001
item3 Threshold -1.98287 0.17861 <.0001
  Slope 1.58597 0.17477 <.0001
item4 Threshold -2.04595 0.19855 <.0001
  Slope 1.86637 0.20430 <.0001
item5 Threshold -1.92289 0.18283 <.0001
  Slope 1.78212 0.19061 <.0001
item6 Threshold -0.98960 0.08956 <.0001
  Slope 1.04069 0.10266 <.0001
item7 Threshold -0.94513 0.10609 <.0001
  Slope 1.45215 0.13449 <.0001
item8 Threshold -0.99508 0.10138 <.0001
  Slope 1.30275 0.12209 <.0001
item9 Threshold -1.08825 0.11406 <.0001
  Slope 1.50541 0.14220 <.0001
item10 Threshold -0.92408 0.12210 <.0001
  Slope 1.82138 0.17132 <.0001
item11 Threshold -0.01606 0.08162 0.4220
  Slope 1.26068 0.11259 <.0001
item12 Threshold 0.08284 0.11270 0.2312
  Slope 2.01803 0.19989 <.0001
item13 Threshold 0.18240 0.07711 0.0090
  Slope 1.12994 0.10180 <.0001
item14 Threshold 0.02184 0.10701 0.4191
  Slope 1.88710 0.18045 <.0001
item15 Threshold 0.06979 0.07072 0.1619
  Slope 0.96280 0.09035 <.0001
item16 Threshold -1.01960 0.10006 <.0001
  Slope 1.25213 0.11865 <.0001
item17 Threshold -0.94652 0.08762 <.0001
  Slope 1.02801 0.10107 <.0001
item18 Threshold -0.93988 0.11111 <.0001
  Slope 1.58224 0.14842 <.0001
item19 Threshold -0.95541 0.08643 <.0001
  Slope 0.97859 0.09745 <.0001
item20 Threshold -0.95462 0.12926 <.0001
  Slope 1.95452 0.18808 <.0001
item21 Threshold -0.88018 0.10365 <.0001
  Slope 1.45124 0.13402 <.0001
item22 Threshold -0.89293 0.11727 <.0001
  Slope 1.74235 0.16226 <.0001
item23 Threshold -1.09262 0.10057 <.0001
  Slope 1.20130 0.11604 <.0001
item24 Threshold -0.98436 0.12054 <.0001
  Slope 1.74203 0.16361 <.0001
item25 Threshold -0.87611 0.10503 <.0001
  Slope 1.48751 0.13758 <.0001


Then you save the estimates of slopes and thresholds in the data set ParmEst and create two separate data sets to store the threshold and slope parameters:

data Thresholds(keep=Item Threshold);
   set ParmEst;
   Threshold = Estimate;
   if (Parameter = "Threshold") then output;
run;
proc print data=Thresholds;
run;
data Slopes(keep=Item Slope);
   set ParmEst;
   Slope = Estimate;
   if (Parameter = "Slope") then output;
run;
proc print data=Slopes;
run;

The two SAS data sets are shown in Output 51.1.9 and Output 51.1.10.

Output 51.1.9: The Threshold Parameter SAS Data Set

Obs Item Threshold
1 item1 -1.91107
2 item2 -1.81557
3 item3 -1.98287
4 item4 -2.04595
5 item5 -1.92289
6 item6 -0.98960
7 item7 -0.94513
8 item8 -0.99508
9 item9 -1.08825
10 item10 -0.92408
11 item11 -0.01606
12 item12 0.08284
13 item13 0.18240
14 item14 0.02184
15 item15 0.06979
16 item16 -1.01960
17 item17 -0.94652
18 item18 -0.93988
19 item19 -0.95541
20 item20 -0.95462
21 item21 -0.88018
22 item22 -0.89293
23 item23 -1.09262
24 item24 -0.98436
25 item25 -0.87611


Output 51.1.10: The Slope Parameter SAS Data Set

Obs Item Slope
1 item1 1.44110
2 item2 1.82037
3 item3 1.58597
4 item4 1.86637
5 item5 1.78212
6 item6 1.04069
7 item7 1.45215
8 item8 1.30275
9 item9 1.50541
10 item10 1.82138
11 item11 1.26068
12 item12 2.01803
13 item13 1.12994
14 item14 1.88710
15 item15 0.96280
16 item16 1.25213
17 item17 1.02801
18 item18 1.58224
19 item19 0.97859
20 item20 1.95452
21 item21 1.45124
22 item22 1.74235
23 item23 1.20130
24 item24 1.74203
25 item25 1.48751


Now you can use PROC SORT to sort the items by either threshold or slope as follows:

proc sort data=Thresholds;
   by Threshold;
run;
proc print data=Thresholds;
run;
proc sort data=Slopes;
   by Slope;
run;
proc print data=Slopes;
run;

Output 51.1.11 and Output 51.1.12 show the sorted data sets.

Output 51.1.11: Items Sorted by Threshold

Obs Item Threshold
1 item4 -2.04595
2 item3 -1.98287
3 item5 -1.92289
4 item1 -1.91107
5 item2 -1.81557
6 item23 -1.09262
7 item9 -1.08825
8 item16 -1.01960
9 item8 -0.99508
10 item6 -0.98960
11 item24 -0.98436
12 item19 -0.95541
13 item20 -0.95462
14 item17 -0.94652
15 item7 -0.94513
16 item18 -0.93988
17 item10 -0.92408
18 item22 -0.89293
19 item21 -0.88018
20 item25 -0.87611
21 item11 -0.01606
22 item14 0.02184
23 item15 0.06979
24 item12 0.08284
25 item13 0.18240


Output 51.1.12: Items Sorted by Slope

Obs Item Slope
1 item15 0.96280
2 item19 0.97859
3 item17 1.02801
4 item6 1.04069
5 item13 1.12994
6 item23 1.20130
7 item16 1.25213
8 item11 1.26068
9 item8 1.30275
10 item1 1.44110
11 item21 1.45124
12 item7 1.45215
13 item25 1.48751
14 item9 1.50541
15 item18 1.58224
16 item3 1.58597
17 item24 1.74203
18 item22 1.74235
19 item5 1.78212
20 item2 1.82037
21 item10 1.82138
22 item4 1.86637
23 item14 1.88710
24 item20 1.95452
25 item12 2.01803


Notice that the sorting does not work correctly if any of the items have more than one threshold (ordinal response) or slope (multidimensional model).

Now, suppose you want to group the items into subgroups based on their difficulty parameters and then sort the items in each subgroup by their slope parameters. First, you need to merge the two data sets, Thresholds and Slopes, into one data set. Then, you transfer the threshold parameter into the difficulty parameter and add another variable, called DiffLevel, to indicate the subgroups. The following statements show these steps:

proc sort data=Slopes;
   by Item;
run;

proc sort data=Thresholds;
   by Item;
run;

data ItemEst;
   merge Thresholds Slopes;
   by Item;
   Dif = Threshold/Slope;
   if Dif < -1.0 then DiffLevel = 1;
   else if Dif < 0 then DiffLevel = 2;
   else if Dif < 1 then DiffLevel = 3;
   else DiffLevel = 4;
run;
proc print data=ItemEst;
run;

Output 51.1.13 shows the merged data set.

Output 51.1.13: The Merged SAS Data Set

Obs Item Threshold Slope Dif DiffLevel
1 item1 -1.91107 1.44110 -1.32612 1
2 item10 -0.92408 1.82138 -0.50735 2
3 item11 -0.01606 1.26068 -0.01274 2
4 item12 0.08284 2.01803 0.04105 3
5 item13 0.18240 1.12994 0.16142 3
6 item14 0.02184 1.88710 0.01157 3
7 item15 0.06979 0.96280 0.07249 3
8 item16 -1.01960 1.25213 -0.81430 2
9 item17 -0.94652 1.02801 -0.92073 2
10 item18 -0.93988 1.58224 -0.59402 2
11 item19 -0.95541 0.97859 -0.97632 2
12 item2 -1.81557 1.82037 -0.99736 2
13 item20 -0.95462 1.95452 -0.48842 2
14 item21 -0.88018 1.45124 -0.60650 2
15 item22 -0.89293 1.74235 -0.51249 2
16 item23 -1.09262 1.20130 -0.90953 2
17 item24 -0.98436 1.74203 -0.56506 2
18 item25 -0.87611 1.48751 -0.58898 2
19 item3 -1.98287 1.58597 -1.25026 1
20 item4 -2.04595 1.86637 -1.09622 1
21 item5 -1.92289 1.78212 -1.07899 1
22 item6 -0.98960 1.04069 -0.95091 2
23 item7 -0.94513 1.45215 -0.65085 2
24 item8 -0.99508 1.30275 -0.76383 2
25 item9 -1.08825 1.50541 -0.72290 2


Then, you can sort the items by slope within each difficulty group as follows:

proc sort data=ItemEst;
   by difflevel slope;
run;
proc print data=ItemEst;
run;

Output 51.1.14 shows the data set after sorting.

Output 51.1.14: Item Sorted by Slope within Each Difficulty Group

Obs Item Threshold Slope Dif DiffLevel
1 item1 -1.91107 1.44110 -1.32612 1
2 item3 -1.98287 1.58597 -1.25026 1
3 item5 -1.92289 1.78212 -1.07899 1
4 item4 -2.04595 1.86637 -1.09622 1
5 item19 -0.95541 0.97859 -0.97632 2
6 item17 -0.94652 1.02801 -0.92073 2
7 item6 -0.98960 1.04069 -0.95091 2
8 item23 -1.09262 1.20130 -0.90953 2
9 item16 -1.01960 1.25213 -0.81430 2
10 item11 -0.01606 1.26068 -0.01274 2
11 item8 -0.99508 1.30275 -0.76383 2
12 item21 -0.88018 1.45124 -0.60650 2
13 item7 -0.94513 1.45215 -0.65085 2
14 item25 -0.87611 1.48751 -0.58898 2
15 item9 -1.08825 1.50541 -0.72290 2
16 item18 -0.93988 1.58224 -0.59402 2
17 item24 -0.98436 1.74203 -0.56506 2
18 item22 -0.89293 1.74235 -0.51249 2
19 item2 -1.81557 1.82037 -0.99736 2
20 item10 -0.92408 1.82138 -0.50735 2
21 item20 -0.95462 1.95452 -0.48842 2
22 item15 0.06979 0.96280 0.07249 3
23 item13 0.18240 1.12994 0.16142 3
24 item14 0.02184 1.88710 0.01157 3
25 item12 0.08284 2.01803 0.04105 3