![]() | ![]() | ![]() | ![]() | ![]() |
| Contents: | Purpose / History / Requirements / Usage / Details / Limitations / See Also / References |
NOTE: Beginning in SAS 9.1, you can use ODS graphics to create a plot of the ROC curve, but points cannot be labeled. Beginning in SAS 9.2, you can use the PLOTS=ROC(ID=OBS|PROB) option to label points with their observation number or predicted probability. See the LOGISTIC documentation.
| Version | Update Notes |
| 1.1 |
|
| 1.0 | Initial coding. |
Follow the instructions in the Downloads tab of this sample to save the %ROCPLOT macro definition. Replace the text within quotes in the following statement with the location of the %ROCPLOT macro definition file on your system. In your SAS program or in the SAS editor window, specify this statement to define the %ROCPLOT macro and make it available for use:
%inc "<location of your file containing the ROCPLOT macro>";
Following this statement, you may call the %ROCPLOT macro. See the Results tab for an example.
The following parameters are required when using the macro:
The following are optional parameters for controlling the graphical appearance:
The version of the %ROCPLOT macro that you are using is displayed when you specify version (or any string) as the first argument. For example:
%ROCPLOT(version, out=logout, outroc=logor, p=p_hat, id=x1 x2)
It is important to use the ROCEPS=0 option in the MODEL statement of PROC LOGISTIC when you fit your model because this option allows all the unique predicted values to be output to the OUTROC= data set. Otherwise, the values may be rounded yielding fewer points on the ROC plot.
If several points have the same specificity value, only the top- and bottom-most points are labeled, and only if they are sufficiently separated from other labels (as controlled by the MINDIST= option).
If multiple predictor settings occur at same point, there is not indication of this in a high-resolution plot (PLOTTYPE=HIGH). However, low-resolution plots (PLOTTYPE=LOW) use characters which indicate the number of multiple settings at a point.
An ROC plot for a large data set may have many points with labels that overlap depending on the values of the SIZE= and MINDIST= options. In the worst cases, a high density of overlapping labels may form a big black blob. You should be able to reduce the number of labeled points and produce a reasonable plot using one or more of these:
The _ROCPLOT data set generated by the macro contains all the information needed to produce the ROC plot. The _ID variable in this data set contains the label for the points on the plot.
The %ROCPLOT macro attempts to check for a later version of itself. If it is unable to do this (such as if there is no active internet connection available), the macro will issue the following message:
ROCPLOT: Unable to check for newer version
The computations performed by the macro are not affected by the appearance of this message.
This message:
ROCPLOT: Some predicted values in OUT= did not match predicted values
in OUTROC=. Verify that you used the ROCEPS=0 option in
PROC LOGISTIC.
may appear in the SAS Log if the list of distinct predicted probabilities in the OUTROC= data set is shortened because the ROCEPS=0 option was not used in PROC LOGISTIC, or if, for some other reason, the number of distinct predicted probabilities in the OUTPUT OUT= data set differs from the number of distinct cutoff values in the OUTROC= data set.
This message does not indicate a problem, but the message is issued to remind you to use the ROCEPS=0 option in case that was the cause.
Note that observations with different settings of the ID= variables may have the same predicted probability, but only one of the settings labels the point on the ROC curve associated with that predicted probability.
Also, two (or more) points on the ROC curve may have the same label because they have the same values of the ID= variables but different predicted probabilities. For example, this may happen if a subset of the predictors is specified in ID= to label the points, and points having the same values of the predictors in ID= differ on one or more of the predictors not in ID=.
An additional example using the %ROCPLOT macro can be seen in %ROC macro examples.
Hanley, J.A. and McNeil, B.J. (1982), "The meaning and use of the area under a receiver operating characteristic (ROC) curve," Radiology, 143, 29-36.
Hanley, J.A. and McNeil, B.J. (1983), "A method of comparing the areas under receiver operating characteristic curves derived from the same cases," Radiology, 148, 839-843.
Metz, C.E. (1978), "Basic principles of ROC analysis," Seminars in Nuclear Medicine, 8(4), 283-298.
An additional example in the Results tab of the ROC macro description shows a comparative plot of several ROC curves and a test of the equality of the areas below the curves.
data Data1;
input disease n age;
datalines;
0 14 25
0 20 35
0 19 45
7 18 55
6 12 65
17 17 75
;
ods select parameterestimates association;
proc logistic data=data1;
model disease/n = age / outroc=roc1 roceps=0;
output out=outp p=phat;
ods output association=assoc;
run;
data _null_;
set assoc;
if label2='c' then call symput("area",cvalue2);
run;
The first call of the ROCPLOT macro plots the ROC curve labeling points by the corresponding value of AGE and displays a set of grid lines.
* Define the ROCPLOT macro ;
%inc "<location of your file containing the ROCPLOT macro>";
title "ROC plot for disease = age";
title2 "Approximate area under curve = &area";
%rocplot(outroc = roc1,
out = outp,
p = phat,
id = age,
grid = yes)
|
The second ROCPLOT call labels the points on the ROC curve with their associated sensitivity, specificity, and cutoff value.
data outp;
set outp;
pfmt=put(phat,6.4);
run;
%rocplot(outroc = roc1,
out = outp,
p = phat,
id = _sens_ _spec_ pfmt,
grid = yes)
|
data Remission;
input remiss cell smear infil li blast temp;
datalines;
1 .8 .83 .66 1.9 1.1 .996
1 .9 .36 .32 1.4 .74 .992
0 .8 .88 .7 .8 .176 .982
0 1 .87 .87 .7 1.053 .986
1 .9 .75 .68 1.3 .519 .98
0 1 .65 .65 .6 .519 .982
1 .95 .97 .92 1 1.23 .992
0 .95 .87 .83 1.9 1.354 1.02
0 1 .45 .45 .8 .322 .999
0 .95 .36 .34 .5 0 1.038
0 .85 .39 .33 .7 .279 .988
0 .7 .76 .53 1.2 .146 .982
0 .8 .46 .37 .4 .38 1.006
0 .2 .39 .08 .8 .114 .99
0 1 .9 .9 1.1 1.037 .99
1 1 .84 .84 1.9 2.064 1.02
0 .65 .42 .27 .5 .114 1.014
0 1 .75 .75 1 1.322 1.004
0 .5 .44 .22 .6 .114 .99
1 1 .63 .63 1.1 1.072 .986
0 1 .33 .33 .4 .176 1.01
0 .9 .93 .84 .6 1.591 1.02
1 1 .58 .58 1 .531 1.002
0 .95 .32 .3 1.6 .886 .988
1 1 .6 .6 1.7 .964 .99
1 1 .69 .69 .9 .398 .986
0 1 .73 .73 .7 .398 .986
;
ods select parameterestimates association;
proc logistic data=Remission;
model remiss = smear blast / outroc=roc1 roceps=0;
output out=outp p=p;
run;
|
The LOGISTIC Procedure
| ||||||||||||||||||||||||||||||||||||||||||||||||||
The ROC plot produced by the following ROCPLOT macro call labels the points with the values of the SMEAR and BLAST predictors corresponding to each cutoff.
* Define the ROCPLOT macro ;
%inc "<location of your file containing the ROCPLOT macro>";
title "ROC plot for remiss = smear blast";
title2 " ";
%rocplot(outroc = roc1,
out = outp,
p = p,
id = smear blast,
roffset = 10,
grid = yes)
Note that if the model has multiple predictors, it is possible for multiple settings of the predictors to have the same predicted probability. In such cases, the label used in the plot is just one of these settings.
|
ods select parameterestimates association;
proc logistic data=ptsd;
where time=1;
model ptsd(event="1") = problems / outroc=or roceps=0;
output out=out p=p;
run;
|
The LOGISTIC Procedure
| ||||||||||||||||||||||||||||||||||||||||||||
The following produces a plot of the ROC curve for the model with it's points labelled by the number of problems. Notice that labels for several closely-spaced points in the upper-right of the plot are omitted to avoid overlap.
title "Model PTSD = PROBLEMS";
%rocplot(out=out, outroc=or, p=p, id=problems)
|
Decreasing the MINDIST= value increases the number of labels in the plot, but this causes some overlapping of labels.
%rocplot(out=out, outroc=or, p=p, id=problems, mindist=.005)
|
By rounding the cutoff values to the nearest tenth, the number of points on the ROC plot is greatly reduced. The labeling of points may change due to the rounding. For instance, the point labelled problems=6.75 in the plot above is labelled 6.125 in the plot below. This is because rounding effectively forms groups of points and the first point in the group containing the 6.75 point is a point with problems=6.125. It is the label of this first point in the group that appears in the plot below.
%rocplot(out=out, outroc=or, p=p, id=problems, round=.1)
|
Right-click on the link below and select Save to save
the %ROCPLOT macro definition
to a file. It is recommended that you name the file
rocplot.sas.
| Type: | Sample |
| Topic: | SAS Reference ==> Procedures ==> PLOT SAS Reference ==> Procedures ==> LOGISTIC SAS Reference ==> Procedures ==> GPLOT Analytics ==> Categorical Data Analysis Analytics ==> Regression |
| Date Modified: | 2008-01-28 10:55:21 |
| Date Created: | 2005-01-13 15:03:46 |
| Product Family | Product | Host | SAS Release | |
| Starting | Ending | |||
| SAS System | Base SAS | All | 6.12 | n/a |
| SAS System | SAS/STAT | All | 6.12 | n/a |
| SAS System | SAS/GRAPH | All | 6.12 | n/a |





