![]() | ![]() | ![]() | ![]() |
Consider multiple studies in which two (or more) diagnostic devices are investigated in comparison to the gold standard. The response is binary resulting in a 2x2 table for each device against the gold standard. For each table, a statistic of interest is the sensitivity (or specificity). Estimation of a common, overall estimate and confidence interval of sensitivity is desired for each device.
The studies are assumed to use independent groups of subjects. However, within a study, subjects might provide responses on multiple diagnostics resulting in correlation that must be accounted for in the analysis. Alternatively, an independent set of subjects could be used within the studies for each diagnostic, which removes any correlation among the responses. In the example below, two diagnostic devices are evaluated using subjects that respond on both devices.
Given the data from all studies, an analysis can be done that estimates the common sensitivity for each device across the studies, taking into account any correlation caused by subjects providing responses on both devices. The analysis, using a logistic GEE model, can estimate the common sensitivities along with their standard errors, allowing for confidence intervals and a comparison. A similar approach could be used for other 2x2 table statistics such as specificity, positive predictive value (PPV), or negative predictive value (NPV).
The following DATA step generates some example data. This step generates, for each subject (ID), random binary values on the gold standard (GOLD) and each device. While it doesn't try to build in specific sensitivity values or any correlation within subjects, it will serve to illustrate the analysis. It creates data for two devices and three studies, but the proposed analysis could be used for more of both. Note that each subject has two observations created and the variable, DIAG, holds the binary responses from the two devices (DEV). This data structure is needed for GEE analysis:
data f; call streaminit(42633); do study=1 to 3; do rep=1 to 20; drop rep; id+1; gold=rand('bernoulli',.5); dev=1; diag=rand('bernoulli',.5); output; dev=2; diag=rand('bernoulli',.5); output; end; end; run;
The following produces the 2x2 table (not shown) for each device against the gold standard within each study:
proc sort data=f; by study dev; run; proc freq data=f; by study dev; table diag*gold; run;
As shown in SAS Note 24170, the sensitivity in each table is the column percentage in the 1,1 cell. Since it is only necessary to work with the GOLD=1 column, the sensitivity can be seen as simply the event probability in that column. The following statements again obtain the sensitivity values and also save them in a data set. The PRINT and MEANS steps then compute and display the simple averages of the observed sensitivities for each device. Of course, this does not take into account any correlation among the measures within subjects:
proc freq data=f; where gold=1; by study dev; table diag; ods output onewayfreqs=frqs(where=(diag=1)); run; proc print; id study dev; var percent; title "Sensitivity for each device in each study"; run; proc means mean; class dev; var percent; title "Average sensitivities across studies"; run;
![]() ![]() |
The following analysis fits a saturated logistic GEE model that reproduces the observed sensitivities for each device in each study and estimates the overall device sensitivities across the studies. PROC GEE or PROC GENMOD can be used, but PROC GEE is the recommended procedure for fitting GEE models. It is similar to what is done in the second FREQ step above, only the GOLD=1 data is used.
The REPEATED statement accounts for the correlated pairs of responses from each subject within a study. Note that if there is only one diagnostic device evaluated in each study or if there are multiple diagnostics per study that are evaluated by independent groups of subjects, then the REPEATED statement is not needed:
proc gee data=f; where gold=1; class id study dev; model diag(event='1')=study|dev / dist=binomial; repeated subject=id(study); lsmeans study*dev / ilink cl; lsmeans dev / ilink cl diff; run;
The first LSMEANS statement provides the sensitivity for each device in each study. The ILINK and CL options produce columns labeled with "Mean" providing the sensitivities, standard errors, and confidence limits. The values in the Estimate column are estimated logits (log odds). Note that the sensitivity values match the observed values above:
![]() |
The second LSMEANS statement estimates each diagnostic device sensitivity evaluated over the studies. The estimated overall sensitivity for each device is similar to the average of its observed sensitivities found by PROC FREQ above. Note that the Estimate in the Differences table is the estimated difference in logits (log odds ratio) and not the difference in sensitivities. The difference test indicates no significant difference between the overall sensitivities (p=0.5815) as expected with this totally randomly generated data:
![]() |
If a statistic that summarizes the comparison of sensitivities is desired, such as the estimated difference (risk difference) or ratio (relative risk), it can be obtained by providing information from the PROC GEE (or GENMOD) analysis to the NLMeans macro. See the documentation of the NLMeans macro (SAS Note 62362), which provides details and examples.
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | Microsoft Windows Server 2012 R2 Std | ||
Microsoft Windows Server 2012 R2 Datacenter | ||||
Microsoft Windows Server 2012 Datacenter | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows Server 2008 R2 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 11 | ||||
Microsoft Windows 10 | ||||
Microsoft Windows 8.1 Pro x64 | ||||
Microsoft Windows 8.1 Pro 32-bit | ||||
Microsoft Windows 8.1 Enterprise x64 | ||||
Microsoft Windows 8.1 Enterprise 32-bit | ||||
Microsoft Windows 8 Pro x64 | ||||
Microsoft Windows 8 Pro 32-bit | ||||
Microsoft Windows 8 Enterprise x64 | ||||
Microsoft Windows 8 Enterprise 32-bit | ||||
OS/2 | ||||
Microsoft® Windows® for x64 | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
OpenVMS VAX | ||||
z/OS 64-bit | ||||
z/OS | ||||
Microsoft Windows Server 2012 Std | ||||
Microsoft Windows Server 2016 | ||||
Microsoft Windows Server 2019 | ||||
Microsoft Windows Server 2022 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | Analytics ==> Categorical Data Analysis Analytics ==> Longitudinal Analysis SAS Reference ==> Procedures ==> GEE SAS Reference ==> Procedures ==> GENMOD |
Date Modified: | 2025-01-17 15:40:17 |
Date Created: | 2025-01-16 11:59:58 |