![]() | ![]() | ![]() | ![]() |
For matched pairs data with a binary response (such as yes/no responses from husband and wife pairs), the AGREE option in PROC FREQ provides a test of equal probability of a Yes response. This is McNemar's test of marginal homogeneity. However, with paired data, point and confidence interval estimates of the risk difference are not available in PROC FREQ. These can be obtained by fitting a model that adjusts for the correlation within pairs. One way to do this is by fitting a repeated measures model with PROC GENMOD. The risk difference can then be estimated using the Margins macro, the NLMeans macro, or by modeling the event probability directly.
To estimate the risk difference between independent groups, rather than in matched pairs, see this note. For estimating the odds ratio with matched pairs data, see this note.
In this example, 100 husband and wife pairs were asked a question that could be answered Yes or No. The results are recorded in the following data set. The Yes response is coded 1 and No is coded 0. Since only four distinct response combinations are possible, the data can be summarized as the number of pairs (Npairs) with each possible combination.
data hwpairs; input husband wife Npairs; datalines; 1 1 15 1 0 20 0 1 5 0 0 60 ;
To test that the proportion of husbands answering Yes is equal to the number of wives answering Yes, McNemar's test of marginal homogeneity (equal marginal probabilities) can be obtained using the AGREE option in PROC FREQ.
proc freq data=hwpairs; table husband*wife / agree; weight Npairs; run;
The results show that the husband and wife probabilities differ significantly (p=0.0027). The marginal percentages for the HUSBAND=1 row and the WIFE=1 column show that 35% of husbands and 20% of wives answered Yes — a difference of 15%. But PROC FREQ does not directly estimate this difference, nor provide a confidence interval for it. This can be done by fitting a repeated measures model to the data.
|
In order to fit a repeated measures model, individual level data rather than summarized data must be used. The following DATA step expands the summarized data to a data set containing an observation for each member of each pair. A variable identifying the pair (ID) is created. The DO loop splits each observation of the summary data into two observations for each pair in each response combination resulting in 200 observations. Husband or wife is indicated by the MEMBER variable (1=husband, 2=wife), and the responses from all subjects are stored in the RESPONSE variable.
data indiv; set hwpairs; retain id 0; do id=id+1 to id+Npairs; member=1; response=husband; output; member=2; response=wife; output; end; keep id member response; run;
These statements display the observations for the first four pairs which were all from the yes/yes response combination.
proc print data=indiv (obs=8); id id; run;
|
PROC GENMOD can be used to fit a repeated measures logistic model to the individual level data. The NLMeans macro can then be used to estimate the difference of the Yes probabilities.
In the following statements, the EVENT='1' response variable option tells GENMOD to model the probability of a RESPONSE=1 (Yes) is modeled. Since husband and wife responses are considered correlated, the pair identifier (ID) is used in the SUBJECT= option in the REPEATED statement. The LSMEANS statement with the ILINK and CL options estimates the Yes probabilities for husbands and wives and provides confidence intervals. The E option produces a table of the coefficients defining the LSMEANS. This table is saved by the ODS OUTPUT statement. The STORE statement saves the fitted model. The coefficients table and the stored model are used by the NLMeans macro.
proc genmod data=indiv; class id member; model response(event='1') = member / dist=binomial; repeated subject=id; lsmeans member / ilink cl e; ods output coef=c; store geemod; run;
The husband and wife probability estimates are shown in the Mean column of the "member Least Squares Means" table. 95% confidence limits for each are also provided.
|
Using the Margins macro
See the description of the Margins macro for information on making it available in your SAS® session.
Estimates of the response mean for levels of a predictor are predictive margins. For this example, the predictive margins for husbands and wives are estimates of the probability of a Yes response. The risk difference is considered the marginal effect of the MEMBER predictor.
In the following call of the Margins macro, the same logistic GEE model as above is requested by the class=, response=, roptions=, dist=, model=, and geesubject= specifications. The predictive margins for husbands and wives are requested by margins=member. An estimate and confidence interval of the husband-wife risk difference, the marginal effect, is provided by options=diff cl.
%Margins(data = indiv, class = id member, response = response, roptions = event='1', dist = binomial, model = member, geesubject = id, margins = member, options = diff cl)
The estimated predictive margins for husbands and wives match those from the LSMEANS statement above. The MEMBER marginal effect, or difference in husband and wife means, is estimated to be 0.15 with a confidence interval (0.0565, 0.2435). The test of the difference indicates that husbands have significantly higher probability of responding Yes than wives (p=0.0017). The label in the Difference column indicates the direction of the difference is member1-member2 or husband-wife. If the reverse is desired, then specify reverse in options= in the Margins macro call.
![]() |
Using the NLMeans macro
See the description of the NLMeans macro for information on making it available in your SAS® session.
The following call of the NLMeans macro also estimates the difference in the husband and wife probabilities. In addition to specifying the saved model and the table of LSMEANS coefficients, the link function used in the model is specified. A title for the table is also given.
%NLMeans(instore=geemod, coef=c, link=logit, title=Pr(Yes) Difference)
The results match those from the Margins macro above. The difference in the probabilities, 0.15, is shown in the Estimate column of the table along with the standard error of the difference (0.0477) and a confidence interval (0.0565, 0.2435). The test of the difference indicates that the husband and wife probabilities differ (p=0.0017). The Label indicates the direction of the difference is member1-member2 or husband-wife. If the reverse is desired, then specify options=reverse in the NLMeans macro call.
|
Using a linear probability model and LSMEANS/DIFF
Alternatively, the difference in probabilities can be estimated by modeling the probability of a Yes response directly. The DIST=BINOMIAL and LINK=IDENTITY options fit a linear probability model.Note With the probability modeled directly, the LSMEANS estimates are the estimated probabilities and the DIFF option can be used to estimate the difference in probabilities.
proc genmod data=indiv; class id member; model response(event='1') = member / dist=binomial link=identity; repeated subject=id; lsmeans member / diff cl; run;
The husband and wife probability estimates and difference again match the above results from the Margins and NLMeans macros.
|
_____
Note: Unlike the default logit link function, the identity link does not ensure that the model produces valid probability estimates. Errors may be result when fitting such models depending on the model and the data.
Product Family | Product | System | SAS Release | |
Reported | Fixed* | |||
SAS System | SAS/STAT | z/OS | ||
OpenVMS VAX | ||||
Microsoft® Windows® for 64-Bit Itanium-based Systems | ||||
Microsoft Windows Server 2003 Datacenter 64-bit Edition | ||||
Microsoft Windows Server 2003 Enterprise 64-bit Edition | ||||
Microsoft Windows XP 64-bit Edition | ||||
Microsoft® Windows® for x64 | ||||
OS/2 | ||||
Microsoft Windows 8 | ||||
Microsoft Windows 95/98 | ||||
Microsoft Windows 2000 Advanced Server | ||||
Microsoft Windows 2000 Datacenter Server | ||||
Microsoft Windows 2000 Server | ||||
Microsoft Windows 2000 Professional | ||||
Microsoft Windows 2012 | ||||
Microsoft Windows NT Workstation | ||||
Microsoft Windows Server 2003 Datacenter Edition | ||||
Microsoft Windows Server 2003 Enterprise Edition | ||||
Microsoft Windows Server 2003 Standard Edition | ||||
Microsoft Windows Server 2003 for x64 | ||||
Microsoft Windows Server 2008 | ||||
Microsoft Windows Server 2008 for x64 | ||||
Microsoft Windows XP Professional | ||||
Windows 7 Enterprise 32 bit | ||||
Windows 7 Enterprise x64 | ||||
Windows 7 Home Premium 32 bit | ||||
Windows 7 Home Premium x64 | ||||
Windows 7 Professional 32 bit | ||||
Windows 7 Professional x64 | ||||
Windows 7 Ultimate 32 bit | ||||
Windows 7 Ultimate x64 | ||||
Windows Millennium Edition (Me) | ||||
Windows Vista | ||||
Windows Vista for x64 | ||||
64-bit Enabled AIX | ||||
64-bit Enabled HP-UX | ||||
64-bit Enabled Solaris | ||||
ABI+ for Intel Architecture | ||||
AIX | ||||
HP-UX | ||||
HP-UX IPF | ||||
IRIX | ||||
Linux | ||||
Linux for x64 | ||||
Linux on Itanium | ||||
OpenVMS Alpha | ||||
OpenVMS on HP Integrity | ||||
Solaris | ||||
Solaris for x64 | ||||
Tru64 UNIX |
Type: | Usage Note |
Priority: | |
Topic: | Analytics ==> Categorical Data Analysis Analytics ==> Longitudinal Analysis SAS Reference ==> Procedures ==> FREQ SAS Reference ==> Procedures ==> GENMOD SAS Reference ==> Macro |
Date Modified: | 2019-02-27 15:48:45 |
Date Created: | 2012-07-13 15:06:20 |