Example 9 for PROC LOGISTIC


/****************************************************************/
/*          S A S   S A M P L E   L I B R A R Y                 */
/*                                                              */
/*    NAME: LOGIEX9                                             */
/*   TITLE: Example 9 for PROC LOGISTIC                         */
/* PRODUCT: STAT                                                */
/*  SYSTEM: ALL                                                 */
/*    KEYS: logistic regression analysis,                       */
/*          binomial response data,                             */
/*   PROCS: LOGISTIC                                            */
/*    DATA:                                                     */
/*                                                              */
/* SUPPORT: Bob Derr                                            */
/*     REF: SAS/STAT User's Guide, PROC LOGISTIC chapter        */
/*    MISC:                                                     */
/*                                                              */
/****************************************************************/

/*****************************************************************
Example 9. Goodness-of-Fit Tests and Subpopulations
*****************************************************************/

/*
A study is done to investigate the effects of two binary factors, A and B, on
a binary response, Y.  Subjects are randomly selected from subpopulations
defined by the four possible combinations of levels of A and B.  The number
of subjects responding with each level of Y is recorded and entered into data
set A.

First, a full model is fit to examine the main effects of A and B as well as
the interaction effect of A and B. Note that Pearson and Deviance
goodness-of-fit tests cannot be obtained for this model since a full model
containing four parameters is fit, leaving no residual degrees of freedom.
For a binary response model, the goodness-of-fit tests have m-q degrees of
freedom, where m is the number of subpopulations and q is the number of model
parameters. In the preceding model, m=q=4, resulting in zero degrees of
freedom for the tests.
*/

title 'Example 9: Goodness-of-Fit Tests and Subpopulations';

data One;
   do A=0,1;
      do B=0,1;
         do Y=1,2;
            input F @@;
            output;
         end;
      end;
   end;
   datalines;
23 63 31 70 67 100 70 104
;


proc logistic data=One;
   freq F;
   model Y=A B A*B;
run;


/*
Results of the model fit above show that neither the A*B interaction nor the
B main effect is significant. If a reduced model containing only the A effect
is fit, two degrees of freedom become available for testing goodness of fit.
Specifying the SCALE=NONE option requests the Pearson and deviance
statistics.  With single-trial syntax, the AGGREGATE= option is needed to
define the subpopulations in the study.

Specifying AGGREGATE=(A B) creates subpopulations of the four combinations of
levels of A and B. Although the B effect is being dropped from the model, it
is still needed to define the original subpopulations in the study. If
AGGREGATE=(A) were specified, only two subpopulations would be created from
the levels of A, resulting in m=q=2 and zero degrees of freedom for the
tests.
*/

proc logistic data=One;
   freq F;
   model Y=A / scale=none aggregate=(A B);
run;