23086 - Error suggests using the AGGREGATE= option or that there is more than one profile

SUPPORT / SAMPLES & SAS NOTES

Support

Usage Note 23086: Error suggests using the AGGREGATE= option or that there is more than one profile

The AGGREGATE= option appears in the LOGISTIC, GENMOD, and PROBIT procedures and it applies to binomial or multinomial response variables. The following information applies to all three procedures. However, the MODEL statements that are used for illustration here are from PROC GENMOD and therefore include the DIST= option. Display of the Pearson and deviance goodness of fit statistics can be requested in PROC LOGISTIC by specifying both the SCALE=NONE and AGGREGATE options. In PROC GENMOD, the SCALE= option is not needed to display these statistics, but as described below, the AGGREGATE option is needed. In PROC PROBIT, both the LACKFIT and AGGREGATE options must be specified.

The deviance and Pearson chi-square statistics are sums, over subpopulations, of the discrepancies between the observed and predicted response probabilities. Note that the model must create only one predicted probability for a subpopulation in order for it to know how to compute the discrepancy for that subpopulation. To compute these statistics, the procedure has to know how the data was sampled, that is, what the subpopulations are in the data. Typically, the subpopulations are defined by the observed settings of all the predictors in the model and this is indicated by listing all of the predictor variables in the AGGREGATE= option. In this example, the predictor variables are A, B, and C, and each unique setting of these variables represents a subpopulation:

   model y = a b c a*b a*c / dist=binomial aggregate=(a b c);

However, if you fit the model and decide to remove a variable, this does not alter the fact that the data was sampled from subpopulations defined using that variable. So, the variables that are listed in AGGREGATE= might define subpopulations more finely than the variables in the model. For example, the following MODEL statement removes C from the model but still defines the subpopulations for the deviance statistic using A, B, and C:

   model y = a b a*b / dist=binomial aggregate=(a b c);

For each A-B-C subpopulation there is only one predicted value in the above model. In fact, the predicted value that is produced by the model for a particular setting of A and B would apply to all A-B-C subpopulations that have the same setting of A and B. But you cannot define the subpopulations more coarsely than the variables in the model because then the model would create more than one predicted value within the subpopulations defined by AGGREGATE=. For example, suppose you have the following MODEL statement:

   model y = a b c a*b a*c / dist=binomial aggregate=(a b);

This statement would cause an error similar to this:

NOTE: The SCALE= option is ignored because there is more than one profile of the
      explanatory variables within the same profile of the aggregate variables.

This error occurs because within an A-B subpopulation, there would be different predicted values associated with different levels of C, so a single contribution to the deviance statistic could not be determined.

When using events/trials syntax to analyze summarized binomial data, note that PROC GENMOD currently does not allow aggregation beyond the observation level. That is, PROC GENMOD always assumes that each observation is a subpopulation. In the LOGISTIC or PROBIT procedures, you can use the AGGREGATE= option to further aggregate the binomial observations.

Operating System and Release Information

Product Family	Product	System	SAS Release
			Reported	Fixed*
SAS System	SAS/STAT	All	n/a

* For software releases that are not yet generally available, the Fixed Release is the software release in which the problem is planned to be fixed.

Type:	Usage Note
Priority:	low
Topic:	SAS Reference ==> Procedures ==> GENMOD SAS Reference ==> Procedures ==> PROBIT Analytics ==> Regression Analytics ==> Categorical Data Analysis SAS Reference ==> Procedures ==> LOGISTIC

Date Modified:	2005-05-17 09:57:55
Date Created:	2002-12-16 10:56:38

Support

Usage Note 23086: Error suggests using the AGGREGATE= option or that there is more than one profile

Operating System and Release Information

Follow Us

What is...