SUPPORT / SAMPLES & SAS NOTES
 

Support

Sample 25020: One-way ANOVA on summary data

DetailsResultsDownloadsAboutRate It

One-way ANOVA on summary data

Contents: Purpose / Requirements / Usage / Details / Limitations / References

NOTE: For comparing two group means using summary data, use SAS/STAT PROC TTEST. See the example in the TTEST documentation.
PURPOSE:
Perform a one-way analysis of variance on an existing SAS data set that contains only summary data.
REQUIREMENTS:
Version 6 or later of base SAS Software and SAS/STAT Software is required.
USAGE:
Follow the instructions in the Downloads tab of this sample to save the %SUM_GLM macro definition. Replace the text within quotes in the following statement with the location of the %SUM_GLM macro definition file on your system. In your SAS program or in the SAS editor window, specify this statement to define the %SUM_GLM macro and make it available for use:
   %inc "<location of your file containing the SUM_GLM macro>";

Following this statement, you may call the %SUM_GLM macro. See the Results tab for an example.

The input data set, specified in the data= option should be structured such that each observation contains the summary statistics for a single level of the group= variable. The data set must have variables containing the group levels, sample sizes, means, and standard deviations. Optionally, variables for BY-group processing may also appear, but if specified, the data set must be sorted by the BY variables prior to calling the %SUM_GLM macro.

The following parameters are required when using the macro:

group=
Name of the classification (grouping) variable.
n=
Name of the variable containing sample sizes.
mean=
Name of the variable containing the means.
stddev=
Name of the variable containing the standard deviations.

The following parameters are optional:

data=
Name of the SAS data set containing the summary data. If not specified, the last-created data set is used.
lsopts=
Any valid option for the LSMEANS statement in the GLM Procedure.
by=
Names of any BY variable(s).
DETAILS:
The %SUM_GLM macro is based on the methods presented in Larson (1992). In this paper, a method of generating surrogate data to represent the summary data is given and an analysis of this data is performed.

The macro creates the data set _WORKING which can be directly analyzed using PROC GLM using the FREQ statement:

freq freq;

The response variable in this data set is named Y. If the GLM analysis done by the macro is not exactly as desired, you can use GLM to reanalyze the _WORKKING data set by including the above FREQ statement in your GLM step.

LIMITATIONS:
Only one-way models are addressed in Larson's paper and in this macro. No error checking is performed in the macro.
REFERENCES:
Larson, David A. (1992), "Analysis of Variance With Just Summary Statistics as Input," American Statistician, 46, 151-152.



These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.