Analysis of Variance

Analysis of Variance

Analysis of Variance

The Analysis of Variance table summarizes the information related to the sources of variation in the data. Sum of Squares measures variation present in the data. It is calculated by summing squared deviations. There are three sources of variation: Model, Error, and C Total. The Model row in the table corresponds to the variation among class means. The Error row corresponds to ${\epsilon}$ in the model and represents variation within class means. C Total is the total sum of squares corrected for the mean, and it is the sum of Model and Error. Degrees of Freedom, DF, are associated with each sum of squares and are related in the same way. Mean Square is the Sum of Squares divided by its associated DF (Moore and McCabe 1989, p.685).

If the data are normally distributed, the ratio of the Mean Square for the Model to the Mean Square for Error is an F statistic. This F statistic tests the null hypothesis that all the class means are the same against the alternative hypothesis that the means are not all equal. Think of the ratio as a comparison of the variation among class means to variation within class means. The larger the ratio, the more evidence that the means are not the same. The computed F statistic (labeled F Stat) is 6.0276. You can use the p-value (labeled Pr > F) to determine whether to reject the null hypothesis. The p-value, also referred to as the probability value or observed significance level, is the probability of obtaining (by chance alone) an F statistic greater than the computed F statistic when the null hypothesis is true. The smaller the p-value, the stronger the evidence against the null hypothesis.

In this example, the p-value is so small that you can clearly reject the null hypothesis and conclude that at least one of the class means is different. At this point, you have demonstrated statistical significance but cannot make statements about which class means are different.

Top of Page