Distribution Analyses |

Observations with missing values for a **Y** variable are not used in the analysis for that variable. Observations with **Weight** or **Freq** values that are missing or that are less than or equal to zero are not used. Only the integer part of **Freq** values is used.

The following notation is used in the rest of this chapter:

*n*is the number of nonmissing values.*y*_{i}is the*i*th observed nonmissing value.*y*_{(i)}is the*i*th ordered nonmissing value, .- is the sample mean, .
*d*is the variance divisor.*s*is the sample variance, .^{2}*z*_{i}is the standardized value, .

The summation represents a summation of .

Based on the variance definition, vardef, the variance divisor *d* is computed as

for vardef=DF, degrees of freedom |
||

for vardef=N, number of observations |

The skewness is a measure of the tendency of the deviations from the mean to be larger in one direction than in the other. The sample skewness is calculated as

for vardef=DF |
||

for vardef=N |

where *c*_{3n} = [*n*/((*n*-2))][1/((*n*-1))].

The kurtosis is primarily a measure of the heaviness of the tails of a distribution. The sample kurtosis is calculated as

for vardef=DF |
||

for vardef=N |

where *c*_{4n} = [(*n*(*n*+1))/((*n*-2)(*n*-3))][1/((*n*-1))] and *c*_{n} = [((*n*-1)^{2})/((*n*-2)(*n*-3))].

When the observations are independently distributed with a common mean and unequal variances, ,where *w*_{i} are individual weights, weighted analyses may be appropriate. You select a **Weight** variable to specify relative weights for each observation in the analysis.

The following notation is used in weighted analyses:

*w*_{i}is the weight associated with*y*_{i}.*w*_{(i)}is the weight associated with*y*_{(i)}.- is the average observation weight, .
- is the weighted sample mean, .
*s*^{2}_{w}is the weighted sample variance, .*z*_{wi}is the standardized value, .

In addition to vardef=DF and vardef=N, the variance divisor is also computed as

for vardef=WDF, sum of weights minus 1 |
||

for vardef=WGT, sum of weights |

With , and the expected value

Note |
The use of vardef=WDF/WGT may not be appropriate since it is the weighted average of individual variances, , which have unequal expected values. |

For vardef=**DF/N**, *s ^{2}*

The weighted skewness is computed as

for DF |
||

for N |

The weighted kurtosis is computed as

for DF |
||

for N |

The formulations are invariant under the transformation *w ^{*}*

To view or change the divisor *d* used in the calculation of variances, or to view or change the use of observations with missing values, click on the **Method** button from the variables dialog to display the method options dialog.

**Figure 38.3:** Distribution Method Options Dialog

By default, SAS/INSIGHT software uses vardef=**DF, degrees of freedom** to compute the variance divisor.

When multiple **Y** variables are analyzed, and some **Y** variables have missing values, the **Use Obs with Missing Values** option uses all observations with nonmissing values for the **Y** variable being analyzed. If the option is turned off, observations with missing values for *any* **Y** variable are not used for any analysis.

Copyright © 2007 by SAS Institute Inc., Cary, NC, USA. All rights reserved.