Distribution Analysis: Descriptive Statistics

Example

In this example, you generate descriptive statistics for the pressure_outer_isobar variable of the Hurricanes data set. The Hurricanes data set contains 6188 observations of tropical cyclones in the Atlantic basin. The pressure_outer_isobar variable gives the sea-level atmospheric pressure for the outermost closed isobar of a cyclone. This is a measure of the atmospheric pressure at the outermost edge of the storm.


Open the Hurricanes data set.


Select Analysis \blacktriangleright\,Distribution Analysis \blacktriangleright\,Descriptive Statistics from the main menu, as shown in Figure 13.1.



ugdistdescmenu.png (4423 bytes)

Figure 13.1: Selecting the Descriptive Statistics Analysis

A dialog box appears as in Figure 13.2. You can select a variable for the univariate analysis by using the Variables tab.


Select the variable pressure_outer_isobar, and click Set Y.



ugdistdescvartab.png (12355 bytes)

Figure 13.2: Selecting a Variable



Click the Tables tab.

The Tables tab (Figure 13.3) becomes active.


Select Extreme Values.

Select Missing Values.

Click OK.



ugdistdesctablestab.png (8058 bytes)

Figure 13.3: Selecting Tables

The analysis calls the UNIVARIATE procedure, which uses the options specified in the dialog box. The procedure displays tables in the output document, as shown in Figure 13.4. In addition to displaying basic statistics such as the mean, median, and standard deviation, the tables also display a few extreme values that seem incongruous. The Extreme Values table shows that there is one low value (998) and one high value (1032) that require investigation. The Missing Values table reveals that almost 25% of the values for this variable are missing.

Two plots are created. One plot shows a histogram of the selected variable; the other shows a box plot. One plot might be hidden beneath the other.

ugdistdescall.png (69876 bytes)

Figure 13.4: Output from a Descriptive Statistics Analysis

For the pressure_outer_isobar variable, the box plot and the Extreme Values table reveal many outliers. It is often useful to investigate outliers to determine whether they are spurious or miscoded data, or to better understand the extreme limits of the data.


In the box plot, click on the outlier with the highest value of pressure_outer_isobar.

This selects the observation in all views of the data, including the data table. You can use the F3 key to scroll through the data table to the next selected observations.


Activate the data table by clicking on the title bar. Use the F3 key to scroll the selected observation into view.

The selected observation corresponds to Hurricane Isadore, September 28, 1996. Scrolling through the data table reveals that the observations before and after the selected observation had a value of 1012 for pressure_outer_isobar. This might indicate that the outlier value of 1032 is a misrecorded value.

You can examine other outliers similarly.


In the box plot, click on the outlier with the lowest value of pressure_outer_isobar.

Activate the data table by clicking on its title bar. Use the F3 key to scroll the selected observation into view.

This selected observation corresponds to a pressure of 988 hPa for the outermost closed isobar of Hurricane Hugo, September 23, 1989. The data table shows that the observations before the selected observation had considerably larger values of pressure_outer_isobar. Furthermore, the value of min_pressure for the selected observation is 990 hPa, which is larger than the value being investigated. This violates the fact that for a low pressure system, the minimum central pressure should be less than the pressure of the outermost closed isobar. Therefore, the 988 hPa value is most likely misrecorded.

You can exclude misrecorded observations by using the Exclude from Plots and Exclude from Analysis features of the data table (see Chapter 4, "The Data Table"). Excluding an observation affects all variables. You can also exclude a single misrecorded value by doing the following: replace the erroneous value with a missing value by typing "." (or " " for a character variable) into the data table cell. Save the data if you want to make the change permanent.

Previous Page | Next Page | Top of Page