Working with SAS Data Sets |
Summary statistics on the numeric variables of a SAS data set can be obtained with the SUMMARY statement. These statistics can be based on subgroups of the data by using the CLASS clause in the SUMMARY statement. The SAVE option in the OPT clause enables you to save the computed statistics in matrices for later perusal. For example, consider the following statement.
> summary var {height weight} class {sex} stat{mean std} opt{save}; SEX Nobs Variable MEAN STD ------------------------------------------------ F 9 HEIGHT 60.58889 5.01833 WEIGHT 90.11111 19.38391 M 9 HEIGHT 64.45556 4.90742 WEIGHT 110.00000 23.84717 All 18 HEIGHT 62.52222 5.20978 WEIGHT 100.05556 23.43382 ------------------------------------------------
This summary statement gives the mean and standard deviation of the variables HEIGHT and WEIGHT for the two subgroups (male and female) of the data set CLASS. Since the SAVE option is set, the statistics of the variables are stored in matrices under the name of the corresponding variables: each column corresponds to a statistic and each row corresponds to a subgroup. Two other vectors, SEX and _NOBS_, are created. The vector SEX contains the two distinct values of the CLASS variable SEX used in forming the two subgroups. The vector _NOBS_ has the number of observations in each subgroup.
Note that the combined means and standard deviations of the two subgroups are displayed but not saved.
More than one CLASS variable can be used, in which case a subgroup is defined by the combination of the values of the CLASS variables.
Copyright © SAS Institute, Inc. All Rights Reserved.