| Working with SAS Data Sets |
Summary statistics on the numeric variables of a SAS data set can be obtained with the SUMMARY statement. These statistics can be based on subgroups of the data by using the CLASS clause in the SUMMARY statement. The SAVE option in the OPT clause enables you to save the computed statistics in matrices for later perusal. For example, consider the following statement.
> summary var {height weight} class {sex} stat{mean std} opt{save};
SEX Nobs Variable MEAN STD
------------------------------------------------
F 9 HEIGHT 60.58889 5.01833
WEIGHT 90.11111 19.38391
M 9 HEIGHT 64.45556 4.90742
WEIGHT 110.00000 23.84717
All 18 HEIGHT 62.52222 5.20978
WEIGHT 100.05556 23.43382
------------------------------------------------
This summary statement
gives the mean and standard deviation of the
variables HEIGHT and WEIGHT for the two subgroups
(male and female) of the data set CLASS.
Since the SAVE option is set, the statistics of the variables
are stored in matrices under the name of the corresponding
variables, with each column corresponding to a statistic
requested and each row corresponding to a subgroup.
Two other vectors, SEX and _NOBS_, are created.
The vector SEX contains the two distinct values of the
CLASS variable SEX used in forming the two subgroups.
The vector _NOBS_ has the number of observations
in each subgroup.
Note that the combined means and standard deviations of the two subgroups are displayed but not saved.
More than one CLASS variable can be used, in which case a subgroup is defined by the combination of the values of the CLASS variables.
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.