Techniques for Exploring Data |
This section describes how to exclude selected observations from plots and from statistical analyses. The data table must be the active window in order for you to exclude observations. Select Edit Observations Exclude from Plots from the main menu to exclude selected observations from plots. Select Edit Observations Exclude from Analyses to exclude selected observations from analyses.
Alternatively, you can right-click on the row heading of any selected observation in the data table and select Exclude from Plots or Exclude from Analyses from the pop-up menu, as shown in Figure 11.4.
Figure 11.4: Data Table Pop-up Menu
The row heading of the data table shows the status of an observation in analyses and plots. A marker symbol indicates that the observation is included in plots; observations excluded from plots do not have a marker symbol shown in the data table. Similarly, the symbol is present if and only if the observation is included in analyses. For example, the first, fifth, and sixth observations in Figure 11.5 are included in plots and analyses.
Figure 11.5: Excluded Observations
If you exclude observations from plots, all plots linked to the current data table automatically redraw themselves. (For example, excluding an extreme value might result in a new range for an axis.) The row headings for the excluded observations no longer show the observation marker. For example, the third and fourth observations in Figure 11.5 are excluded from plots.
If you exclude observations from analyses, the row headings for the excluded observations no longer show the symbol. For example, the second and fourth observations in Figure 11.5 are excluded from analyses.
Caution: If you change the observations included in analyses, previously run analyses and statistics are not automatically rerun.
If an observation is excluded from analyses but included in plots, then the marker symbol changes to the symbol. This combination is useful if you want to fit a regression model to data but also want to exclude outliers or high-leverage observations prior to modeling. The regression model does not use the excluded observations, but the observations show up (as ) on diagnostic plots for the regression.
An example of including some observations in plots but not in analyses is shown in Figure 11.6. The figure shows data from the Mining data set - the results of an experiment to determine whether drilling time was faster for wet drilling or dry drilling. The plot shows the time required to drill the last five feet of a hole plotted against the depth of the hole. A loess fit is plotted only for the wet drilling trials (open circles). This is accomplished by excluding the observations for dry drilling (markers with the shape) before running the loess analysis.
Figure 11.6: Loess Fit of a Subset of Data
Although SAS/IML Studio analyses do not support BY-group processing, you can restrict an analysis to a single BY group by excluding all other BY groups. For data with many BY groups, this is tedious to do using the SAS/IML Studio GUI, but you can write an IMLPlus program to automate the processing of BY groups.
You easily restore all observations into plots and analyses:
Copyright © 2009 by SAS Institute Inc., Cary, NC, USA. All rights reserved.