You can select observations in the data table by using the Find dialog box. (For a way to graphically and interactively select observations that satisfy multiple constraints, see Chapter 11: Techniques for Exploring Data.) You can open the Find dialog box (shown in Figure 4.11) by selecting → from the main menu.
The Find dialog box contains the following UI controls:
chooses the variable whose values are examined. The list includes each variable in the data set.
selects the logical operation used to compare each observation with the contents of the Value field.
specifies the value used to select observations.
applies the variable’s informat to the contents of the Value field. If the variable does not have an informat, then this item is inactive.
applies the variable’s format to the variable and then compares the formatted data to the contents of the Value field. If the variable does not have a format, then this item is inactive.
specifies that each observation be compared to the contents of the Value field in a case-sensitive manner. If the variable is numeric, then this item is inactive.
specifies that a tolerance, , be used in comparing each observation to the contents of the Value field. Table 4.1 specifies how is used. If the chosen variable is a character variable, then this item is inactive.
specifies that all observations be searched, but only the observations that match the search criterion be selected.
specifies that only the observations that are selected be searched. You can use this option to perform logical AND operations.
specifies that all observations be searched, but observations that were selected prior to the search remain selected. You can use this option to perform logical OR operations.
For numeric variables, let be the value of the Value field and let be the value of the Use tolerance of field. (If you are not using a tolerance, then .) Table 4.1 specifies whether an observation with value for the chosen variable matches the query.
Table 4.1: Find Operations for Numeric Variables
Operation |
Values Found |
Missing Selected? |
---|---|---|
Equals |
|
No |
Less than |
|
Yes |
Greater than |
|
No |
Not equals |
|
Yes |
Less than or equals |
|
Yes |
Greater than or equals |
|
No |
Is missing |
is missing |
Yes |
To remember whether missing values match the query, recall that SAS missing values are represented as large negative numbers. Table 4.1 is consistent with the WHERE clause in the SAS DATA step.
For character variables, comparisons are performed according to the ASCII order of characters. In particular, all uppercase letters [A–Z] precede lowercase characters [a–z]. Let be the value of the Value field and let indicate that precedes in ASCII order. Table 4.2 specifies whether an observation with value for the chosen variable matches the query.
Table 4.2: Find Operations for Character Variables
Operation |
Values Found |
Missing Selected? |
---|---|---|
Equals |
|
No |
Less than |
|
Yes |
Greater than |
|
No |
Not equals |
|
Yes |
Less than or equals |
|
Yes |
Greater than or equals |
|
No |
Is missing |
is missing |
Yes |
Contains |
contains |
No |
Does not contains |
does not contain |
Yes |
Begins with |
begins with |
No |
To help remember whether character missing values match the query, think of the character missing value as being a zero-length string that contain no characters. Table 4.2 is consistent with the WHERE clause in the SAS DATA step.
As a first example, Figure 4.11 shows how to find observations in the Hurricanes
data set whose latitude
variable is contained in the interval . This is a quick way to find observations with latitudes between 28 and 32 in a single search.
A second example is shown in Figure 4.12. This search finds observations for which the date
variable strictly precedes 07AUG1988. The date
variable has a DATE9. informat, so you can use that informat to make it more convenient to input the contents of the Value field. (Without the informat, you would need to search for the value 10445, the SAS date value that corresponds to 06AUG1988.)
Recall that the date
variable is a numeric variable, even though the formatted values appear as text.
A related example is shown in Figure 4.13. This search finds all observations for which the date
variable contains the text “AUG”. To perform this search you must check . This forces the Find dialog box to apply the DATE9. format to the date
variable, which means comparing strings (character data) instead of numbers (numeric data). You can then select from the list. Each formatted string is searched for the value “AUG”.