Previous Page | Next Page

Retrieving Data from a Single Table

Retrieving Rows That Satisfy a Condition

The WHERE clause enables you to retrieve only rows from a table that satisfy a condition. WHERE clauses can contain any of the columns in a table, including columns that are not selected.


Using a Simple WHERE Clause

The following example uses a WHERE clause to find all countries that are in the continent of Europe and their populations:

proc sql outobs=12;
   title 'Countries in Europe';
   select Name, Population format=comma10.
      from sql.countries
      where Continent = 'Europe';

Using a Simple WHERE Clause

                              Countries in Europe

                Name                                 Population
                -----------------------------------------------
                Albania                               3,407,400
                Andorra                                  64,634
                Austria                               8,033,746
                Belarus                              10,508,000
                Belgium                              10,162,614
                Bosnia and Herzegovina                4,697,040
                Bulgaria                              8,887,111
                Channel Islands                         146,436
                Croatia                               4,744,505
                Czech Republic                       10,511,029
                Denmark                               5,239,356
                England                              49,293,170

Retrieving Rows Based on a Comparison

You can use comparison operators in a WHERE clause to select different subsets of data. The following table lists the comparison operators that you can use:

Comparison Operators
Symbol Mnemonic Equivalent Definition Example
= EQ equal to where Name = 'Asia';
^= or ~= or ¬= or <> NE not equal to where Name ne 'Africa';
> GT greater than where Area > 10000;
< LT less than where Depth < 5000;
>= GE greater than or equal to where Statehood >= '01jan1860'd;
<= LE less than or equal to where Population <= 5000000;

The following example subsets the SQL.UNITEDSTATES table by including only states with populations greater than 5,000,000 people:

proc sql;
   title 'States with Populations over 5,000,000';
   select Name, Population format=comma10.
      from sql.unitedstates
      where Population gt 5000000
      order by Population desc;

Retrieving Rows Based on a Comparison

                     States with Populations over 5,000,000

                Name                                 Population
                -----------------------------------------------
                California                           31,518,948
                New York                             18,377,334
                Texas                                18,209,994
                Florida                              13,814,408
                Pennsylvania                         12,167,566
                Illinois                             11,813,091
                Ohio                                 11,200,790
                Michigan                              9,571,318
                New Jersey                            7,957,196
                North Carolina                        7,013,950
                Georgia                               6,985,572
                Virginia                              6,554,851
                Massachusetts                         6,071,816
                Indiana                               5,769,553
                Washington                            5,307,322
                Missouri                              5,285,610
                Tennessee                             5,149,273
                Wisconsin                             5,087,770
                Maryland                              5,014,048

Retrieving Rows That Satisfy Multiple Conditions

You can use logical, or Boolean, operators to construct a WHERE clause that contains two or more expressions. The following table lists the logical operators that you can use:

Logical (Boolean) Operators
Symbol Mnemonic Equivalent Definition Example
& AND specifies that both the previous and following conditions must be true Continent = 'Asia' and Population > 5000000
! or | or ¦ OR specifies that either the previous or the following condition must be true Population < 1000000 or Population > 5000000
^ or ~ or ¬ NOT specifies that the following condition must be false Continent not 'Africa'

The following example uses two expressions to include only countries that are in Africa and that have a population greater than 20,000,000 people:

proc sql;
   title 'Countries in Africa with Populations over 20,000,000';
   select Name, Population format=comma10. 
      from sql.countries
      where Continent = 'Africa' and Population gt 20000000
      order by Population desc;

Retrieving Rows That Satisfy Multiple Conditions

              Countries in Africa with Populations over 20,000,000

                Name                                 Population
                -----------------------------------------------
                Nigeria                              99,062,003
                Egypt                                59,912,259
                Ethiopia                             59,291,170
                South Africa                         44,365,873
                Congo, Democratic Republic of        43,106,529
                Sudan                                29,711,229
                Morocco                              28,841,705
                Kenya                                28,520,558
                Tanzania                             28,263,033
                Algeria                              28,171,132
                Uganda                               20,055,584

Note:   You can use parentheses to improve the readability of WHERE clauses that contain multiple, or compound, expressions, such as the following:

where (Continent = 'Africa' and Population gt 2000000) or 
      (Continent = 'Asia' and Population gt 1000000)

  [cautionend]


Using Other Conditional Operators

You can use many different conditional operators in a WHERE clause. The following table lists other operators that you can use:

Conditional Operators
Operator Definition Example
ANY specifies that at least one of a set of values obtained from a subquery must satisfy a given condition where Population > any (select Population from sql.countries)
ALL specifies that all of the values obtained from a subquery must satisfy a given condition where Population > all (select Population from sql.countries)
BETWEEN-AND tests for values within an inclusive range where Population between 1000000 and 5000000
CONTAINS tests for values that contain a specified string where Continent contains 'America';
EXISTS tests for the existence of a set of values obtained from a subquery where exists (select * from sql.oilprod);
IN tests for values that match one of a list of values where Name in ('Africa', 'Asia');
IS NULL or IS MISSING tests for missing values where Population is missing;
LIKE tests for values that match a specified pattern (table note 1) where Continent like 'A%';
=* tests for values that sound like a specified value where Name =* 'Tiland';

TABLE NOTE 1:  You can use a percent symbol (%) to match any number of characters. You can use an underscore (_) to match one arbitrary character. [arrow]

Note:   All of these operators can be prefixed with the NOT operator to form a negative condition.  [cautionend]


Using the IN Operator

The IN operator enables you to include values within a list that you supply. The following example uses the IN operator to include only the mountains and waterfalls in the SQL.FEATURES table:

proc sql outobs=12;
   title 'World Mountains and Waterfalls';
   select Name, Type, Height format=comma10.
      from sql.features
      where Type in ('Mountain', 'Waterfall')
      order by Height;

Using the IN Operator

                              World Mountains and Waterfalls

                         Name             Type            Height
                         ---------------------------------------
                         Niagara Falls    Waterfall          193
                         Yosemite         Waterfall        2,425
                         Tugela Falls     Waterfall        3,110
                         Angel Falls      Waterfall        3,212
                         Kosciusko        Mountain         7,310
                         Pico Duarte      Mountain        10,417
                         Cook             Mountain        12,349
                         Matterhorn       Mountain        14,690
                         Wilhelm          Mountain        14,793
                         Mont Blanc       Mountain        15,771
                         Ararat           Mountain        16,804
                         Vinson Massif    Mountain        16,864

Using the IS MISSING Operator

The IS MISSING operator enables you to identify rows that contain columns with missing values. The following example selects countries that are not located on a continent; that is, these countries have a missing value in the Continent column:

proc sql;
   title 'Countries with Missing Continents';
   select Name, Continent
      from sql.countries
      where Continent is missing;

Note:   The IS NULL operator is the same as, and interchangeable with, the IS MISSING operator.  [cautionend]

Using the IS MISSING Operator

                            Countries with Missing Continents

         Name                                 Continent
         ------------------------------------------------------------------------
         Bermuda                                                                 
         Iceland                                                                 
         Kalaallit Nunaat                                                        

Using the BETWEEN-AND Operators

To select rows based on a range of values, you can use the BETWEEN-AND operators. This example selects countries that have latitudes within five degrees of the Equator:

proc sql outobs=12;
   title 'Equatorial Cities of the World';
   select City, Country, Latitude
      from sql.worldcitycoords
      where Latitude between -5 and 5;

Note:   In the tables used in these examples, latitude values that are south of the Equator are negative. Longitude values that are west of the Prime Meridian are also negative.  [cautionend]

Note:   Because the BETWEEN-AND operators are inclusive, the values that you specify in the BETWEEN-AND expression are included in the results.  [cautionend]

Using the BETWEEN-AND Operators

                              Equatorial Cities of the World

                City                        Country               Latitude
                ----------------------------------------------------------
                Belem                       Brazil                      -1
                Fortaleza                   Brazil                      -4
                Bogota                      Colombia                     4
                Cali                        Colombia                     3
                Brazzaville                 Congo                       -4
                Quito                       Ecuador                      0
                Cayenne                     French Guiana                5
                Accra                       Ghana                        5
                Medan                       Indonesia                    3
                Palembang                   Indonesia                   -3
                Nairobi                     Kenya                       -1
                Kuala Lumpur                Malaysia                     4

Using the LIKE Operator

The LIKE operator enables you to select rows based on pattern matching. For example, the following query returns all countries in the SQL.COUNTRIES table that begin with the letter Z and are any number of characters long, or end with the letter a and are five characters long:

proc sql;
   title1 'Country Names that Begin with the Letter "Z"';
   title2 'or Are 5 Characters Long and End with the Letter "a"';
   select Name
      from sql.countries
      where Name like 'Z%' or Name like '____a';

Using the LIKE Operator

                  Country Names that Begin with the Letter "Z"
              or Are 5 Characters Long and End with the Letter "a"

                      Name
                      -----------------------------------
                      China                              
                      Ghana                              
                      India                              
                      Kenya                              
                      Libya                              
                      Malta                              
                      Syria                              
                      Tonga                              
                      Zambia                             
                      Zimbabwe                           

The percent sign (%) and underscore (_) are wildcard characters. For more information about pattern matching with the LIKE comparison operator, see the "SQL Procedure" chapter in the Base SAS Procedures Guide.


Using Truncated String Comparison Operators

Truncated string comparison operators are used to compare two strings. They differ from conventional comparison operators in that, before executing the comparison, PROC SQL truncates the longer string to be the same length as the shorter string. The truncation is performed internally; neither operand is permanently changed. The following table lists the truncated comparison operators:

Truncated String Comparison Operators
Symbol Definition Example
EQT equal to truncated strings where Name eqt 'Aust';
GTT greater than truncated strings where Name gtt 'Bah';
LTT less than truncated strings where Name ltt 'An';
GET greater than or equal to truncated strings where Country get 'United A';
LET less than or equal to truncated strings where Lastname let 'Smith';
NET not equal to truncated strings where Style net 'TWO';

The following example returns a list of U.S. states that have 'New ' at the beginning of their names:

proc sql;
   title '"New" U.S. States';
   select Name
      from sql.unitedstates
      where Name eqt 'New ';

Using a Truncated String Comparison Operator

                               "New" U.S. States

                      Name
                      -----------------------------------
                      New Hampshire                      
                      New Jersey                         
                      New Mexico                         
                      New York                           

Using a WHERE Clause with Missing Values

If a column that you specify in a WHERE clause contains missing values, then a query might provide unexpected results. For example, the following query returns all features from the SQL.FEATURES table that have a depth of less than 500 feet:

/* incorrect output */

proc sql outobs=12;
   title 'World Features with a Depth of Less than 500 Feet';
   select Name, Depth
      from sql.features
      where Depth lt 500
      order by Depth;

Using a WHERE Clause with Missing Values (Incorrect Output)

               World Features with a Depth of Less than 500 Feet

                           Name                Depth
                           -------------------------
                           Kalahari                .
                           Nile                    .
                           Citlaltepec             .
                           Lena                    .
                           Mont Blanc              .
                           Borneo                  .
                           Rub al Khali            .
                           Amur                    .
                           Yosemite                .
                           Cook                    .
                           Mackenzie-Peace         .
                           Mekong                  .
However, because PROC SQL treats missing values as smaller than nonmissing values, features that have no depth listed are also included in the results. To avoid this problem, you could adjust the WHERE expression to check for missing values and exclude them from the query results, as follows:

/* corrected output */

proc sql outobs=12;
   title 'World Features with a Depth of Less than 500 Feet';
   select Name, Depth
      from sql.features
      where Depth lt 500 and Depth is not missing
      order by Depth;

Using a WHERE Clause with Missing Values (Corrected Output)

               World Features with a Depth of Less than 500 Feet

                           Name                Depth
                           -------------------------
                           Baltic                180
                           Aral Sea              222
                           Victoria              264
                           Hudson Bay            305
                           North                 308

Previous Page | Next Page | Top of Page