Previous Page | Next Page

Working with Character Variables

Handling Missing Values


Reading Missing Values

SAS uses a blank to represent a missing value of a character variable. For example, the data line for Brazil lacks the departure city from the United States:

Japan      5 San Francisco          Tokyo, Osaka
Italy      8 New York               Rome, Naples
Australia 12 Honolulu           Sydney, Brisbane
Venezuela  4 Miami            Caracas, Maracaibo
Brazil     4               Rio de Janeiro, Belem

As Data Set AIR.DEPARTURES shows, when the INPUT statement reads the data line for Brazil and determines that the value for USGate in columns 14-26 is missing, SAS assigns a missing value to USGate for that observation. The missing value is represented by a blank when printing.

One special case occurs when you read character data values with list input. In that case, you must use a period to represent a missing value in data lines. (Blanks in list input separate values; therefore, SAS interprets blanks as a signal to keep searching for the value, not as a missing value.) In the following DATA step, the TourGuide information for Venezuela is missing and is represented with a period:

options pagesize=60 linesize=80 pageno=1 nodate;

data missingval;
   length Country $ 10 TourGuide $ 10;
   input Country TourGuide;
   datalines;
Japan Yamada
Italy Militello
Australia Edney
Venezuela .
Brazil Cardoso
;

proc print data=missingval;
   title 'Missing Values for Character List Input Data';
run;

The following output displays the results:

Using a Period in List Input for Missing Character Data

                     Missing Values for Character List Data                    1

                         Obs    Country      TourGuide

                          1     Japan        Yamada   
                          2     Italy        Militello
                          3     Australia    Edney    
                          4     Venezuela             
                          5     Brazil       Cardoso  

SAS recognized the period as a missing value in the fourth data line; therefore, it recorded a missing value for the character variable TourGuide in the resulting data set.


Checking for Missing Character Values

When you want to check for missing character values, compare the character variable to a blank surrounded by quotation marks:

if USGate = ' ' then GateInformation = 'Missing';

The following DATA step includes this statement to check USGate for missing information. The results are recorded in GateInformation:

options pagesize=60 linesize=80 pageno=1 nodate;

data checkgate;
   length GateInformation $ 15;
   set mylib.departures;
   if USGate = ' ' then GateInformation = 'Missing';
   else GateInformation = 'Available';
run;
proc print data=checkgate;
   var Country CitiesIntour USGate ArrivalDepartureGates GateInformation;
   title 'Checking For Missing Gate Information';
run;

The following output displays the results:

Checking for Missing Character Values

                     Checking For Missing Gate Information                     1

                   Cities                                              Gate
 Obs   Country     InTour   USGate          ArrivalDepartureGates   Information

  1    Japan          5     San Francisco   Tokyo, Osaka             Available 
  2    Italy          8     New York        Rome, Naples             Available 
  3    Australia     12     Honolulu        Sydney, Brisbane         Available 
  4    Venezuela      4     Miami           Caracas, Maracaibo       Available 
  5    Brazil         4                     Rio de Janeiro, Belem    Missing   

Setting a Character Variable Value to Missing

You can assign missing character values in assignment statements by setting the character variable to a blank surrounded by quotation marks. For example, the following statement sets the day of departure based on the number of days in the tour. If the number of cities in the tour is a week or less, then the day of departure is a Sunday. Otherwise, the day of departure is not known and is set to a missing value.

if Cities <=7 then DayOfDeparture = 'Sunday';
else DayOfDeparture = ' ';

The following DATA step includes these statements:

options pagesize=60 linesize=80 pageno=1 nodate;
data departuredays;
   set mylib.departures;
   length DayOfDeparture $ 8;
   if CitiesInTour <=7 then DayOfDeparture = 'Sunday';
   else DayOfDeparture = ' ';
run;

proc print data=departuredays;
   var Country CitiesInTour DayOfDeparture;
   title 'Departure Day is Sunday or Missing';
run;

The following output displays the results:

Assigning Missing Character Values

                       Departure Day is Sunday or Missing                      1

                                        Cities      DayOf
                    Obs    Country      InTour    Departure

                     1     Japan           5       Sunday  
                     2     Italy           8               
                     3     Australia      12               
                     4     Venezuela       4       Sunday  
                     5     Brazil          4       Sunday  

Previous Page | Next Page | Top of Page