Previous Page | Next Page

Starting with Raw Data: The Basics

Reading Unaligned Data with More Flexibility


Understanding How to Make List Input More Flexible

While list input is the simplest to code, remember that it places restrictions on your data. By using format modifiers, you can take advantage of the simplicity of list input without the inconvenience of the usual restrictions. For example, you can use modified list input to do the following:


Creating Longer Variables and Reading Numeric Data That Contains Special Characters

By simply modifying list input with the colon format modifier (:) you can read

To use the colon format modifier with list input, place the colon between the variable name and the informat. As in simple list input, at least one blank (or other defined delimiter character) must separate each value from the next, and character values cannot contain embedded blanks (or other defined delimiter characters). Consider this DATA step:

data january_sales;
   input Item : $12. Amount : comma5.;
   datalines;
Trucks 1,382 
Vans 1,235 
Sedans 2,391 
SportUtility 987
;  

proc print data=january_sales;     
   title 'January Sales in Thousands';    
run;

The variable Item has a length of 12, and the variable Amount requires an informat (in this case, COMMA5.) that removes commas from numbers so that they are read as valid numeric values. The data values are not aligned in columns as was required in the last example, which used formatted input to read the data.

The following output shows the resulting data set.

Data Set Created with Modified List Input (: comma5.)

                           January Sales in Thousands                          1

                         Obs    Item            Amount

                          1     Trucks           1382 
                          2     Vans             1235 
                          3     Sedans           2391 
                          4     SportUtility      987 

Reading Character Data That Contains Embedded Blanks

Because list input uses a blank space to determine where one value ends and the next one begins, values normally cannot contain blanks. However, with the ampersand format modifier (&) you can use list input to read data that contains single embedded blanks. The only restriction is that at least two blanks must divide each value from the next data value in the record.

To use the ampersand format modifier with list input, place the ampersand between the variable name and the informat. The following DATA step uses the ampersand format modifier with list input to create the data set CLUB2. Note that the data is not in fixed columns; therefore, column input is not appropriate.

data club2;
   input IdNumber Name & $18. Team $ StartWeight EndWeight;
   datalines;    
1023 David Shaw   red 189 165
1049 Amelia Serrano  yellow 145 124    
1219 Alan Nance  red 210 192   
1246 Ravi Sinha  yellow 194 177    
1078 Ashley McKnight  red 127 118    
1221 Jim Brown  yellow 220 .    
;

proc print data=club2;
   title 'Weight Club Members';
run;

The character variable Name, with a length of 18, contains members' first and last names separated by one blank space. The data lines must have two blank spaces between the values for the variable Name and the variable Team for the INPUT statement to correctly read the data.

The following output shows the resulting data set.

Data Set Created with Modified List Input (& $18.)

                              Weight Club Members                              1

                  Id                                    Start      End
         Obs    Number    Name               Team      Weight    Weight

          1      1023     David Shaw         red         189       165 
          2      1049     Amelia Serrano     yellow      145       124 
          3      1219     Alan Nance         red         210       192 
          4      1246     Ravi Sinha         yellow      194       177 
          5      1078     Ashley McKnight    red         127       118 
          6      1221     Jim Brown          yellow      220         . 

Previous Page | Next Page | Top of Page