SAS Institute. The Power to Know

Learning Center

Questions from the field

Question: Is there a quick set of key strokes to commenting and/or un-commenting a block of statements?

Answer: In the enhanced editor under Windows, you can highlight a statement or a block of statements and press the CTRL key along with a forward slash to place comments around that statement or each statement in the block of statements. CTRL-SHIFT-/ will remove line comments from any lines that are highlighted. Note that this method does not work on non-Windows systems or in the (non-enhanced) Program Editor on Windows.
Question: How do you create a format without coding all the values between 15 and 100 that would map a range to a label as follows:
Low < 15 mapped to 14
15 -< 16 mapped to 15
16 -< 17 mapped to 16
17 -< 18 mapped to 17
...
98 -< 99 mapped to 98
99 -< 100 mapped to 99
100 - high mapped to 100
In other words, values up to but not including 15 should print '14.' Values between 15 and 99 should print their integer digits. Values of 100 or greater should print '100.'

Answer: The PICTURE statement in the FORMAT procedure will create this format without a lengthy list of ranges and labels. The open-ended beginning and end of the range can be defined using the NOEDIT option. NOEDIT prints the format label as entered, without using the label digits as placeholders for data value digits.

For example, the following code creates a test file, defines a format, and prints the values using the format:
/* Data step to create test data */
data test;
 do WithPicture = 1 to 110 by .3333;
        NoFormat=WithPicture;
   output;
 end;
run;

/* Create a user-defined format                      */
/* NOEDIT - specifies that label numbers are literal */
/*            characters rather than digit selectors */

proc format;
   picture myfmt  low -< 15 ='14' (noedit) /*prints 14*/
                  15 -< 100 = '99' /*print integer digits only*/
                 100 <- high ='100' (noedit) /*prints 100*/;
run;

/* Test the newly created format */
proc print data=test;
  format WithPicture myfmt.;
run;
Question: When is a correlated subquery desirable or necessary?

Answer: Many queries can be evaluated by executing the subquery once and substituting the resulting value or values into the WHERE clause of the outer query. In queries that include a correlated subquery (also known as a repeating subquery), the subquery depends on the outer query for its values. This means that the subquery is executed repeatedly, once for each row that might be selected by the outer query.

In the beginning, databases did not support joins; thus, the only method of returning the desired results was the subquery. If the query with a correlated subquery can be re-written as a join, it will be more efficient to do so. If it can't, then the correlated sub query is required to get the results.

For efficiency in subqueries, 1.) use the MAX function in a subquery instead of the ALL keyword before the subquery; and 2.) use the IN operator instead of the EXISTS keyword, when possible.

For example, the following queries produce the same result, but the second query is more efficient:
proc sql;
    select * from work.payroll
    where salary> all(select salary
                      from work.payroll
                      where jobcode='ME3');

 proc sql;
    select * from work.payroll
    where salary> (select max(salary)
                   from work.payroll
                 where jobcode='ME3');
Question: On the Windows operating system, can a directory be permanently assigned for your SAS session so that it's the default directory every time you start a SAS session?

Answer: Yes. Under the Windows operating system, a "current folder" is defined by default. The current folder is the operating environment folder to which many SAS commands and actions, such as File/Open or File/Save, apply. The path of the current folder is displayed in the status line at the bottom of the main SAS window. It's set by default to the folder that is designated by the SASUSER system option in the SAS configuration file, but you can assign a different location by specifying the SASINITIALFOLDER system option during SAS invocation.

The syntax is:
     -SASINITIALFOLDER newfolder .
If newfolder contains spaces, it must be enclosed in quotation marks.

Alternatively, you can change the path for the current folder by changing the Start In field that is available in the Properties tab for the SAS program shortcut.

Use the OPTIONS procedure to check the setting of the SASINTIALFOLDER option. Executing this code will write to the log the setting for just this one option:
  proc options option=sasinitialfolder;
  run;
Executing this code will write to the log the settings related to environment files:
  proc options group=envfiles;
  run;
For more information, see SASINITIALFOLDER System Option: Windows.
Question: How do you change the SAS windowing environment defaults so that an Enhanced Editor window opens and NOT the Program Editor window?

Answer: The Enhanced Editor should be the default editor that starts when you start SAS under Windows. If the Program Editor appears at start-up, there are two possibilities:
  1. Your Profile Catalog (SASUSER.PROFILE) contains Edit preferences that disabled the Enhanced Editor.
  2. The NOENHANCEDEDITOR system option was specified at SAS invocation, either in your SAS configuration file or in the command associated with starting SAS.
To change your Edit preferences to enable the Enhanced Editor, start SAS, select Tools arrow Options arrow Preferences... arrow Edit, then select the Use Enhanced Editor check box and click OK. If the check box is deselected, the Program Editor is the default editor. When you exit your SAS session, your choice should be saved in SASUSER.PROFILE for subsequent sessions.

If your SAS administrator established NOENHANCEDEDITOR as the default setting in order to disallow ActiveX OCX files, it is possible that the Enhanced Editor was not installed, making it unavailable even after you try to enable it.

Please note that you cannot disable the Program Editor completely. You can always access the Program Editor from the View menu. It might also appear if an invalid command such as Submit is issued from the Log or Output window.

For more information, consult the related documentation in SAS 9.2 Companion for Windows (.pdf).
Question: Why do I get unexpected results when I use the following program to read embedded data following the CARDS (or DATALINES) statement:
data retirees2;
   infile cards dsd dlm=' ';
   input empid $ contrib @@;
cards;
E00973 1400 E00192  E00543 1500
E00123 4500 
E00444 123
;
proc print data=retirees2;
run;
(note the extra space delimiter in the first record between the value E00192 and E00543)

Answer: Embedded data following a CARDS statement (or DATALINES statement) do not have end-of-line characters. A data step that reads embedded data uses an input buffer with a length of 80 bytes (80 characters). When a data line is loaded into the input buffer it is padded with blanks to the buffer length of 80 characters. The DLM=' ' and DSD options on the INFILE statement specify the blank as the delimiter and that two consecutive delimiters enclose a missing value. This is correct for reading the data lines up to where the padding occurs. Since the padding consists of blanks and the blank is the specified delimiter, this data step will assume that each set of two consecutive blanks enclose a missing value. This produces unwanted observations with missing values in the output. Note that the double trailing @ on the INPUT statement holds a data line in the input buffer for consecutive iterations of the data step until the end of record at 80 bytes is reached. At that point the data step will load the next record and repeat the process.

Possible solutions:
  1. Use a different delimiter, for example the comma.
    data retirees2;
       infile cards dsd dlm=',';
       input empid $ contrib @@;
    cards;
    E00973,1400,E00192,,E00543,1500
    E00123,4500 
    E00444,123
    ;
    proc print data=retirees2;
    run;
    
  2. Read the data from an external file. Text files created on Windows and Unix use a variable record format (recfm=v) by default with an end-of-line character to mark the end of each data line. When a data line is loaded into the input buffer, the end-of-line character is included. When the data step reaches the end-of-line character it loads the next data line.
    data retirees2;
       infile 'retirees.dat' dsd dlm=' ';
       input empid $ contrib @@;
    run;
    proc print data=retirees2;
    run;
    
Question: How do I print the last 10 observations from a SAS data set?

Answer: Determine how many observations there are in the data set and instruct your PROC PRINT to start printing with the final 10 observations, rather than starting with observation number one. Here's how:
  1. The DATA step is used to create a data step variable named TOTOBS. Note that the SET statement will never execute; it will only compile. At compile time, the NOBS option populates TOTOBS with the number of observations residing in the input SAS data set. Subtract 9 from TOTOBS, because the ultimate goal is to print only the final 10 observations. SYMPUTX is used to create a macro variable named START, the value of TOTOBS-9 is placed in this macro variable, and this macro variable will be available to PROC PRINT. The STOP statement is optional. The program would function correctly without it, but there would be a note in the LOG indicating that data step looping had ceased. The STOP statement suppresses this expected note.
  2. The macro variable named START is referenced in the PROC PRINT statement. The FIRSTOBS data set option causes PROC PRINT to start printing with this observation number. Therefore, only the final 10 observations are printed!
data _null_;
  if 0 then set prog2.people nobs=totobs; 
  call symputx('start',put(totobs-9,15.));
  stop;
run;
proc print data=prog2.people(firstobs=&start);
run;
Question: Is there a way to check the syntax of a step prior to executing the step?

Answer: Yes, use the option 'Cancel' on the 'Run' statement. The CANCEL option terminates the step without executing it and prints a warning message that the step was not executed at the user's request.

For example:

SAS PROGRAM:
Data happyDays;
  set sasuser.WhereIhavebeen;
  run cancel;

Proc print data=sasuser.WhereIhavebeen (obs=10);
  run cancel;


SAS LOG:
data happyDays;
7      set sasuser.WhereIhavebeen;
8      run cancel;

WARNING: Data step not executed at user's request.
NOTE: DATA statement used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


9      proc print data=sasuser.WhereIhavebeen (obs=10);
10     run cancel;

WARNING: The procedure was not executed at the user's request.
NOTE: PROCEDURE PRINT used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds
Question: Is it possible to have an ID column in PROC PRINT with a blank column header? If NAME from SASHELP.CLASS is my ID variable and I try: LABEL Name=" "; I still see the variable name at the top of the Name column. I'd really like to have no column header at all.

Answer: In many places in SAS, leading and trailing blanks are ignored. For some reason, the spaces are ignored, even if you have the LABEL option and the variable name is used for the column header. This shows how SAS, ODS and PROC PRINT want EVERY column to have a header. So you have to fool PROC PRINT a bit, by using the SPLIT character. If you have this code:

ods html file='c:\temp\whatever.html' style=egdefault;
PROC PRINT DATA=sashelp.class LABEL split='~' ;  <---use SPLIT character
   VAR age height;
   ID Name;
   LABEL Name="~ ";   <---modified LABEL statement
RUN;
ods html close;

then SAS and ODS see the split character as the first character of the label and use the rest of the string (even if it is a space) as the header. This technique also works in PROC REPORT and PROC SQL. PROC REPORT also has the NOHEADER option, which you can use to blank out ALL the headers in a report.

This code sample provides further explanation of this technique:

ods rtf file='c:\temp\whatever.rtf' style=journal;
ods html file='c:\temp\whatever.html' style=egdefault;
ods pdf file='c:\temp\whatever.pdf' ;

PROC PRINT DATA=sashelp.class LABEL split='~' ;
  title 'PROC PRINT with blank header';
   VAR age height;
   ID Name;
   LABEL Name="~ ";
RUN;


** In the table template for PROC SQL, the def_split option;
** is turned on which means the first character in the string;
** can be used for a split character. (If it is not a regular;
** alphanumeric or an underscore or a hypen or blank).;
** So the label = '# #' is just putting a space/split as the label.;
title 'PROC SQL with blank header';
proc sql;
 select name label='# #', 
        age label='Wombat', 
        height label='^Koala^split line' 
        from sashelp.class;
 quit;

 proc report data=sashelp.class nowd split='^';
 title 'PROC REPORT with blank header';
 column name age height;
 define name / '^ ';
 define age / display;
 define height /display;
 run;

 proc report data=sashelp.class nowd noheader split='*';
 title 'PROC REPORT  with NOHEADER option';
 column name age height;
 define name / '* ';
 define age / display;
 define height /display;
 run;

ods _all_ close;
ods listing;
title; footnote;

This topic is covered in SAS Programming I: Essentials.
Question: Are there any advantages to using a "WHERE ALSO" statement versus logical operators in a WHERE statement to connect the expressions? For example:

proc print data=ia.passngrs;
   where dest='SEA';
   where also eclass ge 120 and bclass lt 15;
run;

OR

proc print data=ia.passngrs;
   where dest='SEA' and eclass ge 120 and bclass lt 15;
run;

Answer: The WHERE ALSO originated as part of SAS Control Language (SCL) available in SAS/AF and SAS/FSP. It was designed to allow the SAS code to be built "piecemeal," as the user made selections on the interface screen. In Base SAS, it makes no difference which code you write. All WHERE and WHERE ALSO statements are joined with an "and."

While your particular application might not use SCL, it definitely could be receiving user choices from an HTML form displayed in a browser (think Stored Processes.) If the user is allowed to make subsetting choices in more than one list box, as well as not making a choice in a particular list box, you might definitely need to build your WHERE clause piecemeal. In that case, WHERE ALSO is just the statement to use.

There is no difference in efficiency whether you code everything in a single WHERE statement with logical operators or use the WHERE ALSO statement. All WHERE and WHERE ALSO statements are joined with an "AND" at compile time, prior to execution time.

Another related question that we are often asked is whether there is a difference between using a WHERE statement and the WHERE= data set option. The answer is that there is no difference in efficiency. Both the statement and the data set option call the same underlying C code when executed. However, keep in mind that the WHERE statement subsets ALL input SAS data sets and the WHERE= data set option on an input SAS data set only subsets that one SAS data set. Their effect on your DATA step or procedure might differ based on which one you choose.

This topic is covered in SAS Programming I: Essentials.
Question: A student wants to create PDF output using the Output Delivery System but wants to have the output from multiple procs (MEANS, FREQ, & REPORT) on the same page in the PDF. She says that everything she's found tells her that each proc automatically creates a new page in the PDF. How can she force output from different procs onto the same page (assuming it can fit)?

Answer: Assuming that the procedures themselves will produce "small" output:

            ods pdf file='c:\temp\try_sp.pdf' startpage=no; 
                   proc means data=sashelp.shoes min mean max; 
            var sales; 
           class region; 
          run; 
  
          proc freq data=sashelp.shoes; 
            tables region; 
          run; 
  
          proc report data=sashelp.shoes nowd; 
            column region sales; 
            define region / group; 
            define sales / sum; 
            rbreak after / summarize; 
          run; 
        ods pdf close; 

Assuming that there's room on a single page for all three outputs, then startpage=no will suppress the page breaks between procedures. The STARTPAGE=NO option for ODS is supported by the ODS RTF destination, as well as the ODS PDF destination.

This topic is covered in Advanced Output Delivery System Topics and Creating Detail and Summary Reports.
Question: I'd like to create an Excel workbook containing four separate worksheets (or data sets), but with a nicer style to the workbook, which would be specified in the program code. Style options in PROC EXPORT are limited. How can I jazz it up a bit and still have individual sheets for each data set?

Answer: This student could use Tagsets.Excelxp (in SAS 9) to get a nicer style because the Spreadsheet Markup Language XML supports SAS style templates. And he can get multiple worksheets, which he cannot get with PROC EXPORT or CSV or ODS HTML.
ods tagsets.ExcelXP file='c:\temp\wb.xls' style=analysis;
  proc print data=sashelp.class;
run;

  proc print data=sashelp.prdsale(obs=10);
run;

  proc print data=sashelp.shoes(obs=10);
run;
ods tagsets.excelxp close;
This sample is covered in Base SAS to Microsoft Excel: Counting the Ways.
Question: When class is over, if I want to go back to my office and try out different system options, is there a way that I can save my current options in case I want to bring them back later?

Answer: PROC OPTSAVE lets you save the system options out to a SAS data set. When you want to go back to 'your' settings, you just run PROC OPTLOAD. For example, if you run PROC OPTSAVE at the beginning of a new session, you can save the default settings out to a permanent data set. Then you can make any changes you want during your session and, if you want to get back to the defaults at any time, just run the PROC OPTLOAD with the data= option to restore.
proc optsave out=kaydata.defaultopts;
run;
options ls=110 ps=52 nocenter nodate;
proc optload data=kaydata.defaultopts;
run;
Here's a more in-depth explanation of how this works:

PROC OPTSAVE and PROC OPTLOAD (Base SAS procedures) save to and retrieve from respectively either the SAS registry or a SAS dataset (the default is SASUSER.MYOPTS) all of the portable SAS system options (those note set in the configuration file). The following SAS code shows the current value of the PAGENO= SAS system option, saves the current portable option settings using the OPTSAVE procedure, and resets the value of the PAGENO= SAS system option. It then runs the OPTLOAD procedure to restore the original value of the portable option settings, including the PAGENO= SAS system option.
/*  Display the current value for the PAGENO= option */
proc options option=pageno;
run;

/*  Save the current portable option settings */
proc optsave out=sasuer.myopts;
run;

/*  Change the value of the PAGENO= option */
options pageno=10;

/*  Display the current, changed value for the PAGENO= option */
proc options option=pageno;
run;

/*  Retrieve the saved portable option settings */
proc optload data=sasuser.myopts;
run;

/*  Display the current value for the PAGENO= option */
proc options option=pageno;
run;
The DMOPTSAVE and DMOPTLOAD commands do exactly the same thing as the OPTSAVE and OPTLOAD procedures from the SAS command line.

Please see the SAS On-Line Documentation for further details on these procedures and commands.
Question: Why does the staffmaster table occur twice in this query?
proc sql;
select distinct e.FirstName, e.LastName, e.empid
   from airline.flightschedule as a,
        airline.staffmaster as b,
        airline.payrollmaster as c,
        airline.supervisors as d,
        airline.staffmaster as e
   where a.Date='04mar2000'd and
         a.Destination='CPH' and
         a.EmpID=b.EmpID and
         a.EmpID=c.EmpID and
         d.JobCategory=substr(c.JobCode,1,2) and
         d.State=b.State and d.empid=e.empid;
quit;


Answer: Two different subsets are needed from airline.staffmaster -- one subset (aliased as b) concerns the crew members on the 04mar2000 flight to CPH, and the other subset (aliased as e) concerns the supervisors of these crew members. The two aliases for airline.staffmaster distinguish between these two subsets and enable the query to treat staffmaster as two tables. Notice in the WHERE clause that the b alias is used to get crew information from combining staffmaster and flightschedule (a.EmpID=b.EmpID). The e alias is used to get supervisor information from combining staffmaster and supervisors (d.empid=e.empid).

This example is covered in SQL Processing in SAS.
Question: How do I permanently remove the format associations from a SAS data set so that the unformatted values are used, not the formated values?

Answer: Use the format _all_; statement. This topic is covered in our SAS Programming I: Essentials course.
         
/* Create a data set with formatted values*/     
data formats;
    input a b c d;
    format a dollar8.2 b comma5.0 
          c date7.0  d mmddyy8.;
cards;
1111 2222 3333 4444  
5555 6666 7777 8888
; 
run;

 /* Remove all formatted values from a data set*/
data noformats;
    set formats;
    format _all_; 
run;
Note: If you are importing data into SAS, this statement can help remove formats associated with values as they come into SAS.
Question: I have a table A, which contains a list of unique and valid zip codes. In a separate table B, I have records associated with valid and invalid zip codes which are not unique. (The table can contain zip code 94025 multiple times.) I would also like to get a list of the invalid zip code records contained in table B. How can I do that using Enterprise Guide?

Answer:
  1. Create a query based on Table A.
  2. Select Add Tables... and choose Table B.
  3. Select Join... if the Tables and Joins window does not open automatically.
  4. Verify that the two tables are joined by ZipCode.
  5. Right-click on the join icon (Venn diagram) between the two tables and select Modify Join.
  6. Select the option that corresponds to all rows from Table B. Close the Modify Joins and Tables and Joins windows.
  7. Select the Filter Data tab and drag ZipCode from Table A into the Filter area.
  8. In the Edit Filter window, change the operator to IS MISSING.
  9. In the Select Data tab, add ZipCode from Table B, and select the "Select distinct rows only" checkbox.
This will return a unique list of zip codes that are in table B but not in A.

These sample files and code examples are provided by SAS Institute Inc. "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Recipients acknowledge and agree that SAS Institute shall not be liable for any damages whatsoever arising out of their use of this material. In addition, SAS Institute will provide no support for the materials contained herein.