Chapter Contents

Previous

Next
Optimization

The optimize Option

The optimize compiler option is used to enable global optimization, which optimizes the flow of control and data through an entire function.


Optimizations

Global optimization includes a wide variety of optimizations. Some of the optimizations that are performed are discussed below.

Register allocation

The compiler analyzes the function to determine which auto variables, formal variables, temporary values, and constant values should be assigned to registers at each point in the function. The compiler uses up to 6 of the 370 general registers and 2 of the floating point registers for this purpose. The remaining registers become either dedicated registers, as for example, in dedicating R12 as the CRAB pointer, or working registers during code sequences of medium or less duration.

Generally speaking, the variables that are most used at a given point are selected. Values occurring in loops are more likely to be chosen.

The compiler attempts to keep a variable assigned to a register for as long as possible. When 370 BXLE/BXH instructions are issued, the compiler allocates registers in pairs, resulting in high-quality code for many loops particularly in numerical applications.

Using the ampersand (&) operator with a variable prevents the compiler from allocating that variable to a register because it cannot predict when the resultant pointer will be used to read or modify the variable's value in memory, or the variable may be used in another function. External variables also cannot be allocated to a register.

The effect of global optimization's register allocation is quite different from the use of the register storage class. In general, a variable declared using register is associated with a machine register throughout the entire block in which it is declared (usually the entire function). In most functions, the variable is heavily used in some places and not used in others. Yet, if a machine register is assigned to the variable, then the same register cannot be reused even in those sections where the variable is not used. Therefore, global optimization changes a register's assigned variable during the evaluation of each expression to ensure that the most heavily used variables are always in machine registers.

The compiler overrides the register storage class keyword in the declarations of integer, double, and pointer variables.

Because of the portability of the C language, it would be difficult for a programmer to know the number of available registers provided by the target machine and the compiler. The concept of a register variable is based on the idea that the variable is kept in a register for the entirety of its scope. Such restrictions no longer apply when a compiler uses the more advanced registration allocation algorithms in SAS/C software. Even though the compiler does not have dynamic information about program execution that would indicate which statements are executed more heavily, it can use the loop nesting structure to make a reasonable approximation.

Dead store elimination

If a value is assigned to a variable, but the value is not used, then the assignment can be eliminated as in this example:

o = 23;

   code that does not refer to 'o'

o = 12;

The first assignment to o can be removed.

Since the compiler inspects all references to the variable throughout the entire function, even quite subtle dead stores are eliminated.

Moving invariant calculations out of loops

A calculation in a loop whose value is the same on each iteration can be moved outside of the loop. For example, the loop

for (i = 0; i < j; i++) {
   a[i] = p->q.r[10] ;
}

can be changed to

temp = p->q.r[10] ;
for (i = 0; i < j; i++) {
   a[i] = temp;
}

Refer to the explanation for the loop option in Very busy expression hoisting for more information about this type of optimization.

Constant propagation and folding

References to a variable whose only definition is a constant are replaced by the constant. If the variable is used only in expressions with a different type (for example, if an int variable is only used in a comparison with float variables), global optimization creates a constant of the correct type. If the variable is used only as a constant, global optimization eliminates the variable entirely. The following example demonstrates these optimizations:

void f(double d)
{
   i = 10;
   for (; d < i; ++d) {
      .
      .
      .
   }
   return;
}

The code above can be changed to this:

void f(double d)
{
   for (; d < 10.0; ++d) {
      .
      .
      .
   }
   return;
}

Constant propagation is often useful in programs that contain inline functions.

Merging common subexpressions

Global optimization eliminates recalculation of values that have been computed previously. For example,

x = i / 3;
y = i / 3 + 4;

can be changed to

temp = i / 3;
x = temp;
y = temp + 4;


Dead code elimination

Code that can never be executed is eliminated.

Induction variable transformations

Loops containing multiplications (usually those associated with array indexing) have the operations changed to addition.

Very busy expression hoisting

If the same expression is computed along all paths from a point in the code, the expression is moved to a single, common location. For example,

if (expression )
   x = i + j;
else
   y = (i + j) * 2;

can be changed to

temp = i + j;
if (expression )
   x = temp;
else
   y = temp * 2;


Global Optimization Compiler Options

The compiler accepts the following options to modify optimization:
loop assumes that loops have multiple iterations when the number of iterations is variable. This enables the movement of safe code out of loops. (See Moving invariant calculations out of loops.) loop is the default.

When a loop is not executed at all, the moved code is executed in cases where it previously would not have been. For example,

for (i = 0; i < n; ++i)
   for (j = 0; j < m; ++j)
       p[i * m + j] += 1;

can be changed to

for (i = 0; i < n; ++i) {
   temp = i * m;
   for (j = 0; j < m; ++j)
      p[temp + j]  += 1;
}

In the changed code, i*m can be calculated when m is less than or equal to 0. When the loop option has been specified, there may be a small cost in time for every loop that is not executed. There is also a significant time saving for loops that are executed many times, as most are.

Some types of code may cause an exception, for example, division by 0.

For this reason, the SAS/C Compiler restricts moved code to safe operations, including integral and pointer arithmetic other than division by 0, but not including floating-point operations. The compiler avoids incorrect exceptions regardless of the setting of the loop option.

alias disables type-based aliasing assumptions. If alias is used, the compiler assumes worst-case aliasing. Use of this option can significantly reduce the amount of optimization that can be performed. noalias is the default.
greg controls the number of general registers that the compiler will try to allocate.
freg controls the number of floating-point registers that the compiler will try to allocate.

For both greg and freg , the compiler allocates from among the supported register variable registers. These are R6-R11 and FR4/FR6. Registers R4, R5, R12, and R13 are dedicated to addressing various data objects in the function. R1 is used for numerous specific code sequences.

R0, R2, R3, R14, R15, FR0, and FR2 remain for things like constants, base registers, VCONs, and nonregisterized variables. In the case that the user feels that values of these types are being reloaded too often from memory and can benefit from having more registers available, then the number of registers allocated with greg or freg can be reduced.

inline inlines small functions (as defined by the complexity option) and those with the __inline keyword. inline is the default when the optimize option is used.
inlocal inlines single-call static (local) functions.
complexity defines the complexity of functions considered small by inline . (See Using the complexity option to control inlining.)
depth defines the maximum depth of function calls to be inlined. The range is 0 to 6, and the default value is 3.
rdepth defines the maximum level of recursive function calls to be inlined. The range is 0 to 6, and the default is 0.


Global Optimization and the Debugger

The compiler does not optimize programs when the debug option is used. To utilize all the capabilities of the SAS/C Debugger, there must be an accurate correspondence between object code and source line numbers, and optimizations can alter this correspondence. Also, the debug option causes the compiler to suppress allocation of variables to registers, so the resulting code is not completely optimal.

You can, however, use the dbhook option along with the optimize option to generate optimized object code that can be used with the debugger. The dbhook option generates hooks in the object code that enable the debugger to gain control of an executing program.

When using the debugger with optimized object code that has been compiled with the dbhook option, the source code is not displayed in the debugger's Source window and you cannot access variables. Therefore, the debugger's print command and other commands that are normally used with variables are not used when debugging optimized code. You can use commands such as step, goto, and runto to control the execution of your program. The goto command may cause incorrect results if the expected register contents at the goto target differ from the actual register contents when the command was issued. Also, source code line numbers are displayed in the Source window, providing an indication of your location in the code. You also have the capability of viewing register values in the debugger's Register window.

The debugging of optimized code is most effective when used in conjunction with the Object Module Disassembler (OMD) or your system's debugger. See Compiling C Programs for information about using the OMD.


Chapter Contents

Previous

Next

Top of Page

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.