The
optimize
compiler option is used to enable global optimization, which optimizes
the flow of control and data through an entire function.
Global optimization includes a wide variety of optimizations.
Some of the optimizations that are performed are discussed below.
The compiler analyzes the function to determine which auto variables,
formal variables, temporary values, and constant values should be assigned
to registers at each point in the function. The compiler uses up to 6 of
the 370 general registers and 2 of the floating point registers for this purpose.
The remaining registers become either dedicated registers, as for example,
in dedicating R12 as the CRAB pointer, or working registers during code sequences
of medium or less duration.
Generally speaking, the variables that are most used
at a given point are selected. Values occurring in loops are more likely
to be chosen.
The compiler attempts to keep a variable assigned to
a register for as long as possible. When 370 BXLE/BXH instructions are issued,
the compiler allocates registers in pairs, resulting in high-quality code
for many loops particularly in numerical applications.
Using the ampersand (&) operator with a
variable prevents
the compiler from allocating that variable to a register because it cannot
predict when the resultant pointer will be used to read or modify the variable's
value in memory, or the variable may be used in another function. External
variables also cannot be allocated to a register.
The effect of global optimization's register allocation
is quite different from the use of the
register
storage class. In general,
a variable declared using
register
is associated with a machine register throughout
the entire block in which it is declared (usually the entire function). In
most functions, the variable is heavily used in some places and not used in
others. Yet, if a machine register is assigned to the variable, then the
same register cannot be reused even in those sections where the variable is
not used. Therefore, global optimization changes a register's assigned variable
during the evaluation of each expression to ensure that the most heavily used
variables are always in machine registers.
The compiler overrides the register storage class keyword
in the declarations of integer, double, and pointer variables.
Because of the portability of the C language, it would
be difficult for a programmer to know the number of available registers provided
by the target machine and the compiler. The concept of a register variable
is based on the idea that the variable is kept in a register for the entirety
of its scope. Such restrictions no longer apply when a compiler uses the
more advanced registration allocation algorithms in SAS/C software.
Even though the compiler does not have dynamic information about program
execution that would indicate which statements are executed more heavily,
it can use the loop nesting structure to make a reasonable approximation.
If a value is assigned to a variable, but the value is not used,
then the assignment can be eliminated as in this example:
o = 23;
code that does not refer to 'o'
o = 12;
The first assignment to
o
can be removed.
Since the compiler inspects all references to the variable
throughout the entire function, even quite subtle dead stores are eliminated.
A calculation in a loop whose value is the same on each iteration
can be moved outside of the loop. For example, the loop
for (i = 0; i < j; i++) {
a[i] = p->q.r[10] ;
}
can be changed to
temp = p->q.r[10] ;
for (i = 0; i < j; i++) {
a[i] = temp;
}
Refer to the explanation for the
loop
option in Very busy expression hoisting for more information about
this type of optimization.
References to a variable whose only definition is a constant
are replaced by the constant. If the variable is used only in expressions
with a different type (for example, if an
int
variable is only used in a comparison
with
float
variables), global optimization creates a constant of the correct
type. If the variable is used only as a constant, global optimization
eliminates the variable entirely. The following example demonstrates these
optimizations:
void f(double d)
{
i = 10;
for (; d < i; ++d) {
.
.
.
}
return;
}
The code above can be changed to this:
void f(double d)
{
for (; d < 10.0; ++d) {
.
.
.
}
return;
}
Constant propagation is often useful in programs that
contain inline functions.
Global
optimization eliminates recalculation of values that have
been computed previously. For example,
x = i / 3;
y = i / 3 + 4;
can be changed to
temp = i / 3;
x = temp;
y = temp + 4;
Code that can never be executed is
eliminated.
Loops containing multiplications (usually those associated with
array indexing) have the operations changed to addition.
If the same expression is computed along all paths from a point
in the code, the expression is moved to a single, common location. For example,
if (expression )
x = i + j;
else
y = (i + j) * 2;
can be changed to
temp = i + j;
if (expression )
x = temp;
else
y = temp * 2;
The compiler accepts the
following options to modify optimization:
loop
|
assumes that loops have multiple
iterations when the number of iterations is variable. This enables the movement
of safe code out of loops. (See Moving invariant calculations out of loops.)
loop
is the default.
When a loop is not executed at all, the moved code is
executed in cases where it previously would not have been. For example,
for (i = 0; i < n; ++i)
for (j = 0; j < m; ++j)
p[i * m + j] += 1;
can be changed to
for (i = 0; i < n; ++i) {
temp = i * m;
for (j = 0; j < m; ++j)
p[temp + j] += 1;
}
In the changed code,
i*m
can be calculated when
m
is less than or equal
to 0. When the
loop
option has been specified, there may be a small cost in time for
every loop that is not executed. There is also a significant time saving
for loops that are executed many times, as most are.
Some types of code may cause an exception, for example,
division by 0.
For this reason, the SAS/C Compiler
restricts moved code to safe operations, including integral and pointer arithmetic
other than division by 0, but not including floating-point operations. The
compiler avoids incorrect exceptions regardless of the setting of the
loop
option. |
alias
|
disables type-based aliasing assumptions.
If
alias
is used, the compiler assumes worst-case aliasing. Use of this
option can significantly reduce the amount of optimization that can be performed.
noalias
is the default. |
greg
|
controls the number of general registers
that the compiler will try to allocate. |
freg
|
controls the number of floating-point
registers that the compiler will try to allocate.
For both
greg
and
freg
, the compiler allocates from among the supported
register variable registers. These are R6-R11 and FR4/FR6. Registers R4, R5,
R12, and R13 are dedicated to addressing various data objects in the function.
R1 is used for numerous specific code sequences.
R0, R2, R3, R14, R15, FR0, and FR2 remain for things
like constants, base registers, VCONs, and nonregisterized variables. In
the case that the user feels that values of these types are being reloaded
too often from memory and can benefit from having more registers available,
then the number of registers allocated with
greg
or
freg
can be reduced. |
inline
|
inlines small functions (as defined
by the
complexity
option) and those with the
__inline
keyword.
inline
is the default when
the
optimize
option is used. |
inlocal
|
inlines single-call static (local)
functions. |
complexity
|
defines the complexity of functions
considered small by
inline
. (See Using the complexity option to control inlining.) |
depth
|
defines the maximum depth of function
calls to be inlined. The range is 0 to 6, and the default value is 3. |
rdepth
|
defines the maximum level of recursive
function calls to be inlined. The range is 0 to 6, and the default is 0. |
The compiler
does not optimize programs when the
debug
option is used. To utilize all
the capabilities of the SAS/C Debugger,
there must be an accurate correspondence between object code and source line
numbers, and optimizations can alter this correspondence. Also, the
debug
option causes the compiler to suppress allocation of variables to registers,
so the resulting code is not completely optimal.
You can, however, use the
dbhook
option along with
the
optimize
option to generate optimized object code that can be used
with the debugger. The
dbhook
option generates hooks in the object code
that enable the debugger to gain control of an executing program.
When using the debugger with optimized object code that
has been compiled with the
dbhook
option, the source code is not displayed in
the debugger's Source window and you cannot access variables. Therefore, the
debugger's print command and other commands that are normally
used with variables are not used when debugging optimized code. You can use
commands such as step, goto, and runto
to control the execution of your program. The goto command may
cause incorrect results if the expected register contents at the goto target differ from the actual register contents when the command was
issued. Also, source code line numbers are displayed in the Source window,
providing an indication of your location in the code. You also have the capability
of viewing register values in the debugger's Register window.
The debugging of optimized code is most effective when
used in conjunction with the Object Module Disassembler (OMD) or your system's
debugger. See Compiling C Programs for information about using the OMD.
Copyright © 2001
by SAS Institute Inc., Cary, NC, USA. All rights reserved.