Chapter Contents

Previous

Next
Source Code Conventions

Language Extensions

This section describes extensions to the ISO/ANSI C language implemented by the compiler. Library extensions are described in SAS/C Library Reference, Volume 1 and SAS/C Library Reference, Volume 2. Note that use of these extensions is likely to render a program nonportable.

Embedded $ in identifiers

The dollar sign ($) can be used as an embedded character in identifiers. If the dollar sign is used in identifiers, the dollars compiler option must be specified. As mentioned, use of the dollar sign is not portable because the dollar sign is not part of the portable C character set specified by the C Standard. Also, the dollar sign cannot be used as the first character in an identifier; such usage is reserved for the library.

Comment nesting

The compiler optionally allows comments to be nested. (The C Standard does not sanction this usage.) The comnest compile-time option must be specified to enact comment nesting. When comment nesting is honored, each /* encountered must be matched by a corresponding --> before the comment terminates. This feature makes it easy to comment out large sections of code that contain comments. Thus, sections of debugging code can be removed easily and preserved. Comment nesting is nonportable.

C++ Style Comments

The SAS/C Compiler now supports C++ style line comments. A line comment starts with two forward slashes and goes to the end of the line. An example of the new comment extension is:

     // This is a comment line

Note:    This support is turned off if the strict compiler option is used.  [cautionend]

Specifying floating-point constants in hexadecimal

An extended format for floating-point constants enables them to be specified in hexadecimal to indicate the exact bit pattern to be placed in memory. A hexadecimal double constant consists of the sequence 0.x, followed by 1 to 14 hexadecimal digits. (If there are fewer than 14 digits, the number is extended to 14 digits on the right with 0s.) A hexadecimal double constant defines the exact bit pattern to be used for the constant. As an example, 0.x411 has the same value as 1.0. Use of this feature is nonportable.

Function pointer formats

The compiler supports two function pointer formats: local and remote. A remote function pointer is indirect; it points to an 8- or 12-byte area containing the address of the function code as well as the address of the extern (and possibly static ) data associated with the load module. A local function pointer is direct; it is simply the address of the function code. No other addresses are needed. In 370 object code terminology, a local function pointer is a V-type address constant.

The remote format supports pointers to functions in other load modules that have their own set of externs. Since most of the run-time library functions are in separate load modules, library functions that accept function pointer arguments, such as signal or atexit , typically require remote function pointers.

The local format is simpler. Many assembler language subroutines that accept subroutine addresses only accept addresses in this format. The disadvantage is that a function pointer in local format cannot call a function in another load module if the called function references extern or static data in that load module, or if that function calls C library routines that might reference such data.

You should use the remote format unless your application has a specific need for function pointers in local format. The remote format is supported by all library functions.

By default, all function pointers are remote. The __local and __remote keywords explicitly declare function pointers in local or remote format. You can use the pflocal option to force the compiler to generate local format function pointers by default.

Far Pointer Support

SAS/C supports the _ _near and _ _far keywords in pointer type declarations as part of its access-register mode support. See Access Register Mode Support for more information.

Note:    The C++ translator does not support _ _near and _ _far for Release 7.00.  [cautionend]

The __local and __remote keywords

The __local and __remote keywords can be used in function pointer declarations to specify whether the function pointer is in remote or local format.

The pflocal compiler option specifies that all function pointers declared in a compilation are local, except those specifically declared with the __remote keyword. Under the default, nopflocal , most function pointers declared in the compilation are remote, except those specifically declared with the __local keyword.

There are three exceptions to this rule: the __asm , __ref , and __ibmos keywords. Use of the __asm or __ref keywords in a function pointer declaration implies that the declared function pointer is local unless the __remote keyword is explicitly specified in the declaration. Function pointers that are declared with the __ibmos keyword are always local.

Remote function pointers can be converted to local function pointers. However, local function pointers cannot be converted to remote function pointers. Local function pointers cannot be passed as arguments to library functions.

Below is a list of the library functions that require remote function pointers.
atcoexit atexit atfork bldexit bsearch btrace buildm cmsrxfn cosignal costart loadd loadm qsort sigdef signal unloadd unloadm

The sa_handler field of the library sigaction structure is a remote function pointer.

The function pointers that specify alternate strcoll and strxfrm functions for user-added locales must be remote, and the function pointers that specify a DCB exit routine for the osdcb and osbdcb functions must be remote.

The address of a function, that is, the value of &fnc_name , is either local or remote, depending on the setting of the pflocal option. If pflocal is used, &fnc_name is considered a local function pointer. If nopflocal is used, &fnc_name is considered a

remote function pointer. You can override this behavior by using an explicit cast, as in the following example:

atexit((__remote void (*)(void)) &exit_func);

Note that you must be careful using function addresses within conditional expressions. In an expression like the following example, both function addresses are converted to the default function pointer type.

test ? &fnc_name1 : &fnc_name2

If the pflocal option was specified and the expression above is assigned to a __remote function pointer variable, an error will be indicated by the compiler, since the result of the conditional expression is a local function pointer. You should use casts in expressions of this sort to ensure correct interpretation, as in this example:

test ? (__remote void(*)(void)) &fnc_name1 :
       (__remote void(*)(void)) &fnc_name2

Note:    The SAS/C dynamic-loading functions that are described in SAS/C Library Reference, Volume 2 require the use of remote function pointers. You cannot use the loadm function with a function pointer that has been declared with the __ibmos keyword.  [cautionend]

Keywords for assembler language functions

SAS/C supports three keywords that can be used to declare functions and pointers to functions written in assembler language that expect a parameter list in OS format:

Refer to _ _asm, _ _ref, and _ _ibmos Keywords for a discussion of these keywords.

__weak storage class modifier

The __weak storage class modifier applies only to references to external named objects and functions. For objects, it has meaning only for __norent objects or const objects that can be initialized at compile time. (See Code Generation Conventions for a discussion of __norent objects.) The __weak keyword is placed next to extern on the declaration of those objects and functions. The __weak keyword causes the compiler to generate weak references to the declared object. A weak reference suppresses autocall of the symbol by both COOL and the linkage editor or loader. A symbol or function declared with __weak need not actually be present in the load module unless specifically included, or referenced by some other compilation in which it is declared without the use of __weak . __weak does not apply to definitions; therefore, it does not cause the creation of a new storage class. __weak external objects are still storage class extern .

As a storage class modifier, __weak cannot appear as part of a typedef , in a cast, on a structure member, and so on. Also, you cannot have a "pointer to __weak ," any more than one can have a "pointer to extern ."

When declaring a __weak pointer, you must place the __weak after the asterisk ( * ). The following example demonstrates the use of __weak :

__weak extern double wd;
extern double * __weak wpd;  /* __weak pointer to double */
__weak int wf();

You can use the isunresolved library macro to test whether or not a __weak reference has been resolved by the linkage editor. See SAS/C Library Reference, Volume 1 for more information about isunresolved .

The @ operator

The @ operator is a language extension provided primarily to aid communication between C and non-C programs.

In C, the normal argument-passing convention is to use call-by-value; that is, the value of an argument is passed. The normal IBM 370 (non-C) argument-passing conventions differ from this in two ways. First, arguments are passed by reference; that is, each item in the parameter list is an argument address, not an argument value. Second, the last argument address in the list is usually flagged by setting the high-order bit, which does not change the value of the address since IBM 370 addresses are 31 bits (XA) or 24 bits (non-XA).

A simplistic approach to the problem of call-by-reference is to precede each function argument by the ampersand ( & ) operator, thereby passing the argument address rather than its value. For example, you can write asmcode(&x) rather than asmcode(x) . This approach is not generally applicable because it is frequently necessary to pass constants or computed expressions, which are not valid operands of the address-of operator. The compiler provides an option to solve this problem.

When the compiler option AT is specified, the at sign ( @ ) is treated as a new operator. The @ operator can be used only in an argument to a function call. (The result of using it in any other context is undefined.) The @ operator has the same syntax as & . In situations where & can be used, @ has the same meaning as & .

In addition, @ can be used on non-lvalues such as constants and expressions. In these cases, the value of @expr is the address of a temporary storage area to which the value of @expr is copied.

One special case for the @ operator is when its argument is an array name or a string literal. In this case, @array is different from &array. While @array addresses a pointer addressing the array, &array still addresses the array.

The compiler continues to process the @ operator as in earlier releases when the @ is in the context of a function call. Use of @ is nonportable. Its use should be restricted to programs that call non-C routines using call by reference.

Nesting of #define

If the compiler option redef is specified, multiple #define statements for the same symbol can appear in a source file. When a new #define statement is encountered for a symbol, the old definition is stacked but is restored if an #undef statement for the symbol occurs. For example, if the line

#define XYZ 12

is followed later by

#define XYZ 43

the new definition takes effect, but the old one is not forgotten. Then, when the compiler encounters the following, the former definition (12) is restored:

#undef XYZ

To completely undefine XYZ, an additional #undef is required. Each #define must be matched by a corresponding #undef before the symbol is truly forgotten. Identical #define statements for a symbol (those permitted when redef is not specified) do not stack.

Preprocessor directives for listing control

The following preprocessor commands are available to control the format of the printed listing. They have no effect on any aspect of a program except the program listing and can appear anywhere in program code, except as a continuation line.

pragma can be omitted. However, omitting pragma renders a program nonportable.


Anonymous unions

An anonymous union, that is, a union with no associated identifier, can be declared in a structure. Members of anonymous unions are in the same scope as the containing structure. Here are two examples of anonymous unions:

union ANON {
   int i;
   short o[2];
} ;

static struct {
   int a;
   union ANON;
   double d;
}  ex;

static struct {
   int a;

   union {
      int i;
      short o[2];
   } ;
   double d;
}  ex;

Member o[1] of the union can be accessed by using this expression:

ex.o[1]

The members of an anonymous union are in the same name space as that of other members in the containing structure. Therefore, a member of the union cannot have the same identifier as a member of the containing structure or a member in another anonymous union in the containing structure.

Other than the above considerations, anonymous unions have the same properties and can be used in the same manner as other unions.

Noninteger bitfields

By default, all bitfields must have type int , signed int , or unsigned int . The bitfield compiler option can be used in order for other types to be used. If the bitfield option is used, the compiler accepts any integral type in the declaration of a bitfield as in this example:

struct {
   char f1 :3;
   signed short f2 :15;
   unsigned long f3 : 28;
} ex;

Types that are not int can be used to specify the allocation unit to be used by the compiler. (See Structure and union type names.) By default, the allocation unit is an int . This means that the compiler allocates 4 bytes of storage for the first bitfield it encounters in a structure definition. Adjacent bitfields are packed into the int until not enough bits remain for the next bitfield, a nonbit-field member is declared, or a zero-length bitfield is encountered.

If the bitfield option is used, the type of the bitfield determines the allocation unit. If the type is a char type, the allocation unit is 1 byte. If the type is a short or long type, the allocation unit is 2 or 4 bytes, respectively.

The first bitfield declared with a particular type is aligned on the appropriate boundary for that type (as modified by the bytealign option, __noalignmem , or both). In the previous example, f1 is allocated in byte 1, f2 is allocated in bytes 3 and 4, and f3 is allocated in bytes 5 through 8.

The bitfield option also specifies the allocation unit to be used for int bitfields. The unit can be char , short , or long . When a bitfield of type int is declared, the compiler uses the allocation unit specified by the option.

For example, in the following structure definition, the compiler, by default, allocates 4 bytes of storage for the 8 bits:

struct {
   unsigned f1 : 3;
   unsigned f2 : 5;
} ex;

However, the bitfield option can be used to specify that the allocation unit should be a char , which specifies that only 1 byte of storage should be allocated, or a short , which specifies that 2 bytes of storage should be allocated.

See Compiler Options for information on how to specify the bitfield option and the allocation unit.

Zero-length arrays

An array of length 0 can be declared as a member of a structure. No space is allocated for the array, but the following member will be aligned on the boundary required for the array type. Zero-length arrays are useful for aligning members to particular boundaries (to match the format of external data for example) and for allocating varying-length arrays following a structure. In the following structure definition, no space is allocated for member d , but the member b will be aligned on a doubleword boundary:

struct ABC {
   int a;
   double d[0];
   int b;
} ;

Zero-length arrays are not permitted in any other context.

The __alignmem and __noalignmem keywords

The __alignmem and __noalignmem keywords can be used in a structure definition to specify whether members of a structure are to be aligned normally ( __alignmem ) or on byte boundaries ( __noalignmem ). The keywords are associated with the structure tag. Note that the keyword must precede the word struct in the structure declaration. For example, in the following structure declaration, member ex.d will not be aligned on a doubleword boundary, but it will be allocated at the next available location, a word boundary:

__noalignmem struct XYZ {
   int a;
   double d;
} ex;

This property can be useful when C structures are used to map existing data areas, such as operating system control blocks, that have fields aligned on boundaries other than those normally associated with the C data types.

The keywords can be used to force alignment even when the bytealign compiler option is used or to force byte alignment when the nobytealign option is used. See the discussion of bytealign in Option Descriptions.

__alignmem and __noalignmem propagate to any structure members. For example, in the following structure declaration, the members of s will be byte-aligned as well:

__noalignmem struct XYZ {
   struct ABC {
      char c;
      float f;
   } s;
   double d;
} ex;

The keywords can be used in the declaration of inner structures to change alignment requirements. In the following example, the members of the outer structure are not aligned. The members of the inner structure are aligned.

__noalignmem struct XYZ {
   int a;
   short b;
   __alignmem struct ABC {
      int c;
      double d;
   } abc;
   double d;
} ;


Special keywords for declarations of non-C functions

The compiler accepts a number of special interlanguage communication keywords in the declarations of functions and function pointers. These keywords indicate that the declared function, or the function pointed to, is written in a language other than the C language. The following keywords are accepted by the compiler:

__asm __pascal __pli __cobol __fortran __foreign

Note:    If you use any of these keywords other than __asm , you must also use the SAS/C Interlanguage Communication Feature. Do not use these keywords if you are using another technique for interlanguage communication.  [cautionend]

Here is an example of a function declaration for a function written in FORTRAN. The function returns a value of type double , as follows:

double __fortran xyz();

The keywords can be used in combination either via a typedef or directly. For example, suppose pasfnc is a function written in Pascal that returns a pointer to a function written in assembler language. The assembler language function, in turn, returns a value of type int . To declare pasfnc , you can use a typedef as in the following example:

typedef __asm int (*asmfp)();
__pascal asmfp pasfnc();

Here is an example that does not use a typedef :

char *(__asm *__pli p())();

In this example, a PL/I function returns a pointer to an assembler function (the assembler function returns a pointer to char ).

Do not use the prototype form of function declaration in declarations containing one of these keywords.

See Chapter 3, "Communication with other Languages," in SAS/C Compiler Interlanguage Communication Feature User's Guide for detailed information on how these keywords can be used.

__inline and __actual storage class modifiers

__inline is a storage class modifier. It can go in the same places as a storage class specifier and can be given in addition to a storage class specifier. If a function is declared as __inline and the module contains at least one definition of the function, the compiler sees this as a recommendation that the function be inlined. __actual is also a storage class modifier. It can be specified with or without the __inline qualifier, but it implies __inline . __actual is used to specify that the compiler should produce an actual (callable) copy of the function if the function has external linkage. If the function has internal linkage, the compiler may not output an actual function if it does not need one.

For additional information, see the discussion of __inline in Global Optimization Compiler Options and __actual in The __actual Keyword for Inline Functions.

The #pragma options statement

The #pragma options statement specifies compiler options within program source code. More than one #pragma options statement can be used in a source file. See Compiler Options for more information about the #pragma options statement.

The #pragma linkage statement

The compiler accepts the following statement in which identifier is the name of a function or a typedef of a function. This statement specifies that identifier is called using IBM OS linkage.

#pragma linkage (identifier ,OS)

Sample #pragma linkage Statement illustrates the use of the #pragma linkage statement in a program.


Sample #pragma linkage Statement
extern int asm_func(void); /* Declare 'asm_func' as called */ #pragma linkage (asm_func,OS) /* using OS-linkage. */ typedef int os_linkage_t(void); /* Declare functions of type */ #pragma linkage (os_linkage_t,OS) /* 'os_linkage_t' as called */ extern os_linkage_t asm_func; /* using OS-linkage. */

The compiler accepts the #pragma linkage statement to ensure compatibility with IBM products whose C language interface functions are defined with this statement. Refer to the IBM documentation for a specific product for more information. For more information on IBM OS linkage, see Compiler Options.

The #pragma map statement

The compiler accepts the following statement:

#pragma map (external-identifier,"external-name")

This statement directs the compiler to use external-name as the object code external symbol for external-identifier. The external identifier must be a C identifier of storage class extern . The external symbol can be any sequence of characters, enclosed by double quotes. If the external symbol is longer than eight characters, the compiler truncates it on the right to eight characters.

For example, suppose you had an assembler language module named ABC$DEF and you wanted to reference it in your C programs with a more natural looking name. You could use the following #pragma statement to map abc_def to ABC$DEF:

#pragma map(abc_def, "ABC$DEF")

Adhering to the following guidelines ensures that the symbols you create do not conflict with the compiler-generated symbols or symbols defined in the SAS/C Library:

The compiler issues a warning diagnostic message for any external symbol name that does not follow these guidelines. Depending on the context, the compiler may refer to the external identifier by the external symbol name in a diagnostic message. Extended Names contains information on using #pragma map with the extname compiler option.

Character and String Qualifiers

Release 6.50 introduced A and E qualifiers for character and string constants. The new qualifiers causes the string to be either ASCII or EBCDIC.

A string literal prefixed with A is parsed and stored by the compiler as an ASCII string. An example of its usage is:

     A"this is an ASCII string"

A string literal prefixed with E is parsed and stored by the compiler as an EBCDIC string. An example of its usage is:

     E"this is an EBCDIC string"

The translation between ASCII and EBCDIC is based on IBM Code Page 1047 for EBCDIC and ISO 8859-1 (Latin 1) for ASCII.


Chapter Contents

Previous

Next

Top of Page

Copyright © 2001 by SAS Institute Inc., Cary, NC, USA. All rights reserved.