The SASCBTBL Attribute Table

Introduction to the SASCBTBL Attribute Table

Because the MODULE function invokes an external routine that SAS knows nothing about, you must supply information about the routine's arguments so that the MODULE function can validate them and convert them, if necessary. For example, suppose you want to invoke a routine that requires an integer as an argument. Because SAS uses floating-point values for all of its numeric arguments, the floating-point value must be converted to an integer before you invoke the external routine. The MODULE function looks for this attribute information in an attribute table that is referred to by the SASCBTBL fileref.

What Is the SASCBTBL Attribute Table?

The attribute table is a sequential text file that contains descriptions of the routines that you can invoke with the MODULE function. The table defines how the MODULE function should interpret supplied arguments when it builds a parameter list to pass to the called routine.
The MODULE function locates the table by opening the file that is referenced by the SASCBTBL fileref. If you do not define this fileref, the MODULE function simply calls the requested shared library routine without altering the arguments.
CAUTION:
Using the MODULE function without defining an attribute table can cause SAS to crash, produce unexpected results, or result in severe errors.
You need to use an attribute table for all external functions that you want to invoke.

Syntax of the Attribute Table

The Attribute Table

The attribute table should contain the following items:
  • a description in a ROUTINE statement for each shared library routine that you intend to call
  • descriptions in ARG statements for each argument that is associated with the routine you intend to call
At any point in the attribute table file, you can create a comment using an asterisk (*) as the first non-blank character of a line or after the end of a statement (following the semicolon). You must end the comment with a semicolon.

ROUTINE Statement

Here is the syntax of the ROUTINE statement:
ROUTINE name MINARG=minarg MAXARG=maxarg
<CALLSEQ=BYVALUE|BYADDR>
<TRANSPOSE=YES|NO> <MODULE=shared-library-name>
<RETURNS=DBLPTR | CHAR<n> | DOUBLE | LONG | PTR | SHORT | [U]INT32 |
[U]INT64 | ULONG | USHORT>
The following are descriptions of the ROUTINE statement attributes:
ROUTINE name
starts the ROUTINE statement. You need a ROUTINE statement for every shared library function that you intend to call. The value for name must match the routine name or ordinal that you specified as part of the module argument in the MODULE function, where module is the name of the shared library (if not specified by the MODULE attribute) and the routine name or ordinal. For example, in order to specify libc,getcwd in the MODULE function call, the ROUTINE name should be getcwd.
The name argument is case sensitive, and is required for the ROUTINE statement.
MINARG=minarg
specifies the minimum number of arguments to expect for the shared library routine. In most cases, this value will be the same as MAXARG; but some routines do allow a varying number of arguments. This attribute is required.
MAXARG=maxarg
specifies the maximum number of arguments to expect for the shared library routine. This attribute is required.
CALLSEQ=BYVALUE | BYADDR
indicates the calling sequence method used by the shared library routine. Specify BYVALUE for call-by-value and BYADDR for call-by-address. The default value is BYADDR.
Fortran and COBOL are call-by-address languages. C is usually call-by-value, although a specific routine might be implemented as call-by-address.
The MODULE function does not require that all arguments use the same calling method. You can identify any exceptions by using the BYVALUE and BYADDR options in the ARG statement.
TRANSPOSE=YES | NO
specifies whether SAS transposes matrices that have both more than one row and more than one column before it calls the shared library routine. This attribute applies only to routines called from within PROC IML with MODULEI, MODULEIC, and MODULEIN.
TRANSPOSE=YES is necessary when you are calling a routine that is written in a language that does not use row-major order to store matrices. (For example, Fortran uses column-major order.)
For example, consider this matrix with three columns and two rows:
     columns
                   1  2  3
              ------------
         rows  1 | 10 11 12  
               2 | 13 14 15
PROC IML stores this matrix in memory sequentially as 10, 11, 12, 13, 14, 15. However, Fortran routines will expect this matrix as 10, 13, 11, 14, 12, 15.
The default value is NO.
MODULE=shared-library-name
names the executable module (the shared library) in which the routine resides. You do not need to specify this attribute if the name of the shared library is the same name as the routine. If you specify the MODULE attribute here in the ROUTINE statement, then you do not need to include the module name in the module argument of the MODULE routine (unless the shared library routine name that you are calling is not unique in the attribute table). The MODULE routine is described in CALL MODULE Routine: UNIX.
You can have multiple ROUTINE statements that use the same MODULE name. You can also have duplicate routine names that reside in different shared libraries.
The MODULE function searches the directories that are defined in each operating system's library path environment variable when it attempts to load the shared library argument provided in the MODULE attribute. The following table lists this environment variable for each UNIX operating system that SAS supports.
Shared Library Environment Variable Name
Operating Environment
Environment Variable Name
Solaris
$LD_LIBRARY_PATH
AIX/R
$LIBPATH
HP-UX
$LD_LIBRARY_PATH or $SHLIB_PATH
Linux
$LD_LIBRARY_PATH
Note: For more information about these environment variables, see the man pages for your operating environment.
You can also use the PATH system option to point to the directory that contains the shared library specified in the MODULE= option. Using the PATH system option overrides your system's environment variable when you load the shared library. For more information, see PATH System Option: UNIX.
RETURNS=DBLPTR | CHAR<n> | DOUBLE | LONG | PTR | SHORT | [U]INT32 | [U]INT64 | ULONG | USHORT
specifies the type of value that the shared library routine returns. This value will be converted as appropriate, depending on whether you use MODULEC (which returns a character) or MODULEN (which returns a number). The following are the possible return value types:
DBLPTR
pointer to a double-precision floating point number (instead of using a floating-point register). See the documentation for your shared library routine to determine how it handles double-precision floating-point values.
CHAR<n>
pointer to a character string up to n bytes long. The string is expected to be null-terminated and will be blank-padded or truncated as appropriate. If you do not specify n, the MODULE function uses the maximum length of the receiving SAS character variable.
DOUBLE
double-precision floating-point number.
LONG
long integer.
PTR
character string being returned.
SHORT
short integer.
[U]INT32
32–bit unsigned integer.
[U]INT64
64-bit unsigned integer.
ULONG
unsigned long integer.
USHORT
unsigned short integer.
If you do not specify the RETURNS attribute, you should invoke the routine with only the MODULE and MODULEI CALL routines. You will get unpredictable values if you omit the RETURNS attribute and invoke the routine using the MODULEN and MODULEIN functions or the MODULEC and MODULEIC functions.

ARG Statement

The ROUTINE statement must be followed by as many ARG statements as you specified in the MAXARG= option. The ARG statements must appear in the order in which the arguments will be specified within the MODULE function.
Here is the syntax for each ARG statement:
ARG argnum NUM|CHAR <INPUT|OUTPUT|UPDATE> <NOTREQD|REQUIRED>
<BYADDR|BYVALUE> <FDSTART> <FORMAT=format>;
Here are the descriptions of the ARG statement attributes:
ARG argnum
defines the argument number. This a required attribute. Define the arguments in ascending order, starting with the first routine argument (ARG 1).
NUM | CHAR
defines the argument as numeric or character. This attribute is required.
If you specify NUM here but pass the routine a character argument, the argument is converted using the standard numeric informat. If you specify CHAR here but pass the routine a numeric argument, the argument is converted using the BEST12. informat.
INPUT | OUTPUT | UPDATE
indicates the argument is either input to the routine, an output argument, or both. If you specify INPUT, the argument is converted and passed to the shared library routine. If you specify OUTPUT, the argument is not converted, but is updated with an outgoing value from the shared library routine. If you specify UPDATE, the argument is converted, passed to the shared library routine, and updated with an outgoing value from the routine.
You can specify OUTPUT and UPDATE only with variable arguments (that is, no constants or expressions are allowed).
NOTREQD | REQUIRED
indicates whether the argument is required. If you specify NOTREQD, then the MODULE function can omit the argument. If other arguments follow the omitted argument, identify the omitted argument by including an extra comma as a placeholder. For example, to omit the second argument to routine XYZ, you would specify:
call module('XYZ',1,,3);
CAUTION:
Be careful when using NOTREQD; the shared library routine must not attempt to access the argument if it is not supplied in the call to MODULE. If the routine does attempt to access it, you might receive unexpected results or severe errors.
The REQUIRED attribute indicates that the argument is required and cannot be omitted. REQUIRED is the default value.
BYADDR | BYVALUE
indicates whether the argument is passed by reference or by value.
BYADDR is the default value unless CALLSEQ=BYVALUE was specified in the ROUTINE statement. In that case, BYVALUE is the default. Specify BYADDR when you are using a call-by-value routine that also has arguments to be passed by address.
FDSTART
indicates that the argument begins a block of values that are grouped into a structure whose pointer is passed as a single argument. Note that all subsequent arguments are treated as part of that structure until the MODULE function encounters another FDSTART argument.
FORMAT=format
names the format that presents the argument to the shared library routine. Any formats supplied by SAS, PROC FORMAT style formats, or SAS/TOOLKIT formats are valid. Note that this format must have a corresponding valid informat if you specified the UPDATE or OUTPUT attribute for the argument.
The FORMAT= attribute is not required, but is recommended because format specification is the primary purpose of the ARG statements in the attribute table.
CAUTION:
Using an incorrect format can produce invalid results, cause SAS to crash, or result in serious errors.

The Importance of the Attribute Table

The MODULE function relies heavily on the accuracy of the information in the attribute table. If this information is incorrect, unpredictable results can occur (including a system crash).
Consider an example routine xyz that expects two arguments: an integer and a pointer. The integer is a code indicating what action takes place. For example, action 1 means that a 20-byte character string is written into the area that is pointed to by the second argument, the pointer.
Suppose you call xyz using the MODULE function, but you indicate in the attribute table that the receiving character argument is only 10 characters long:
routine xyz minarg=2 maxarg=2;
arg 1 input num byvalue format=ib4.;
arg 2 output char format=$char10.;
Regardless of the value given by the LENGTH statement for the second argument to MODULE, MODULE passes a pointer to a 10-byte area to the xyz routine. If xyz writes 20 bytes at that location, the 10 bytes of memory following the string provided by MODULE are overwritten, causing unpredictable results:
data _null_;
    length x $20;
    call module('xyz',1,x);
run;
The call might work fine, depending on which 10 bytes were overwritten. However, overwriting can cause you to lose data or cause your system to crash.
Also, note that the PEEKLONG and PEEKCLONG functions rely on the validity of the pointers that you supply. If the pointers are invalid, it is possible that severe errors will result. For example, this code causes an error:
data _null_;
   length c $10;
     /* trying to copy from address 0!!!*/
   c = peekclong(0,10); 
run;