The PROTO Procedure |
Registering Function Prototypes |
Function prototypes are registered (declared) in the PROTO procedure. Use the following form:
return-type function-name (arg-type <arg-name> / <iotype> <arg-label>, ...) <options>; |
specifies a C language type for the returned value. See Supported C Return Types for a list of supported C return types.
Tip: | Return-type can be preceded by either the unsigned or Exceldate modifiers. You need to use Exceldate if the return type is a Microsoft Excel date. |
specifies the name of the function to be registered.
Tip: | Function names within a given package must be unique in the first 32 characters. Function names do not need to be unique across different packages. |
specifies the C language type for the function argument. See Supported C Argument Types for a list of supported C argument types.
You must specify arg-type for each argument in the function's argument list. The argument list must be between the left and closed parentheses. If the argument is an array, then you must specify the argument name prefixed to square brackets that contain the array size (for example, double A[10]) . If the size is not known or if you want to disable verification of the length, then use type * name instead (for example, double * A ).
Tip: | Arg-type can be preceded by either the unsigned, const, or Exceldate modifiers. You need to use Exceldate if the return type is a Microsoft Excel date. |
specifies the name of the argument.
specifies the I/O type of the argument. Use I for input, O for output, or U for update.
Tip: | By default, all parameters that are pointers are assumed to be input type U. All non-pointer values are assumed to be input type I. This behavior parallels the C language parameter passing scheme. |
specifies a description or label for the argument.
specifies a description or a label for the function. Enclose the text string in quotation marks.
specifies the group that the function belongs to. The KIND= or GROUP= option allows for convenient grouping of functions in a package.
You can use any string (up to 40 characters) in quotation marks to group similar functions.
Tip: | The following special cases provided for Risk Dimensions do not require quotation marks: INPUT (Instrument Input), TRANS (Risk Factor Transformation), PRICING (Instrument Pricing), and PROJECT. The default is PRICING. |
The following C return types are supported in the PROTO procedure.
Function prototype | SAS variable type | C variable type | |
---|---|---|---|
short | numeric | short, short *, short ** | |
short * | numeric, array | short, short *, short ** | |
int | numeric | int, int *, int ** | |
int * | numeric, array | int, int *, int ** | |
long | numeric | long, long *, long ** | |
long * | numeric, array | long, long *, long ** | |
double | numeric | double, double *, double ** | |
double * | numeric, array | double, double *, double ** | |
char * | character | char *, char ** | |
struct * | struct | struct *, struct ** | |
void |
|
void |
The following C argument types are supported in the PROTO procedure.
Function prototype | SAS variable type | C variable type | |
---|---|---|---|
short | numeric | short, short *, short ** | |
short * | numeric, array | short, short *, short ** | |
short ** | array | short *, short ** | |
int | numeric | int, int *, int ** | |
int * | numeric, array | int, int *, int ** | |
int ** | array | int *, int ** | |
long | numeric | long, long *, long ** | |
long * | numeric, array | long, long *, long ** | |
long ** | array | long *, long ** | |
double | numeric | double, double *, double ** | |
double * | numeric, array | double, double *, double ** | |
double ** | array | double *, double ** | |
char * | character | char *, char ** | |
char ** | character | char *, char ** | |
struct * | structure | struct *, struct ** | |
struct ** | structure | struct *, struct ** |
Basic C Language Types |
The SAS language supports two data types: character and numeric. These types correspond to an array of characters and a double (double-precision floating point) data type in the C programming language. When SAS variables are used as arguments to external C functions, they are converted (cast) into the proper types.
Working with Character Variables |
You can use character variables for arguments that require a "char *" value only. The character string that is passed is a null string that is terminated at the current length of the string. The current length of the character string is the minimum of the allocated length of the string and the length of the last value that was stored in the string. The allocated length of the string (by default, 32 bytes) can be specified by using the LENGTH statement. Functions that return "char *" can return a null or zero-delimited string that is copied to the SAS variable. If the current length of the character string is less than the allocated length, the character string is padded with a blank.
In the following example, the allocated length of str is 10, but the current length is 5. When the string is NULL-terminated at the allocated length, "hello " is passed to the function xxx:
length str $ 10; str = "hello"; call xxx(str);
To avoid the blank padding, use the SAS function TRIM on the parameter within the function call:
length str $ 10; str = "hello"; call xxx(trim(str));
In this case, the value "hello" is passed to the function xxx.
Working with Numeric Variables |
You can use numeric variables for an argument that requires a short, int, long, or double data type, as well as for pointers to those types. Numeric variables are converted to the required type automatically. If the conversion fails, then the function is not called and the output to the function is set to missing. If pointers to these types are requested, the address of the converted value is passed. On return from the call, the value is converted back to a double type and stored in the SAS variable. SAS scalar variables cannot be passed as arguments that require two or more levels of indirection. For example, a SAS variable cannot be passed as an argument that requires a cast to a "long **" type.
Working with Missing Values |
SAS variables that contain missing values are converted according to how the function that is being called has mapped missing values when using the PROTO procedure. All variables that are returned from the function are checked for the mapped missing values and converted to SAS missing values.
For example, if an argument to a function is missing, and the argument is to be converted to an integer, and an integer was mapped to -99, then -99 is passed to the function. If the same function returns an integer with the value -99, then the variable that this value is returned to would have a value of missing.
Interfacing with External C Functions |
To make it easier to interface with external C functions, many PROTO-compatible procedures have been enhanced to support most of these C types.
There is no way to return and save a pointer to any type in a SAS variable (see Automatic Type Casting for the short Data Type in an Assignment Statement. Pointers are always dereferenced, and their contents are converted and copied to SAS variables.
The EXTERNC statement is used to specify C variables in PROTO compatible procedures. The syntax of the EXTERNC statement has the following form:
EXTERNC DOUBLE | INT | LONG | SHORT | CHAR <[*][*]> var-1 <var-2 ... var-n>; |
The following table shows how these variables are treated when they are positioned on the left side of an expression. The table shows the automatic casting that is performed for a short type on the right side of an assignment. (Explicit type conversions can be forced in any expression, with a unary operator called a cast.) The table lists all the allowed combinations of short types that are associated with SAS variables.
Note: A table for int, long, and double types can be created by substituting any of these types for "short" in this table.
If any of the pointers are null and require dereferencing, then the result is set to missing if there is a missing value set for the result variable (see MAPMISS Statement for more information).
Type for left side of assignment | Type for right side of assignment | Cast performed | |
---|---|---|---|
short | SAS numeric | y = (short) x | |
short | short | y = x | |
short | short * | y = * x | |
short | short ** | y = ** x | |
short * | SAS numeric | * y = (short) x | |
short * | short | y = & x | |
short * | short * | y = x | |
short * | short ** | y = * x | |
short ** | SAS numeric | **y = (short) x | |
short ** | short * | y = & x | |
short ** | short ** | y = x | |
SAS numeric | short | y = (double) x | |
SAS numeric | short * | y = (double) * x | |
SAS numeric | short ** | y = (double) ** x |
The following table shows how these variables are treated when they are passed as arguments to an external C function.
Function prototype | SAS variable type | C variable type | |
---|---|---|---|
short | numeric | short, short *, short ** | |
short * | numeric, array | short, short *, short ** | |
short ** | array | short *, short ** | |
int | numeric | int, int *, int ** | |
int * | numeric, array | int, int *, int ** | |
int ** | array | int *, int ** | |
long | numeric | long, long *, long ** | |
long * | numeric, array | long, long *, long ** | |
long ** | array | long *, long ** | |
double | numeric | double, double *, double ** | |
double * | numeric, array | double, double *, double ** | |
double ** | array | double *, double ** | |
char * | character | char *, char ** | |
char ** | character | char *, char ** | |
struct * | structure | struct *, struct ** | |
struct ** | structure | struct *, struct ** |
Note: Automatic conversion between two different C types is never performed.
C Structures in SAS |
Many C language libraries contain functions that have structure pointers as arguments. In SAS, structures can be defined only in PROC PROTO. After being defined, they can be declared and instantiated within many PROC PROTO compatible procedures, such as PROC COMPILE.
A C structure is a template that is applied to a contiguous piece of memory. Each entry in the template is given a name and a type. The type of each element determines the number of bytes that are associated with each entry and how each entry is to be used. Because of various alignment rules and base type sizes, SAS relies on the current machine compiler to determine the location of each entry in the memory of the structure.
The syntax of a structure declaration in SAS is the same as for C non-pointer structure declarations. A structure declaration has the following form:
struct structure_name structure_instance; |
Each structure is set to zero values at declaration time. The structure retains the value from the previous pass through the data to start the next pass.
Structure elements are referenced by using the static period (.) notation of C. There is no pointer syntax for SAS. If a structure points to another structure, the only way to reference the structure that is pointed to is by assigning the pointer to a declared structure of the same type. You use that declared structure to access the elements.
If a structure entry is a short, int, or long type, and it is referenced in an expression, it is first cast to a double type and then used in the calculations. If a structure entry is a pointer to a base type, then the pointer is dereferenced and the value is returned. If the pointer is NULL, then a missing value is returned. The missing value assignments that are made in the PROC PROTO code are used when conversions fail or when missing values are assigned to non-double structure entities.
The length of arrays must be known to SAS so that an array entry in a structure can be used in the same way as an array in SAS, as long as its dimension is declared in the structure. This requirement includes arrays of short, int, and long types. If the entry is actually a pointer to an array of a double type, then the array elements can be accessed by assigning that pointer to a SAS array. Pointers to arrays of other types cannot be accessed by using the array syntax.
proc proto package = sasuser.mylib.struct label = "package of structures"; #define MAX_IN 20; typedef char * ptr; struct foo { double hi; int mid; ptr buf1; long * low; struct { short ans[MAX_IN + 1]; struct { /* inner */ int inner; } n2; short outer; } n; }; typedef struct foo *str; struct foo2 { str tom; }; str get_record(char *name, int userid); run; proc fcmp library = sasuser.mylib; struct foo result; result = get_record("Mary", 32); put result=; run;
Enumerations are mnemonics for integer numbers. Enumerations enable you to set a literal name as a specific number and aid in the readability and supportability of C programs. Enumerations are used in C language libraries to simplify the return codes. After a C program is compiled, you can no longer access enumeration names.
The following example shows how to set up two enumerated value types in PROC PROTO: YesNoMaybeType and Tens. Both are referenced in the structure EStructure:
proc proto package = sasuser.mylib.str2 label = "package of structures"; #define E_ROW 52; #define L_ROW 124; #define S_ROW 15; typedef double ExerciseArray[S_ROW][2]; typedef double LadderArray[L_ROW]; typedef double SamplingArray[S-Row]; typedef enum { True, False, Maybe } YesNoMaybeType; typedef enum { Ten = 10, Twnety = 20, Thirty = 30, Forty = 40, Fifty = 50 } Tens; typedef struct { short rows; short cols; YesNoMaybeType type; Tens dollar; ExerciseArray dates; } EStructure; run;
The following PROC FCMP example shows how to access these enumerated types. In this example, the enumerated values that are set up in PROC PROTO are implemented in SAS as macro variables. Therefore, they must be accessed using the & symbol.
proc fcmp library = sasuser.mylib; EStructure mystruct; mystruct.type = &True; mystruct.dollar = &Twenty; run;
You can use PROC PROTO in a limited way to compile external C functions. The C source code can be specified in PROC PROTO in the following way:
EXTERNC function-name; ... C-source-statements ... EXTERNCEND;
The function name tells PROC PROTO which function's source code is specified between the EXTERNC and EXTERNCEND statements. When PROC PROTO compiles source code, it includes any structure definitions and C function prototypes that are currently declared. However, typedef and #define are not included.
This functionality is provided to enable the creation of simple "helper" functions that facilitate the interface to preexisting external C libraries. Any valid C statement is permitted except for the #include statement. Only a limited subset of the C-stdlib functions are available. However, you can call any other C function that is already declared within the current PROC PROTO step.
The following C-stdlib functions are available:
Function | Description | |
---|---|---|
double sin(double x) | returns the sine of x (radians) | |
double cos(double x) | returns the cosine of x (radians) | |
double tan(double x) | returns the tangent of x (radians) | |
double asin(double x) | returns the arcsine of x (-pi/2 to pi/2 radians) | |
double acos(double x) | returns the arccosine of x (0 to pi radians) | |
double atan(double x) | returns the arctangent of x (-pi/2 to pi/2 radians) | |
double atan2(double x, double y) | returns the arctangent of y/x (-pi to pi radians) | |
double sinh(double x) | returns the hyperbolic sine of x (radians) | |
double cosh(double x) | returns the hyperbolic cosine of x (radians) | |
double tanh(double x) | returns the hyperbolic tangent of x (radians) | |
double exp(double x) | returns the exponential value of x | |
double log(double x) | returns the logarithm of x | |
double log2(double x) | returns the logarithm of x base-2 | |
double log10(double x) | returns the logarithm of x base-10 | |
double pow(double x, double y) | returns x raised to the y power of x**y | |
double sqrt(double x) | returns the square root of x | |
double ceil(double x) | returns the smallest integer not less than x | |
double fmod(double x, double y) | returns the remainder of (x/y) | |
double floor(double x) | returns the largest integer not greater than x | |
int abs(int x) | returns the absolute value of x | |
double fabs(double) | returns the absolute value of x | |
int min(int x, int y) | returns the minimum of x and y | |
double fmin(double x, double y) | returns the minimum of x and y | |
int max(int x, int y) | returns the maximum of x and y | |
double fmax(double x, double y) | returns the maximum of x and y | |
char* malloc(int x) | allocates memory of size x | |
void free(char*) | frees memory allocated with malloc |
The following example shows a simple C function written directly in PROC PROTO:
proc proto package=sasuser.mylib.foo; struct mystruct { short a; long b; }; int fillMyStruct(short a, short b, struct mystruct * s); externc fillMyStruct; int fillMyStruct(short a, short b, struct mystruct * s) { s ->a = a; s ->b = b; return(0); } externcend; run;
The limitations for the C language specifications in the PROTO procedure are as follows:
#define statements must be followed by a semicolon (;) and must be numeric in value.
The #define statement functionality is limited to simple replacement and unnested expressions. The only symbols that are affected are array dimension references.
The C preprocessor statements #include and #if are not supported. The SAS macro %INC can be used in place of #include.
A maximum of two levels of indirection are allowed for structure elements. Elements like "double ***" are not allowed. If these element types are needed in the structure, but are not accessed in SAS, you can use placeholders.
The float type is not supported.
A specified bit size or byte size for structure variables is not supported.
Function pointers and definitions of function pointers are not supported.
The union type is not supported. However, if you plan to use only one element of the union, you can declare the variable for the union as the type for that element.
All non-pointer references to other structures must be defined before they are used.
You cannot use the enum key word in a structure. In order to specify enum in a structure, use the typedef key word.
Structure elements with the same alphanumeric name but with different cases (for example, ALPHA, Alpha, and alpha) are not supported. SAS is not case-sensitive. Therefore, all structure elements must be unique when compared in a case-insensitive program.
Copyright © 2010 by SAS Institute Inc., Cary, NC, USA. All rights reserved.