Publishing SAS Formats in Netezza

Overview of the Publishing Process

The SAS publishing macros are used to publish formats and the SAS_PUT( ) function in Netezza.
The %INDNZ_PUBLISH_FORMATS macro creates the files that are needed to build the SAS_PUT( ) function and publishes those files to the Netezza data warehouse.
This macro also makes many formats that SAS supplies available inside Netezza. In addition to formats that SAS supplies, you can also publish the PROC FORMAT definitions that are contained in a single SAS format catalog by using the FMTCAT= option. The process of publishing a PROC FORMAT catalog entry converts the range label pairs into embedded data in Netezza.
The %INDNZ_PUBLISH_FORMATS macro performs the following tasks:
  • produces the set of .c, .cpp, and .h files that are necessary to build the SAS_PUT( ) function
  • produces a script of the Netezza commands that are necessary to register the SAS_PUT( ) function on the Netezza data warehouse
  • transfers the .c, .cpp, and .h files to Netezza using the Netezza External Table interface
  • calls the SAS_COMPILEUDF function to compile the source files into object files and to access the SAS Formats Library for Netezza
  • uses SAS/ACCESS Interface to Netezza to run the script to create the SAS_PUT( ) function with the object files

Running the %INDNZ_PUBLISH_FORMATS Macro

To run the %INDNZ_PUBLISH_FORMATS macro, complete the following steps:
  1. Start SAS 9.3 and submit these commands in the Program or Enhanced Editor:
    %indn;
    %let indconn = server=myserver user=myuserid password=XXXX
       database=mydb <serveruserid=myserveruserid>;
    For more information, see %INDNZPF Macro and the INDCONN Macro Variable.
  2. Run the %INDNZ_PUBLISH_FORMATS macro.
    For more information, see %INDNZ_PUBLISH_FORMATS Macro Syntax.
    Messages are written to the SAS log that indicate whether the SAS_PUT( ) function was successfully created.

%INDNZPF Macro

The %INDNZPF macro is an autocall library that initializes the format publishing software.

INDCONN Macro Variable

The INDCONN macro variable is used as credentials to connect to Netezza. You must specify the server, user, password, and database information to access the machine on which you have installed the Netezza data warehouse. You must assign the INDCONN macro variable before the %INDNZ_PUBLISH_FORMATS macro is invoked.
Here is the syntax for the value of the INDCONN macro variable:
SERVER=server USER=userid PASSWORD=password DATABASE=database
Arguments

SERVER=server

specifies the Netezza server name or the IP address of the server host.

USER=user

specifies the Netezza user name (also called the user ID) that is used to connect to the database.

PASSWORD=password

specifies the password that is associated with your Netezza user ID.

Tip Use only PASSWORD=, PASS=, or PW= for the password argument. PWD= is not supported and causes an error to occur.

DATABASE=database

specifies the Netezza database that contains the tables and views that you want to access.

Tip
The INDCONN macro variable is not passed as an argument to the %INDNZ_PUBLISH_FORMATS macro. This information can be concealed in your SAS job. You might want to place it in an autoexec file and set the permissions on the file so that others cannot access the user ID and password.

%INDNZ_PUBLISH_FORMATS Macro Syntax

%INDNZ_PUBLISH_FORMATS
(<DATABASE=database-name>
<. DBCOMPILE=database-name>
<, DBJAZLIB=database-name>
<, FMTCAT=format-catalog-filename>
<, FMTTABLE=format-table-name>
<, ACTION=CREATE | REPLACE | DROP>
<, MODE=FENCED | UNFENCED>
<, OUTDIR=diagnostic-output-directory>
);
Arguments

DATABASE=database-name

specifies the name of a Netezza database to which the SAS_PUT( ) function and the formats are published. This argument lets you publish the SAS_PUT( ) function and the formats to a shared database where other users can access them.

Interaction The database that is specified by the DATABASE= argument takes precedence over the database that you specify in the INDCONN macro variable. For more information, see Running the %INDNZ_PUBLISH_FORMATS Macro.
Tip It is not necessary that the format definitions and the SAS_PUT( ) function reside in the same database as the one that contains the data that you want to format. You can use the SQLMAPPUTO= system option to specify the database where the format definitions and the SAS_PUT( ) function have been published.

DBCOMPILE=database-name

specifies the name of the database where the SAS_COMPILEUDF function was published.

Default SASLIB
See For more information about the publishing the SAS_COMPILEUDF function, see the SAS In-Database Products: Administrator's Guide.

DBJAZLIB=database-name

specifies the name of the database where the SAS Formats Library for Netezza was published.

Default SASLIB
Restriction This argument is supported only on TwinFin systems.
See For more information about publishing the SAS Formats Library for Netezza, see the SAS In-Database Products: Administrator's Guide.

FMTCAT=format-catalog-filename

specifies the name of the format catalog file that contains all user-defined formats that were created with the FORMAT procedure and are made available in Netezza.

Default If you do not specify a value for FMTCAT= and you have created user-defined formats in your SAS session, the default is WORK.FORMATS. If you do not specify a value for FMTCAT= and you have not created any user-defined formats in your SAS session, only the formats that SAS supplies are available in Netezza.
Interaction If the format definitions that you want to publish exist in multiple catalogs, you must copy them into a single catalog for publishing.
See Netezza Considerations and Limitations When Using the FMTCAT= Options

FMTTABLE=format-table-name

specifies the name of the Netezza table that contains all formats that the %INDNZ_PUBLISH_FORMATS macro creates and that the SAS_PUT( ) function supports. The format table contains the columns shown in the following table.

Format Table Columns
Column Name
Description
FMTNAME
specifies the name of the format.
SOURCE
specifies the origin of the format. SOURCE can contain one of these values:
SAS
supplied by SAS
PROCFMT
User-defined with PROC FORMAT
Default If FMTTABLE is not specified, no table is created. You can see only the SAS_PUT( ) function. You cannot see the formats that are published by the macro.
Interaction If ACTION=CREATE or ACTION=DROP is specified, messages are written to the SAS log that indicate the success or failure of the table creation or drop.

ACTION=CREATE | REPLACE | DROP

specifies that the macro performs one of these actions:

CREATE

creates a new SAS_PUT( ) function.

REPLACE

overwrites the current SAS_PUT( ) function, if a SAS_PUT( ) function is already registered or creates a new SAS_PUT( ) function if one is not registered.

DROP

causes the SAS_PUT( ) function to be dropped from the Netezza database.

Interaction If FMTTABLE= is specified, both the SAS_PUT( ) function and the format table are dropped. If the table name cannot be found or is incorrect, only the SAS_PUT( ) function is dropped.

Default CREATE
Tip If the SAS_PUT( ) function was published previously and you specify ACTION=CREATE, you receive warning messages that the function already exists and you are prompted to use REPLACE. If you specify ACTION=DROP and the function does not exist, an error message is issued.

MODE= FENCED | UNFENCED

specifies whether running the code is isolated in a separate process in the Netezza database so that a program fault does not cause the database to stop.

Default FENCED
Restriction The MODE= argument is supported for Netezza 6.0. The MODE argument is ignored for previous versions of Netezza.
Tip There are limited resources available in Netezza when you run in fenced mode. For example, there is a limit to the number of columns available.
See Modes of Operation

OUTDIR=diagnostic-output-directory

specifies a directory that contains diagnostic files.

Files that are produced include an event log that contains detailed information about the success or failure of the publishing process.
See Special Characters in Directory Names

Modes of Operation

The %INDNZ_PUBLISH_FORMATS macro has two modes of operation: fenced and unfenced. You specify the mode by setting the MODE= argument.
The default mode of operation is fenced. Fenced mode means that the format that is published is isolated in a separate process in the Netezza database when it is invoked. An error does not cause the database to stop. It is recommended that you publish the format in fenced mode during acceptance tests.
When the format is ready for production, you can run the macro to publish the format in unfenced mode. You could see a performance advantage if the format is published in unfenced mode.
Note: The MODE= argument is supported for Netezza 6.0. The MODE argument is ignored for previous versions of Netezza.

Format Publishing Macro Example

%indnzpf;
%let indconn = server=netezbase user=user1 password=xxxx
database=mydb;
%indnz_publish_formats(fmtcat= fmtlib.fmtcat);
This sequence of macros generates .c, .cpp, and .h files for each data type. The format data types that are supported are numeric (FLOAT, INT), character, date, time, and timestamp (DATETIME). The %INDNZ_PUBLISH_FORMATS macro also produces a text file of Netezza CREATE FUNCTION commands that are similar to these:
CREATE FUNCTION sas_put(float , varchar(256))
RETURNS VARCHAR(256)
LANGUAGE CPP
PARAMETER STYLE npsgeneric
CALLED ON NULL INPUT
EXTERNAL CLASS NAME 'Csas_putn'
EXTERNAL HOST OBJECT '/tmp/tempdir_20090528T135753_616784/formal5.o_x86'
EXTERNAL NSPU OBJECT '/tmp/tempdir_20090528T135753_616784/formal5.o_diab_ppc'
After it is installed, you can call the SAS_PUT( ) function in Netezza by using SQL. For more information, see Using the SAS_PUT( ) Function in the Netezza Data Warehouse .