Data Types for Greenplum

Overview

Every column in a table has a name and a data type. The data type tells Greenplum how much physical storage to set aside for the column and the form in which the data is stored. This section includes information about Greenplum data types, null and default values, and data conversions.
For more information about Greenplum data types and to determine which data types are available for your version of Greenplum, see the Greenplum Database Administrator Guide.
SAS/ACCESS Interface to Greenplum does not directly support any data types that are not listed below. Any columns using these types are read into SAS as character strings.

String Data

CHAR(n)
specifies a fixed-length column for character string data. The maximum length is 32,767 characters. If the length is greater than 254, the column is a long-string column. SQL imposes some restrictions on referencing long-string columns. For more information about these restrictions, see the Greenplum Database Administrator Guide.
VARCHAR(n)
specifies a varying-length column for character string data. The maximum length is 32,767 characters. If the length is greater than 254, the column is a long-string column. SQL imposes some restrictions on referencing long-string columns. For more information about these restrictions, see the Greenplum Database Administrator Guide.
LONG VARCHAR(n)
specifies a varying-length column for character string data. The maximum size is limited by the maximum size of the database file. To determine the maximum size of your database, see the Greenplum Database Administrator Guide.

Numeric Data

BIGINT
specifies a big integer. Values in a column of this type can range from –9,223,372,036,854,775,808 to +9,223,372,036,854,775,807 .
SMALLINT
specifies a small integer. Values in a column of this type can range from –32768 to +32767.
INTEGER
specifies a large integer. Values in a column of this type can range from –2147483648 to +2147483647.
BIT
specifies a Boolean type. Values in a column of this type can be either 0 or 1. Inserting any nonzero value into a BIT column stores a 1 in the column.
DOUBLE | DOUBLE PRECISION
specifies a floating-point number that is 64 bits long. Values in a column of this type can range from -1.79769E+308 to –2.225E-307 or +2.225E-307 to +1.79769E+308, or they can be 0. This data type is stored the same way that SAS stores its numeric data type. Therefore, numeric columns of this type require the least processing when SAS accesses them.
REAL
specifies a floating-point number that is 32 bits long. Values in a column of this type can range from approximately -3.4E38 to –1.17E-38 and +1.17E-38 to +3.4E38.
FLOAT
specifies a floating-point number. If you do not supply the precision, the FLOAT data type is the same as the REAL data type. If you supply the precision, the FLOAT data type is the same as the REAL or DOUBLE data type, depending on the value of the precision. The cutoff between REAL and DOUBLE is platform-dependent. It is the number of bits that are used in the mantissa of the single-precision floating-point number on the platform.
DECIMAL | DEC | NUMERIC
specifies a fixed-point decimal number. The precision and scale of the number determines the position of the decimal point. The numbers to the right of the decimal point are the scale, and the scale cannot be negative or greater than the precision. The maximum precision is 126 digits.

Date, Time, and Timestamp Data

SQL date and time data types are collectively called datetime values. The SQL data types for dates, times, and timestamps are listed here. Be aware that columns of these data types can contain data values that are out of range for SAS.
CAUTION:
The following data types can contain data values that are out of range for SAS.
DATE
specifies date values. The range is 01-01-0001 to 12-31-9999. The default format is YYYY-MM-DD. An example is 1961-06-13.
TIME
specifies time values in hours, minutes, and seconds to six decimal positions: hh:mm:ss[.nnnnnn]. The range is 00:00:00.000000 to 23:59:59.999999. Due to the ODBC-style interface that SAS/ACCESS Interface to Greenplum uses to communicate with the server, fractional seconds are lost in the data transfer from server to client.
TIMESTAMP
combines a date and time in the default format of yyyy-mm-dd hh:mm:ss[.nnnnnn]. For example, a timestamp for precisely 2:25 p.m. on January 25, 1991, would be 1991-01-25-14.25.00.000000. Values in a column of this type have the same ranges and limitations as described for DATE and TIME.

Greenplum Null Values

Greenplum has a special value called NULL. A Greenplum NULL value means an absence of information and is analogous to a SAS missing value. When SAS/ACCESS reads a Greenplum NULL value, it interprets it as a SAS missing value. When loading SAS tables from Greenplum sources, SAS/ACCESS stores Greenplum NULL values as SAS missing values.
In Greenplum tables, NULL values are valid in all columns by default. There are two methods to define a column in a Greenplum table so that it requires data:
  • Using SQL, you specify a column as NOT NULL. This tells SQL to allow only a row to be added to a table if a value exists for the field. Rows that contain NULL values in that column are not added to the table.
  • Another approach is to assert NOT NULL DEFAULT. For more information, see the Greenplum Database Administrator Guide.
When creating Greenplum tables with SAS/ACCESS, you can use the DBNULL= data set option to specify the treatment of NULL values. For more information about how SAS handles NULL values, see Potential Result Set Differences When Processing Null Data .
Once you know whether a Greenplum column enables NULLs or the host system supplies a default value for a column that is defined as NOT NULL WITH DEFAULT, you can write selection criteria and enter values to update a table. Unless a column is defined as NOT NULL or NOT NULL DEFAULT, it enables NULL values.
To control how the DBMS handles SAS missing character values, use the NULLCHAR= and NULLCHARVAL= data set options.

LIBNAME Statement Data Conversions

The following table shows the default formats that SAS/ACCESS Interface to Greenplum assigns to SAS variables when using the LIBNAME statement.
These default formats are based on Greenplum column attributes.
LIBNAME Statement: Default SAS Formats for Greenplum Data Types
Greenplum Data Type
SAS Data Type
Default SAS Format
CHAR(n)1
character
$n.
VARCHAR(n)1
character
$n.
INTEGER
numeric
11.
SMALLINT
numeric
6.
TINYINT
numeric
4.
BIT
numeric
1.
BIGINT
numeric
20.
DECIMAL(p,s)
numeric
m.n
NUMERIC(p,s)
numeric
m.n
REAL
numeric
none
DOUBLE
numeric
none
TIME
numeric
TIME8.
DATE
numeric
DATE9.
TIMESTAMP
numeric
DATETIME25.6
1n in Greenplum data types is equivalent to w in SAS formats.
The next table shows the default Greenplum data types that SAS/ACCESS assigns to SAS variable formats during output operations when you use the LIBNAME statement.
LIBNAME Statement: Default Greenplum Data Types for SAS Variable Formats
SAS Variable Format
Greenplum Data Type
m.n
DECIMAL(p,s)
other numerics
DOUBLE
$n.
VARCHAR(n)1
datetime formats
TIMESTAMP
date formats
DATE
time formats
TIME
1n in Greenplum data types is equivalent to w in SAS formats.