Previous Page | Next Page

Reading Raw Data

Reading Binary Data


Definitions

binary data

is numeric data that is stored in binary form. Binary numbers have a base of two and are represented with the digits 0 and 1.

packed decimal data

are binary decimal numbers that are encoded by using each byte to represent two decimal digits. Packed decimal representation stores decimal data with exact precision; the fractional part of the number must be determined by using an informat or format because there is no separate mantissa and exponent.

zoned decimal data

are binary decimal numbers that are encoded so that each digit requires one byte of storage. The last byte contains the number's sign as well as the last digit. Zoned decimal data produces a printable representation.


Using Binary Informats

SAS can read binary data with the special instructions supplied by SAS informats. You can use formatted input and specify the informat in the INPUT statement. The informat you choose is determined by the following factors:

Different computer platforms store numeric binary data in different forms. The ordering of bytes differs by platforms that are referred to as either "big endian" or "little endian." For more information, see Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms in SAS Language Reference: Dictionary.

SAS provides a number of informats for reading binary data and corresponding formats for writing binary data. Some of these informats read data in native mode, that is, by using the byte-ordering system that is standard for the system on which SAS is running. Other informats force the data to be read by the IBM 370 standard, regardless of the native mode of the system on which SAS is running. The informats that read in native or IBM 370 mode are listed in the following table.

Informats for Native or IBM 370 Mode
Description Native Mode Informats IBM 370 Mode Informats
ASCII Character $w. $ASCIIw.
ASCII Numeric w.d
$ASCIIw.
EBCDIC Character $w. $EBCDICw.
EBCDIC Numeric (Standard) w.d
S370FFw.d
Integer Binary IBw.d S370FIBw.d
Positive Integer Binary PIBw.d S370FPIBw.d
Real Binary RBw.d S370FRBw.d
Unsigned Integer Binary PIBw.d S370FIBUw.d, S370FPIBw.d
Packed Decimal PDw.d S370FPDw.d
Unsigned Packed Decimal PKw.d S370FPDUw.d or PKw.d
Zoned Decimal ZDw.d S370FZDw.d
Zoned Decimal Leading Sign S370FZDLw.d S370FZDLw.d
Zoned Decimal Separate Leading Sign S370FZDSw.d S370FZDSw.d
Zoned Decimal Separate Trailing Sign S370FZDTw.d S370FZDTw.d
Unsigned Zoned Decimal ZDw.d S370FZDUw.d

If you write a SAS program that reads binary data and that will be run on only one type of system, you can use the native mode informats and formats. However, if you want to write SAS programs that can be run on multiple systems that use different byte-storage systems, use the IBM 370 informats. The IBM 370 informats enable you to write SAS programs that can read data in this format and that can be run in any SAS environment, regardless of the standard for storing numeric data.(footnote 1) The IBM 370 informats can also be used to read data originally written with the corresponding native mode formats on an IBM mainframe.

Note:   Any time a text file originates from anywhere other than the local encoding environment, it might be necessary to specify the ENCODING= option on either EBCDIC or ASCII systems.

When you read an EBCDIC text file on an ASCII platform, it is recommended that you specify the ENCODING= option in the FILENAME or INFILE statement. However, if you use the DSD and the DLM= or DLMSTR= options on the FILENAME or INFILE statement, the ENCODING= option is a requirement because these options require certain characters in the session encoding (such as quotes, commas, and blanks).

The use of encoding-specific informats should be reserved for use with true binary files that contain both character and non-character fields.  [cautionend]

For complete descriptions of all SAS formats and informats, including how numeric binary data is written, see SAS Language Reference: Dictionary.


FOOTNOTE 1:   For example, using the IBM 370 informats, you could download data that contain binary integers from a mainframe to a PC and then use the S370FIB informats to read the data. [arrow]

Previous Page | Next Page | Top of Page