Byte Ordering for Integer Binary Data on Big Endian and Little Endian Platforms

Definitions

Integer values for integer binary data are typically stored in one of three sizes: one byte, two bytes, or four bytes. The ordering of the bytes for the integer varies depending on the platform (operating environment) on which the integers were produced.
The ordering of bytes differs between the “big endian” and the “little endian” platforms. These colloquial terms are used to describe byte ordering for IBM mainframes (big endian) and for platforms that are based on Intel (little endian). In the SAS System, the following platforms are considered big endian: IBM mainframe, HP-UX, AIX, Solaris on SPARC, and Macintosh. In SAS, the following platforms are considered little endian: Intel ABI, Linux, OpenVMS Alpha, OpenVMS Integrity, Solaris on x64, Tru64 UNIX, and Windows.

How the Bytes Are Ordered

On big endian platforms, the value 1 is stored in binary and is represented here in hexadecimal notation. One byte is stored as 01, two bytes as 00 01, and four bytes as 00 00 00 01. On little endian platforms, the value 1 is stored in one byte as 01 (the same as big endian), in two bytes as 01 00, and in four bytes as 01 00 00 00.
If an integer is negative, the “two's complement” representation is used. The high-order bit of the most significant byte of the integer is set on. For example, –2 would be represented in one, two, and four bytes on big endian platforms as FE, FF FE, and FF FF FF FE respectively. On little endian platforms, the representation would be FE, FE FF, and FE FF FF FF. These representations result from the output of the integer binary value –2 expressed in hexadecimal representation.

Reading Data Generated on Big Endian or Little Endian Platforms

SAS can read signed and unsigned integers regardless of whether they were generated on a big endian or a little endian system. Likewise, SAS can write signed and unsigned integers in both big endian and little endian format. The length of these integers can be up to eight bytes.
The following table shows which informat to use for various combinations of platforms. In the Sign? column, “no” indicates that the number is unsigned and cannot be negative. “Yes” indicates that the number can be either negative or positive.
Platform for Which the Data Was Created
Platform the Data Is Read On
Signed Integer
Informat
big endian
big endian
yes
IB or S370FIB
big endian
big endian
no
PIB, S370FPIB, S370FIBU
big endian
little endian
yes
IBR
big endian
little endian
no
PIBR
little endian
big endian
yes
IBR
little endian
big endian
no
PIBR
little endian
little endian
yes
IB or IBR
little endian
little endian
no
PIB or PIBR
big endian
either
yes
S370FIB
big endian
either
no
S370FPIB
little endian
either
yes
IBR
little endian
either
no
PIBR

Integer Binary Notation in Different Programming Languages

The following table compares integer binary notation according to programming language.
Language
2 Bytes or 8-Bit Systems
4 Bytes or 16-Bit Systems
8 Bytes or 64-Bit Systems
SAS
IB2., IBR2., PIB2.,PIBR2., S370FIB2., S370FIBU2., S370FPIB2.
IB4., IBR4., PIB4., PIBR4., S370FIB4., S370FIBU4., S370FPIB4.
IB8., IBR8., PIB8., PIBR8., S370FIB8., S370FIBU8., S370FPIB8.
C
short
int
long 1
Java
short
int
long 1
Visual Basic 6.0
short
long*
none
Visual Basic.NET
short
integer
long 1
PL/I
fixed bin(15)
fixed bin(31)
fixed bin(63)
Fortran
integer*2
integer*4
integer*8
COBOL
comp pic 9(4)
comp pic 9(8)
comp pic 9(16)
IBM assembler
H
F
FD
1The size of integers declared as long depends on the operating environment.