The TCP/IP Protocol Suite

Overview of TCP/IP

Networking has become a fundamental feature of most computer applications. (TCP/IP), the protocol suite used by the Defense Advanced Research Projects Agency (DARPA) Internet, is one of the most commonly used network protocols. The DARPA Internet is a collection of networks and gateways that function as a single network. The Internet extends all over the globe and consists of over 100,000 computers.

Finding a way to connect existing computer networks of diverse types was a primary goal in the design of TCP/IP. Achieving of this goal has made TCP/IP particularly successful in the networking of computers that run different operating systems and that are manufactured by different vendors. TCP/IP is also designed to accommodate a very large number of host computers and local networks. A site or organization that uses TCP/IP need not be connected to the Internet, but most sites and organizations are connected.

The open, nonproprietary nature of TCP/IP and its global scope have made it popular among users of the UNIX operating system. Standards for writing communications programs in C have also become widespread. The two most common standards are the BSD UNIX Socket Library interface and the UNIX System V Transport Layer Interface (TLI).

The SAS/C Library currently implements the BSD UNIX Socket Library interface because it is somewhat more common than TLI, and it has better support from the underlying communications software on MVS and CMS systems. The socket library is integrated with SAS/C support for UNIX file I/O to provide the same type of integration between file and network I/O that is available on BSD UNIX systems.

TCP/IP is now a base for higher level protocols that support many popular networking applications. Some of these protocols are:

TELNET
a remote terminal connection service that supports remote login.
FTP
a File Transfer Protocol that transfers files from one machine to another.
X11
a graphical user interface that can operate in a network environment. This protocol is not limited to TCP/IP.
NFS
a Network File System that allows cooperating computers to access one another's file systems as though they were local. This protocol is not limited to TCP/IP.
The multivendor capabilities of these protocols and the applications that use them have made them particularly successful.

Internet Protocol (IP)

The Internet Protocol (IP) integrates different physical or proprietary networks into a unified logical network known as the Internet.

An IP address is a 32-bit number that specifies both the number for the individual physical network and the number for a given host computer on that network. The term host computer can apply to any end-user computer system that connects to a network. The size of a host can range from an X-terminal or a PC to a large mainframe. Among all the organizations connected to the Internet, the address of each host computer is unique. The address is often written in dotted decimal notation. Dotted decimal notation is the decimal value of each byte (often referred to as an octet in the literature on TCP/IP) separated by a period. For example,

      192.22.31.05

 is the dotted decimal notation for a machine whose 32-bit address is

      0xC0161f05
 
At the IP protocol layer, host computers cannot be referenced by name. Refer to Domain Name System (DNS) for an explanation of name referencing for host computers. All network communication uses IP addresses. The IP layer routes packets of data to their destinations, which may be many physical hops away from the source of the message. A physical hop is a gateway through which the data must pass. The IP layer does not guarantee that a packet will reach its destination, nor does it provide error checking for the data. The IP layer does not provide flow control or any lasting association (connection) between sender and receiver. A higher layer of the protocol, such as the User Datagram Protocol (UDP) or TCP, must provide all these services.

User Datagram Protocol (UDP)

The User Datagram Protocol (UDP) provides the lowest level of service that can be used conveniently by network application programs. UDP is most often used when applications implement their own networking protocol and thus require little intervention from the TCP/IP software.

In addition to the communication capabilities of IP, UDP adds checksums for the application data and protocol ports to help distinguish among the different processes that are communicating between sending and receiving machines. A checksum detects errors in the transfer of a packet from one machine to another. A protocol port is an abstraction used to distinguish between multiple destinations within a single host.

A datagram is a basic unit of information transferred across a network. UDP does not guarantee that datagrams reach their destination, nor does it ensure that the datagrams are received in the order in which they are sent. Because UDP does not use connections or sessions, it is called a connectionless protocol.

UDP ports are two-byte integers that specify a particular service or program within a host computer. For example, port 13 is generally used by programs that query the date and time maintained by a particular host. A client-server relationship is usually defined in a UDP transaction. The server waits for messages (listens) at a predefined port. When it receives a datagram from a new client, the server knows where to respond because the datagram contained both the sender's IP address and its port number.

Transmission Control Protocol (TCP)

The Transmission Control Protocol (TCP) provides a higher level of service. TCP establishes connections between the sender and receiver by using IP addresses and port numbers. It then simulates a bidirectional, continuous communications service. TCP provides reliability of transmission, division of data into packets (without the knowledge of the user program), ordered transmission of packets, simultaneous transfer in both directions, and buffering of data. The format of TCP data is analogous to that of a file in the UNIX operating system. Data are sent or received in a continuous stream of bytes. The data have no record (message) boundaries or any other structure except that agreed to by the connected applications.

Domain Name System (DNS)

The previously discussed protocols do not use any concept of a host name. These protocols always use 32-bit IP addresses to locate source and destination hosts. Of course, no one wants to specify a remote computer using an address such as 192.22.31.05. The Domain Name System (DNS) maps IP addresses to alphabetic names. One of the most important features of DNS is distributed management.

Each organization has the ability to control names within its own domain. Domains are arranged in a hierarchy. For example, the XYZ Company, Inc., may have names all ending in the following:

    .xyz.com
 
com
the final section of the name, is a higher-level domain used to group commercial organizations.
xyz
the second section of the name, is the designated name of the organization.
The names could be further divided into several groups such as the following: For example,
    abcvm.vm.xyz.com
 
might be the primary VM system at the XYZ Company, Inc. DNS enables you to use a File Transfer Program command such as
    ftp abcvm.vm.xyz.com
 
instead of
    ftp 123.45.67.89
 
when transferring a file to this VM system.

Although it is possible to locate the mapping of host addresses to host names in a file (for example, /etc/hosts on UNIX), DNS is more versatile than a system that maps addresses to names in a file. Under a system that maps names to addresses, the file containing the mapped names and addresses: must be replicated on every host, does not have the capacity to contain the mappings for all computers on a system as large as the Internet, and cannot be updated on a real-time basis.

DNS uses server processes called name servers to stay current with the names assigned within a particular domain. The network administrator provides the name servers with configuration files. Each configuration file contains the mapping for the domain that it controls. Name servers in a particular domain can refer to the addresses of name servers for higher- and lower-level domains if the configuration files that they control do not contain a particular name or address.

Name servers typically run on only a few machines in an organization. Programs can use a set of routines, known as the resolver, to query their organization's name server. The resolver routines are associated with the application and provide all the message formatting and TCP or UDP communications logic necessary to talk to their organization's name server.

DNS is general enough to allow distributed management of other types of information, such as mailbox locations, and it does not require any correspondence between domains and IP addresses or physical network connections.

MVS and CMS TCP/IP Implementation

The SAS/C Socket Library has an open architecture that permits using TCP/IP products from different vendors. If a vendor provides the appropriate SAS/C transient library module, existing socket programs can communicate using the TCP/IP implementation specified during configuration of the system. A program compiled and linked with the SAS/C Socket Library at one site can be distributed to sites that are running different TCP/IP implementations. In addition, any site can change TCP/IP vendors without recompiling or relinking its existing SAS/C applications.

With Release 6.00, the SAS/C Library supports both integrated and non-integrated sockets. With integrated sockets, the TCP/IP sockets are integrated with OpenEdition support instead of being a direct run-time library interface to the TCP/IP software implemented only in the run-time library.

With non-integrated sockets, the SAS/C Socket Library relies on an underlying layer of TCP/IP communications software, such as IBM TCP/IP Version 2, or higher, for VM and MVS. TCP/IP communications software handles the actual communications. The SAS/C Library adds a higher level of UNIX compatibility, as well as integration with the SAS/C run-time environment.


Copyright (c) 1998 SAS Institute Inc. Cary, NC, USA. All rights reserved.