An Introduction of TCP/IP

An Introduction to TCP/IP

If you are involved with the networking industry, work on a network computer, or even occasionally surf the Internet you have probably heard of TCP/IP, but what exactly is it? Because of use on the Internet, and it's bundled with a majority of the UN IX operating systems, TCP/IP has become an extremely popular networking protocol. Almost seamlessly connecting millions of computers together TCP\IP designates a protocol suite that includes a number of protocols in addition to TCP and IP.

In this paper I will look at look at the history of TCP/IP, and its roots at the Department of Defence. I will then look at how TCP/IP addressing, and how it sends packets across a network. While also explain the layering model of TCP/IP and comparing it to the 'Open Systems Interconnect' standard set by the ISO. Finally we will look into the possible limitations of TCP/IP and how next generation designs of TCP/IP are dealing with these issues.

History of TCP/IP

TCP and IP were developed by the Department of Defense (DOD) in an attempt to connect different networks designed by different vendors. As the story goes the Army puts out a bid on a computer and DEC wins the bid. The Air Force puts out a bid and IBM wins. The Navy's bid is won by Unisys. Then the President decides to invade Grenada and the Armed Forces discover that their computers cannot talk to each other. This placed the DOD under heavy fire from Washington, so the DOD begins researching a network protocol that would build a network out of different systems. Each of which, by law, was delivered by the lowest bidder on a single contract. Often referred to as the 'Network of the Lowest Bidder', TCP/IP, finally allowed the separate networks of the Armed Forces to talk to one another over a computer network of networks.

TCP/IP was initially successful because it delivered a few basic services that everyone needed. Services like file transfer, electronic mail, and remote logon could now be operated across a very large number of client and server systems. Computers could now use TCP/IP, and other networking protocols, on a single LAN. The IP component provided routing from the single LAN, to the enterprise network, then to regional networks, and finally to the global Internet. Communication networks could sustain damage, so the DOD designed TCP/IP to be robust and automatically recover from any node or phone line failure. This design allows the construction of very large networks with less central management.

Addressing and Protocols

Since TCP/IP is a standardized communications protocol it is composed of layers:

IP - Responsible for moving packets of data from node to node. IP forwards each packet based on a four byte destination address (the IP number). Internic assign ranges of numbers to different organizations. The organizations assign group of their numbers to departments. IP operates on gateway machines that move data from department to organization to region and then around the world.

TCP - Responsible for verifying the correct delivery of data from client to server. Data can be lost in the intermediate network. TCP adds support to detect errors or lost data and to trigger retransmission until the data is correctly and completely received.

Sockets - Package of subroutines that provide access to TCP/IP on most systems.

TCP/IP assigns a unique number to every workstation in the world, on top of any vendor specific networking protocols. This "IP number" is a four byte value that, by convention, is expressed by converting each byte into a decimal number (0 to 255) and separating the bytes with a period.

An organization begins by sending electronic mail to Hostmaster@INTERNIC.NET requesting assignment of a network number. It is still possible for almost anyone to get assignment of a number for a small "Class C" network in which the first three bytes identify the network and the last byte identifies the individual computer. The author followed this procedure and was assigned the numbers 192.35.91.* for a network of computers at his house. Larger organizations, Internet Service Providers, and universities can get a "Class B" network where the first two bytes identify the network and the last two bytes identify each of up to 64 thousand individual workstations or servers.

After obtaining an address you can connect to the Internet through regional or specialized network suppliers. When you subscribe to a network vendor the subscriber's network number is added to the routing configuration of the vendor's machine and those of the other major network suppliers. The routing configurations of the major networking suppliers are commonly referred to as routing tables. These routing tables are dynamic and differ depending on the type of traffic the supplier host.

There is no mathematical formula that translates the numbers 128.138 into "The University of Colorado" or Boulder, Co.. The machines that manage large regional networks or the central Internet routers managed by the National Science Foundation can only locate these networks by looking each network number up in a table. There are potentially thousands of Class B networks, and millions of Class C networks, but computer memory costs are low, so the tables are reasonable.

Although the individual subscribers do not need to tabulate network numbers or provide explicit routing, it is convenient for most Class B networks to be internally managed as a much smaller and simpler version of the larger network organizations, this is known as a 'Subnet'. It is common to subdivide the two bytes available for internal assignment into a one byte department number and a one byte workstation ID.

Routing

Whenever a packet is sent across a network using TCP/IP and that message reaches the IP router, the router makes a decision about where to send the packet next. This decision is made on the concept of taking the path of least resistance. Consider a company with facilities in New York, Los Angeles, Chicago and Atlanta. It could build a network from four phone lines forming a loop (NY to Chicago to LA to Atlanta to NY). A message arriving at the NY router could go to LA via either Chicago or Atlanta. The reply could come back the other way.

If one phone line in this network breaks down, traffic can still reach its destination through a roundabout path. After losing the NY to Chicago line, data can be sent NY to Atlanta to LA to Chicago. This provides continued service though with degraded performance. This kind of recovery is the primary design feature of IP. The loss of the line is immediately detected by the routers in NY and Chicago, but somehow this information must be sent to the other nodes. Otherwise, LA could continue to send NY messages through Chicago, where they arrive at a "dead end." Each network adopts some Router Protocol which periodically updates the routing tables throughout the network with information about changes in route status.

Error Correction and Recovery

When the DOD designed TCP/IP they created a protocol that was robust, compared to the centrally managed protocols like SNA, IPX, or X.25. In battlefield conditions, the loss of a node or line is a normal circumstance. Casualties can be sorted out later on, but the network must stay up. So IP networks automatically, and silently, reconfigure themselves when something goes wrong. If there is enough redundancy built into the system, then communication is maintained.

When TCP/IP was first produced building the redundancy to maintain such a network was prohibitively expense, to everyone except the Department of Defense. Today, however, simple routers cost no more than a PC. However, the TCP/IP design that, errors are normal and can be largely ignored, produces problems of its own.

TCP was designed to recover from node or line failures where the network propagates routing table changes to all router nodes. Since the update takes some time, TCP is slow to initiate recovery. The TCP algorithms are not tuned to optimally handle packet loss due to traffic congestion. Instead, the traditional Internet response to traffic problems has been to increase the speed of lines and equipment in order to say ahead of growth in demand.

TCP treats the data as a stream of bytes. It logically assigns a sequence number to each byte. The TCP packet has a header that says, in effect, "This packet starts with byte 379642 and contains 200 bytes of data." The receiver can detect missing or incorrectly sequenced packets. TCP acknowledges data that has been received and retransmits data that has been lost. The TCP design means that error recovery is done end-to-end between the Client and Server machine. There is no formal standard for tracking problems in the middle of the network, though each network has adopted some ad hoc tools.

OSI Network Model vs. TCP/IP

Now that we have taken a close look at the TCP/IP protocol lets compare it to the standard model for networking protocols and distributed applications as defined by the International Standard Organization's. Taking a close look at how TCP/IP addresses Open System Interconnect (OSI) model created by the ISO. The OSI model defines seven network layers.

Layer 1 - Physical

Physical layer defines the cable or physical medium itself. All media are equivalent. The main difference is in convenience and cost of installation and maintenance. Converters from one medium to another operate at this level.

Layer 2 - Data Link

Data Link layer defines the format of data on the network. A network data frame (or packet) includes checksum, source and destination address, and data. The largest packet that can be sent through a data link layer defines the Maximum Transmission Unit (MTU). The data link layer handles the physical and logical connections to the packet's destination, using a network interface. A host connected to an Ethernet would have an Ethernet interface to handle connections to the outside world, and a loopback interface to send packets to itself.

Layer 3 - Network

The network layer is responsible for routing and directing packets from one network to another. The network layer may have to break large packets into smaller packets and host receiving the packet will have to reassemble. For example the Internetwork Protocol identifies each host with a 32-bit IP address. IP addresses are written as four dot-separated decimal numbers between 0 and 255, e.g., 251.39.56.120. The leading 1-3 bytes of the IP identify the network and the remaining bytes identifies the host on that network. The network portion of the IP is assigned by InterNIC Registration Services, under the contract to the National Science Foundation, and the host portion of the IP is assigned by the local network administrators, as described in the 'Protocol and Addressing section of my paper. Even though IP packets are addressed using IP addresses, hardware addresses must be used to actually transport data from one host to another. The Address Resolution Protocol (ARP) is used to map the IP address to it hardware address.

Layer 4 - Transport

Transport layer subdivides user-buffer into network-buffer sized packets and enforces desired transmission control.

Layer 5 - Session

The session protocol defines the format of the data sent over the connections.

Layer 6 - Presentation

The presentation layer converts local representation of data to its canonical form and vice versa.

Layer 7 - Application

Provides network services to the end-users. Mail, ftp, telnet, DNS, NIS, NFS are examples of network applications.

TCP/IP Network Model

Although the OSI model is widely used and often cited as the standard, TCP/IP deviates from this standard slightly. TCP/IP is designed around a simple four-layer scheme. It does omit some features found under the OSI model. Also it combines the features of some adjacent OSI layers and splits other layers apart. The four network layers defined by TCP/IP model are as follows.

Layer 1 - Link

Defines the network hardware and device drivers.

Layer 2 - Network

Basic communication, addressing and routing.

Layer 3 - Transport

Handles communication among programs on a network. TCP and UDP are protocols which fall into this layer.

Layer 4 - Application

End-user applications reside at this layer. Commonly used applications include NFS, DNS, arp, rlogin, talk, ftp, ntp and traceroute.

Limitations and IPng

The Internet's explosive growth has made the TCP/IP protocol suite's technology limitations increasingly apparent to frustrated network and system managers. Particularly troublesome are the weaknesses of the IP protocol itself. A new IP, dubbed IP Next Generation (IPng or IPv6), is under development to resolve some of IP's limitations. As the single, essential component of any TCP/IP Intranet and the Internet IP, lacked two critical attributes: scalability and security.

IP's scalability problems center around three key issues: address allocation, backbone routing table growth, and host configuration. Perhaps IP's most widely reported deficiency is its limited addressing architecture. IP's "Class A, B, C" address allocation is considered an 'Inefficient Dinosaur' by many in the industry. Given historical Internet growth trends, unassigned addresses are expected to be exhausted by 2005.

However, until recently, the most critical problem facing the Internet community was the growing size of the backbone routing tables. Prior to the adoption of Classless Interdomain Routing (CIDR) routing tables on the Internet's backbone routers were growing 1.5 times faster then RAM capacity. With most Class A and B addresses already assigned, the prospect of even a small portion of the potential 221 (2,097,152) routing entries seemed unmanageable.

This problem is known as 'address aggregation.' When various branch subnets all attach to the first network in their numeric sequence, the backbone routing table only needs the network address of the first network -- all the rest can be reached from it.

CIDR help alleviate the problem of address aggregation by linking all networks from 199.200.202.0 through 199.200.254.0 to network 199.200.201.0. It also helped to slow down the rate at which addresses were being exhausted by encouraging the practice of ISP-based address allocation. ISP-based addressing allows for the more complete disbursement of under-utilized Class B and Class A subnets, therefore releasing more networks for new sites to connect to the 'Net. But CIDR only magnifies the third scalability weakness of IP: host address configuration. Every time a site changes ISPs, all its hosts must be readdressed, a decidedly non-trivial job for most sites and one that could have a significant impact on a company's personnel resources.

Finally, IP fails in any attempt to provide security and guarantee the source of any given packet, or that the information it contains will remain private. IP lacks authentication and confidentiality controls. IP's ease of compromise is well documented and is regularly exploited. Though TCP/IP has been an incredibly successful network in the past security and scalability remain a problem, and maybe a redesign of this protocol is in order.

Conclusion

TCP/IP is an extremely powerful networking protocol which has changed the way people view computing. People all across the world on hundreds of different platforms can be networked together because of this. Is TCP/IP an incredible tool? Well in this authors opinion TCP/IP is one of the premiere networking standards derived from the computing revolution, and probably the best thing to come out of the DOD in decades. This does not mean that the standard cannot be improved, and a serious look has to be taken at IPng and IPv6. Hopefully next time you will take a closer look at the IP addresses when you are surfing the web or playing with your computer.