An Introduction to TCP/IP
If you are involved with the networking industry, work on a
network computer, or even occasionally surf the Internet you have
probably heard of TCP/IP, but what exactly is it? Because of use
on the Internet, and it's bundled with a majority of the UN IX
operating systems, TCP/IP has become an extremely popular
networking protocol. Almost seamlessly connecting millions of
computers together TCP\IP designates a protocol suite that
includes a number of protocols in addition to TCP and IP.
In this paper I will look at look at the history of TCP/IP,
and its roots at the Department of Defence. I will then look at
how TCP/IP addressing, and how it sends packets across a network.
While also explain the layering model of TCP/IP and comparing it
to the 'Open Systems Interconnect' standard set by the ISO.
Finally we will look into the possible limitations of TCP/IP and
how next generation designs of TCP/IP are dealing with these
issues.
History of TCP/IP
TCP and IP were developed by the Department of Defense (DOD)
in an attempt to connect different networks designed by different
vendors. As the story goes the Army puts out a bid on a computer
and DEC wins the bid. The Air Force puts out a bid and IBM wins.
The Navy's bid is won by Unisys. Then the President decides to
invade Grenada and the Armed Forces discover that their computers
cannot talk to each other. This placed the DOD under heavy fire
from Washington, so the DOD begins researching a network protocol
that would build a network out of different systems. Each of
which, by law, was delivered by the lowest bidder on a single
contract. Often referred to as the 'Network of the Lowest
Bidder', TCP/IP, finally allowed the separate networks of the
Armed Forces to talk to one another over a computer network of
networks.
TCP/IP was initially successful because it delivered a few
basic services that everyone needed. Services like file transfer,
electronic mail, and remote logon could now be operated across a
very large number of client and server systems. Computers could
now use TCP/IP, and other networking protocols, on a single LAN.
The IP component provided routing from the single LAN, to the
enterprise network, then to regional networks, and finally to the
global Internet. Communication networks could sustain damage, so
the DOD designed TCP/IP to be robust and automatically recover
from any node or phone line failure. This design allows the
construction of very large networks with less central management.
Addressing and Protocols
Since TCP/IP is a standardized communications protocol it is
composed of layers:
TCP/IP assigns a unique number to every workstation in the
world, on top of any vendor specific networking protocols. This
"IP number" is a four byte value that, by convention,
is expressed by converting each byte into a decimal number (0 to
255) and separating the bytes with a period.
An organization begins by sending electronic mail to
Hostmaster@INTERNIC.NET requesting assignment of a network number.
It is still possible for almost anyone to get assignment of a
number for a small "Class C" network in which the first
three bytes identify the network and the last byte identifies the
individual computer. The author followed this procedure and was
assigned the numbers 192.35.91.* for a network of computers at
his house. Larger organizations, Internet Service Providers, and
universities can get a "Class B" network where the
first two bytes identify the network and the last two bytes
identify each of up to 64 thousand individual workstations or
servers.
After obtaining an address you can connect to the Internet
through regional or specialized network suppliers. When you
subscribe to a network vendor the subscriber's network number is
added to the routing configuration of the vendor's machine and
those of the other major network suppliers. The routing
configurations of the major networking suppliers are commonly
referred to as routing tables. These routing tables are dynamic
and differ depending on the type of traffic the supplier host.
There is no mathematical formula that translates the numbers
128.138 into "The University of Colorado" or Boulder,
Co.. The machines that manage large regional networks or the
central Internet routers managed by the National Science
Foundation can only locate these networks by looking each network
number up in a table. There are potentially thousands of Class B
networks, and millions of Class C networks, but computer memory
costs are low, so the tables are reasonable.
Although the individual subscribers do not need to tabulate
network numbers or provide explicit routing, it is convenient for
most Class B networks to be internally managed as a much smaller
and simpler version of the larger network organizations, this is
known as a 'Subnet'. It is common to subdivide the two bytes
available for internal assignment into a one byte department
number and a one byte workstation ID.
Routing
Whenever a packet is sent across a network using TCP/IP and
that message reaches the IP router, the router makes a decision
about where to send the packet next. This decision is made on the
concept of taking the path of least resistance. Consider a
company with facilities in New York, Los Angeles, Chicago and
Atlanta. It could build a network from four phone lines forming a
loop (NY to Chicago to LA to Atlanta to NY). A message arriving
at the NY router could go to LA via either Chicago or Atlanta.
The reply could come back the other way.
If one phone line in this network breaks down, traffic can
still reach its destination through a roundabout path. After
losing the NY to Chicago line, data can be sent NY to Atlanta to
LA to Chicago. This provides continued service though with
degraded performance. This kind of recovery is the primary design
feature of IP. The loss of the line is immediately detected by
the routers in NY and Chicago, but somehow this information must
be sent to the other nodes. Otherwise, LA could continue to send
NY messages through Chicago, where they arrive at a "dead
end." Each network adopts some Router Protocol which
periodically updates the routing tables throughout the network
with information about changes in route status.
Error Correction and Recovery
When the DOD designed TCP/IP they created a protocol that was
robust, compared to the centrally managed protocols like SNA,
IPX, or X.25. In battlefield conditions, the loss of a node or
line is a normal circumstance. Casualties can be sorted out later
on, but the network must stay up. So IP networks automatically,
and silently, reconfigure themselves when something goes wrong.
If there is enough redundancy built into the system, then
communication is maintained.
When TCP/IP was first produced building the redundancy to
maintain such a network was prohibitively expense, to everyone
except the Department of Defense. Today, however, simple routers
cost no more than a PC. However, the TCP/IP design that, errors
are normal and can be largely ignored, produces problems of its
own.
TCP was designed to recover from node or line failures where
the network propagates routing table changes to all router nodes.
Since the update takes some time, TCP is slow to initiate
recovery. The TCP algorithms are not tuned to optimally handle
packet loss due to traffic congestion. Instead, the traditional
Internet response to traffic problems has been to increase the
speed of lines and equipment in order to say ahead of growth in
demand.
TCP treats the data as a stream of bytes. It logically assigns
a sequence number to each byte. The TCP packet has a header that
says, in effect, "This packet starts with byte 379642 and
contains 200 bytes of data." The receiver can detect missing
or incorrectly sequenced packets. TCP acknowledges data that has
been received and retransmits data that has been lost. The TCP
design means that error recovery is done end-to-end between the
Client and Server machine. There is no formal standard for
tracking problems in the middle of the network, though each
network has adopted some ad hoc tools.
OSI Network Model vs. TCP/IP
Now that we have taken a close look at the TCP/IP protocol
lets compare it to the standard model for networking protocols
and distributed applications as defined by the International
Standard Organization's. Taking a close look at how TCP/IP
addresses Open System Interconnect (OSI) model created by the ISO.
The OSI model defines seven network layers.
Layer 1 - Physical
Physical layer defines the cable or physical medium itself.
All media are equivalent. The main difference is in convenience
and cost of installation and maintenance. Converters from one
medium to another operate at this level.
Layer 2 - Data Link
Data Link layer defines the format of data on the network. A
network data frame (or packet) includes checksum, source and
destination address, and data. The largest packet that can be
sent through a data link layer defines the Maximum Transmission
Unit (MTU). The data link layer handles the physical and logical
connections to the packet's destination, using a network
interface. A host connected to an Ethernet would have an Ethernet
interface to handle connections to the outside world, and a
loopback interface to send packets to itself.
Layer 3 - Network
The network layer is responsible for routing and directing
packets from one network to another. The network layer may have
to break large packets into smaller packets and host receiving
the packet will have to reassemble. For example the Internetwork
Protocol identifies each host with a 32-bit IP address. IP
addresses are written as four dot-separated decimal numbers
between 0 and 255, e.g., 251.39.56.120. The leading 1-3 bytes of
the IP identify the network and the remaining bytes identifies
the host on that network. The network portion of the IP is
assigned by InterNIC Registration Services, under the contract to
the National Science Foundation, and the host portion of the IP
is assigned by the local network administrators, as described in
the 'Protocol and Addressing section of my paper. Even though IP
packets are addressed using IP addresses, hardware addresses must
be used to actually transport data from one host to another. The
Address Resolution Protocol (ARP) is used to map the IP address
to it hardware address.
Layer 4 - Transport
Transport layer subdivides user-buffer into network-buffer
sized packets and enforces desired transmission control.
Layer 5 - Session
The session protocol defines the format of the data sent over
the connections.
Layer 6 - Presentation
The presentation layer converts local representation of data
to its canonical form and vice versa.
Layer 7 - Application
Provides network services to the end-users. Mail, ftp, telnet,
DNS, NIS, NFS are examples of network applications.
TCP/IP Network Model
Although the OSI model is widely used and often cited as the
standard, TCP/IP deviates from this standard slightly. TCP/IP is
designed around a simple four-layer scheme. It does omit some
features found under the OSI model. Also it combines the features
of some adjacent OSI layers and splits other layers apart. The
four network layers defined by TCP/IP model are as follows.
Layer 1 - Link
Defines the network hardware and device drivers.
Layer 2 - Network
Basic communication, addressing and routing.
Layer 3 - Transport
Handles communication among programs on a network. TCP and UDP
are protocols which fall into this layer.
Layer 4 - Application
End-user applications reside at this layer. Commonly used
applications include NFS, DNS, arp, rlogin, talk, ftp, ntp and
traceroute.
Limitations and IPng
The Internet's explosive growth has made the TCP/IP protocol
suite's technology limitations increasingly apparent to
frustrated network and system managers. Particularly troublesome
are the weaknesses of the IP protocol itself. A new IP, dubbed IP
Next Generation (IPng or IPv6), is under development to resolve
some of IP's limitations. As the single, essential component of
any TCP/IP Intranet and the Internet IP, lacked two critical
attributes: scalability and security.
IP's scalability problems center around three key issues:
address allocation, backbone routing table growth, and host
configuration. Perhaps IP's most widely reported deficiency is
its limited addressing architecture. IP's "Class A, B, C"
address allocation is considered an 'Inefficient Dinosaur' by
many in the industry. Given historical Internet growth trends,
unassigned addresses are expected to be exhausted by 2005.
However, until recently, the most critical problem facing the
Internet community was the growing size of the backbone routing
tables. Prior to the adoption of Classless Interdomain Routing (CIDR)
routing tables on the Internet's backbone routers were growing 1.5
times faster then RAM capacity. With most Class A and B addresses
already assigned, the prospect of even a small portion of the
potential 221 (2,097,152) routing entries seemed unmanageable.
This problem is known as 'address aggregation.' When various
branch subnets all attach to the first network in their numeric
sequence, the backbone routing table only needs the network
address of the first network -- all the rest can be reached from
it.
CIDR help alleviate the problem of address aggregation by
linking all networks from 199.200.202.0 through 199.200.254.0 to
network 199.200.201.0. It also helped to slow down the rate at
which addresses were being exhausted by encouraging the practice
of ISP-based address allocation. ISP-based addressing allows for
the more complete disbursement of under-utilized Class B and
Class A subnets, therefore releasing more networks for new sites
to connect to the 'Net. But CIDR only magnifies the third
scalability weakness of IP: host address configuration. Every
time a site changes ISPs, all its hosts must be readdressed, a
decidedly non-trivial job for most sites and one that could have
a significant impact on a company's personnel resources.
Finally, IP fails in any attempt to provide security and
guarantee the source of any given packet, or that the information
it contains will remain private. IP lacks authentication and
confidentiality controls. IP's ease of compromise is well
documented and is regularly exploited. Though TCP/IP has been an
incredibly successful network in the past security and
scalability remain a problem, and maybe a redesign of this
protocol is in order.
Conclusion
TCP/IP is an extremely powerful networking protocol which has
changed the way people view computing. People all across the
world on hundreds of different platforms can be networked
together because of this. Is TCP/IP an incredible tool? Well in
this authors opinion TCP/IP is one of the premiere networking
standards derived from the computing revolution, and probably the
best thing to come out of the DOD in decades. This does not mean
that the standard cannot be improved, and a serious look has to
be taken at IPng and IPv6. Hopefully next time you will take a
closer look at the IP addresses when you are surfing the web or
playing with your computer.