Every day, the citizens of the
Internet send each other billions of e-mail messages. If you
are online a lot, you yourself may send a dozen or more e-mails
each day without even thinking about it. Obviously, e-mail
has become an extremely popular communication tool.
Have you ever wondered
how e-mail gets from your desktop to a friend halfway around
the world? What is a POP3 server, and how does it hold your
mail? The answers may surprise you, because it turns out that
e-mail is an incredibly simple system at its core! In this
edition of, we’ll take an in-depth look at e-mail and how
it works!
An E-mail Message
According to this (extremely interesting)
article, the first e-mail message was sent in 1971 by an engineer
named Ray Tomlinson. Prior to this, you could only send messages
to users on a single machine. Tomlinson’s breakthrough was
the ability to send messages to other machines on the Internet,
using the @ sign to designate the receiving machine.
An e-mail message
has always been nothing more than a simple text message
-- a piece of text sent to a recipient. In the beginning and
even today, e-mail messages tend to be short pieces of text,
although the ability to add attachments now makes many e-mail
messages quite long. Even with attachments, however, e-mail
messages continue to be text messages -- we’ll see why when
we get to the section on attachments.
E-mail Clients
You have probably already received
several e-mail messages today. To look at them, you use some
sort of e-mail client. Many people use well-known stand-alone
clients like Microsoft Outlook, Outlook Express, Eudora or
Pegasus. People who subscribe to free e-mail services like
Hotmail or Yahoo use an e-mail client that appears in a Web
page. If you are an AOL customer, you use AOL’s e-mail reader.
No matter which type of client you are using, it generally
does four things:
-
It shows you
a list of all of the messages in your mailbox by displaying
the message headers. The header shows you who sent
the mail, the subject of the mail and may also show the
time and date of the message and the message size.
-
It lets you select
a message header and read the body of the e-mail message.
-
It lets you create
new messages and send them. You type in the e-mail address
of the recipient and the subject for the message, and
then type the body of the message.
-
Most e-mail clients
also let you add attachments to messages you send and
save the attachments from messages you receive.
Sophisticated e-mail
clients may have all sorts of bells and whistles, but at the
core, this is all that an e-mail client does.
A Simple E-mail Server
Given that you have an e-mail client
on your machine, you are ready to send and receive e-mail.
All that you need is an e-mail server for the client
to connect to. Let’s imagine what the simplest possible e-mail
server would look like in order to get a basic understanding
of the process. Then we will look at the real thing.
If you have read
How Web Servers and the Internet Work, then you know that
machines on the Internet can run software applications that
act as servers. There are Web servers, FTP servers,
telnet servers and e-mail servers running on millions of machines
on the Internet right now. These applications run all the
time on the server machine and they listen to specific ports,
waiting for people or programs to attach to the port (see
How Web Servers and the Internet Work for details). The simplest
possible e-mail server would work something like this:
-
It would have
a list of e-mail accounts, with one account for each person
who can receive e-mail on the server. My account name
might be mbrain, John Smith’s might be jsmith,
and so on.
-
It would have
a text file for each account in the list. So the server
would have a text file in its directory named MBRAIN.TXT,
another named JSMITH.TXT, and so on.
-
If someone wanted
to send me a message, the person would compose a text
message ("Marshall, Can we have lunch Monday? John") in
an e-mail client, and indicate that the message should
go to mbrain. When the person presses the Send button,
the e-mail client would connect to the e-mail server and
pass to the server the name of the recipient (mbrain),
the name of the sender (jsmith) and the body of the message.
-
The server would
format those pieces of information and append them to
the bottom of the MBRAIN.TXT file. The entry in the file
might look like this:
From: jsmith
To: mbrain
Marshall,
Can we have lunch Monday?
John
There are several
other pieces of information that the server might save into
the file, like the time and date of receipt and a subject
line; but overall, you can see that this is an extremely simple
process.
As other people sent
mail to mbrain, the server would simply append those messages
to the bottom of the file in the order that they arrived.
The text file would accumulate a series of five or 10 messages,
and eventually I would log in to read them. When I wanted
to look at my e-mail, my e-mail client would connect to the
server machine. In the simplest possible system, it would:
-
Ask the server
to send a copy of the MBRAIN.TXT file
-
Ask the server
to erase and reset the MBRAIN.TXT file
-
Save the MBRAIN.TXT
file on my local machine
-
Parse the file
into the separate messages (using the word "From:" as
the separator)
-
Show me all of
the message headers in a list
When I double-clicked
on a message header, it would find that message in the text
file and show me its body.
You have to admit
that this is a VERY simple system. Surprisingly, the real
e-mail system that you use every day is not much more complicated
than this!
The Real E-mail System
For the vast majority of people right
now, the real e-mail system consists of two different servers
running on a server machine. One is called the SMTP Server,
where SMTP stands for Simple Mail Transfer Protocol. The SMTP
server handles outgoing mail. The other is a POP3 Server,
where POP stands for Post Office Protocol. The POP3 server
handles incoming mail. A typical e-mail server looks like
this:
The SMTP server listens
on well-known port number 25, while POP3 listens on port 110
(see How Web Servers and the Internet Work for details on
ports).
The SMTP Server
Whenever you send a piece of e-mail,
your e-mail client interacts with the SMTP server to handle
the sending. The SMTP server on your host may have conversations
with other SMTP servers to actually deliver the e-mail.
Let’s assume that
I want to send a piece of e-mail. My e-mail ID is brain,
and I have my account on .com. I want to send
e-mail to jsmith@mindspring.com. I am using a stand-alone
e-mail client like Outlook Express.
When I set up my
account at , I told Outlook Express the name of the
mail server -- mail. .com. When I compose a message
and press the Send button, here is what happens:
-
Outlook Express
connects to the SMTP server at using port 25.
-
Outlook Express
has a conversation with the SMTP server, telling the SMTP
server the address of the sender and the address of the
recipient, as well as the body of the message.
-
The SMTP server
takes the "to" address (jsmith@mindspring.com) and breaks
it into two parts:
If the "to" address
had been another user at .com, the SMTP server would
simply hand the message to the POP3 server for .com
(using a little program called the delivery agent).
Since the recipient is at another domain, SMTP needs to
communicate with that domain.
-
The SMTP server
has a conversation with a Domain Name Server, or
DNS (see How Web Servers and the Internet Work
for details). It says, "Can you give me the IP address
of the SMTP server for mindspring.com?" The DNS replies
with the one or more IP addresses for the SMTP server(s)
that Mindspring operates.
-
The SMTP server
at .com connects with the SMTP server at Mindspring
using port 25. It has the same simple text conversation
that my e-mail client had with the SMTP server for How
Stuff Works, and gives the message to the Mindspring server.
The Mindspring server recognizes that the domain name
for jsmith is at Mindspring, so it hands the message to
Mindspring’s POP3 server, which puts the message in jsmith’s
mailbox.
If, for some reason,
the SMTP server at How Stuff Works cannot connect with the
SMTP server at Mindspring, then the message goes into a queue.
The SMTP server on most machines uses a program called sendmail
to do the actual sending, so this queue is called the sendmail
queue. Sendmail will periodically try to resend the messages
in its queue. For example, it might retry every 15 minutes.
After four hours, it will usually send you a piece of mail
that tells you there is some sort of problem. After five days,
most sendmail configurations give up and return the mail to
you undelivered.
The actual conversation
that an e-mail client has with an SMTP server is incredibly
simple and human readable. It is specified in public documents
called Requests For Comments (RFC), and a typical conversation
looks something like this:
helo test
250 mx1.mindspring.com Hello abc.sample.com
[220.57.69.37], pleased to meet you
mail from: test@sample.com
250 2.1.0 test@sample.com... Sender ok
rcpt to: jsmith@mindspring.com
250 2.1.5 jsmith... Recipient ok
data
354 Enter mail, end with "." on a line by itself
from: test@sample.com
to:jsmith@mindspring.com
subject: testing
John, I am testing...
.
250 2.0.0 e1NMajH24604 Message accepted
for delivery
quit
221 2.0.0 mx1.mindspring.com closing connection
Connection closed by foreign host.
What the e-mail client
says is in red, and what the SMTP server replies is in green.
The e-mail client introduces itself, indicates the "from"
and "to" addresses, delivers the body of the message and then
quits. You can, in fact, telnet to a mail server machine
at port 25 and have one of these dialogs yourself -- this
is how people "spoof" e-mail.
You can see that
the SMTP server understands very simple text commands like
HELO, MAIL, RCPT and DATA. The most common commands are:
-
HELO - introduce yourself
-
EHLO - introduce yourself and
request extended mode
-
MAIL
FROM: - specify the sender
-
RCPT
TO: - specify the recipient
-
DATA - specify the body of
the message (To:, From: and Subject: should be the first
three lines.)
-
RSET - reset
-
QUIT - quit the session
-
HELP - get help on commands
-
VRFY - verify an address
-
EXPN - expand an address
-
VERB - verbose
The POP3 Server
In the simplest implementations of
POP3, the server really does maintain a collection of text
files -- one for each e-mail account. When a message arrives,
the POP3 server simply appends it to the bottom of the recipient’s
file!
When you check your
e-mail, your e-mail client connects to the POP3 server using
port 110. The POP3 server requires an account name
and a password. Once you have logged in, the POP3 server
opens your text file and allows you to access it. Like the
SMTP server, the POP3 server understands a very simple set
of text commands. Here are the most common commands:
-
USER - enter your user ID
-
PASS - enter your password
-
QUIT - quit the POP3 server
-
LIST - list the messages and
their size
-
RETR - retrieve a message,
pass it a message number
-
DELE - delete a message, pass
it a message number
-
TOP - show the top x lines
of a message, pass it a message number and the number
of lines
Your e-mail client
connects to the POP3 server and issues a series of commands
to bring copies of your e-mail messages to your local machine.
Generally, it will then delete the messages from the server
(unless you’ve told the e-mail client not to).
You can see that
the POP3 server simply acts as an interface between the e-mail
client and the text file containing your messages. And again,
you can see that the POP3 server is extremely simple! You
can connect to it through telnet at port 110 and issue the
commands yourself if you would like to (see How Web Servers
and the Internet Work for details on telnetting to servers).
Attachments
Your e-mail client allows you to add attachments
to e-mail messages you send, and also lets you save attachments
from messages that you receive. Attachments might include
word processing documents, spreadsheets, sound files, snapshots
and pieces of software. Usually, an attachment is not text
(if it were, you would simply include it in the body of the
message). Since e-mail messages can contain only text information,
and attachments are not text, there is a problem that needs
to be solved.
In the early days
of e-mail, you solved this problem by hand, using a program
called uuencode. The uuencode program assumes that
the file contains binary information. It extracts 3 bytes
from the binary file and converts them to four text characters
(that is, it takes 6 bits at a time, adds 32 to the value
of the 6 bits and creates a text character -- see How Bits
and Bytes Work to learn more about ASCII characters). What
uuencode produces, therefore, is an encoded version
of the original binary file that contains only text characters.
In the early days of e-mail, you would run uuencode yourself
and paste the uuencoded file into your e-mail message.
Here is typical output
from the uuencode program:
begin 644 reports
M9W)E<" B<&P_(B O=F%R+VQO9R]H=’1P9"]W96(V-C1F-
BYA8V-E<W,N;&]GM(’P@8W5T("UF(#(@+60@(C\B(’P@8W5T
("UF(#$@+60@(B8B(#X@<V5A<F-HM+61A=&$M)#$*?B]C;
W5N="UP86=E<R!\(’-O<G0@/B!S=&%T<RTD,0IC<
" @M?B]W96)S:71E+V-G:2UB:6XO<W5G9V5S="UD871A+V1A=
&$@<W5G9V5S="TDM,0IC<"!^+W=E8G-I=&4O8V=I+6)I;B
]W:&5R92UD871A+V1A=&$@=VAE<F4MM)#$*8W @?B]W96)S:7
1E+V-G:2UB:6XO96UA:6QE<BUD871A+V1A=&$@96UAL:6PM)#
$*?B]G971L;V<@/B!L;V=S+20Q"GXO=&]T86P@/B!T;W1A;"T
D,0IA
End
The recipient would
then save the uuencoded portion of the message to a file and
run uudecode on it to translate it back to binary.
The word "reports" in the first line tells uudecode what to
name the output file.
Modern e-mail clients
are doing exactly the same thing, but they run uuencode and
uudecode for you automatically. If you look at a raw e-mail
file that contains attachments, you’ll find that the attachment
is represented in the same uuencoded text format shown above!
Considering its tremendous
impact on society, having forever changed the way we communicate,
today’s e-mail system is one of the simplest things ever devised!
There are parts of the system, like the routing rules in sendmail,
that get complicated, but the basic system is incredibly straightforward.