A Guide to Understanding URLs
  - Working with Uniform Resource Locators on the Internet
                        by Michael R. Irwin, copyright 1994-1997 

Contents --			
  1, Introduction.
  2. What Is a URL?	
  3. Computers on the Internet
  4. Parts of a URL
  5. Types of URLs
  6. Bookmarking URLs
  7. Errors and URLs	
  8. In Summary	

                note :    NO PICTURES ARE INCLUDED WITH THIS VERSION


===============
1. Introduction
===============

This document is written as an aid to assist people in understanding and 
working with URL addresses.  It is not a paper on the mechanics of using 
URLs; rather it focuses on explaining URLs and how people use them when 
working on the Internet.  Specifically:

How to go from one location to another on the Internet. 

The URL concept is really pretty simple, as you will learn.  This guide 
is just a quick tour through some of the more common URL types and will 
allow you to be working with and understanding URLs in a variety of 
context very quickly.

               (c) copyright by Michael R. Irwin, 1994-1997 
 

================= 
2. What is a URL?
================= 

URL is the acronym for something called a Uniform Resource Locator.  It 
is known as an address that is used on the Internet.  Every time you 
want to view or get a file on the World Wide Web, you need to access the 
file via its URL.

A URL is like your complete mailing address: it specifies all the 
information necessary for someone to address an envelope to you. 

However, they are much more than that, since URLs can refer to a variety 
of very different types of resources. A more fitting analogy would be a 
system for specifying your mailing address, your phone number, or the 
location of the book you just read from the public library, all in the 
same format.

In short, a URL is a very convenient and succinct way to direct people 
and applications to a file or other electronic resources. Learning how 
to interpret, use, and construct URLs will greatly assist your 
exploration of the Internet.

The idea behind URLs is actually a good one -- create a universal system 
for accessing information on the Internet, no matter if is a single 
document (HTML page) on a server, a file on an anonymous FTP site, a 
query from a database, an entire gopher server, or even a Web images.  
In other words ... “it it’s out there, you can point to it!”

Unfortunately, that means that to access files in the World Wide Web, 
you have to get use to seeing, and typing, things like:

    http://www.germany.eu.net/books/eegtti/eegti.html

This is the actual Web address for a great paper, electronic book, named 
the “Everybody’s Guide to the Internet”.

Where do you use URLs

Whenever you work with your browser, you will use URLs.  When you want 
to go to a specific resource of the Internet, you would type in the name 
of the URL.  For instance in Netscape, you may want to go directly to 
yahoo, a search engine on the Internet.  To do this, you would type in 
the name of the URL in the Location field of Netscape (below the menu 
and main program buttons.  It would look something like this:

  *** PICTURE WOULD BE INSERTED HERE

Most browsers have a similar location for typing in the name of a URL 
you want to go to.  Once you type in the name of the URL and press 
enter, the browser will connect to the Internet and attempt to locate 
that URL.  If it can find it, it will go to it and display the URL or 
offer to download it, if the browser does not recognize the format.


  ----------------------------
  Comparing URLs to file names
  ----------------------------

The URL above is the address (location and filename) for a specific 
document.  When the URL concept was introduced, users of the Internet 
agreed that a single methodology needed to be created that would allow 
anyone on the Web to find and access anything on the Internet.

To understand  what a URL is and how you use them, let us compare a URL 
to a filename on a computer.

For instance, you may want to copy a text file named myFile.txt that is 
on your D: hard drive to a floppy disk.  You know that the file is 
sitting on drive D: in a sub-directory named 

                 D:\DOCUMENT\WINWORD6\TEXTFILE\.  

To copy this file you can move to the drive using the change drive 
command (D:) and then move to the directory using the change directory 
(CHDIR or CD) command and finally use the copy command to copy the file 
to drive A:.  This would take a minimum of three commands:

    D:
    cd \document\winword6\textfile
    copy myFile.txt a:\myFile.txt

An alternative is to copy the file using one command that references the 
drive, directory and filename:

    copy d:\document\winword6\textfile\myFile.txt a:\myFile.txt

Although the second method is longer, it is prone to the same level of 
error as the first method.  Specifically, typos.

In this case, the file was sitting on your own computer.  If the file 
was on a network that you were a part of, you could have just as easily 
copied the file from the network.  To copy the file from the network, 
you replace the local drive and directory name with the network drive 
and directory name where the file actually resides.

               (c) copyright by Michael R. Irwin, 1994-1997 


============================ 
3. Computers on the Internet
============================ 

In the IBM PC world commands like copy are not case sensitive.  This 
means that you can type the command using any combination of upper and 
lower case letters.

When working on the Internet, you need a way to access a document that 
is sitting on some computer in some directory.  This document can be on 
a Mainframe, Mini or PC computer.  It can be found on a computer that 
has an operating system different than yours -- UNIX, Windows NT, Mac 
OS, or one of many other operating systems.  Each system stores and 
names files differently.  To overcome this problem, the URL concept was 
instituted.  Every document, query, graphic, FTP file, Gopher site, etc. 
are assigned a unique Uniform Resource Locator or address.

A URL can point to a file in a directory and that file and directory can 
exist on any machine on the Internet and this file can be served via any 
of several different methods.  As pointed out, it can be more than a 
file, it can be query, a gopher server contents, and on and on.


  ---------------------------
  The servers of the Internet
  ---------------------------

There are several types of computers on the Internet, they are joined 
together as one single network.  Each computer that exchanges or 
transfers information on the Internet is known as a server.  In fact, 
the Internet is the largest Client/Server database in the World.  There 
are thousands of servers available on the Internet.

There are several different types of servers on the Internet, each 
running their own server software.  The four basic types of servers are:

  HTTP, or HyperText Transport Protocol, server

      Used to store and send standard World Wide Web 
      hypertext documents.  HTTP is a simple protocol 
      which is the basis of the Web.

  FTP, or File Transfer Protocol, server

      Used to store and transfer files across the Internet.
      These servers are the file libraries or archives, which 
      can be used by the public.  These can be program files 
      or documents

  Gopher server 

      Used to accept requests for information and then scan 
      the Net for it.  A Gopher server lets the user work 
      through menus, instead of typing in long sequences of 
      characters.  It works in conjunction with FTP sites 
      letting a user select a file from a menu. It is normally 
      based on a single database.

  WAIS, or Wide Area Information, database server

      WAIS is similar to a Gopher server; however, it lets 
      users access and find files using a single interface.
      The WAIS server program worries about how to access 
      information on hundreds of different databases.

Each of these servers can store different types of documents and files.  
Once a server is on the Internet, it can be accessed by the millions of 
users of the Internet.

               (c) copyright by Michael R. Irwin, 1994-1997 

================= 
4. Parts of a URL
================= 

A URL, like a file on your local computer has several parts.  

Since the document, or file, can be on any type of Internet server, and 
on any type of computer, accessing the file via a URL requires 
specifying several pieces of information:


  ----------------
  Parts of an URL:
  ----------------

Specify the type of server, or method needed, to retrieve the document.  
By telling the browser or program the type of server or method it will 
connect to lets the browser know what it has to do with the information 
once it gets it.  This is the only part of a URL that does not directly 
relate to locating a file on your local machine or on a network that you 
are attached to.

Specify the machine name where the document is located.  This portion is 
used to identify the type of computer and where it is located on the 
Internet. This is equivalent to specifying a drive on your local 
computer or on a network.

Specify the path and document name that you want.  This is exactly the 
same as specifying a path and file name in commands on your local 
network or drive.

Understanding these three actions, explains the three parts of a URL.  
Each URL is comprised of three parts:

  First, the type of resource to access

  Second, the name of the site where the resource is located

  Third, the directory path and resource name, or directory path alone

      --------------------------------
  >>  WARNING: URLs are Case Sensitive  <<
      --------------------------------  

Since many Internet servers are running in an UNIX environment, you must 
pay attention to the URL name.  Most programs running in the UNIX 
environment are case sensitive.  Because of this you should be very 
careful when typing URLs.  Always assume that the URL is case sensitive.


  -----------------
  Working with URLs
  -----------------

Look at the following URL --

    http://www.europa.com/~ria/links.html

This URL is an html document of “A Collection of Philippine Pages 
Picturesque Philippines World Wide Web Links” it is a great link page 
for finding resources and information about the Philippines -- 
government, education, Internet, business, even newspapers and ezines.

The URL is made up of three parts:

  http://

      The “http” means that you are dealing with a World 
      Wide Web resource.  It stands for “HyperText Transport 
      Protocol”.  This is the way that the Web moves information 
      around the world.  This information is critical to your 
      browser.  It tells the browser how to connect to the system. 

  www.europa.com

    This is the next part of the URL. It is the name of the 
    site where the resource is located.  It is the name given 
    to the actual server that sits somewhere in the world on 
    the Internet.

  /~ria/links.html

    The final part of the URL is the directory path and resource 
    name.  Notice that the path is separated with forward slashes.

In the example above, notice how the last item ends in “.html”.  That 
stands for HyperText Markup Language, which is the program coding that 
is used to create hypertext documents.  Many Web addresses will end in 
it.

If you connect to this URL, you will see a page that begins similar to 
the following:

     *** PICTURE WOULD BE INSERTED HERE

Some other URLs may use numbers in them, as in the following  -- 

    http://204.146.46.134:80/prev/explore/wtools/

This URL is an html document of “World Wide Web Search Tools”, a part of 
the IBM Global Network pages. It is a great link page for finding the 
different search tools available on the Internet for locating URL 
resources.

Like the previous example, this URL consists of three parts:

  http://

    The “http” means that you are dealing with a World 
    Wide Web resource.

  204.146.46.134:80

    It is the name of the site where the resource is located.
    Notice that it has a numeric name instead of an English name.

  /prev/explore/wtools/

    This is the directory path and resource name.  Notice that 
    it does not end with a document name.

     ------------------------------
  >> NOTE: Ending an URL in a slash  <<
     ------------------------------

When using FTP, HTTP, and Gopher URLs, the "directory path and resource 
name" will sometimes end in a slash. This simply means that the URL is 
not pointing to a specific file, but a directory. In this case, the 
server generally returns the "default index" of that directory. This 
might be just a listing of the files available within that directory, or 
a default file that the server automatically looks for in the directory. 
With HTTP servers, this default index file is generally called 
"index.html", but is frequently seen as "default.html”, "home.html", or 
"welcome.html".

               (c) copyright by Michael R. Irwin, 1994-1997 
 
================  
5. Types of URLs
================  

There are several different types of URLs.  The one we have currently 
seen in this paper is the HTTP URL.  When the World Wide Web was first 
introduced to the Internet, in late 1993, it offered an easy, single, 
consistent user interface that could be used to browse, or view, text 
and graphics at the same time.  With the introduction of the Web was a 
new server known as the HTTP server.

Prior to the introduction of the HTTP server, there were several other 
servers already in use on the Internet.  These servers allowed user to 
(1) transfer files via archaic UNIX commands, like ls or get, (2) read 
news via programs like rn and nn that use commands like j or sz, and (3) 
using menus for finding things in gopher and WAIS servers.

To work with the different types of resources found on the Internet, you 
need a way to tell your browser how to find the resource and the type of 
resource you want to work with.  It can be a file that you want to 
download, a news article you want to read, or a gopher site that has a 
menu that you want to view.  Each type of resource will reside on its 
own type of server.

     --------------------------------
  >> NOTE: Different types of servers <<
     --------------------------------
  
To review the types of Internet servers see the section Servers of the 
Internet section found earlier in this paper.

There are many different types of URLs, however the most common schemes 
are:

    HTTP URLs
    FTP URLs
    Gopher URLs
    News URLs


  ---------
  HTTP URLs
  ---------

HTTP is the Internet protocol specifically created for the World Wide 
Web, thus it will be the most common scheme you are likely to use.  
These are the HyperText documents of the World Wide Web.  HTTP, as 
pointed out previously, stands for HyperText Transport Protocol.  HTTP 
servers are commonly used for storing and serving hypertext documents.  
These types of documents tend to be extremely efficient, containing 
navigational information within themselves.  Moving from one document to 
another is handled via an embedded reference this means that the server 
protocol does not have to contain support for navigational features like 
Gopher or FTP protocols require.

For instance, you may want to go to a page that gives you information on 
creating a home page, you can enter an address like:

    http://www.goliath.org/makepage/index.html

This URL is the Welcome to "Make Your Own Home Page" page.  Notice that 
it is an HTTP type URL.

HTTP URLs have become the most common type of URLs on the Net today.


  ---------------------------------
  File Transfer Protocol (FTP) URLs
  ---------------------------------

FTP URL scheme is used to access files and directories on Internet hosts 
using the FTP protocol.  The FTP protocol is one of the oldest was of 
transmitting files over the Internet.  While there are many advantages 
to using HTTP instead, many servers don't offer the full support of 
HTTP.  In addition, many client programs are developed for FTP. This is 
especially true if you are accessing the Internet via Terminal emulation 
as many UNIX clients still do.  In addition, many files are distributed 
only via FTP on the Internet. 

Connecting to an FTP site works basically the same way as logging into 
an HTTP site.  For example, to connect to the Internet’s Electronic 
Frontier Foundation computer, you would use the URL:

    ftp://ftp.eff.org/

Notice that the URL is very similar to an HTTP URL.  Instead of 
specifying the type of server as http://, you specified ftp://.  The 
name of the site where the resource is located is ftp.eff.org/.  Notice 
in this case that you ended the URL with a forward slash.  This ftp does 
not specify a specific path and document name.  Therefore, it displays 
the contents of the sub-directory pointed to on the ftp server.

Another example will specify a specific file that you want to locate.  
Once located, your browser will either display it, if it recognizes the 
format or notify you that it doesn’t recognize the format and offer to 
save it to disk for you.  If you want to find and display the file named 
cda_approved.gif on the same ftp server we just connected to, you would 
enter the following URL:

    ftp://ftp.eff.org/pub/EFF/Graphics/cda_approved.gif

Your browser will display a graphic similar to the following:

   *** PICTURE WOULD BE INSERTED HERE

Notice that the above URL looks very similar to the URLs that you have 
specified when working with HTTPs.  In this  case it has a directory and 
file name as part of the URL.

    ----------------------
 >> Note: Case sensitivity <<
    ----------------------

Notice that the above URL has both lower and upper case in the URL name.  
URLs are case sensitive and must be entered exactly the same as the case 
sensitivity of the directory and file names.


  -----------
  Gopher URLs
  -----------

As you work with FTP URLs you begin to realize ftp sites can be very 
frustrating to work with.  You have to remember all of those ftp site 
names and, oh, many of the ftp sites have weird directory and file 
names.  This is where a gopher URL can help.  Gophers (and WAISs) are 
essentially menu systems.  They take a request for information and then 
scans the Net for it.  This eliminates the need for you to have to 
search for it.  Once a menu is displayed, you can select files and 
programs from ftp sites for downloading or displaying.

The Gopher protocol syntax is very similar to FTP and HTTP.  Instead of 
using http:// or ftp:// you specify gopher://.  For example, to connect 
to the National Cancer Center gopher site in Tokyo, Japan, the URL is:

    gopher://gopher.ncc.go.jp/

Or another site you may be interested in is the United Nations Criminal 
Justice Country Profiles gopher site.  This site is maintained at the 
Albany, NY university.  The gopher URL is:

    gopher://UACSC2.ALBANY.EDU:70/11/newman

Once on the server, you will see a menu similar to the one at the top of 
the next page: 

This menu is actually a Gopher server.  Notice that it is a series of 
menus choices.  Since you are using a browser, like Netscape, it shows 
all the choices as underlined text.

To select any of the choices all you have to do is double mouse click on 
the menu choice you want.

   *** PICTURE WOULD BE INSERTED HERE

Using this Gopher menu, you can click on the UN Criminal Justice Country 
Profiles and then select any country whose information you want to 
review or copy.

     ---------------------------------------------------------------
  >> Warning: It asks a port # when connecting to FTP/ GOPHER server <<
     ---------------------------------------------------------------

Sometimes, you may have to specify a port number for the FTP or Gopher 
site you are trying to connect to. Usually it will default OK with a 
port number.   If you are connecting to a FTP or GOPHER site via a 
browser and a menu choice on another Web document, the port number will 
be passed at the same time, automatically.

  ---------
  News URLs
  ---------

The final most common type of URL used is to connect to an Usenet 
newsgroup.  These URLs are known as News URLs.

Before demonstrating how to connect to a News URL, we need to quickly 
discuss UseNet:


  What is USENET
  --------------

USENET is a large collection of computers that share data with each 
other.  It is the people that use these computers that make USENET worth 
the effort.  Imagine a conversation that is being carried on over days, 
where anyone can put their two-cents in.  Usenet is like email, except 
that it is many-to-many instead of one-to-one.  It is the international 
meeting place where people gather to meet their friends,  discuss 
events, or talk about anything they want.  Often, many people believe 
that USENET is the Internet.  However, it is a totally separate system.  
All Internet sites CAN carry Usenet.  Usenet has millions of messages 
posted each day -- it is HUGE.  

The basic building block of the Usenet is the newsgroup which is a 
collection of messages related to a theme.  There are almost 10,000 of 
these newsgroups, in a wide range of languages, covering any subject you 
can imagine.  Which Usenet groups you have access to depends upon your 
Internet service provider.  Each newsgroup usually has a fee attached to 
it and requires that the provider pay this fee.  Therefore the services 
available are those that your provider subscribes to.

To connect to a newsgroup you use its URL.  Unlike the previous URLs, 
you do not specify the new service the same way you connect to other 
URLs.  Specifically, you do not specify the double forward slashes.  

Before you use any news services, you will have to specify the news 
server used by your Internet provider.  In Netscape this is done via 
specifying your NNTP (Net News Transfer Protocol) server in the 
Preferences dialog box, under the Options box.  In Mosaic you set an 
environment variable NNTPSERVER to the name of the news server.  Most 
browsers will let you set the news server via a file menu choice like 
options.

Once your news server has been specified, you can point to a Usenet 
newsgroup by referencing the URL.  For instance, to connect to the US 
jobs offered newsgroup you would type:

    news:us.jobs.offered


  ----------
  Other URLs
  ----------

There are several other URLs that you can reference from your browser.  
Each can be referenced similar to the way you have worked with HTTP, FTP 
and GOPHER URLs.  

Some of the other URLs you may come across are:

    File URLs      file://ftp.unt.edu/README
    WAIS URLs      wais://wais.free.net/
    NNTP URLs      nntp:////<# doc>
    Telnet URLs    telnet://none@edlis.ied.edu.hk:23/
    Mailto URLs    mailto:mrirwin@ibm.net

Although you may not come in contact with these URLs often, you may find 
them as links in other WWW documents.  For instance, if you see a link 
like “send message to page owner” it will probably use a MAILTO URL.  
Some URLs like Telenet will require that you have a Telenet application 
linked to your browser.  Since Telenet allows you to login to a server 
as a terminal, you will need some sort of program that lets you act as a 
terminal.  This application will run by your browser when you log into 
the server.
 
               (c) copyright by Michael R. Irwin, 1994-1997  
 
===================  
6. Bookmarking URLs
===================  

Although URLs are frustrating to work with, there is an easy to return 
to a URL resource.  Nearly all of the Internet Web browsers today have a 
feature which is like an automated address book.  Some browsers call it 
“Book Marking” others call it “Hot Listing”, in both cases the effect is 
the same.

Bookmarking allows you to grab a copy of a URL and store it so that you 
can easily go back to the site at a future time.
 
Understanding the action of book marking, the definition of a bookmark 
becomes obvious. A bookmark is a Web page tag or reference that you 
place in a list that can be accessed later to return to the URL.

Following are instructions for bookmarking (or hot listing) using 
several popular internet web browsers:

Netscape’s Navigator (Version 2.0x and 3.0x)
Go to the First page of the site you want to reference
Select BookMarks >> Add Bookmark from the main menu

Microsoft’s Internet Explorer (Version 2.0x and 3.0x)
Go to the First page of the site you want to reference
Select Favorites >> Add to Favorites from the main menu

Spry’s Mosaic (Version 4.00.xx)
Go to the First page of the site you want to reference
Select Navigate >> Add Web page to Hot list    or click on the ADD 
button on the button bar.

As you can see, adding URLs to a browser are relatively easy.  All 
require the same basic action -- go to the URL resource that you want to 
add to the list.  Once you are at the resource, add it to the Bookmark 
or Hot list.
 
               (c) copyright by Michael R. Irwin, 1994-1997  
 
==================  
7. Errors and URLs
==================  

If you receive an error when attempting to connect to a URL, first check 
to see if you entered the correct URL -- in other words, check the 
typing.  

If it is OK then perhaps the Web server is busy, simply try again later.

Finally, if you connect and it tells you that you must specify a port 
number and/or a user name and password, you will need to obtain the 
appropriate information and add it to you URL before accessing the URL 
source.
 
               (c) copyright by Michael R. Irwin, 1994-1997  
 
=============  
8. In Summary
=============  

Using URLs is relatively easy as long as you remember one simple rule:

        RULE: URLs are case sensitive.

There are several different types of URLs; however, they all tend to 
work the same way.:

      First  - you put in the type of URL you want to connect
               to (e.g. : http, ftp).

      Second - after this, you place the host server name.

      Third  - the path and resource you want to access.

That is all there is to URLs.  Using URLs lets you move from one Web 
resource to another quickly and easily.
 
               (c) copyright by Michael R. Irwin, 1994-1997
Source: geocities.com/tokyo/towers/4385

( geocities.com/tokyo/towers) ( geocities.com/tokyo)