Towards a Unification of Internet Applications
==============================================
Author: Uzi Paz
E-mail: for e-mail contact: user is uzi4wg and domain is uzipaz.com
First version: 18/08/1997
Recent version: 03/01/1998
Main Source:
http://www.spoofers.net/uzi/eng/unified.txt 
or http://www.geocities.com/uzipaz/eng/unified.txt
Copyright Notice: you may not copy this document or any part of it, to a
public location, nor to publicize it in any other manner without
prior permission from the author.

The aim of this document is NOT to be used as a learning resource, but
rather to be used as a matter for further work. This document should be
treated only as a work in progress.

AN IMPORTANT COMMENT: the author does not claim to be knowledgable in
the field. If there are any comments which you think that I'm not aware
of them please contact me.

Index
=====
1. Abstract
2. Motivation
3. Unified Framework for Internet Extensions
3.1 Definition of the Criterias
3.1.1 Objects
3.1.2 Distance
3.1.3 Expiration
3.1.4 Automatic and Manual Change of Distance
3.1.5 Identity of the Object
3.1.6 Output Presentation on Local Site
3.1.7 Forground vs. Background
3.1.8 Topic Classification
3.1.9 Location Identifiers
3.2 Examination According to Criteria
3.2.1 Personal E-Mail
3.2.2 Unmoderated Mailing Lists
3.2.3 Unmoderated Newsgroups
3.2.4 FTP Sites
3.2.5 WWW-Sites
3.2.6 Intermediate Stages
3.2.6.1 Proxy Servers
3.2.6.2 News Servers
4. Putting Everything Inside
4.1 Objects
4.2 Distance
4.3 Expiration
4.4 Automatic and Manual Forground/Background Transfer of Messages
4.5 Identity of the Object
4.6 Output Presentation on Local Site
4.7 Topic Classification
4.8 Location Identifiers
4.9 Intermediate stages
5. Transition Problems
6. Last Remarks
7. References

1 Abstract
----------
We discuss the different Internet applications from a unified
standpoint.
We believe that such a standpoint is the correct framework for any
further extensions. We treat the subject not from the technical
point of view, but rather from the user point of view.

2 Motivation
------------
Few years ago, before the WWW (World Wide Web) was invented, there was
a clear distinction between various Internet protocols.
FTP was for transfering files. Telnet was for connecting to a remote
account, SMTP (E-Mail) was for transfering E-Mail messages between
e-mail users. Usenet was invented without a connection to the Internet,
and then it was adopted by the Internet. Mailing lists were well
developed on Bitnet, before they were on the Internet, but as there were
no problems in transferring messages between the two networks, many
Internet accounts were subscribed to those bitnet mailing lists.

At present, we still consider Usenet as being transmitted within the
Internet via NNTP, E-MAIL via SMTP, Web-pages via HTTP, files via FTP,
and mailing list messages via SMTP (or the secured extensions for these
protocols), but this is no longer accurate.
There are gateways from E-Mail to Usenet and vice-versa; from Usenet to
WWW and vice versa. More and more mailing lists have WWW archives with
an interface to post messages via WWW. Even private e-mail can be
accessed via WWW interface (e.g. Hotmail). Many Newsgroups have mirror
mailing lists, so that every message posted to the mailing list, will be
gatewayed to the mirror newsgroup and vice versa.
The invention of the URL, did an important step towards a unification of
the different applications. A single browser allows you to access
various applications via various protocols. Many of the browsers do not
really care if the message came via ftp, http, or nntp, if it is an html
message, they will provide you an (optional) html interpretation of it.
So it doesn't too much matter if you use http or ftp to get the file.

Each of the services and programs try to provide better options, so
that you can use WWW to access Usenet and have all advantages of a
regular news-reader (e.g. Billy News [1] uses cookies to allow the
reader to subscribe to newsgroups and mark messages as being read, while
accessing Usenet by HTTP, and more such services appear), mail to news
gateways cause borrowing headers which belong to one system to the
other.
Some mail clients support an option to put different incoming mail in
different "incoming" folders, so that the user can decide that incoming
mail from one mailing list will be kept in one folder, mail from another
mailing list will be kept in another, private e-mail in another
"incoming" folder etc. If once, the automatic seperation of messages to
different logical-"folders" according to name of discussion group, was a
feature which belong solely to Usenet, it becomes more part of the
features of mailing lists. There is much discussion on borrowing header
fields from Usenet to E-mail, (e.g. "Expires:" field from Usenet to
E-Mail [2], so that a poster of e-mail would be able to set an
expiration date). This will make the e-mail even more similar to Usenet.
A suggestion was made, to add to mailing list messages, in their header
fields, new fields with URL-like structures for controlling subscription
options. This will provide the e-mail client, the option to add to the
interface simple "buttons" for controlling subscription. Such a
suggestion will make mailing lists even closer to newsgroups than they
are now. HTML is now fully implemented by many MUAs, and links from e-mail
messages to other resources are also supported.

To make things even more complicated, new creatures were invented,
newsgroup-like Web-based discussion groups (Hypernews [3]), and mailing
lists-like web-based discussion groups (Web4Groups [4]) which also have
a standard e-mail interface. Each of them tries to combine the
advantages of WWW to the discussion groups.
Take one of these applications, use it via a gateway from e-mail or from
Usenet, and enjoy a mailing-list-like, WWW based application via usenet,
or perhaps a Usenet-like WWW-based application via Usenet, or perhaps
you wish to use the Usenet-like WWW-based application service via
e-mail ?

Would you like all smart advantages of usenet to be provided to you
via e-mail when accessing Usenet through a gateway? With a wise enough
gateway, wise enough extension of the protocols, and a wise enough
mail-client, nothing is impossible.

This overlapping of areas, is not simple, gateways may produce
compatibility problems. For example, if you post a message and put in
the "To:" field, two different mailing lists, each of them is mirorred
to a different newsgroup, then the message, instead of being posted to
Usenet once, with a "Newsgroups:" field containing the two newsgroups,
it is posted twice, with the same message-id, but once with only the
first newsgroup in the "Newsgroups:" field, and once with only the
second newsgroup.

The more each of the applications will try to be compatible with
gatewaying services from other applications, and the more each of the
applications will try to adopt advantages and features from the other,
then the less, the borders between the different applications will be
clear.

At the moment, there are still clear differences between different
applications, but the direction is clear.

As there is already such a trend towards unification, but it is
driven from inside out, i.e. from application-specific points of view,
and from IETF groups each working on extensions of specific protocols.
The suggestion made here is to develop a unified framework for
discussion of future extensions to existing protocols and applications,
and to discuss the unification of them.
More specifically, we wish to discuss the possibility of pushing this
unification to the extreme.

At first sight, it looks really crazy: How can an e-mail message sent by
a person to another person as a private e-mail, and a web-page, can be
treated as two points in a continuous map?
We are not in a position to provide an answer to this question, but we
wish to initiate the discussion, not knowing where it will lead us in the
future.

Any trial for inventing a new protocol which will unify all features
of all existing protocols, is immediately faced with the questions:
a) What guarantees, do we have that, it won't become just another
protocol, with gateways to and from other systems?
b) What will be the problems with compatibility with other non-Internet
networks, when using gateways.

Before we are trying to unify the different systems and protocols, we
should discuss the existing ones from a unified standpoint.
Even without trying to unify the protocols, such a unified approach,
should be a good framework for discussions of further extensions of the
existing software and standards.

3 Unified Framework for Internet Extensions
--------------------------------------------
We shall try to define the different applications according to various
criteria. We shall not enter the technical aspects of the different
applications and protocols. We shall also ignore many aspects and
details which according to our view, should be discussed only at a later
stage.
In order to make the overall picture more transparent, we do not enter
in details to the examples nor intend to discuss all known examples, and
special cases. The info below is probably not new to most of you. The
main importance of it, is in the way the info is presented.

3.1 Definition of the Criterias
-------------------------------
3.1.1 Objects
-------------
An object is any file or stream delivered using Internet protocols.
It may be:
1) A text file
2) A MIME enhanced file
3) A binary file (not defined as a MIME enhancement)
4) An HTML file (not defined as MIME enhancement)
5) A stream of text such as the output for a LIST request on NNTP.

3.1.2 Distance
--------------
We shall roughly define three different distances for any object.
We shall define the distance with respect to any of the recipients.
The definition is not connected neither to physical distance, nor to
logical Ineternet distance.

REMOTE: ftp site, www-site, nntp-server of the poster of a news
message.

INTERMEDIATE: Proxy-servers, news-servers. Relay systems will not
be considered.

LOCAL: the local computer of the recipient, or a relay system of the
recipient's ISP.

Different systems and protocols provide a different control to the
sender, recipient, and intermediate maintainer (i-maint) for the
transfer of the object from one distance to another.

3.1.3 Expiration
----------------
For some objects and for different distances different people have
different control over expiration of the object. in some cases the
expiration is set, in some there is no expiration, in some cases
expiration is set at one stage but ignored by the later stages.

3.1.4 Automatic and Manual Change of Distance
----------------------------------------------
Sometimes the recipient requests a specific object or set of objects
to be sent to the local distance. In some cases, the recipient requests
that whenever some objects meet some definitions, they will
automatically be moved to the local distance. This might be generalized
to allow the recipient to request only the transfer to an intermediate
distance automatically, and then, while online, requesting them to be
transferred to the local.

3.1.5 Identity of the Object
----------------------------
Some identities are supposed to be unique (message-id). Some may refer
to different objects at different times (URL of a file which is
updated from time to time). A combination of a date/time and URL is a
good URI. On the other hand, if there are a few copies of the same
object, the there might be a few different URIs for the same object.

3.1.6 Output Presentation on Local Site
---------------------------------------
This has more to do with the software of the recipient, but has to be
discussed as it is inseperable from the whole discussion.
Output will usually directed to a file in the local site. It might be a
temporary file+screen (www-browsers), may be an incoming buffer
(e-mail), or a regular file (ftp).
A related question, is who is in charge on the local presentation.
On WWW a site developer may choose the fonts, and presentation of the
service.
Software on local machine may allow the user to choose the fonts and
presentation. Using a small number of standard font-codes, or
presentation-preferences, may allow the local user to choose his/her own
user-preferences for each of them. (standards for codes might be:
underligned/regular/bolded/blinking/etc.). With some convention on the
usage of each standard preferences, it allows the user to have an easy
control over presentation. On the other extreme, a huge number of fonts,
and total flexibility with no standards, may provide a lot of
flexibility to the designer of the object, but no easy control to the
user for choosing his own preferences.

3.1.7 Forground vs. Background
------------------------------
You may open your Web-browser, request an object and wait for it to
come into your local site. Usually this is the case with www, and ftp.
E-mail is different, as you may use it to get objects in the
background. You may use www-mail servers (e.g. agora servers) to get
the object in the background, or you may run ftp or http in the
background. At the moment, most of the people who wish to ransfer
objects to their local distance, in the background, will find the
by-e-mail services as the most useful tools. When discussing a unified
approach, we should look at this aspect as well.

3.1.8 Topic Classification
--------------------------
This criteria is for classification of objects in order to allow
automatic receipt of objects. For locating links and threads.
There are many different topic classifications, and html allows every
one to construct a different classification. Do we wish a single
standard classification? Can we unify a Usenet-like classification
(i.e. by threads, newsgroups, subhierarchies and hierarchies), and the
many different wise types of classifications found on WWW pages, and FTP
sites.

3.1.9 Location Identifiers
--------------------------
Location identifiers should provide info on where to find a specific
object. Do you wish to receive a document according to a specific
specification ? Where it exists ? This is probably the hardest part in
unification. It is not enough that the object holds the identifiers, but
rather that those identifiers will help users which do not have the
object to identify its location.

3.2 Examination According to Criteria
-------------------------------------
We shall now examine how do E-Mail, Mailing lists, news-messages,
ftp-sites, and www-sites, fall in the above criteria.
Many of the definitions here are not exact and there are plenty of
exceptions. Making comments about all the exceptions will make the
discussion much less transparent, hence many cases and exceptions are
ignored. For the same reasons, we ignore other applications such as IRC
and the Internet phone.

3.2.1 Personal E-Mail
---------------------
A personal e-mail is any e-mail for which the sender addresses the
object to a list of addresses, each of them is an incoming box of a
single human.
In general:
(1) the poster has the manual control for transfering the object to
the local distance.
(we ignore setting filters and killfiles)
(2) the recipient has no control (in general. Up to kill files and
filters which are considered as automatic controls).
(3) No intermediate distance.
(4) Usually: no expiration (recipient has to delete the object
manually).
(5) The object might be of type, either 1 or 2 (see 3.1.1).
It might of-course be 4, but many mail-client will not interpret it
as that.
(6) identity is suppposed to be unique
(7) A reply-to header might in theory, be used for generation of
threads, but practically, this is not used.
On the local machine, one may either manually or automatically
organize the messages on different folders.
(8) No location identifiers needed. The "Received:" field serves as
path-recorder, and may be extended to be used as a location
identifier. Location is at local site.

3.2.2 Unmoderated Mailing Lists
-------------------------------
In general:
(2) the recipient has automatic controls, by choosing to which mailing
lists to subscribe.
(1) The poster has manual control for transferring the object to the
local distances.
(3) No intermediate distance.
(4) Usually no expiration.
(5) types of objects are the same as in 4.1
(6) identity is supposed to be unique
(7) mailing lists are one kind of topic classification. Each mailing
list is supposed to send messages related to a specific topic.
A reply-to header might in theory, be used for generation of
threads, but practically, this is not used.
On the local machine, one may either manually or automatically
organize the messages on different folders.
(8) Same as in 3.2.11

3.2.3 Unmoderated Newsgroups
----------------------------
In general:
(1) is the same as for mailing lists.
(2) is the same as for mailing lists, up to changes implied from (3)
(3) Intermediate news-servers: The maintainer of the news-servers will
usually limit the ability to access.
(4) expiration is set either by the intermediate maintainer, or by the
poster.
(5) types of objects are as in 4.1 .
(6) identity is supposted to be unique
(7) Topic classification: hierarchies - sub-hierarchies - newsgroups -
threads .
(8) Location identifiers - not much needed. The "Path:" header field
may play a role equivalent to the "Received:" field in e-mail.

3.2.4 FTP Sites
---------------
In general (ignoring proxy servers for ftp sites):
(1) the poster has control for transfering the object to the remote
distance.
(2) The recipient has manual control for transferring the object from
the remote site to the local site.
(3) FTP - proxy servers
(4) No expiration. But at the distant site the poster has control.
(5) Originally, types of objects were 1,3.
(6) identity may refer to different objects at different times.
(7) Classification according to location.
(8) Location identifiers - None. URL includes an identifier of the
remote location. Stream provides a partial location info.

3.2.5 WWW-Sites
---------------
In general
(1) the poster has control for transferring the object to the remote
site.
(2) The recieipent has manual control for transferring the object to
the local site.
(3) Some wise proxy servers allow the recipient to save time, by
automaticaaly keep objects in an intermediate site.
(4) No expiration. At the distant and intermediate sites, the poster
has control over deleting and replacing the object.
(5) all objects 1-4.
(6) identity may refer to different objects at different times
URL + time give us a unique identity. There might be few copies of
the same object with different URIs.
(7) First classification is by location, usually not used. A free
classification may be used by linking. There is no standard for such
a classification.
(8) Location identifiers, appear in the URL, and include the remote
location.

3.2.6 Intermediate Stages
-------------------------
We shall discuss the different intermediate stages, from the unified
point of view. We shall discuss two such systems:
news servers, and proxy- servers.
On any intermediate site which serves many users, the maintainer must
have some control of the way in which the resources are used most
efficiently. For news-servers, it is done by subscription to only part
of the newsgroups, and by setting different expiration times for
different hierarchies.

3.2.6.1 Proxy Servers
---------------------
For this discussion we treat Proxy-Server as an intermediate site which
saves the recipient time by keeping popular objects closer to the site
of the recipient. I do not wish to enter other possible uses for the
proxy servers (such as firewalls).

A proxy server makes a non-automatic transfer of objects from a distant
site to an intermediate site. It is usually done in the forground, but
may be done at the background. After transfering the object from the
remote site to the intermediate site, users will be able to get the
object from the intermediate site instead of the remote site, and hence
save time. Object has expiration on the intermediate site.
Expiration depends on the resources for the intermediate site, and on
the amount of time passed since the last recipient ordered the object.
After expiration, the recipient will get the object from the remote
distance, through the proxy-server.

3.2.6.2 News Servers
--------------------
A news-server provides a feature similar to a proxy server, in the
sense that it allows the bringing of objects closer to the recipient.
Of course news-server is a basic element of the Usenet logical structure
as Usenet messages do not have a fixed location.

A news-server makes an automatic transfer of objects from a distant
site to the intermediate site, according to the decission of its
maintainer. It is done automatically, and hence in the background.
Objects have expiration date. The maintainer sets the expiration time
for the objects. The poster may request a specific expiration by
setting a header field. The maintainer may decide to respect this
request, or not.


4 Putting Everything Inside
---------------------------
We wish to discuss a new creature which combines the most flexible
features according to the criteria mentioned in 3.1 .
The purpose is not to offer a specific suggestion, although at a future
stage we may be able to make such. At the moment, we only wish to
concentrate on the discussion itself. We shall treat the creature as if
it is a new logical structure (as ftp, www, Usenet, e-mail etc.)

4.1 Objects
-----------
Our structure has to support all objects mentioned in 3.1.1 .
One may say that the structure can handle binary files through MIME,
and hence there is no need for other types of binary files.
It is important, to state that we wish this object to be compatible
with all existing objects. We do not enter the technical manner in
which the binary files are transferred, but it should be clear that the
structure should be able to handle even binary objects which do not
have a MIME header-fields.
About the `stream' object mentioned in 3.1.1 , a local or intermediate
agent may request such an object from the intermediate or remote site.
Since listings are for some systems delivered as files, and for others,
delivered as streams, we wish our structure to treat the stream as any
other object.

4.2 Distance
------------
The structure has to support an intermediate stage as a basic element.
The nature of the intermediate servers should be, in general, very
flexible. For any object, the poster, the (human) recipient, and the
maintainer, have all control for the passing and expiration of the
object in the intermediate server.
There might be more than one intermediate stages.

4.3 Expiration
--------------
Expiration is always to be set for any stage, by the maintainer of the
stage. For the local : by the recipient, for the intermediate, by its
maintainer, and for the remote, by the poster. A poster my request a
different expiration, and the recipient and
intermediate-server-maintainer, may (automatically) accept the request
or not.

4.4 Automatic and Manual Forground/Background Transfer of Messages
-------------------------------------------------------------------
The recipient may request to transfer a specific object to the local
site, or may set, that whenever a new object which matches some criteria
is found, at a certain remote/intermediate site, it should
automatically transferred to the local site. The recipient may also
request certain objects to be automatically or manually transfer from a
remote site to an intermediate site in the background, so that it would
be possible to get the objects faster in the forground. Any such usage
of the intermediate server, should be mastered by the maintainer of the
intermediate site, which has to define good rules for a fair share of
the server's resources by the local users.

4.5 Identity of the Object
--------------------------
On Usenet and e-mail any object should have its identity, and there is
no specific rules for the naming. On WWW and FTP the identity is
defined by the location of the object, and hence, a replacement of the
object with an updated one, will not change its identity.
Identity by location has the clear disadvantage if the object has no
fixed location. If we leave for a moment the question of transition,
(we would address it later in 5) a good suggestion for an identity,
would be the address (email?) of the creator of the object, and the
exact time of the creation. This may leave another question on the
definition of the location of an object. We will address the question
on 4.8.

4.6 Output Presentation on Local Site
-------------------------------------
This part enters the specific client software considerations.
Presentation of each object has to be according to the type of object,
whether the object is an html file or a text file etc.
Each object has a header, which is either a stream of input about it
(such as the stream result of the ftp dir command for a file) or the
header of an e-mail/news message. These all should be treated as the
header of the object. The interface should provide both the header and
the object as distinct connected parts of the object.

The header frame might be used for generating a continuation of the
thread (a reply, a forward etc.).

There is a place for a further discussion of the header. How does it
tell us the locations of the object.

4.7 Classification of Objects According to Topic
-------------------------------------------------
In section 3.2 we saw a few examples of classifications, and we may put
them into three main types:
(i) Classification by hierarchies/Sub-Hierarchies/Names/Threads (Usenet)
(ii) Classification by location (FTP)
(iii) Free Classification (WWW)

Practically, I see no advantage in the second type (should not be
confused with the third one, which might be related to the second).
I believe that the first and third ones must coexist: A standard,
type (i) classification, and a free (i.e. non-standard) classification
for html objects.
A type (i) classification, should be standard for all types of objects,
and may include the info that until now could be found in the protocol
envelope, in the object header fields. The output presentation of it,
might be as an HTML interpreted standard structure of the header frame
(see 4.6).

4.8 Location Identifiers
------------------------
Location identifiers, have to identify locations in the most efficient
manner. If it is possible, they should hold info on as many locations
as it is possible, for a specific object, with expiration date for each
of them. If the object has no expiration date on the remote site, there
was no problem and there was no need for more identifiers, as the
intermediate sites will not need to use it, if they hold the object.

4.9 Intermediate stages
-----------------------
In order for the recipient to be able to locate the objects relevant
for him/her. It is imporatant that if not the objects to be pushed to
the intermediate sites, then at least their headers. (or indices of some
kind).
It is possible that there will be a standard remote location for each
hierarchy or subhierarchy , and whenever a user wishes to receive a
list of objects according to a specific classification, The list will
be pulled from the remote site.
There is much more place to discuss this issue. I'm totally unsatisfied
with what I have at the moment.


5 Transition Problems
---------------------
The last thing we wish, is to add just another protocol, so that many
sites which exists at the moment will continue to use recent protocols.

In order to allow it to replace existing protocols, it is better that
the protocol will be compatible with most of the recently existing
protocols, so that all existing could be accessed using this protocols.
Such extenstions of the existing protocols so that they would support
using the new protocol, and application is another direction.

Another option is to decide that a specific protocol (e.g. HTTP) will be
extended to support all spectrum of options discussed.

6 Last Remarks
--------------
The draft above is just a starting point. There are quite a bunch of
problems to be solved until a standards-draft could be written.
I'm not considering it as a good starting point, but this is what I was
able to do at this stage, and I do believe that such a direction of
research, is important. Any comments are welcomed.

7 References:
-------------
1. http://www.billyboard.com/
2. ftp://ftp.dsv.su.se/users/jpalme/draft-ietf-mailext-new-fields-09.txt
3. http://union.ncsa.uiuc.edu/HyperNews/get/hypernews.html
4. http://www.dsv.su.se/~jpalme/w4g/web4groups-summary.html

    Source: geocities.com/uzipaz/eng

               ( geocities.com/uzipaz)