WebRFM 0.4 (beta) - A Remote CGI File Manager.
Copyright (C) 1999 Yoram Last (ylast@mindless.com)
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
ABOUT THIS FILE:
==================
This is the readme file for WebRFM 0.4, released December 18, 1999. This file
contains general information and installation instructions for WebRFM. More
information can also be obtained by looking at the main WebRFM script in
WebRFM's 'scripts' directory. Further documentation is included in WebRFM's
online help pages. WebRFM needs to first be installed before these pages can
be properly accessed. Also, for the latest information on WebRFM, you should
check its homepage at
http://webrfm.netpedia.net/index.html
CONTENTS:
=============
A. WHAT IS THIS?
B. WHAT DOES IT DO?
C. EXTERNAL PROGRAMS
D. COMPATIBILITY
E. SYSTEM REQUIREMENTS
F. INSTALLATION
G. STATUS OF THIS PROGRAM
H. TODOS
I. TECHNICAL SUPPORT
A. WHAT IS THIS?
=================
WebRFM (Web-based Remote File Manager) is a CGI-Perl program aimed at providing
a single solution for remote Web-based file management, and at replacing traditional
FTP-based access for that purpose. It is suitable for managing websites, as well
as for more general purpose file management tasks. WebRFM combines a "visible"
HTML 3.2 compliant form-based layer (which is in the spirit of the tools currently
provided by many large hosting services) along with a "hidden" direct HTTP layer
that implements a class 1 WebDAV server. Support for some legacy HTTP methods
(which are essentially borrowed from AOLserver and Netscape's Enterprise server) is
also provided. While WebRFM can be installed and used by individual users, it is
specifically designed to provide a secure system-wide solution that is suitable for
usage by ISP's, web-space providers, etc. WebRFM currently runs on UNIX/Linux
systems.
B. WHAT DOES IT DO?
===================
1. Provides a simple-minded (but effective) HTML form-based file manager.
2. Provides a built-in HTML form-based text editor.
3. Supports file retrieval (downloading) as well as form-based (RFC 1867)
file uploading. FTP-like 'text mode' option is available for file transfers
in both directions.
4. Provides support for the HTTP 1.1 'PUT' and 'DELETE' methods. Files can
be transparently edited and then published using applications that support
'PUT' (such as the HTML editor in Netscape's Communicator).
5. Provides extensive support for additional HTTP methods that are used for
content management. This currently includes a rough implementation of
a class 1 WebDAV server (MKCOL, MOVE, COPY, and PROPFIND methods), along
with support for some legacy (non-standardized) HTTP methods (MKDIR, BROWSE,
INDEX, SAVE, EDIT, and RMDIR methods). This results in WebRFM being able
to work properly with many clients that are designed to use those HTTP
extension methods. In particular, AOLpress, SiteCopy, Cadaver, Microsoft's
Web Folders (it comes as part of Internet Explorer 5) and Office 2000
applications, and Netscape's Communicator Roaming Profiles, are fully
functional with WebRFM.
6. Designed to operate in a secure way on multi-user systems. In particular,
the following security-related features are provided:
a) Runs in the security context (UID/GID) of the authenticated user that
is using it, so OS-based restrictions (such as quota limits) are being
imposed. A special setuid wrapper is provided in order to provide a simple
way for running WebRFM in this way. It is also possible to use other
wrappers, such as the Apache 'suEXEC' wrapper.
b) Optionally implements a user-dependent 'virtual root directory'. By
default, each user's home directory appears to him as if it where the
file system root directory and he can't access anything outside of it.
The 'virtual root directory' can be changed to be any other directory,
including a subdirectory of the user's home directory.
c) Contains built-in access control mechanisms (both location-based and
user-based) that can be used to enhance and double-check server-imposed
access control. Implements various checks (such as imposing minimal
UID and GID to run as) to insure secure operation.
d) Has a built-in permissions engine that can be easily customized to
impose various restrictions beyond those that are natively provided
by the OS.
7. Modular, highly configurable design. WebRFM's behavior is controlled
by a fairly large number of variables that can be changed to customize it
in various ways. In particular, WebRFM's HTML interface is very
configurable. Arbitrary parameters for the '' tag, two sets of table
parameters, and several size parameters can be set, and they control
the appearance of the interface in a consistent way. Properties that should
be user-controlled (like those that effect the appearance of WebRFM's
HTML output), are stored in a per-user configuration file and can be
modified from a web-based interface. Administrators can disable user-
control of some (or even all) of these properties. Properties that
should only be changed by administrators are set in the main script
and are protected from user intervention.
Remark: Essentially everything that can be done by using WebRFM can also
be done through FTP. Some of the main advantages of WebRFM over FTP are:
1) It does not require any client other than a web browser, and it is
thus likely to be simpler to use for non-technical users.
2) It can be much more secure, because:
a) It can be used over completely encrypted connections (by using an SSL
capable server).
b) It can be used in conjunction with secure authentication schemes (such
as digest authentication) that avoid sending plain text passwords over
non-encrypted connections.
c) It Can be used behind a firewall through a standard HTTP proxy server.
d) The 'virtual root directory' and other built-in mechanisms can be used
to limit access to the system itself, as well as to impose many
restrictions on what can be done through WebRFM.
C. EXTERNAL PROGRAMS:
======================
The WebRFM distribution archive includes the following two external scripts.
They are used by WebRFM, but are not part of it.
1. cgi-lib.pl by Steven E. Brenner: "The de facto standard library for creating
Common Gateway Interface (CGI) scripts in the Perl language." It is located in
WebRFM's 'lib' directory. Details concerning usage and distribution of this
library can be found in the body of the cgi-lib.pl file itself. More
information concerning it can be found at the cgi-lib.pl homepage at
http://www.bio.cam.ac.uk/cgi-lib.
2. getcwd.pl by Brandon S. Allbery: Gets the current working directory, and used
by WebRFM precisely for this purpose. This is simply the getcwd script from the
standard Perl library, which is included with the standard Perl distribution.
We include it here (in WebRFM's own 'lib' directory) to release WebRFM from
needing anything other than a Perl binary in order to work.
D. COMPATIBILITY:
==================
1. Server-side:
WebRFM exploits both Perl and the (standard) CGI interface quite a bit. As a
result, it requires a good Perl interpreter and a server that has a robust CGI
implementation. The current version was tested mainly on Red Hat Linux 5.x
systems with Apache 1.2.6/1.3.3 servers and Perl 5.004. However, it should work
on any UNIX system with Perl 4.036 or later installed, using any web server
that has a robust CGI implementation (in particular, the server should be willing
to transfer arbitrary HTTP request methods to CGI programs). Non-UNIX Operating
systems are not currently supported.
2. Client-side:
WebRFM's HTML form-based layer should work with any HTML 3.2 compliant browser.
JavaScript support would make some things work a little bit faster, and there
are some very minor features that are only available with Netscape browsers
(WebRFM is at its best when using a Navigator 4.xx browser with JavaScript
enabled). However, those things are not really necessary, and WebRFM remains
fully functional without them. Some current browsers do not properly support
form-based file uploading, and thus this particular functionality is not
available with such browsers. Overall, WebRFM's form-based interface was
designed to ensure compatibility with a wide range of browsers and screen
resolutions (including browsers that run on non-PC devices). It had been
specifically tested for compatibility with pure text browsers (W3m, Lynx)
and with WebTV. There is a minor problem when using Microsoft's Internet
Explorer (version 3.0 or higher) in that the 'GET as TEXT' and 'GET as BIN'
file downloading methods are not effective (they behave the same as 'GET').
This is because MSIE ignores the server-reported MIME type of files and decides
by itself what is the type of the file and what it should do with it (this is
in violation of the HTTP protocol).
WebRFM's direct HTTP layer should work with any client that is designed
to do one of the following:
a) To publish documents using the HTTP 1.1 PUT method, and/or to remove
documents (or directories) using the HTTP 1.1 DELETE method.
b) To use the extension methods of an AOLserver (aka NaviServer).
c) To use the extension methods of a Netscape Enterprise server
(except for locking and versioning related functionality).
d) To work with a class 1 WebDAV server.
Some specific clients that where found to work well with WebRFM's
direct HTTP layer are mentioned in section B above.
E. SYSTEM REQUIREMENTS:
========================
WebRFM should work on any system that runs an appropriate operating
system, Perl interpreter, and web server. For reasonable performance
(namely, convenient response time), the following minimal system
configurations (or equivalents) are recommended (faster is always better):
Linux: 486DX2 (66 Mhz) with 16 MB RAM.
If the system is simultaneously running other programs (e.g., a number
of servers) larger memory might be needed to obtain reasonable performance.
F. INSTALLATION:
=================
WebRFM is quite flexible in how it can be installed and used. We provide below
explicit instructions for two types of installations: A single user installation,
and a system wide installation. Before we move to describe those specific
installations, it would be useful to note a few things about WebRFM's design,
which can be thought of as having the following three parts:
a) A 'Main Program Directory', where most of the program actually resides. In
general, it can be located anywhere. When WebRFM is run, it must have read
permission to most of the files in this directory.
b) A gif image file called 'highdir.gif' that should be made retrievable directly
through the web server. It can be added to an existing 'icons' or 'images'
directory, or reside in its own (web accessible) directory.
c) The 'main WebRFM script' which is the one and only file that is being run as
a CGI script. It can be moved anywhere and renamed as desired, as long as it
is being enabled as a CGI script. In order for WebRFM to work, there are two
pieces of information that must be entered in the body of this file as part
of the installation: The location of the 'Main Program Directory' (so that
WebRFM can find the rest of itself) and the URI which corresponds to the
directory where the 'highdir.gif' image file is found (so that WebRFM can
create appropriate references to it). The 'main WebRFM script' also doubles
as being the main configuration file for WebRFM. The first part of this file
contains many variables that can be set to control various aspects of WebRFM's
operation.
Other than these three parts, we should also note that each user running WebRFM
should have a configuration directory where WebRFM keeps some per-user
configuration files (These files should never be edited manually. WebRFM provides
a form-based interface to manage them.) The location of this configuration
directory can be set in the 'main WebRFM script' (the default is ~/.WebRFM).
If it does not exist, it would be automatically created when WebRFM is used for
the first time by a user.
Other than the main configuration information in the 'main WebRFM script',
there are two additional files that contain configuration variables. Both
reside in WebRFM's 'lib' directory (the 'lib' subdirectory of the 'Main Program
Directory'). The first is the file 'initlib.pl', which contains default values
for user controlled variables. The second is the file 'extlib.pl' which
contains most of the implementation of WebRFM's direct HTTP layer. The first
part of this file defines some variables that control some aspects of this
layer. Normally, it should not be needed to modify any of these files.
Another file that may need to be edited is WebRFM's default MIME table. This
is the 'mimetable' file in WebRFM's 'lib' directory. If WebRFM is used for
managing web content, then it is recommended that the MIME type matchings
defined in this file would correspond as closely as possible to those that
are done by the web server. (Note that all of the file extensions in this file
must be capitalized. The matching WebRFM eventually does is case insensitive.)
We can now proceed to describe some specific installation setups of WebRFM:
Private single-user installation:
----------------------------------
An installation of this type can be done by any user that:
a) Has a valid user account on a UNIX system.
b) Has the privilege of running CGI programs (through an appropriate web
server on that system) in his own user context.
Shell access may be helpful for the installation, but is not essential. In
most cases, FTP access would suffice (but some of the text editing described
below would need to be done on a remote machine). Prior experience in running
CGI programs is recommended. Do the following:
a) Extract the distribution archive to its final destination. You must extract
it in a way that preserves directory structure, such that you get a top-level
'WebRFM' directory (this is your 'Main Program Directory') with a number of
of subdirectories (we refer to those as WebRFM's directories). Your home
directory should be a good place to extract, such that you will get a
'WebRFM' subdirectory in your home directory.
b) Copy the file 'highdir.gif' from WebRFM's 'sfdir' directory to some place
within your web space, such that it can be retrieved through the web server.
c) The file 'webrfm.cgi' in WebRFM's 'scripts' directory is your 'main WebRFM
script'. Copy it to where you want to run it from, and enable it as a CGI
script. Restrict access to it such that it is only accessible to whoever is
supposed to access it (presumably just you). It is strongly recommended that
you use a username + password authentication scheme.
d) Open your CGI-enabled 'main WebRFM script' with a text editor. Look for
the line starting with '$ProgDir = ', and set the value of $ProgDir to be
the full path to your 'Main Program Directory'. Then look for the line
starting with '$SendFilesUrl = ', and set the value of $SendFilesUrl to
be the URI which corresponds to the directory into which you previously
copied the 'highdir.gif' file. Also, make sure that the first line of the
script points to Perl on the system. Save your changes.
WebRFM should now be properly installed.
System-wide installation:
--------------------------
In order to perform this type of installation, you should become the root
user.
a) Extract the distribution archive to its final destination. You must extract
it in a way that preserves directory structure, such that you get a top-level
'WebRFM' directory (this is your 'Main Program Directory') with a number of
of subdirectories (we refer to those as WebRFM's directories). The
recommended location to extract the archive is /usr/local/lib, such that
your 'Main Program Directory' will be /usr/local/lib/WebRFM
b) Copy the file 'highdir.gif' from WebRFM's 'sfdir' directory to some place
within your web space, such that it can be retrieved through the web server.
If you have a global 'icons' or 'images' directory, it should be a good
location for it, as long as you don't already have some other file with
the same name in there.
c) The file 'webrfm.cgi' in WebRFM's 'scripts' directory is your 'main WebRFM
script'. Open this file with a text editor. Look for the line starting with
'$ProgDir = ', and set the value of $ProgDir to be the full path to your
'Main Program Directory' (if you followed the recommendation in (a), this
should already be set for you). Then look for the line starting with
'$SendFilesUrl = ', and set the value of $SendFilesUrl to be the URI which
corresponds to the directory into which you previously copied the
'highdir.gif' file. Also, make sure that the first line of the script points
to Perl on your system. Save your changes.
Your basic installation of WebRFM is now complete. However, In order for your
users to be able to use it, they would need some (properly authenticated) way
to get the 'webrfm.cgi' file in WebRFM's 'scripts' directory to run as a CGI
program in their appropriate user context (namely, it needs to be run with
their UID/GID). There are several ways of doing that. If you already have
a mechanism (such as the Apache suEXEC wrapper) that enables users to run CGI
programs in their own user context, it can also be used to run WebRFM. Your
users can simply set (or you can set for them) a simple two-line wrapper
script of the form
#!/usr/bin/perl
require "/usr/local/lib/WebRFM/scripts/webrfm.cgi";
Of course, a script of this type should be owned by the appropriate
user/group, and access to it must be restricted appropriately.
Another (generally much simpler) way is to use WebRFM's own setuid wrapper
in order to provide all of your users access from a single point. The code
for this wrapper is the file 'wrfmwrap.c' in WebRFM's scripts directory.
You should open this file with a text editor, and then follow the
instructions given there (in particular, note the warnings given there).
Once you have this wrapper properly installed as a setuid CGI program
(if you have a cgi-bin directory, putting the wrapper there should
normally be OK), you would need to set access control for this file, such
that your users would get authenticated with their appropriate user names.
Please note that the wrapper works in the following way: It trusts the
server to supply the appropriate user name in the REMOTE_USER environment
variable, and then if it finds a valid system user with that user name,
it spawns WebRFM with the corresponding UID/GID. Technically speaking, you
can achieve the appropriate kind of authentication (we assume here that you
are using an Apache server, although most other servers should be similar)
by using your /etc/passwd file (if you are using it as a user database) as
an 'AuthUserFile'. However, it is strongly recommended NOT TO DO THAT, since
it exposes your root account (and other privileged accounts) to password
guessing attacks. A much better approach would be to use a separate
'AuthUserFile' file. You should just make sure that this file includes
proper user names (that is, names that correspond to all of your system's
users that need to run WebRFM, but not names of any privileged users that
should not send passwords over non secured connections). Of course, there
are many Apache modules to authenticate against various types of user
databases, and many of them can also be used here. The main principle to
remember here is that users must be valid system users, and they should also
have valid home directories on the system.
An important note concerning Apache and proper WebDAV operation:
-----------------------------------------------------------------
If you are using an Apache server and you would like WebDAV clients such
as Microsoft's Web Folders to work properly with WebRFM, there is still
one further thing that you would need to do. Note that this applies to
both private and system-wide installations. The source of the problem is
that WebDAV clients expect the server to provide a DAV header in responses
to OPTIONS requests for DAV enabled resources. While WebRFM has an
appropriate implementation of the OPTIONS method, Apache handles such
requests by itself and does not transfer them to WebRFM at all. My
(temporary?) workaround is to use the Apache 'Header' directive (this
requires mod_headers to be available) in order to force a 'DAV: 1'
header to be attached to every response from WebRFM. This insures the
inclusion of this header in OPTIONS responses, and it shouldn't hurt
anything else. For example, if I have a single point system-wide
installation, and WebRFM is available as '/cgi-bin/webrfm', then the
following lines in my httpd.conf file do the trick:
Header set Dav 1
In case of a private installation, a similar setting in an appropriate
.htaccess file should work.
If you further want WebRFM's WebDAV layer to work smoothly with Microsoft
clients (or even to work at all, in case that you have the FrontPage extensions
installed on the same server), then there is yet one more header that needs
to be added in a similar way. This is the 'MS-Author-Via' header (a proprietary
Microsoft header that is instructing Microsoft clients how they should try to
accomplish content management) which should have the value 'DAV'. That is,
the complete WebDAV-related header setup in your httpd.conf (or equivalent)
should be something like:
Header set Dav 1
Header set MS-Author-Via DAV
A note concerning temporary files:
-----------------------------------
When files are uploaded using WebRFM's form-based interface, they are initially
stored as temporary files. Then, if some problem arises in moving them to their
final destination (for example, if there is already a file by that name and the
user didn't check the 'overwrite existing files' box on the upload form), the
user is prompted for further action. If the user doesn't respond to that prompt,
then the temporary file would remain, and with time this can lead to the
accumulation of many such "garbage" files. A similar thing happens also in case
that the user saves a file using the 'Save As' option of the Text Editor (a
temporary file is created, and it might remain in case that there is a problem
and the user doesn't respond to WebRFM's prompts). These temporary files are
being created in WebRFM's temporary directory, which can be set in the
'main WebRFM script' (the default is ~/.WebRFM/temp). Users having a private
installation should occasionally scan their temporary directory and clean
whatever accumulated there. In system-wide installations, it is recommended to
run a daily cron job that would scan those temporary directories and delete old
files that are found there.
A note concerning files in WebRFM's 'htm' directory:
-----------------------------------------------------
Files that are located in WebRFM's 'htm' directory (by default, these are
just WebRFM help files), can be retrieved by calling WebRFM with a query
string of the form ?com=rshgethtm+, where should
be substituted for the name of the file. Such files are not simply retrieved,
but are considered by WebRFM to be HTML files that are intended to be
dynamically parsed. WebRFM scans those files and replaces certain special
strings with values of corresponding WebRFM parameters (see the 'ParseHTM'
subroutine in the main script for what is being replaced) and it also
attaches its footer (along with the ending