Navigation: Table of Contents, Index, next: Emacs, prev: Links
The maintenance of this web-site is greatly facilitated by a number of Perl, Python and Shell (bash) scripts. These scripts modify existing HTML files. If you don't want to edit HTML directly, these scripts are not for you.
Recently, I started to use the WikiWiki model for my website. See the wiki SiteMap for more information. Generally speaking, these scripts try to maintain a book metaphor for the web site. There are pages, they are linked, you can go forward and backward, there is an index, there is a table of contents.
This Python script reads HTML files and produces a keyword and link index of your web-site. This web-site also has such an indices, see site-index.html and link-index.html. In order to use this, your page must contain anchor tags with a name and a title attribute. If the anchor tag has no href attribute, it is added to the keyword index. If the anchor tag as a href attribute, it is added to the link index. This script is of no use if your use of the title attribute conflicts with this.
Example:
<a name="sample" title="Sample Index Entry for AWWW"></a>
This will result in the entry Sample Index Entry for AWWW
in site-index.html for the letter
S
with a link to this code (check it out!).
This Python script replaces LINK tags in HTML files with an expanded list of links, and it replaces a table of contents (TOC) in a HTML file based on this list of links. Sounds confusing? It is.
All you have to do is this:
Here's an example TOC file:
contents index.html index site-index.html chapter awww.html chapter emacs.html chapter atlantis/index.html section atlantis/doku.html subsection atlantis/utilities.html subsection atlantis/english.html section juenger.html chapter comix.html
Here's an example LINK tag:
<link rel="prev" href="doku.html" title="Doku Formate">
This might be replaced by the following links:
<link rel="contents" href="../index.html" title="Table of Contents"> <link rel="index" href="../site-index.html" title="Index"> <link rel="next" href="english.html" title="A German PBEM Game"> <link rel="prev" href="doku.html" title="Doku Formate"> <link rel="up" href="doku.html" title="Doku Formate">
This can be used by html-navigation-bar to generate navigation bars for all HTML pages.
If a page contains a meta description tag, this description is added to the TOC entry in the contents page. A meta description tag looks like this:
<META name="description" content="Idyllic European vacations">
The contents page must contain a TOC tag like the following (could be any heading level from 1 to 6):
<h2 title="TOC">...</h2> ...
The index entry in the sample TOC above can be produced by the html-index script.
This Perl script replaces navigation bars in HTML files with newly generated navigation bars based on link tags. You can find such navigation bars at the top and at the bottom of this page.
A link tag looks like this in the HTML source:
<link rel="index" href="site-index.html" title="Index">
It will be converted into a link like this for the navigation bars:
<a href="site-index.html">Index</a>
The top navigation bar looks like this (ended by an empty line):
<a name="TOP"></a> Navigation: ...
The bottom navigation bar looks like this in the HTML source (ended by an empty line):
Navigation: <a href="#TOP">Top</a>, ...
Note how the bottom navigation bar requires a link to the top navigation bar (and therefore back to the top of the page).
This Perl script replaces overviews in HTML files with newly generated overviews based on heading tags <H2> to <H6>. You can see such an overview at the top of this page. An overview looks like this in the HTML source (ended by an empty line):
<p> Overview: ...
or
<p> Übersicht: ...
or
<p> Sommaire: ...
This Perl script replaces the address tag in HTML files with a newly generated address tag. It contains a link to the page itself (so people can download the page from the WWW and still know where to find the original), a link to the author's homepage, the author's email address and the modification date of the file. All of these are set with command line parameters.
You can find such an address tag at the bottom of this page.
An address tag looks like this in the HTML source:
<address> ... </address>
And this is the new address tag code produced:
<address> <a href="http://www.oocities.org/TimesSquare/6120/awww.html">http://www.oocities.org/TimesSquare/6120/awww.html</a> / <a href="http://www.oocities.org/TimesSquare/6120/">Alex Schroeder</a> <<a href="mailto:kensanata@yahoo.com">kensanata@yahoo.com</a>> / updated: 1999-05-30 / significant changes: 2000-02-08</address>
Utilities required beyond Perl: pwd(1).
This Bash script uses the CVS(1) repository where all revisions of the web pages are stored. One of these revisions has the LAST tag. If the number of modified lines reaches 5% of the current workfile, then the current revision is assigned the LAST tag. The LAST tag therefore indicates the revision where the last 5% of all HTML lines changed, ie. the time of the last significant change. This information is used to put the significant changes into the ADDRESS tag.
The number of lines changed is computed as follows: A diff is produced between the LAST revision and the workfile. The lines A starting with '<' or '>' are counted. This means that a new line adds 1, a removed line adds 1, and a changed line adds 2. Lines starting with 'up:', 'prev:', 'next:', 'updated:', or '<link' are not counted. This is compared to the number of lines B in the workfile, not counting lines starting with 'up:', 'prev:', 'next:', 'updated:', or '<link'. If A / B >= 0.05, we assume significant changes between the LAST revision and the workfile.
Utilities required beyond Bash: cat(1), cut(1), cvs(1), date(1), grep(1), and wc(1).
This is a fragment of the Makefile I use to produce this web-site:
SCRIPT_DIR := scripts HTML := $(shell echo *.html atlantis/*.html) DOCROOT := /home/alex/WWW/home WWWROOT := http://hammer.prohosting.com/~gumbart all: $(ELISP) $(SCRIPTS) $(SCRIPT_DIR)/html-index.py $(HTML) $(SCRIPT_DIR)/html-toc TOC $(SCRIPT_DIR)/html-navigation-bar $(HTML) $(SCRIPT_DIR)/html-overview $(HTML) $(SCRIPT_DIR)/html-address -r $(DOCROOT) -w $(WWWROOT) -u $(WWWROOT)/ $(HTML) $(SCRIPT_DIR)/html-changes $(HTML)
Note that I would have to provide author name and author email to the call to html-address if my name and email weren't the default <grin>.
Navigation: Top, Table of Contents, Index, next: Emacs, prev: Links