catdoc - A Review
By Timothy Swenson
catdoc is a Unix program created by Victor Wagner and ported
to the QL by Jonathan Hudson. catdoc takes a Microsoft Word
file and converts it to plain ASCII text. And that's it. It
is a simple program to run, simple to operate, and it does
what it says it does.
So, why would you need catdoc? For Wintel (Win 3.1, Win95,
Win98, & NT) systems, Microsoft Word is THE word processor
used. A number of documents are created and distributed in
Word format, assuming that most people have access to
Micrsoft Word or a Word viewer. For those QLers that don't
have access to Word, but do run across Word files, catdoc is
the utility to convert the Word files into something more
useable for the QL.
The catdoc zip file is available from Jonathan's web page
'The Dead Letter Drop' or through the normal freeware
distribution channels. The distribution will fit (with a
little room left) on a 720K floppy. Before unzipping the
distribution, be sure you know how to prevent unzip from
converting the dot (.) extensions to underlines. Since
catdoc is originally a Unix application it will be expecting
files with dot extensions. I unzipped the distribution
before I knew of this and had to change a few files by hand.
Once unzipped, you will have a number of files and three
subdirectories ( src, charsets, & docs). The catdoc
executable is found under the src subdirectory. I moved it
to the main directory to make it easier to use.
Before running catdoc an Environment Variable for letting
catdoc know where the character set files are located. Since
I had unzipped catdoc on a floppy I set it as:
setenv "CATDOCLIB=flp1_charsets_"
Now to run catdoc all you need is a Word file. Since I have
Word 7.0 on my PC, I copied over my ToDo list (todo.doc) and
let catdoc chew on it.
The simplest way to execute catdoc is this:
exec catdoc;"todo.doc"
This will take the file "todo.doc", convert it to ASCII, and
display it on the screen. If you want to save the output to
a file then execute catdoc like this:
exec catdoc;"todo.doc > todo_txt"
catdoc does some fairly simplistic reading of the Word file.
I noticed in converting my ToDo file a bunch of extra
information and text that I had deleted out of the file. It
seems that Word keeps some of this version information in the
file and when catdoc processes the file it appears. When I
converted a simple test file with no revisions, the output
from catdoc looked better. I even added a table to the
second Word document and catdoc was able to handle it.
Any output from catdoc will probably have to be cleaned up
before it is presentable. The text file generated by catdoc
can easily be imported in to Quill, cleaned up, and formatted
to create a final document.
So, if you don't have access to Microsoft Word and need to
read a Word file on the QL, catdoc is the tool for you. It
may not generate a "pretty" document, but it will extract the
text information from the Word document.
               (
geocities.com/siliconvalley/pines)                   (
geocities.com/siliconvalley)