catdoc - A Review
   By Timothy Swenson

catdoc  is a Unix program created by Victor Wagner and ported
to  the QL by Jonathan Hudson.  catdoc takes a Microsoft Word
file and converts it to plain ASCII text.  And that's it.  It
is a simple  program to run,  simple to operate,  and it does
what it says it does.

So,  why would you need catdoc?   For Wintel (Win 3.1, Win95,
Win98,  & NT) systems,  Microsoft Word is  THE word processor
used.  A  number of documents are  created and distributed in
Word  format,  assuming  that  most  people  have  access  to
Micrsoft  Word or a Word viewer.   For those QLers that don't
have  access to Word, but do run across Word files, catdoc is
the utility  to convert  the Word  files into  something more
useable for the QL.

The catdoc  zip file  is available  from Jonathan's  web page
'The  Dead  Letter  Drop'  or  through  the  normal  freeware
distribution channels.  The distribution  will  fit  (with  a
little room  left) on  a 720K  floppy.  Before  unzipping the
distribution, be sure you know  how  to  prevent  unzip  from
converting the  dot  (.)  extensions  to  underlines.   Since
catdoc  is originally a Unix application it will be expecting
files with  dot  extensions.   I  unzipped  the  distribution
before I knew of this and had to change a few files by hand.

Once  unzipped, you  will have  a number  of files  and three
subdirectories  (   src,  charsets,  &   docs).   The  catdoc
executable is  found under the src  subdirectory.  I moved it
to the main directory to make it easier to use.

Before running catdoc an  Environment  Variable  for  letting
catdoc know where the character set files are located.  Since
I had unzipped catdoc on a floppy I set it as:

   setenv "CATDOCLIB=flp1_charsets_"

Now  to run catdoc all you need is a Word file.  Since I have
Word  7.0 on my PC, I copied over my ToDo list (todo.doc) and
let catdoc chew on it.

The simplest way to execute catdoc is this:

   exec catdoc;"todo.doc"

This  will take the file "todo.doc", convert it to ASCII, and
display  it on the screen.  If you want to save the output to
a file then execute catdoc like this:

   exec catdoc;"todo.doc > todo_txt"

catdoc  does some fairly simplistic reading of the Word file.
I noticed in  converting  my  ToDo  file  a  bunch  of  extra
information  and text that I had deleted out of the file.  It
seems that Word keeps some of this version information in the
file  and when catdoc processes the  file it appears.  When I
converted  a simple test  file with no  revisions, the output
from catdoc looked better.  I  even  added  a  table  to  the
second Word document and catdoc was able to handle it.

Any output  from catdoc will  probably have to  be cleaned up
before  it is presentable.  The text file generated by catdoc
can easily be imported in to Quill, cleaned up, and formatted
to create a final document.

So, if you  don't have access  to Microsoft Word  and need to
read a  Word file on the QL, catdoc  is the tool for you.  It
may not generate a "pretty" document, but it will extract the
text information from the Word document.


               (                   (