Super Zip Utility - ZipUtilityLZH 1.0.0
Lempel-Ziv/Huffman data compression archival CTOS shareware utility.
(c) 1994 S. Kurowski (SJCSJK) 94/07/28
This utility performs three basic functions:
(1) Archives a list of files to an archive file, using LZWH compression.
(2) Restores a list of files from an archived file, decompressing them.
(3) Same as (1), but the output archive is a self-extracting run file.
Environment Requirements:
OS levels:
BTOS II 3.0 or later, CTOS I 3.3 or later,
CTOS II 3.3 or later, CTOS III 1.0 or later.
Standard SW:
BTOS II 3.0 or later, CTOS 12.0 or later.
Disk storage:
170 sectors for all installed utility software.
27 sectors (13k) overhead is used by run file
archive extraction code in self-extracting
archives.
Memory:
368k for Zip Archive and Unzip Archive.
132k for self-extracting archive files.
Archive Contents:
ZipUtilityLZH.run Version x1.0.0 utility
ZipUtilityMsg.bin Version x1.0.0 binary messages
ZipUtility.data Self-extracting executable code
<$000>SuperZipCmds.sub Version x1.0.0 Executive commands
<$000>ZipUtilityMsg.txt PLK text version of binary messages
<$000>SuperZip.ReadMeFirst (this file)
Revision History:
1.0.0 94/07/28 First release.
The three commands added by Submit of <$000>SuperZipCmds.sub are:
Zip Archive (Case 00)
[File list]
[File prefix(es) from]
[Archive data set (.zpt)]
[Delete existing archive data set?]
[Confirm each?]
[Print file]
[Zip to run file?]
Unzip Archive (Case 01)
[Archive data set (.zpt)]
[File list from (<*>*)]
[File list to (<*>*)]
[Overwrite okay?]
[Confirm each?]
[Print file]
[List files only?]
Unzip Run Archive - or - Run (both are case 00)
Run file archive Run file ('Run' is NOT added!)
[CASE]
[Command]
[File list from (<*>*)] [Parameter 1]
[File list to (<*>*)] [Parameter 2]
[Overwrite okay?] [Parameter 3]
[Confirm each?] [Parameter 4]
Using the SuperZip utility:
SuperZip Features -
Most of the command parameters work as with the regular 12.x CTOS Standard
Software except where noted.
The <$000>ZipUtilityMsg.txt file may be nationalized to replace the provided
ZipUtilityMsg.bin file. The yes/no/blank Executive parameters are also
nationalized when a local Nls.sys (or RMOS NLSService) is available.
When archiving or unarchiving files, SuperZip will show you how large the
compressed file data is in comparision to the input file, as a percentage.
For example, if the input file was 2000 bytes long, and the compressed output
file was 1000 bytes long, the ratio would be display as "... (50.0%) done.".
If you are restoring this same file to its original form, the ratio would
display as "... (200.0%) done.". Please note that this hopefully more
intuitive representation of compression ratios is not the same as the formally
defined 'compression ratio' quantity, which is ( 100% - displayed ratio ).
SuperZip archive files always have the suffix ".zpt". For each file archived,
the archive contains a variable-length header containing a compression method
identifier, a checksum value used to verify file data integrity at the time it
is decompressed, the uncompressed file size, the compressed file size and the
name of the file, including the source directory. The number of files
processed is displayed at the conclusion of the archival or restore operation.
During compression or decompression of files, the "spinning" activity
indicator is a mechanism provided to show the relative compression rates and
let the user know the software is actually working upon a file. The indicator
"ticks" every 4 sectors that are bit-written during compression or bit-read
during decompression. A complete turn of the indicator means 16 sectors have
been processed at the bit level. A pause of the spinning indicator is normal
for this algorithm as compression stages take turns - only the secondary
Huffman stage outputs or inputs bit data. A long pause indicates the file is
very highly compressable and is using a single 8k dictionary for large data
runs. There is no activity indicator for self-extracting archives.
Super Zip Archive -
The Zip Archive command "[File list]" parameter accepts any list of files.
Wildcard filespecs are expanded by the Executive. The [File prefix(es) from]
parameter works in association with the [File List] parameter in a fashion
identical to that of the LCopy utility to build file names to archive from
different sources. All successfully archived input files are stored in a
single output archive or self-extracting run file archive.
Self-Extracting Run File Archives -
The Zip Archive parameter "[Zip to run file?]" optionally archives the dataset
to a self-extracting archive run file (always suffixed with ".run"). The
resultant run file archive dataset may then be run to extract its contents.
To keep the self-extracting code as small as possible (the primary criteria in
the first place) no message files are used by the self-extractor. This
parameter has SuperZip append an otherwise regular .zpt archive file to a copy
of the [Sys]ZipUtility.data file, which is the self-extracting executable.
Please note quick arithmetic assuming average nominal file compression at
archival is around 40% (with 28 sector extractor overhead) shows we should
expect zip-to-run archiving less than around 45 sectors of file data
(depending largely upon data types) will result in an executable archive
larger than the unarchived file data. Normally larger data sets are archived.
When you run a self-extracting archive run file it will accept four optional
parameters. These parameters may by typed directly into the Executive Run
command by an archive recipient, or you may instead use the (rather cheesy-
looking) Unzip Run Archive command also included in SuperZipCmds.sub.
Note that if you use the Executive Run command you will need to enclose the
file list subparameters in literals ('') to prevent the Executive from
expanding your wildcards when restoring.
When running a self-extracting SuperZip archive, user interaction is
facilitated by the following conventions:
(a) The [File list from (<*>*)] and [File list to (<*>*)] parameters
work as usual. (Using literals around wildcards when using Run.)
Additionally, the use of null literals in [File list to (<*>*)]
will act as a filespec placeholder and will restore the file
using its original name.
(b) Files being unarchived are displayed in the format of:
filespecfrom ==>> filespecto
(c) If an error occurs, the CTOS error code is displayed at the end of
the line in parenthesis - for example if the file already exists:
filespecfrom ==>> filespecto (224)
(d) If the output file is being overwritten, an asterisk is displayed
at the end of the line:
filespecfrom ==>> filespecto *
By default, files are not overwritten.
(e) If the [Confirm each?] parameter is set to 'yes' OR the output file
already exists and the [Overwrite okay?] parameter is default, the
self-extractor will display the same information as in (b) but in
parenthesis followed by a question mark prompting a user decision:
( filespecfrom ==>> filespecto ) ?
At this point the extractor will pause for a GO, CANCEL, or FINISH
keystroke from the user. Each of these keystrokes operates just
as it would in SuperZip's Unzip Archive utility.
SuperZip Unzip Archive -
The Unzip Archive restore parameter "[List files only?]" optionally shows the
contents of an archive without restoring it (NOT a run file archive, however),
listing the names of the files, their uncompressed and compressed sizes in
bytes and sectors, and computes the total uncompressed and compressed file
sectors for the archive. The total compressed sectors value is not strictly
the sum of the individual compressed sector counts of the archive's content
files, but rather the actual stored sector sum minus the archive file headers.
If you attempt to restore an archive not produced by this utility it will
report CTOS error code 7 (Not Implemented) and abort processing of the file.
Also, if each resultant decompressed file computed checksum does not match
its header checksum, CTOS error code 245 (Run File Invalid Checksum) is
reported. In the latter case, the corrupt decompressed file is not deleted
so that any of its intact contents can be recovered. The utility will not
terminate so the remaining files in the archive file will still be processed.
SuperZip Design Notes -
The LZH algorithm implemented here is based in part upon earlier work by
Okumura (~1990) (used also in LHarc). The first stage uses an LZ78 dictionary
variation, while the secondary Huffman statistical stage is engaged to
compress the contents of the buffered compressed dictionary data. It
separately counts the dictionary unencoded data and pointer/length values and
builds a Huffman tree encoding these as variable-length integral-bit codes.
When decompressing a file, these two stages are reversed, Huffman first then
LZ78. Substantially more processing is required for compression than
decompression to generate the statistics in the second stage; consequently,
decompression is much faster than compression.
The disk bytestream I/O used for the self-extracting archive code is a 2k
'micro' bytestream module replacing all of the CTOS.lib synchronous disk
bytestream calls with equivalent forward-only acting operations (in C).
An advanced version of this algorithm (LZA1) is under development that
replaces the secondary Huffman stage with table-driven arithmetic coding per
Fenwick and Gutmann (U. of Auckland, NZ, 1993) and further work of Fenwick
(1993) using a new statistical binary indexed tree structure, both of which
significantly facilitate arithmetic coding speed. An order-0 prototype
(LZWA0) of this has already been completed at DSD San Jose.
Check the MailOrder/TS2 shareware system for this new version later this year.
**** Ideas & suggestings regarding this utility are welcome.
****
**** You may distribute this utility freely so long as it is understood to be
**** shareware for no cost.
****
**** San Jose, California - sjk (SJCSJK)
               (
geocities.com/siliconvalley/pines/4011/unisysfreeware)                   (
geocities.com/siliconvalley/pines/4011)                   (
geocities.com/siliconvalley/pines)                   (
geocities.com/siliconvalley)