Q L H A C K E R ' S J O U R N A L
===========================================
Supporting All QL Programmers
===========================================
#27 January 1998
The QL Hacker's Journal (QHJ) is published by Tim
Swenson as a service to the QL Community. The QHJ is
freely distributable. Past issues are available on disk,
via e-mail, or via the Anon-FTP server, garbo.uwasa.fi.
The QHJ is always on the look out for article submissions.
QL Hacker's Journal
c/o Tim Swenson
38725 Lexington St. #230
Fremont, CA 94536
swensontc@geocities.com
http://www.geocities.com/SilconValley/Pines/5865/
EDITORS' FORUMN
The QHJ is back. After a year of taking a break, I'm back
in the programming spirit again. Of course, I have not been
inactive during that time, as any reader of QL Today can
attest. I just have not felt like writing any programs for
a while. I guess I did get burnt out a bit. Now we'll see
how long before I get burnt out again.
Having recently purchased Qliberator, I have found its
manual similar to the original QL manual, full of
information, but kind of hard to find without reading the
whole manual. I'm all for reading the whole manual, but
sometimes it takes a while to figure out exactly how to
apply what you are reading. I sometimes like manuals that
are more "If you want to do this, this is how to do it."
From this thought, came the idea for the "Qlib Source Book",
which will be something similar to the "Z88 Source Book".
The Z88 Source Book was a collection of existing knowledge
about the Z88. Most of the Z88 Source Book came from older
published sources. With Qlib, there does not seem to be a
wealth of published material helping the beginning Qlib
user. So, time to send out a query and ask for material.
If you are an experienced Qlib user and have a few tricks
that you would like to pass on, please send them to me
(either hard copy, disk or e-mail). If you are a beginning
Qlib user and you have questions that you would like to see
answered, send them too. Since I do not have the knowledge
to really do the subject well, I will play the role of
editor. I'll collect the different submissions and put them
in an organized document. The Qlib Source Book will be
Freeware in its electronic form. Like the Z88 Source Book,
a hard copy version will probably be available at minimal
cost. With the Z88 Source Book the price of the book
covered the cost of production and a small profit to FWD
Computing. It was my way of supporting the primary US QL
dealer.
Through the QHJ and QL Today I'll keep QLers informed of my
progress. I've already volunteered Dilwyn Jones to help in
writing some parts. Dilwyn has a number of years of
experience with Qliberator and producing commercial
software.
So, here is the long awaited next issue of the QHJ. Feel
free to send any comments, complaints, articles, large
denomination bills, etc. Enjoy.
REGULAR EXPRESSIONS
In all the years that I've been dealing with Unix, one of
the things that I have not taken the time to really learn is
Regular Expressions. Regular expressions are based on a
mini-language used for pattern matching in a number of Unix
search utilities. The most well known of these programs is
grep and its variations fgrep and egrep. The term 'grep' is
even derived from the words 'regular expression'.
No matter what operating system you have used, you have
probably run across a regular expression. Most operating
systems have a way of understanding something like this;
"dir *.txt". In MS-DOS this means to list all files that
end with a .txt extension. In QDOS, the equivilent phrase
would be "wdir flp1__txt". The asterisk or star, "*", is a
wild card and means to match all strings. The asterisk is
really a metacharacter. Metacharacters are special
characters that mean different things in the regular
expression language. More experienced users of MS-DOS may
have used something like this; "dir *.e??". This means to
match all files that start with a .e in the extension. It
will match .exe, .efs, .exx, and others. The question mark
is a metacharacter that means to match any character of
length one.
So what does all this means to QDOS users? Well, a version
of grep has been ported to the QL and comes with the C68
distribution. Grep is a very powerful and popular utility
that can fill a number of needs. It is used to extract
lines of text from files, but with its handling of regular
expressions, it can be very smart on what it extracts. Once
you know how grep works and how to use it, you will probably
remember a time when it would have been useful to you.
With grep, you can do two things with its output, it can go
to standard output or you can redirect it to a file. Since
the QL does not have standard output, the QL version of grep
opens a window to display its results. it also supports
file direction. This means that you can send the output of
grep to a file to be dealt with later.
To demonstrate the file redirection, lets take a look at a
short grep example. In this example we have a text file and
we want to find all lines that have the word QL in them:
exec flp1_grep;"ql flp1_file_in > flp1_file_out"
Since we are using arguements, we have to put them in quotes
after the grep command. The results of the grep will now be
in th file flp1_file_out.
Before we go to far, let's talk about three major concepts
in regular expressions: characters, metacharacters, and
character classes. A character is basically a byte, be it a
text byte or binary byte. Metacharacters are a set of
characters that are part of the regular expression language.
In the examples above, the asterisk is a metacharacter. A
character class is a way of matching a group of characters.
Let's take a look at the metacharacters:
A character matches itself. Any character or string of
characters are taken as literals. If you want to find the
string "ing" in a file you would use the regular expression
"ing". Most of the times when I am using grep, I use only
literal characters.
A dot (.) matches any character, but only 1 character,
similar to the question mark in MS-DOS. If you want to find
a word in a text file that has three letters, starts with a
B and ends with D, then you would use the regular expression
B.D (grep is case sensitive. Upper case lettering has only
been used to highlight the example.).
The caret (^) means the beginning of a line. If you want to
find all lines that start with the word "The", you would use
the regular expression "^The".
The dollar sign ($) means the end of a line. If you want to
find all lines that end with the word "end", you would use
the regular expression "end$".
The question mark (?) is used to match an optional
character. If you wanted to find the word "color" but don't
know if the British spelling "colour" is used, the regular
expression "colo?r" would work. The ? means optional.
The plus (+) is used to match one or more items. If you
want to find the words helper or helps, but not just help,
you would use the regular expression "help+". The plus must
match at least one character or it will fail.
The asterisk (*) is used like +, but it allows a null match.
To find the words helper, helps and help, the regular
expression "help*" would work. The asterisk allows for no
character, as in the case of just help.
To get a little more power out of regular expressions, there
is a metacharacter for the logical OR, the pipe symbol (|).
Say you have a text file with a bunch of e-mail messages and
you want to find all of the From and Subject lines, you
would use the regular expression "From|Subject".
Now that you know how to use the OR metacharacter, you will
find that you need to limit the OR. That's were the
parentheses () come in. Using the last example of finding
the From and Subject lines from e-mail messages, using the
regular expression "From|Subject" will also find lies with
either word in them. With e-mails, the From in the From
line is always followed by a colon; "From:". The same goes
for Subject. Now how do we write a regular expression for
this? One way is this: "From:|Subject:". This will work,
but a "cleaner" approach is this: "(From|Subject):". Since
AND's are assumed in regular expressions, what you get is
this "( From OR Subject ) AND :". Just like in math, the
parentheses control the bounds of the OR condition.
The backslash (\) is used to make a metacharacter a literal.
If you want to look for all lines that end will a full
sentence, meaning they end with a period, you could use the
following regular expression: ".$". But, since the period
is a metacharacter you will find all lines that end with a
character. To get grep to use the period as a period, you
need to use the backslash like this; "\.$". The backslash
tells grep to take the next character as a literal and not
to interpret it.
Character classes are used as a way to search for groups of
characters. Say you wanted to match the numbers less than
4. You could do this with "(1|2|3)". Using the brackets,
you could also create a character class "[123]". The true
power of the character class comes when using the period.
The period means to create a range of characters
(Metacharacters mean something else when in a character
class). In the last example, the character class could also
be written as "[1.3]", meaning all characters from 1 to 3.
To define the letters of the alphabet the character class
would be "[a.z]". Since grep is case sensitive, a better
character class would be "[a.zA.Z]".
You can mix up characters in a character class any way you
like. Say you have to find all occurances of numberical
dates in a file. Dates could be defined as 7-23-97, or
7/23/97, or even 7.23.97. You want to find any dates with a
dash, slash, or period. You would create the character
class "[-/.]". Remember that the period means only itself
when inside a character class and does not mean to match a
single character. So to find our dates, we would use the
regular expression "7[-/.]23[-/.]97".
The caret (^) means something else when used in a character
class; it means to negate the class. If you want to match
anything but numbers, you would create the character class
"[^0.9]". The caret works to negate when it is immediately
used after the opening bracket. If it is used after that it
only means itself. The character class "[-.^]" matches only
a dash, period, or caret.
If you are interested in learning more, check out the book
"Mastering Regular Expressions" by Jeffery Friedl.
END-OF-FILE FINDING
A lot of the programs that I like to write are filters.
They take a text file as input, do something to the file,
and output the results to a second file. Doing this
involves reading a file one line at a time. A way of doing
this would be something like this:
REPeat loop
INPUT #4,in$
IF EOF(#4) THEN EXIT loop
PRINT in$
END REPeat loop
This algorithm will work, except that it will not output the
last line. When I first tried this, I could not figure out
why the last line was not being output. It was all based on
how I saw the program being executed. I thought that the
INPUT statement would read in the end-of-file (EOF) marker
and then do a compare. What is really happening is that the
last line is read in, then the EOF check is made. Since the
file pointer advanced after reading in the last string, it
is now pointing at the EOF marker. When the EOF check is
done, it returns TRUE and the EXIT loop is done. A better
example would be this:
REPeat loop
IF EOF(#4) THEN EXIT loop
INPUT #4,in$
PRINT in$
END REPeat loop
This will print out the last line of the file. But, this
algorithm also has its faults. It assumes that there is an
end-of-line (EOL) marker at the end of the last line. If
there was not EOL and only the EOF, an error would occur
reading in the last line.
A better routine would read in each character and put the
line together while constantly checking for an EOF. Here is
an example:
DEF PROCedure read_line
in$=""
REPeat loop
IF EOF(#4) THEN EXIT loop
byte$ = INKEY$(#4,-1)
in$ = in$ & byte$
END REPeat loop
RETURN in$
END DEF read_line
It would be used like this:
next_line$ = read_line
If using Qliberator, you can use the Q_ERR function to
locate EOF. Q_ERR can only trap for EOF after the fact.
You keep reading through the file until you get an EOF
error, which is trapped by Q_ERR. This means that you would
check for Q_ERR/EOF after an INPUT statement. An example
is:
Q_ERR_ON "INPUT"
REPEAT loop
INPUT #4,in$
IF Q_ERR = -10 THEN EXIT loop
PRINT in$
END REPEAT loop
Q_ERR_OFF
BACKGROUND PROGRAMS
Back in the hey-days of MS-DOS, before MS-Windows, there was
a neat type of program called "Terminate & Stay Ready"
(TSR). The program could be loaded up at boot time, remain
in memory while other programs were running, and could be
called up at any time. The program would stay in the
background until a funny key sequence was typed in, then it
would pop-up in front of the current program and be ready to
do something. Sidekick was the first popular program to do
this.
Since MS-DOS could not multitask, how this was done is still
a mystery to me. In the QDOS world, where multitasking is a
reality, a program like this is fairly easy to do. Since
SuperBasic will not multitask, the end program has to be
compiled in some way. For this article, I'll use Qliberator
to compile SuperBasic.
A background job is designed to be hidden and not appear
until it needs to. This means that the program will not
immediately open any windows and only open them when
necessary.
When compiling this with Qliberator, be sure to turn the
WINDS option off. The program will open it's own windows.
If you have WINDS turned on, the program will execute, but
you will need to do a CTRL-C to get back to QDOS. If
anybody knows exactly what I'm doing wrong, please let me
know.
100 job = Q_MYJOB
110 QP job,128
120 x = KEYROW(7)
130 IF x = 20 THEN
140 BEEP 1000,10
150 OPEN #3,con_50x50a100x100_32
160 PAPER #3,0: INK #3,2: BORDER #3,4,2: CLS #3
170 PRINT #3,"Hello"
180 x$ = INKEY$(#3,-1)
190 CLOSE #3
200 END IF
210 GO TO 120
MICROEMACS LINE NUMBERING
I've been meaning to tinker around with MicroEmacs macros
for some time, but never got around to it. Recently I
decided to take the time to really give it a try. Of all of
the text editors available for the QL, I think MicroEmacs is
the most powerful. It's macro language is the most robust
of the editors. Both QED and ED have macros that can
automate keystroke commands, but they don't have any logic
(IF..THEN) or structure ( WHILE ) features. MicroEmacs has
looping and logic controls.
As an example, I thought that a line number macro would be
nice. The following macro goes to the beginning of the file
and starts putting line numbers on each line. Before it
does this it queries you for a starting line number, which
are are incremented in 10's. To determine when to stop
processing, I had to know when the macro had reached the end
of the file. Since there is no end-of-file checking
mechanism, I had to move to the end of the file and get the
line number of the last line. This was then used in the
while loop. If there are lots of empty lines at the bottom
of the file, there macro will number them also. A check
could be put in the see if the current line is empty, but
this would not work if a line had only white space in it (
tabs and/or spaces).
I noticed two differences between the execution of
MicroEmacs and ED/QED macros. One, ED/QED macros are kind
of slow and take a while to run. MicroEmacs macros are very
fast. Total run time for this macro in an 20 line routine
was about 1-2 seconds. Two, when executing ED/QED macros
you can see what is going on as it happens. The screen
updates with each command. With MicroEmacs, the screen
seems to update only at the end of the macro. When the
macro went to the bottom of the file and then returned to
the top, I thought it would display the movement, but it did
not. If you do want to update the dislay while a macro is
executing, there is a redraw screen command that you can
use.
The documentation for the MicroEmacs macros is good in
documenting the different commands, but it falls short of
providing many examples. I used other macros that came with
MicroEmacs to learn from. This can slow down the learning
process, but there is no other alternative. In some ways I
use this same technique in other languages. I keep bits of
code around so I don't have to memorize how to do a routine
in a particular language, I just go though my old code.
; Line Numbering Macro
set %line_num @"Starting Line Number? "
end-of-file
set %tot_lines $curline ;LET tot_lines=line number @ EOF
beginning-of-file
!while &less $curline %tot_lines
beginning-of-line
insert-string %line_num
insert-string " "
set %line_num &add %line_num 10;LET line_num=line_num+10
next-line
!endwhile
beginning-of-file
ADDING CONFIG BLOCKS TO QLIB PROGRAMS
BasConfig is a utility, written by Oliver Fink, that creates
config blocks for Qliberator compiled programs. For those
that don't know, config blocks are extras chucks of data
added to programs that are changeable by the user, using the
program "config". In other words, if you have a program and
you want the user to be able to change the size of the
programs window, you can put the variables for the window
size in a config block and let the user configure anytime
they want. Config blocks are part of the executable and do
not interfere with the running of the program. The 'config'
program knows where in the executable the config block is
and knows how to change it.
Another way of looking at the config block is as an object
that has some data that is used by your program and is
separate from your program. In fact, until your program is
compiled, the config block is a separate file from your
SuperBasic program. This block is accessable from both your
program and the "config" program.
BasConfig creates a file that has the config block and some
SuperBasic extensions that allow the program access to the
block. These extensions need only be LRESPRed when you are
developing your program. They can be compiled into your
program and become part of the executable.
Before you use BasConfig, you need to define what type of
data you want the user to be able to change. There are 7
different data types that are allowed in config blocks:
String
Long Word
Word
Byte
Select
Code
Char
BasConfig does not support the Long Word or Select data
types. I don't have any documentation on config, so I can't
say exactly what the difference is between the types other
than what is obvious.
To access the data in the config block, there is a function
for each data type supported by BasConfig:
C_STR$(n) - String
C_WORD(n) - Word
C_BYTE(n) - Byte
C_CODE(n) - Code
C_CHAR(n) - Char
The functions return the Nth data type in the config block.
If you have two CHAR's and one STRING data types in the
config block and you wanted to get the second CHAR, you
would do something like this:
$var = C_CHAR(2)
If your config block does not have a CHAR data type, you
should be back some sort of error (I have not tested this).
To learn how all of this works, I created a SuperBasic
program that opens a window and displays the contents of the
two BYTE data items in a config block. The example code is:
100 REMark $$asmb=ram1_test_cfg,0,10
110 EXT_FN "C_BYTE"
120 OPEN #3,scr_100x100a50x50
130 PAPER #3,0: INK #3,4: CLS #3
140 item1 = C_BYTE(1)
150 item2 = C_BYTE(2)
160 PRINT #3,"Item #1 = ";item1
170 PRINT #3,"Item #2 = ";item2
180 PAUSE 500
190 CLOSE #3
Note the $$asmb directive that links the config block into
the program. It is BasConfig that creates this block, which
includes the 5 functions to access the config block. The
EXT_FN command tells Qliberator that the references to
C_BYTE will be resolved at link time.
To create the config block, exec basconfig_obj. The program
will ask you for how many different config items you want.
For this example, I entered 2. Next you are asked to enter
the name of your final program and its version number. This
is used by the "config" program to let the user know exactly
what program they are configuring. These to items can not
be changed by the user.
Now the program will query you for the data types for the
first data item. You can scrolll through the data types by
hitting the left arrow key. I scrolled over to "Byte" and
hit return. Since each data type is different the next few
questions will be different for each data type. In the base
of the "Byte" data type the items were: Initial value,
Minimum value, & Maximum value. The Min and Max values give
you control of the changes the user can make, so that a
"bad" configuration can't be made. For this example, I gave
the first item a initial value of 10 and the second a value
of 20.
Once I answered all of the questions for the second config
item, the program asked for a file to store the config
block. It looks like the convention for config block file
name extensions is _cfg.
Now, the documentation for BasConfig is very sparse. I had
to figure out how to get the data out of the config block by
reading the source code for BasConfig. So, I have only done
just enough to get a fair idea of what is going on and how
to get it to work.
               (
geocities.com/siliconvalley/pines/Pines)                   (
geocities.com/siliconvalley/pines)                   (
geocities.com/siliconvalley)