*****************************************************************************
ANSI-ISO PASCAL FAQ
Welcome to the FAQ ! I will answer here various questions about ANSI-ISO
Pascal, and compilers of that language. This FAQ is not limited to any
one machine, operating system or language level. Any language that is
based on the original Wirth language or the standards that came from it
may be covered here.
Requests to add information are welcome, submissions are encouraged.
*****************************************************************************
========= "CLASSIC" PASCAL ===========
This section concentrates on the original Pascal as defined by N. Wirth, and
utilized in the early standards.
*****************************************************************************
Q. WHY IS THE NAME PASCAL ?
A. This should be a trivia question. Pascal was named after the French
mathematician Blaise Pascal, who created a calculating machine (not a true
computer).
*****************************************************************************
Q. WHAT IS ANSI-ISO PASCAL ?
A. Pascal is one of a series of languages put forth by one of the most
prolific computer language creators, Nicklaus Wirth, a professor at
Institut fur informatik, ETH, Zurich, Switzerland. Professor Wirth
participated in various versions of Algol, a language put forth by
international cooperation that introduced the basic concepts of structured
programming to the world. Wirth terms Pascal as a descendant of Algol 60
(for Algol, 1960 standard). The "official" descendant of Algol 60 was
Algol W, famous for having assignment as an expression operator (a basic
feature of the later language C). Wirth felt that the design committee
for Algol, after Algol 60, was losing focus and creating an unnecessarily
complex language.
While Algol W has had it's fans, the language Pascal was considered
to be a new high of consistent language design.
The first draft of Pascal was created in 1968. The first compiler was
operational in 1970, and the language was generally published in 1971.
In 1973, after two years of testing and use, the language was revised
into it's final form.
The first compiler for Pascal was implemented on a CDC 6000 computer
at ETH, for "unrevised" Pascal. After the language was revised, a new,
high optimization compiler for the new language was created using the
old compiler, then the source for that compiler itself changed to
"revised" format, so that it could compile itself (known as "bootstrapping"
a compiler).
In 1974 there were 10 compilers running on various systems. By 1979
there were at least 80.
In 1977, various committees began the work to standardize the language.
In 1982, the ISO (International Standards Organization) issued ISO 7185,
the official Pascal standard. In the same year, the US ANSI committee
issued ANSI/IEEE770X3.97-1983, the US standard for Pascal. In addition,
several countries around the world issued their own national standard
for Pascal.
*****************************************************************************
Q. What are the different Pascal standards?
There are currently 3 different documents that can be classified as
Pascal standards: Unextended Pascal, Extended Pascal, and the
Object-Oriented Extensions to Pascal.
*****************************************************************************
Q. WHAT IS THE CURRENT STATUS OF PASCAL STANDARDS ?
A. Originally, unextended Pascal was actually 2 standards. An ANSI/IEEE
standard (ANSI/IEEE770X3.97-1993) and an ISO standard (ISO 7185 : 1983).
There were 2 standards for mostly political differences that I won't
get into here. For the most part, the ISO standard was a superset of
the ANSI/IEEE standard and included the conformant array feature. See
the foreword of Extended Pascal for some additional history of the
development of unextended Pascal.
In 1989, ISO 7185 was revised to correct various errors and ambiguities
found in the original document. Also, the ANSI/IEEE770X3.97 standard was
replaced with a "pointer" to the ISO 7185 standard. So finally in 1989,
there was only 1 unextended Pascal standard in the world.
The unextended Pascal standard (ISO 7185 : 1990) is still in force as
a valid ISO language standard.
*****************************************************************************
Q. WHAT ARE THE BASIC FEATURES OF PASCAL ?
A. Pascal is a structured language, using if-then-else, while, repeat-until,
and for-to/downto control structures. It differs primarily from proceeding
languages in that data structures were also included, with records (a feature
borrowed from COBOL), arrays, files, sets and pointers.
Pascal is also unusual for forging an effective compromise between language
simplicity, power, and matching of language structures to underlying machine
implementation.
Pascal also has many features for compiler writers. The language is
constructed to have a minimum of ambiguity. Pascal, with few exceptions,
can be processed "forward" with all of the smaller elements (like constants,
types, etc) being defined before they are used. Pascal requires the types
and exact sizes of operands to be known before they are operated on, again
leading to simplified language processing and efficient output code
(although this feature has often been called a problem).
For this reason, Pascal still remains a popular language to implement
compilers for as part of a compiler science class.
*****************************************************************************
Q. WHAT IS J&W (OR THE "REPORT") ?
A. This refers to the "Pascal user manual and report", by Kathleen Jensen
and Niklaus Wirth. This is the original bible of Pascal. The second
edition contained the finalized language under Wirth. It is no longer
available. The current edition is the third, containing almost twice
as many pages, and contains the second edition extensively revised to
meet the ISO Pascal standard.
*****************************************************************************
Q. WHAT ARE THE DIFFERENCES BETWEEN STANDARD PASCAL AND THE ORIGINAL
PASCAL ? WAS IT CHANGED EXTENSIVELY ?
A. [This is one of the common myths. In fact, Microsoft customer service
explained to me by phone conversation that they believed that the reason they
were not compatible with ANSI or ISO pascal was that their compilers were
based on Wirth's original Pascal, and the ANSI and ISO language had in fact
been changed extensively. This continues to be repeated as fact around
the internet]
The stated goal of the standards committees was to keep Pascal unchanged, but
simply address the insecurities and ambiguities that had been discovered by
users of the language.
The MAJOR changes are:
1. Procedure and function parameters (where the procedure or function
itself was passed as a reference) appeared without a parameter list
in the declared procedure or function. The standard requires that
the parameter list appears as well, so that it can be checked against
any call of that procedure or function. For example:
procedure junk(function y: real);
begin
y(z);
...
end;
...
x := junk(sin, y);
procedure junk(function y(x: real): real);
etc.
2. The original language only allowed procedure and function parameters
to have value parameters. The standard allows value or VAR parameters.
2. In conjunction with (2), standard procedures and functions (those
defined by the compiler itself) are no longer acceptable as
procedure or function parameters in the standard. The REPORT shows
several examples of passing such functions.
3. In the original language, it was left as implementation defined as
to the exact rules of whether type x was compatible with type y.
In fact, the first implementations at ETH (which were not documented
in the REPORT) were based on "best effort", such that:
var rx: record x, y: integer; c: char end;
and
var ry: record x: integer; y: 0..10; c: char end;
Were considered compatible because they had the same basic structure.
The standard tightened these rules up considerably. In the standard,
types are compatible with a few exceptions only if they are the same
type or "aliases" of the same type as:
type a = b;
The standard also exactly defines the rules for assignment and other
compatibility modes.
4. The REPORT defined symbol lengths to be implementation defined. The
standard defines them as "unlimited", which for practical purposes
means that if the program lines will fit through the compiler, and
a symbol fits on one of those lines, it should work.
5. The report leaves the rules for intra procedure goto's as
implementation defined. The standard says they must only target the
OUTER level of the block.
6. The control variable in a "for" statement must be a variable local
to the procedure, function or program block in which it appears.
This change was to allow a more efficient implementation with
better checking.
In fact, most of what the standard did was simply acknowledge what were
already good coding practices. The original REPORT method of assuring
portability could be stated as:
APPLICATION: Stay within the guidelines as possible. Don't rely on
implementation dependent features, such as the compiler's ability
to recognize the similarity between types, etc.
COMPILER: Implement the language as fully as possible, and always
try to do the most reasonable thing for implementation dependent
features, such as attempting to determine whether types are compatible
as best as possible.
The idea being that a program will not fail unless it is a poorly written
program run on a poorly written compiler.
The standard changed that to a much more exact set of rules that all
compilers and programs must follow.
As an example of the compatibility between the REPORT language and the
standard, I moved several thousand lines of my own Pascal source from the
"old" to a standard compiler without A SINGLE CHANGE because of the standard.
The only error I found was that the compiler would not accept:
var s: array [1..10] of char;
...
writeln(s);
Because such Pascal strings must be "packed". This was actually also required
in the REPORT, I just had not read it correctly (or well enough).
*****************************************************************************
Q. WHAT ARE THE DIFFERENCES BETWEEN BORLAND PASCAL AND THE STANDARD ?
A. Because Borland Pascal is arguably the most prevalent version of Pascal
in existence, it is useful to compare the two languages. Note that I compare
here only the differences between Borland and the basic standard. Undiscussed
are any extensions provided by Borland. In other words, this section answers
the question "why doesn't my standard Pascal program run under Borland ?",
and perhaps "what can I write in Borland that will also be compatible with
the standard ?".
Borland originally claimed to be compatible with the ANSI version of the
standard (the first CP/M Turbo Pascal). As to whether or not the
omissions in Borland cause portability problems or are easily surmounted,
this is for you to decide. As to why these differences exist, this is
certainly a story on its own. Borlands C compiler did not originally match
the C language either (as detailed by Kernighan and Ritchie's "white book"),
but Borland corrected their compiler to meet the ANSI standard after it
(the standard) was issued. Call it a political issue.
1. Lack of file buffer variable handling. Standard Pascal has file "buffer
variables", and "get" and "put" procedures to operate on them. This
functionality was entirely omitted in Borland Pascal.
2. Lack of intraprocedural gotos. UCSD introduced this convention, which was
designed to both discourage use of "goto"s and keep language implementation
simple. Unfortunately, intraprocedural gotos are the most useful type
of gotos:
program test;
label 99;
procedure alpha;
begin
if error then goto 99 { exit }
end;
begin
...
99: { clean up files and exit }
...
end.
Intraprocedural gotos are used to implement error "bailout" in Pascal,
similar to "exception handling" in other languages. Borland later added
a goto method to do this that was completely incompatible with the
standard (patterned after the C language).
3. Lack of procedure and function parameters. Borland Pascal provides a much
more general and powerfull mechanism of procedure and function "types",
which are then used to create procedure and function parameters:
type CompareFunction = function(Key1, Key2 : string) : integer;
function Sort(Compare : CompareFunction);
begin
...
end;
Great. But Borland does not also implement the method from original Pascal,
requiring source changes.
4. Lack of "sized" dynamic variable allocation. Standard Pascal allows
the tag fields of a variant record to be specified as a parameter to the
"new" and "dispose" procedures:
var r: record
case b: boolean of
false: (i: integer);
true: (c: char);
end;
p: ^r;
...
new(p, true);
This allows variant records to take up less space in memory.
It should also be noted that Borland Pascal allows a considerable number
of operations that would be errors in the standard language. This leaves
the opportunity for a programmer to inadvertently create programs that
break the standard in many ways. Also, borland requires the use of a
non-standard integer type ("longint") to get the maximum precision of
an integer.
*****************************************************************************
WHAT IS THE DIFFERENCE BETWEEN PASCAL AND C OR SIMILAR LANGUAGES ?
C is called a "low level" language because it operates on the kinds of units
that the CPU itself deals with, such as integers, and pointers. In
particular, the most powerfull (and dangerous) feature of C is the ability to
treat any array reference as a pointer reference, and vice versa. C can also
translate one type to another at will. This creates a "insecure" language,
which means that it is not possible for the compiler to check if the program
will go wild and start writing all over variables, programs, or the operating
system, hard disk, and perhaps that unbacked up copy of your big project.
Before I used high level languages, I worked in assembly exclusively (there
were few or no compilers available in the early days of microprocessors).
I was debating a friend about which was better, assembly language or HLLs,
and told him that if I were to use a HLL, I would still potentially have a
program that could write on itself, but would be in a worse position to do
something about it, since I would essentially be reduced to disassembling
generated code to find the problem. He pointed out something that I did not
believe at the time, that it was indeed possible to design an HLL so that
the program could never write or read anything but data, and could do no more
damage than going into an infinite loop (while staying in the program).
In fact, that level security is possible, and not even very expensive. And
when a program is well tested and mature, even that level of security can be
dropped out by compiler option, meaning that a secure language can be just
as efficient as a non-secure one. Working with a secure language is a true
pleasure. If the program "halts" (goes into an infinite loop), I can just
hit a key and find out where it is stuck at. Rebooting the machine is not
required, and because the program cannot destroy data arbitrarily, debugging
the problem is much less difficult. In fact, I don't think it is an
overstatement to say that at least %50 of debugging time is saved using this
arrangement.
In the meantime, C has taken a dramatic 180 degree turn back to type security.
Most compilers now do extensive type checking, and complain (sometimes to
excess) about any bad use of types in C. The C++ language attempts to bring
C all the way back to type security, including the ability to check for out
of bounds array (pointer) references.
When translating from one to the other, pointers and type conversion are
typically the central issue. Pascal cannot arbitrarily point to anything,
nor translate any type to another.
In my experience, program translations can be done only one way. Pascal
programs can be translated to C, but most C programs cannot be translated to
Pascal, because that program will just contain too many broken rules to be
corrected without a complete rewrite.
*****************************************************************************
WHY IS C MORE EFFICIENT THAN PASCAL ?
It isn't.
This is one of the more common myths. The old proof goes something like this:
a++
Means "add one" in C, and:
a := a+1
Means "add one" in Pascal. For anyone who knows assembly language, you know
that there is usually a special machine instruction for "add one" or
"increment", and that to add a constant one may take two instructions.
This gets more complex. In C, pointers are synonyms for arrays, so
you can pick up an array and walk through it:
void strlen(char s[])
{
char l[];
l = s;
while (*l++);
return l-s;
}
Gives the length of a zero terminated string. The program relys on being
able to directly index into the string with a machine address (pointer),
and perform math with two pointers.
So C is more efficient than Pascal right ? Well, assuming you have two
compilers as dumb as posts, yes, that would be true. And in the early days
of desktop computing, compilers that dumb were common (in fact, a HARDWARE
feature of the 286 and later CPUs, and of the operating systems, was built
around stupid compilers[&]). But any modern compiler knows that "a+1" can be
changed to an increment. And compilers know how to translate array references
to pointer references. In fact, to a good optimizing compiler, the source
language is irrelevant. By the time the program is encoded, it has been
remade extensively, to the point it may no longer even resemble the original
source. At this level, the only effect the source language has on the output
is that one language may give the compiler more information about the program
than another. And Pascal is better at this than any other language in common
usage. So the fact that a C programmer has taken great pains to use pointers
instead of arrays may just be wasted effort. A higher level description of
the problem will probably end up producing the exact same code with less
programmer effort.
The C language was a very clever design in that it could be implemented with
little or no optimization and still yield acceptable code, by placing the
burden of optimization upon the programmer. This was a very important
characteristic of C when microprocessors were still limited to 64kb of
address space.
[&] I refer to the fact that the 286 and later processors, and OS/2 and
windows (even win32) are built to use stack parameter passing, which most
PC compilers no longer use.
*****************************************************************************
Q. WHAT GOOD BOOKS ARE AVAILABLE ON THE STANDARD LANGUAGE ?
A. Many books have been published on Pascal. I will be
happy to collect reviews here.
TITLE:
Pascal user manual and report, third edition. Kathleen Jensen and
Niklaus Wirth, Revised by Andrew Mickel and James Miner.
Published by Springer-Verlag.
COMMENTS:
A definitive reference on standard pascal and a must have book.
TITLE:
Standard Pascal: User reference manual. Doug Cooper.
Published by W. W. Norton and Company.
COMMENTS:
Doug Cooper is a Professor at UC Berkley. If you buy just ONE book
on standard Pascal, this would be it. Contains ALL of the points in the
standard, in the most readable format anywhere.
TTTLE:
Oh! Pascal. Doug Cooper.
Published by W. W. Norton and Company.
COMMENTS:
Another Doug Cooper blockbuster, this is probably the most used classroom
book on Pascal. Recommended if you are learning standard Pascal.
*****************************************************************************
Q. Where can I find books ?
A. On the Web, I found a server with an amazing number of books available for
order at:
http://www.amazon.com
The drawback is that no real book descriptions are included, and shipping is
expensive. But this would seem the way to go to get hard to find books.
Submission of other interesting online catalogs are encouraged.
*****************************************************************************
Q. WHAT ARE THE RULES OF ANSI/ISO PASCAL ?
A. It is unusual to describe a language completely in a FAQ, but books
on standard Pascal are sufficiently rare that I feel it is warranted.
Also, many books introduce themselves as "books on Pascal", without
specifying what language they use (in an obvious manner). I have seen
several such books that are really based on non-standard Pascals.
You can match the features in the book to the actual standard language here.
Note that the following description could be wrong or incomplete.
LEXOGRAPHY
Pascal source consists of identifiers, keywords, numbers and special
character sequences. A pascal identifier must begin with 'a' to 'z', but may
continue with 'a' to 'z' and '0' to '9'. There is no length limit on labels,
but there may be a practical limit. If the compiler cannot process a source
line longer than N, you cannot have a label longer than N, since labels may
not cross lines.
Keywords (or reserved words) appear just as labels, but have special meaning
wherever they appear, and may never be used as identifiers:
and array begin case const div do
downto else end file for function goto
if in label mod nil not of
or packed procedure program record repeat set
then to type until var while with
A number can appear in both integer and real form. Integers will appear
as a sequence of digits:
83
00004
Are valid integer numbers. For a number to be taken as "real" (or "floating
point") format, it must either have a decimal point, or use scientific
notation:
1.0
1e-12
0.000000001
Are all valid reals. At least one digit must exist on either side of a
decimal point.
Strings are made up of a sequence of characters between single quotes:
'string'
The single quote itself can appear as two single quotes back to back in a
string:
'isn''t'
Finally, special character sequences are one of the following:
+ - * / = < >
[ ] . , : ; ^
( ) <> <= >= .. @
{ } (* *) (. .)
Note that these are just aliases for the same character sequence:
@ and ^ (or the "up arrow" if allowed in the typeface)
(. and [
.) and ]
(* and {
*) and }
Spaces and line endings in the source are ignored except that they may act
as "separators". No identifier, keyword, special character sequence or
number may be broken by a separator or other object. No two identifiers,
keywords or numbers may appear in sequence without an intervening separator:
MyLabel - Valid
My Label - Invalid
begin farg := 1 - Valid
beginfarg := 1 - Invalid
1.0e-12 - Valid
1.e-122e-3 - Invalid
PROGRAM STRUCTURE
A Pascal program appears as a nested set of "blocks", each of which has the
following form:
block_type name(parameter [, parameter]...);
label x[, y]...
const x = y;
[q = r;]...
type x = y;
[q = r;]...
var x[,y]...: z;
[x[,y]...: z;]...
[block]...
begin
statement[; statement]
end[. | ;]
Note that:
[option] means optional.
[repeat]... means can appear 0 or more times.
[x | y] means one or the other.
There are three types of blocks, program, procedure and function. Every
program must contain a program block, and exactly one program block exists
in the source file.
Each block has two distinct sections, the declaration and statements
sections. The declarations immediately before a statement section are
considered "local" to that section.
The declaration section builds a description of the data used by the coming
statement section in a logical order. For example, constants are usually
used to build type declarations, and type declarations are used to build
variables, and all of these may be used by nested blocks.
LABEL DECLARATION
The first declaration, labels, are numeric sequences that denote the target
of any goto's appearing in the block:
label 99,
1234;
Are valid labels. Labels "appear" to be numbers, and must be in the range
0 to 9999. The "appearance" of a number means that:
label 1,
01,
Are the same label.
CONSTANT DECLARATION
Constant declarations introduce fixed valued data as a specified identifier:
const x = 10;
q= -1;
y = 'hi there';
r = 1.0e-12;
z = x;
Are all valid constant declarations. Only integer, real and character
constants may be so defined (no sets may appear).
TYPES
The type declaration allows types to be given names, and are used to create
variables later:
type x = array [1..10] of integer;
i = integer;
z = x;
Types can be new types, aliases of old types, etc.
VARIABLE DECLARATION
Variables set aside computer storage for a element of the given type:
var x, y: integer;
z: array [1..10] of char;
BLOCK DECLARATION
A block can be declared within a block, and that block can declare blocks
within it, etc. There is no defined limit as to the nesting level.
Because only one program block may exist, by definition all "sub blocks"
must be either procedure or function blocks. Once defined, a block may
be accessed by the block it was declared in. But the "surrounding" block
cannot access blocks that are declared within such blocks:
program test;
procedure junk;
procedure trash;
begin { trash }
...
end; { trash }
begin { junk }
trash;
...
end; { junk }
begin { test }
junk;
...
end. { test }
Here test can call junk, but only junk can call trash. Trash is "hidden"
from the view of test.
Similarly, a subblock can access any of the variables or other blocks
that are defined in surrounding blocks:
program test;
var x;
procedure q;
begin
end;
procedure y;
begin
q;
x := 1
end;
begin
y;
writeln('x')
end.
The variable "x" can be accessed from all blocks declared within the same
block.
It is also possible for a block to call itself, or another block that
calls it. This means that recursion is allowed in Pascal.
DECLARATION ORDER
Every identifier must be declared before it is used, with only one exception,
pointers, which are discussed later. But there is a way to declare procedures
and functions before they are fully defined to get around problems this
may cause.
PREDEFINED TYPES
Several types are predeclared in Pascal. These include integer, boolean, char,
real and text. Predeclared types, just as predeclared functions and procedures,
exist in a conceptual "outer block" around the program, and can be replaced
by other objects in the program.
BASIC TYPES
Types in Pascal can be classed as ordinal, real and structured. The ordinal
and real types are referred to as the "basic" types, because they have no
complex internal structure.
Ordinal types are types whose elements can be numbered, and there are a
finite number of such elements.
INTEGER TYPES
The basic ordinal type is "integer", and typically it represents the accuracy
of a single word on the target machine:
var i: integer;
A predefined constant exists, "maxint", which tells you what the maximum
integral value of an integer is. So:
type integer = -maxint..maxint;
Would be identical to the predefined type "integer". Specifically, the
results of any operation involving ordinals will only be error free if
they lie within -maxint to +maxint.
Although other ordinal types exist in Pascal, all such types have a mapping
into the type "integer", and are bounded by the same rules. The "ord"
function can be used on any ordinal to find the corresponding integer.
ENUMERATED TYPES
Enumerated types allow you to specify an identifier for each and every value
of an ordinal:
type x = (one, two, three, four);
Introduces four new identifiers, each one having a constant value in sequence
from the number 0. So for the above:
one = 0
two = 1
three = 2
four = 3
Enumerated types may have no relationship to numbers whatever:
type y = (red, green, blue);
Or some relationship:
type day = (mon, tue, wed, thur, fri, sat, sun);
Here the fact that "day"s are numbers (say, isn't that a lyric ?) is usefull
because the ordering has real world applications:
if mon < fri then writeln('yes');
And of course, subranges of enumerated types are quite possible:
type workday = (mon..fri);
Enumerated types are fundamentally different from integer and subrange types
in the fact that they cannot be freely converted to and from each other.
There is only one conversion direction defined, to integer, and that must
be done by special predefined function:
var i: integer;
d: day;
...
i := ord(d); { find integral value of d }
BOOLEAN TYPES
The only predefined enumerated type is "boolean", which could be declared:
type boolean = (false, true);
However, booleans cannot be cross converted (being enumerated types), this
user created type could not in fact be used just as the predeclared one.
Booleans are special in that several predefined procedures, and all of the
Comparison operators ("=", ">", etc.) give boolean results. In addition,
several special operators are defined just for booleans, such as "and",
"or" etc.
CHARACTER TYPES
Character types in Pascal hold the values of the underlying character set,
usually ISO single byte encoded (including ASCII). The Pascal standard
makes no requirements as to what characters will be present or what order they
will appear in. However, as a practical matter, most Pascal programs rely
on the characters of the alphabet and the digits '0'-'9' being present, and
that these are numbered sequentially (which leaves out EBCDIC, for example).
A character declaration appears as:
var c: char;
Character values can also be converted to and from integers at will, but only
by using the special functions to do so:
ord(c); { find integer value of character }
chr(i); { find character value of integer }
SUBRANGE TYPES
Subrange types are simply a voluntary programmer restriction of the values
an ordinal type may hold:
type constrained = -10..50;
(the notation x..y means all values from x to y inclusive.)
It is an error to assign a value outside of the corresponding range to a
variable of that type:
var x: constrained
...
x := 100; { invalid! }
But note that there are no restrictions on the USE of such a type:
writeln('The sum is: ', x+100);
Here, even though the result of x+100 is greater than the type of x, it is
not an error. When used in an expression, a subrange is directly equivalent
to the type "integer".
Subranges can be declared of any ordinal type:
type enum = (one, two, three, four, five, six, seven, eight, nine, ten);
var e: three..seven;
var c: 'a'..'z';
Etc.
REAL TYPES
Real types, or "floating point", allow approximations of a large range of
numbers to be stored. The tradeoff is that reals have no direct ordinality
(cannot be counted), and so have no direct relationship with integers. Real
types are the only basic type which is not ordinal.
var r: real;
Integers are considered "promotable" to reals. That is, is is assumed that
an integer can always be represented as a real. However, there may be
a loss of precision when this is done (because the mantissa of a real
may not be as large as an integer).
Reals are never automatically promoted to integer, however, and the
programmer must choose between finding the nearest whole number to the real,
or simply discarding the fraction. This choice must be made explicitly by
predefined function.
STRUCTURED TYPES
A structured type is a type with a complex internal structure. In fact, the
structured types all have one thing in common: they can hold more than
one basic type object at one time. They are structured because they are
"built up" from basic types, and from other structured types.
PACKING
Structured types can also be "packed", which is indicated by the keyword
"packed" before the type declaration. Packing isn't supposed to change the
function of the program at all. Stripping the "packed" keywords out of a
program will not change the way it works (with the exception of "strings",
below).
Packing means that (if implemented: its optional) the program should conserve
space by placing the values in as few bits as possible, even if this takes more
code (and time) to perform.
Packing is better understood if you understand the state of computers before
Microprocessors (the jurassic age of computers ?). Most mainframe computers
access memory as a single word size only, and not even a neat multiple of
8 bits either (for example, 36 bit computer; the CDC 6000 has 60 bit words).
The machine reads or writes in words only. There is no byte access, no even/odd
addressing, etc. Because storage on such a machine of small items could be
wastefull (especially characters), programs often pack many single data items
into a single word.
The advent of the Minicomputer changed that. DEC started with an 8 bit machine
(just as microprocessors did), and when they changed to 16, then 32 bits the
ability to address single bytes was maintained.
For this reason, many people refer to such a machine as "automatically packed",
or that Pascals packing feature is unessary on such machines. However,
quantizing data by 8 bit bytes is not necessarily the most extreme packing method
available. For example, a structure of boolean values, which take up only 1 bit
per element, left to byte packing would waste 7/8s of the storage allocated.
SET TYPES
Set types are perhaps the most radical feature of Pascal. A set type can be
thought of as an array of bits indicating the presence or absence of each
value in the base type:
var s: set of char;
Would declare a set containing a yes/present or no/not present indicator for
each character in the computer's character set. The base type of a set must
be ordinal.
ARRAY TYPES
The most basic structured type is the array. Pascal is unusual in that both
the upper and lower bounds of arrays are declared (instead of just the upper
bound or length), and that the index type can be any ordinal type:
var a: array [1..10] of integer;
Would declare an array of 10 integers with indexes from 1 to 10.
You may recognize the index declaration as a subrange, and indeed any
subrange type can be used as an index type:
type sub = 0..99;
var a: array [sub] of integer;
Arrays can also be declared as multidimensional:
var a: array [1..10] of array [1..10] of char;
There is also a shorthand form for array declarations:
var a: array [1..10, 1..10] of char;
Is equivalent to the last declaration.
A special type of array definition is a "string". Strings are arrays of packed
characters, with integer indexes, whose lower bound is 1:
var s: packed array [1..10] of char;
String types are special in that any two strings with the same number of
components are compatible with each other, including constant strings.
RECORD TYPES
Records give the ability to store completely different component types together
as a unit. There they can be manipulated, copied and passed as a unit. It is
also possible to create different typed objects that occupy the same storage
space.
var r: record
a: integer;
b: char
end;
Gives a single variable with two completely different components, which can
be accessed independently, or used as a unit.
var vr: record
a: integer;
case b: boolean of { variant }
true: (c: integer; d: char);
false: (e: real)
{ end }
end;
Variant records allow the same "collection of types", but introduce the idea
that not all of the components are in use at the same time, and thus can occupy
the same storage area. In the above definition, a, b, c, d, and e are all
elements of the record, and can be addressed individually. However, there are
three basic "types" of record elements in play:
1. "base" or normal fixed record elements, such as a.
2. The "tagfield" element. Such as b.
3. The "variants", such as c, d, and e.
All the elements before the case variant are normal record elements and are
always present in the record. The tagfield is also always present, but has
special function with regards to the variant. It must be an ordinal type, and
ALL of it's possible values must be accounted for by a corresponding variant.
The tagfield gives both the program and the compiler the chance to tell what
the rest of the record holds (ie., what case variant is "active"). The tagfield
can also be omitted optionally:
var vr: record
a: integer;
case boolean of { variant }
true: (c: integer; d: char);
false: (e: real)
{ end }
end;
In this case, the variant can be anything the program says it is, without
checking.
The variants introduce what essentially is a "sub record" definition that
gives the record elements that are only present if the selecting variant is
"active". A variant can hold any number of such elements.
If the compiler chooses to implement variants, the total size of the resulting
record will be no larger than the fixed record parts plus the size of the
largest variant.
It is possible for the compiler to treat the variant as a normal record,
allocating each record element normally, in which case the variant record
would be no different from a normal record.
FILE TYPES
Files are identical to arrays in that they store a number of identical
components. Files are different from arrays in that the number of components
they may store is not limited or fixed beforehand. The number of components
in a file can change during the run of a program.
A file can have any type as a component type, with the exception of other
file types. This rule is strict: you may not even have structures which
contain files as components.
A typical file declaration is:
var f: file of integer;
Would declare a file with standard integer components. A special predefined
file type exists:
var f: text;
Text files are supposedly equivalent to:
type text = file of char;
But there are special procedures and functions that apply to text files only.
POINTER TYPES
Pointers are indirect references to variables that are created at runtime:
var ip: ^integer;
Pointers are neither basic or structured types (they are not structured
because they do not have multiple components). Any type can be pointed to.
In practice, pointers allow you to create a series of unnamed components
which can be arranged in various ways.
The type declaration for pointers is special in that the type specified to
the right of "^" must be a type name, not a full type specification.
Pointer declarations are also special in that a pointer type can be declared
using base types that have not been declared yet:
type rp: ^rec;
rec: record
next: rp;
val: integer
end;
The declaration for rp contains a reference to an undeclared type, rec. This
"forward referencing" of pointers allows recursive definition of pointer
types, essential in list processing.
TYPE COMPATIBILITY
Type compatibility (ability to use two different objects in relation to each
other), occurs on three different levels:
1. Two types are identical.
2. Two types are compatible.
3. Two types are assignment compatible.
Two types are identical if the exact same type definition was used to create
the objects in question. This can happen in several different ways. Two
objects can be declared in the same way:
var a, b: array [1..10] of record a, b: integer end;
Here a and b are the same (unnamed) type. They can also be declared using the
same type name:
type mytype = record a, b: integer end;
var a: mytype;
b: mytype;
Finally, an "alias" can be used to create types:
type mytype = array [1..10] of integer;
myother = mytype;
var a: mytype;
b: myother;
Even though an alias is used, these objects till have the same type.
Two types are considered compatible if:
1. They are identical types (as described above).
2. Both are ordinal types, and one or both are subranges of an identical
type.
3. Both are sets with compatible base types and "packed" status.
4. Both are string types with the same number of components.
Finally, two types are assignment compatible if:
1. The types are compatible, as described above.
2. Neither is a file, or has components of file type.
3. The destination is real, and the source is integer (because integers can
allways be promoted to real, as above).
4. The source "fits" within the destination. If the types are subranges of
the same base type, the source must fall within the destination's range:
var x: 1..10;
...
x := 1; { legal }
x := 20; { not legal }
5. Both are sets, and the source "fits" within the destination. If the base
types of the sets are subranges, all the source elements must also exist in
the destination:
var s1: set of 1..10;
...
s1 := [1, 2, 3]; { legal }
s1 := [1, 15]; { not legal }
EXPRESSIONS
The basic operands in Pascal are:
xxx - Integer constant. A string of digits, without sign, whose
value is bounded by -maxint..maxint.
x.xex - Real constant.
'string' - String constant.
[set] - Set constant. A set constant consists of zero or more elements
separated by ",":
[1, 2, 3]
A range of elements can also appear:
[1, 2..5, 10]
The elements of a set must be of the same type, and the
"apparent" base type of the set is the type of the elements.
The packed or unpacked status of the set is whatever is
required for the context where it appears.
ident - Identifier. Can be a variable or constant from a const
declaration.
func(x, y) - A function call. Each parameter is evaluated, and the
function called. The result of the function is then used
in the encompassing expression.
The basic construct built on these operands is a "variable access", where
"a" is any variable access.
ident - A variable indentifier.
a[index] - Array access. It is also possible to access any number of
dimensions by listing multiple indexes separated by ",":
[x, y, z, ...]
a.off - Record access. The "off" will be the element identifier as
used in the record declaration.
a^ - Pointer reference. The resulting reference will be of the
variable that the pointer indexes. If the variable reference
is a file, the result is a reference to the "buffer variable"
for the file.
Note that a VAR parameter only allows a variable reference, not a full
expression.
For the rest of the expression operators, here they are in precedence, with
the operators appearing in groups according to priority (highest first).
"a" and "b" are operands.
(a) - A subexpresion.
not - The boolean "not" of the operand, which must be boolean.
a*b - Multiplication/set intersection. If the operands are real or
integer, the multiplication is found. If either operand is
real, the result is real. If the operands are sets, the
intersection is found, or a new set with elements that exist
in both sets.
a/b - Divide. The operands are real or integer. The result is a real
representing a divided by b.
a div b - Integer divide. The operands must be integer. The result is an
integer giving a divided by b with no fractional part.
a mod b - Integer modulo. The operands must be integer. The result is an
integer giving the modulo of a divided by b.
a and b - Boolean "and". Both operands must be boolean. The result is a
boolean, giving the "and" of the operands.
+a - Identity. The operand is real or integer. The result is the
same type as the operand, and gives the same sign result as the
operand (essentially a no-op).
-a - Negation. The operand is real or integer. The result is the
same type as the operand, and gives the negation of the
operand.
a+b - Add/set union. If the operands are real or integer, finds the
sum of the operands. If either operand is real, the result is
real. If both operands are sets, finds a new set which contains
the elements of both.
a-b - Subtract/set difference. If the operands are real or integer,
finds a minus b. If either operand is real, the result is
real. If both operands are sets, finds a new set which contains
the elements of a that are not also elements of b.
a or b - Boolean "or". Both operands must be boolean. The result is
boolean, giving the boolean "or" of the operands.
a < b - Finds if a is less than b, and returns a boolean result.
The operands can be basic or string types.
a > b - Finds if a is greater than b, and returns a boolean result.
The operands can be basic or string types.
a <= b - Finds if a is less than or equal to b, and returns a boolean
result. The operands can be basic, string, set or pointer
types.
a >= b - Finds if a is greater than or equal to b, and returns a boolean
result. The operands can be basic, string, set or pointer
types.
a = b - Finds if a is equal to b, and returns a boolean result.
The operands can be basic, string, set or pointer types.
a <> b - Finds if a is not equal to b, and returns a boolean result.
The operands can be basic, string, set or pointer types.
a in b - Set inclusion. A is an ordinal, b is a set with the same base
type as a. Returns true if there is an element matching a in
the set.
PREDEFINED FUNCTIONS
The following predefined functions exist:
sqr(x) - Finds the square of x, which can be real or integer. The
result is the same type as x.
sqrt(x) - Finds the square root of x, which can be real or integer. The
result is allways real.
abs(x) - Finds the absolute value of x, which can be real or integer.
The result is the same type as x.
sin(x) - Finds the sine of x,which can be real or integer. x is
expressed in radians. The result is always real.
cos(x) - Finds the cosine of x,which can be real or integer. x is
expressed in radians. The result is always real.
arctan(x) - Finds the arctangent of x, which can be real or integer. The
result is always real, and is expressed in radians.
exp(x) - Finds the exponential of x, which can be real or integer. The
result is always real.
ln(x) - Finds the natural logarithim of x, which can be real or
integer. The result is always real.
ord(x) - Finds the integer equivalent of any ordinal type x.
succ(x) - Finds the next value of any ordinal type x.
pred(x) - Finds the last value of any ordinal type x.
chr(x) - Finds the char type equivalent of any integer x.
trunc(x) - Finds the nearest integer below the given real x (converts a
real to an integer).
round(x) - Finds the nearest integer to the given real x.
STATEMENTS
Pascal uses "structured statements". This means you are given a few standard
control flow methods to build a program with.
ASSIGNMENT
The fundamental statement is the assignment statement:
v := x;
There is a special operator for assignment, ":=" (or "becomes"). Only a
single variable reference may appear to the right, and any expression may
appear to the left.
The operands must be assignment compatible, as defined above.
IF STATEMENT
The if statement is the fundamental flow of control structure:
if cond then statement [else statement]
In Pascal, only boolean type expressions may appear for the condition (not
integers). The if statement specifys a single statement to be executed
if the condition is true, and an optional statement if the condition is
false. You must beware of the "bonding problem" if you create multiple
nested if statements:
if a = 1 then if b = 2 then writeln('a = 1, b = 2')
else writeln('a <> 1');
Here the else clause is attached to the very last statement that appeared,
which may not be the one we want.
WHILE STATEMENT
Just as if is the fundamental flow of control statement, while is the
fundamental loop statement:
while cond do statement
The while statement continually executes it's single statement as long as
the condition is true. It may not execute the statement at all if the
condition is never true.
REPEAT STATEMENT
A repeat statement executes a block of statements one or more times:
repeat statement [; statement] until cond
It will execute the block of statements as long as the condition is false.
The statement block will always be executed at least once.
FOR STATEMENT
The for statement executes a statement a fixed number of times:
for i := lower to upper do statement
for i := upper downto lower do statement
The for statement executes the target statement as long as the "control
variable" lies within the set range of lower..upper. It may not execute
at all if lower > upper.
The control variable in a for is special, and it must obey several rules:
1. It must be ordinal.
2. It must be local to the present block (declared in the present block).
3. It must not be "threatened" in the executed statement. To threaten means
to modify, or give the potential to modify, as in passing as a VAR parameter
to a procedure or function (see below).
CASE STATEMENT
The case statement defines an action to be executed on each of the values of
an ordinal:
case x of
c1: statement;
c2: statement;
...
end;
The "selector" is an expression that must result in an ordinal type. Each of
the "case labels" must be type compatible with the selector. The case
statement will execute one, and only one, statement that matches the current
selector value. If the selector matches none of the cases, then an error
results. It is NOT possible to assume that execution simply continues if none
of the cases are matched. A case label MUST match the value of the selector.
GOTO STATEMENT
The goto statement directly branches to a given labeled statement:
goto 123
...
123:
Several requirements exist for gotos:
1. The goto label must have been declared in a label declaration.
2. A goto cannot jump into any one of the structured statements above
(if, while, repeat, for or case statements).
3. If the the target of the goto is in another procedure or function,
that target label must be in the "outer level" of the procedure or function.
That means that it may not appear inside any structured statement at all.
COMPOUND STATEMENT
A statement block gives the ability to make any number of statements appear
as one:
begin statement [; statement]... end
All of the above statements control only one statement at a time, with the
exception of repeat. The compound statement allows the inclusion of a whole
substructure to be controlled by those statements.
PROCEDURES AND FUNCTIONS
When you need to use a block of the same statements several times, a
compound block can be turned into a procedure or function and given a name:
procedure x;
begin
...
end;
function y: integer;
begin
...
end;
Then, the block of statements can be called from anywhere:
var i: integer;
x; { calls the procedure }
i := y; { calls the function }
The difference between a procedure and a function is that a function returns
a result, which can only be a basic or pointer type (not structured). This
makes it possible to use a function in an expression. In a function, the
result is returned by a special form of the assign statement:
function y: integer;
begin
...
y := 1 { set function return }
end;
The assignment is special because only the name of the function appears on
the left hand side of ":=". It does not matter where the function return
assignment appears in the function, and it is even possible to have multiple
assignments to the function, but AT LEAST one such assignment must be executed
before the function ends.
If the procedure or function uses parameters, they are declared as:
procedure x(one: integer; two, three: char);
begin
...
end;
The declaration of a parameter is special in that only a type name may be
specified, not a full type specification.
Once appearing in the procedure or function header, parameters can be
treated as variables that just happen to have been initialized to the value
passed to the procedure or function. The modification of parameters has no
effect on the original parameters themselves. Any expression that is
assignment compatible with the parameter declaration can be used in place
of the parameter during it's call:
x(x*3, 'a', succ('a'));
If it is desired that the original parameter be modified, then a special form
of parameter declaration is used:
procedure x(var y: integer);
begin
y := 1
end;
Declaring y as a VAR parameter means that y will stand for the original
parameter, including taking on any values given it:
var q: integer;
...
x(q);
Would change q to have the value 1. In order to be compatible with a VAR
the passed parameter must be of identical type as the parameter declaration,
and be a variable reference.
Finally, Pascal provides a special mode of parameter known as a procedure or
function parameter which passes a reference to a given procedure or function:
procedure x(procedure y(x, q: integer));
...
procedure z(function y: integer);
...
To declare a procedure or function parameter, you must give it's full
parameter list, including a function result if it is a function.
A procedure or function is passed to a procedure or function by just it's
name:
procedure r(a, b: integer);
begin
...
end;
begin
x(r); { pass procedure r to procedure x }
...
The parameter list for the procedure or function passed must be "congruent"
with the declared procedure or function parameter declaration. This means
that all it's parameters, and all of the parameters of it's procedure or
function parameters, etc., must match the declared parameter. Once the
procedure or function has been passed, it is then ok for the procedure or
function that accepts it to use it:
procedure x(procedure y(x, q: integer));
begin
y(1, 2);
...
Would call r with parameters 1 and 2
Procedures and functions can be declared in advance of the actual appearance
of the procedure or function block using the forward keyword:
procedure x(a, b: integer); forward;
procedure y;
begin
x(1, 2)
...
end;
procedure x;
begin
...
The forward keyword replaces the appearance of the block in the first
appearance of the declaration. In the second appearance, only the name of
the procedure appears, not it's header parameters. Then the block appears
as normal.
The advance declaration allows recursive structuring of procedure and
function calls that would be otherwise not be possible.
PREDEFINED PROCEDURES AND FILE OPERATIONS
A file is not accessed directly (as an array is). Instead, Pascal
automatically declares one component of the files base type which is
accessed by special syntax:
f^
So that:
f^ := 1;
Assigns to the file "buffer" component, and:
v := f^;
Reads the file buffer. Unless the file is empty or you are at the end of the
file, the file buffer component will contain the contents of the component
at the file location you are currently reading or writing. Other than that,
the file buffer behaves as an ordinary variable, and can even be passed as
a parameter to routines.
The way to actually read or write through a file is by using the predeclared
procedures:
get(f);
Loads the buffer variable with the next element in the file, and advances the
file position by one element, and:
put(f);
Outputs the contents of the buffer variable to the file and advances the file
position by one. These two procedures are really all you need to implement
full reading and writing on a file. It also has the advantage of keeping the
next component in the file as a "lookahead" mechanism.
However, it is much more common to access files via the predefined procedures
read and write:
read(f, x);
Is equivalent to:
x := f^; get(f);
And:
write(f, x);
Is equivalent to:
f^ := x; put(f);
Read and write are special in that any number of parameters can appear:
read(f, x, y, z, ...);
write(f, x, y, z, ...);
The parameters to read must be variable references. The parameters to write
can be expressions of matching type, except for the file parameter (files
must always be VAR references).
Writing to a file is special in that you cannot write to a file unless you
are at the end of the file. That is, you may only append new elements to the
end of the file, not modify existing components of the file.
Files are said to exist in three "states":
1. Inactive.
2. Read.
3. Write.
All files begin life in the inactive state. For a file to be read from, it
must be placed into the read state. For a file to be written, it must be
placed in the write state. The reset and rewrite procedures do this:
reset(f);
Places the buffer variable at the 1st element of the file (if it exists), and
sets the file mode to "read".
rewrite(f);
Clears any previous contents of the file, and places the buffer variable at
the start of the file. The file mode is set to "write".
A file can be tested for only one kind of position, that is if it has reached
the end:
eof(f);
Is a function that returns true if the end of the file has been reached. eof
must be true before the file can be written.
PREDEFINED PROCEDURES AND TEXT FILES
As alluded to before, text files are treated specially under Pascal. First,
The ends of lines are treated specially. If the end of a line is reached,
a read call will just return a space. A special function is required to
determine if the end of the line has been reached:
eoln(f);
Returns true if the current file position is at the end of a line. Pascal
strictly enforces the following structure to text files:
line 1
line 2
...
line N
There will always be an eoln terminating each line. If the file being read
does not have an eoln on the last line, it will be added automatically.
Besides the standard read and write calls, two procedures are special to text
files:
readln(f...);
writeln(f...);
Readln behaves as a normal read, but after all the items in the list are
read, The rest of the line is skipped until eoln is encountered.
Writeln behaves as a normal write, but after all the items in the list are
written, an eoln is appended to the output.
Text files can be treated as simple files of characters, but it is also
possible to read and write other types to a text file. Integers and reals can
be read from a text file, and integers, reals, stringsbooleans, and strings can be
written to text files. These types are written or read from the file by
converting them to or from a character based format. The format for integers
on read must be:
[+/-]digit[digit]...
Examples:
9
+56
-19384
The format for reals on read is:
[+/-]digit[digit]...[.digit[digit]...][e[+/-]digit[digit]...]
Examples:
-1
-356.44
7e9
+22.343e-22
All blanks are skipped before reading the number. Since eolns are defined as
blanks, this means that even eoln is skipped to find the number. This can
lead to an interesting situation when a number is read from the console. If
the user presses return without entering a number (on most systems), nothing
will happen until a number is entered, no matter how many times return is
hit !
Write parameters to textfiles are of the format:
write(x[:field[:fraction]]);
The field states the number of character positions that you expect the object
to occupy. The fraction is special to reals. The output format that occurs
in each case are:
integer: The default field for integers is implementation defined, but
is usually the number of digits in maxint, plus a position for the sign.
If a field is specified, and is larger than the number of positions
required to output the number and sign, then blanks are added to the left
side of the output until the total size equals the field width. If the
field width is less than the required positions, the field width is
ignored.
real: The default field for reals is implementation defined. There are
two different format modes depending on whether the fraction parameter
appears.
If there is no fraction, the format is:
-0.0000000e+000
Starting from the left, the sign is either a "-" sign if the number is
negative, or blank if the number is positive or zero. Then the first digit
of the number, then the decimal point, then the fraction of the number,
then either 'e' or 'E' (the case is implementation defined), then the sign
of the exponent, then the digits of the exponent. The number of digits in
the exponent are implementation defined, as are the number of digits in
a fraction if no field width is defined. If the field width appears, and
it is larger than the total number of required positions in the number
(all the characters in the above format without the fraction digits),
then the fraction is expanded until the entire number fills the specified
field, using right hand zeros if required. Otherwise, the minimum required
positions are always printed.
If a fraction appears (which means the field must also appear), the format
used is:
[-]00...00.000..00
The number is converted to it's whole number equivalent, and all the of
whole number portion of the number printed, regardless of the field size,
proceeded by "-" if the number is negative. Then, a decimal point appears,
followed by the number of fractional digits specified in the fraction
parameter. If the field is greater then the number of required positions
and specified fraction digits, then leading spaces are appended until the
total size equals the field width. The minimum positions and the specified
fractional digits are always printed.
HEADER FILES
The header files feature was originally designed to be the interface of
Pascal to the external files system, and as such is implementation by
definition. It is also (unfortunately) ignored in most implementations.
The header files appear as a simple list of identifiers in the program
header:
program test(input, output, source, object);
Each header file automatically assumes the type text. If the file needs
to be another type, it should be declared again in the variables section
of the program block:
program test(intlist);
var intlist: file of integer;
Two files are special, and should not be redeclared. These are input and
output. The input files are understood to represent the main input and
main output from the program, and are present in all Pascal programs.
In addition, they are the default files is special forms of these
procedures and functions:
This form is equivalent to This form
--------------------------------------------------------------
write(...) write(output, ...)
writeln(...) writeln(output, ...)
writeln writeln(output)
read(...) read(input, ...)
readln(...) readln(input, ...)
readln readln(input)
eof eof(input)
eoln eoln(input)
PACKING PROCEDURES
Because arrays are incompatible with each other even when they are of the
same type if their packing status differs, two procedures allow a packed
array to be copied to a non-packed array and vice versa:
unpack(PackedArray, UnpackedArray, Index);
Unpacks the packed array and places the contents into the unpacked array.
The index gives the starting index of the unpacked array where the data
is to be placed. Interestingly, the two arrays need not have the same index
type or even be the same size ! The unpacked array must simply have enough
elements after the specified starting index to hold the number of elements
in the packed array.
pack(UnpackedArray, Index, PackedArray);
Packs part of the unpacked array into the packed array. The index again gives
the starting position to copy data from in the unpacked array. Again, the
arrays need not be of the same index type or size. The unpacked array simply
need enough elements after the index to provide all the values in the packed
array.
DYNAMIC ALLOCATION
In Pascal, pointer variables are limited to the mode of variable they can
index. The objects indexed by pointer types are anonymous, and created or
destroyed by the programmer at will. A pointer variable is undefined when
it is first used, and it is an error to access the variable it points to
unless that variable has been created:
var p: ^integer;
...
new(p); { create a new integer type }
p^ := 1; { place value }
Would create a new variable. Variables can also be destroyed:
dispose(p);
Would release the storage allocated to the variable. It is an error (a very
serious one) to access the contents of a variable that has been disposed.
A special syntax exists for the allocation of variant records:
var r: record
a: integer;
case b: boolean of
true: (c: integer);
false: (d: char)
{ end }
end;
...
new(p, true);
...
dispose(p, true);
For each of new and dispose, each of the tagfields we want to discriminate
are parameters to the procedure. The appearance of the tagfield values allow
the compiler to allocate a variable with only the amount of space required
for the record with that variant. This can allow considerable storage savings
if used correctly.
The appearance of a discriminant in a new procedure does not also
automatically SET the value of the tagfield. You must do that yourself. For
the entire life of the variable, you must not set the tagfield to any other
value than the value used in the new procedure, nor access any of the
variants in the record that are not active.
The dispose statement should be called with the exact same tagfield values
and number.
Note that ALL the tagfields in a variable need not appear, just all the ones,
in order, that we wish to allocate as fixed.
*****************************************************************************
Q. WHAT ARE SOME STANDARD METHODS AND HINTS FOR PASCAL ?
A. There are several techniques in Standard Pascal that are not obvious.
STRINGS
Pascal and C have one feature in common, they include no basic support for
handling of strings of characters. Strings, as implemented in Basic and
other languages, are a "high cost" data element, mainly because a lot
of character copying must occur within the string functions. Most
"professional" (ie., used for paid programming) languages choose to leave
creation of string handling up to the programmer. The reason is that in many
or even most cases, applying primitives to simple arrays of characters can
achieve the same result at less expense.
However, one big difference between C and Pascal is that C allows variable
length string passage, which greatly facilitates creation of general purpose
string handling routines, and manipulation of string constants.
In Pascal, by contrast, you must declare a string as a fixed length array:
var string: packed array [1..50] of char;
Which means that all of your strings must have the same length as the handler
routines expect. Further, for any reasonable size of string, assigning
string constants to strings is prohibitive:
string = 'hello, world ';
And this is a short example ! More commonly, strings must be 100-200
characters in length, so the assignment of string constants is just
impractical.
USING "SPACE PADDED" STRINGS
No matter what the length of string, the first and best trick is to make
extensive use of space padded strings:
var s: packed array [1..10] of char;
...
s := 'hello ';
For example to read a word from the input:
var i: integer;
s := ' ';
i := 1; { set 1st character in string }
while not input^ = ' ' do { read characters }
if i <= 10 then begin { not overflow }
read(s[i]); { get next character }
i := i+1 { next character }
end;
If the user inputs "one more", s would be "one ", or the first word
with blank padding. Note the trick used to find the end of the word. Eoln
is returned as a space, which also happens to be the word delimiter, and
in standard Pascal every line must be terminated by a eoln.
Once all your strings are in space padded form, operations on them are
easy:
var s1, s2: packed array [1..10] of char;
...
s1 = s2; { find if strings are equal }
s1 < s2; { find if s1 is lexograpically smaller than s2 }
s1 := s2; { assign strings }
Note that comparing strings for order depends on the value of the space
character. If space has a smaller ordinal value than all other characters,
then "ab " is going to be less than "abc ". If space has a
larger values than others, the opposite would be true. In ASCII, the space
is the lowest value character, and this indeed gives the lexographical
sort that is most popular (ie., that shorter character sequences go first).
Space padded strings even work for strings with ebedded spaces ! A string
like:
s := 'this is a very long string ';
Would be equal to "this is a very long string" without the trailing blanks.
This works because, reading left to right as we do, any spaces after the
text are unimportant (and typically will not make any difference to
printout). Putting text through processing using blank padded strings
will have the effect of trimming all the trailing blanks off the text,
often a desirable side effect.
To find the length of a blank padded string:
var e: integer;
s: packed array [1..100] of char;
...
e := 100; { set maximum }
{ find end of string }
while (s[e] > 1) and (s[e] = ' ') do e := e-1;
Will set e to be the last character of the string, or to 1 if the string
is empty. A check for a blank string need not be:
s = ' ';
or similar, but simply:
s[1] = ' ';
Because if the first character is empty, the entire string is usually empty
as well.
USE "CUSTOM" STRING SIZES
The best way to achieve "tight" code in standard pascal is to use custom
strings for given tasks. For instance, if you are going to input a command
from the user, and use that to look up a string in a table, you can create
a string type based on the maximum length of command string that you are
going to need. Then you can create a few custom handling procedures for
that string type. This works best if you use constant declarations to
declare the length of the string:
const strlen = 100;
type mystring = packed array [1..strlen] of char;
This way, if you must change the string size later, there is less difficulty.
Although these kinds of solutions may create a larger program listing, the
result is usually faster than using a general purpose string library, and
the cost saved by not including the general string library may pay for
any duplication of effort.
CLEARING STRINGS
Clearing long strings is best done as:
const strlen = 100;
var i: integer;
s: packed array [1..strlen] of char;
...
for i := 1 to strlen do s[i] := ' ';
This code will usually take less space than spelling out a long string of
blanks, take the same amount of CPU time (after all, a string assign uses
a copy loop as well), and most importantly, won't have to be changed if
we change the string lengths in the program.
INITIALIZING STRINGS
If you have to have both long strings and also occasionally set these
strings to a known constant, it helps to create a procedure to initialize
them. You simply find what the longest constant string you are going
to assign to the variable strings, then create a special string type
for that. Finally, you create a custom procedure to do the copy:
const strlen = 250; { our "big" string }
cstlen = 12; { our constant strings }
type string = packed array [1..strlen] of char;
cstring = packed array [1..cstlen] of char;
var s: string;
procedure inistr(var s: string; c: cstring);
begin
for i := 1 to strlen do s[i] := ' '; { clear result }
for i := 1 to cstlen do s[i] := c[i] { place string }
end;
...
inistr(s, 'hello, world';
"REAL" STRINGS
Because Pascal does not have built in strings does not prevent you from
implementing them yourself:
type string = record
len: integer;
str: packed array [1..100] of char;
end;
Then you can go ahead and define a full set of procedures to concatenate,
find substrings, etc., within a string.
The principal drawback here is again, initializing strings. But using the
"initialize procedure" method, in combination with the "find end of blank
string" method, it is easy to create a procedure that will do the job:
var s: string;
...
inistr(s, 'mystring ');
Here intstr would find the exact length of the string by blank termination,
then assign that to the string length, and place the string body.
FINDING AN ENUMERATED VALUE FROM AN INTEGER
One of the more irritating things in Pascal is converting an integer to an
enumerated type. Even though there is a one to one correspondence between
integers and enumerateds, and you can find the integer value of any
enumerated value with ord, you cannot go the other way easily. You might
use a case statement or similar kludge.
But there is a trick you can use that takes a bit of space, but incurs very
little speed penalty and is as easy to use sourcewise as the hypothetical
"unord" function would be:
type enum = (one, two, three, four, five, six, seven, eight, nine, ten);
var ei: enum;
etran: array [10] of enum;
begin
{ initalize translation array }
for ei := one to ten do etran[ord(ei)] := ei;
...
ei := etran[5];
Now translating an integer back to the enumerated type is simply a fast array
lookup. The price for this is the size of the translation array, and the fact
that you must declare the translation array with a number that depends on the
size of enum.
CREATING CONSTANT TABLES
Standard Pascal has no capability to create tables of fixed data at compile
time. But your compiler may be able to turn assigns into preinitalized data
if it can determine that the assign will ABSOLUTELY happen before anything
else. The best way to do this is to perform all such assigns first:
var table: record
a: integer;
b: integer;
c: char
end;
begin
table.a := 1;
table.b := 45;
table.c := 'x';
...
Typically this should only be done at the program block level. Doing this
gives the compiler the maximum chance to perform this optimization.
CREATING YOUR OWN DYNAMIC VARIABLE RECYCLING
You may be able to do a better job of dynamic variable recycling than the
standard "dispose" routine. getting and putting a lot of variable length
blocks tends to create "fragmentation" difficulties. You can illustrate
this problem fairly simply. If you are using two different data types
in dynamic storage, one of 100 bytes length, and the other 200 bytes in
length, and these are created and disposed at random, the standard new
and dispose procedures may be taking back the 200 byte blocks and breaking
them down into 100 byte blocks to satisfy the calls for that size of
variable. The result is that eventually, the call to new for a 200 byte
block may fail, even though plenty of storage still exists, because it is
all broken into 100 byte blocks.
I have found that in many cases, it is better to hold on to unused blocks,
and recycle them yourself:
program mine(input, output);
type blkptr = ^block;
block = record
next: blkptr;
array [1..100] of byte
end;
var freblk; { the free block list }
{ get a new block }
procedure getblk(var p: blkptr); { returns the block }
begin
if freblk <> nil then begin { recycle existing block }
p := freblk; { index the top block }
freblk := freblk^.next { remove from list }
else new(p) { otherwise create a new one }
end;
{ put an unused block }
procedure putblk(p: blkptr); { block to dispose of }
begin
p := freblk; { insert to free list }
freblk := p
end;
begin
freblk := nil; { clear free block list }
...
This system reduces fragmentation, because blocks are reserved for a
particular use, and not broken down into smaller parts. It tends to be
storage efficient, because programs typically do the same sort of work
over and over again. That is, if you needed to get N blocks of X type, this
means that X type blocks will be used a lot in the run of the program
(although obviously there are programs that break that rule).
I add "meters" to count the number of blocks going in and out of the free
list to tell me how the system is performing in real life.
CREATING VARIABLE LENGTH ARRAYS
If you require variable length arrays in standard Pascal, what do you do ?
For example, if you are going to create a text editor, and want to store
variable length lines, but you don't want to place a limit on the length of
a line. You need a variable length string type to contain each line. The
answer is that you can create variable length arrays yourself ! The secret
is to chain dynamically allocated arrays together to make a larger array:
type
blkptr = ^block;
block = record
next: blkptr;
data: packed array [1..10] of char
end;
var p: blkptr;
{ allocate string in terms of blocks }
procedure alcblk(var p: blkptr; { returns string allocated }
l: integer); { length of string }
var t: blkptr;
begin
p := nil; { clear result list }
while l > 0 do begin { allocate blocks }
new(t); { get new block }
t^.next := p; { link into list }
p := t;
l := l-10 { count that block }
end
end;
{ get character from string }
function getblk(p: blkptr; { string to fetch character from }
i: integer); { index to get character from }
: char; { returns character from index }
begin
while i > 10 do
begin p := p^.next; i := i-10 end; { index proper block }
getblk := p^.data[i] { return resulting character }
end;
{ place string character }
procedure putblk(p: blkptr; { string to put character to }
i: integer; { index to put character to }
c: char); { character to place }
begin
while i > 10 do
begin p := p^.next; i := i-10 end; { index proper block }
p^.data[i] := c { place character }
end;
This technique could be used on any array type. Although it looks pretty
horrible, the system can keep any number of any length of string, such as
an advanced editor would do, and the fact that all strings are broken down
into fixed length "quanta" keeps storage fragmentation to a minimum, or even
eliminates it entirely (and important attribute for an editor). The
processing cost of the system can be lessened by pulling the variable length
data to/from a large buffer, and working on it there. But of course, this
would limit the length of data you can work on.
PERFORMING BITWISE BOOLEAN FUNCTIONS
If you have to perform booleans on a standard compiler that does not have
boolean bitwise operators on integers, you can create them.
First, if you know that the same bits are not set in two words, then you can
just add them to get "or" functionality:
i := i+64; { set bit 7 }
You can also mask off bits in an integer using "div" and "mod":
i := i mod 256; { is equivalent to i and $ff }
i := i div 256*256; { is equivalent to i and not $ff }
For generalized boolean operations you can use:
function bor(a, b: integer);
var i, r, p: integer;
begin
r := 0; { clear result }
p := 1; { set 1st power }
for i := 1 to maxbit do
r := r*2; { move bits up }
if ord(odd(a) or odd(b) then r := r+p; { add in power }
p := p*2 { find next power }
end;
bor := r { return result }
end;
function band(a, b: integer);
var i, r, p: integer;
begin
r := 0; { clear result }
p := 1; { set 1st power }
for i := 1 to maxbit do
r := r*2; { move bits up }
if ord(odd(a) and odd(b) then r := r+p; { add in power }
p := p*2 { find next power }
end;
bor := r { return result }
end;
function bxor(a, b: integer);
var i, r, p: integer;
begin
r := 0; { clear result }
p := 1; { set 1st power }
for i := 1 to maxbit do
r := r*2; { move bits up }
if ord(odd(a) <> odd(b) then r := r+p; { add in power }
p := p*2 { find next power }
end;
bor := r { return result }
end;
These assume a 32 bit integer, but can be set to any integer length. Note
that the sign bit is specifically left out of the operation.
You can find the value of maxbit (the number of bits in an integer) as:
var i, x: integer;
...
x := maxint;
i := 0;
while x <> 0 do begin x := x div 2; i := i+1 end;
This won't count the sign bit, which is correct for the above routines. This
should be done only once, when the program starts up.
READING AND WRITING INTEGERS TO A BYTE FILE
Often you must deal with files that randomly mix different types of data that
are not ameniable to declaration as a file of records.
If you read and write large integers using only standard constructs (and not
"type changing" using variant records), you may find it easier to use a
format known as "signed magnitude" than to try to write the number using
the 2's complement format used by the CPU. In signed magnitude, the sign
bit is determined and written out separately from the value of the integer:
{ write integer to byte file }
procedure wrtint(var f: bytfil; { file to write to }
i: integer); { integer to write }
var s: byte; { sign holder }
begin
{ remove sign and save }
if i < 0 then s := 128 else s := 0;
i := abs(i); { remove sign }
write(f, i div 16777216+s); { output high byte with sign }
write(f, i div 65536 mod 256); { output high middle }
write(f, i div 256 mod 256); { output low middle }
write(f, i mod 256) { output low byte }
end;
{ read integer from byte file }
procedure rdint(var f: bytfil; { file to read from }
var i: integer); { integer to read }
var s: boolean; { sign holder }
b: byte;
begin
s := false; { set no sign }
read(f, b); { get high byte }
if b >= 128 then begin s := true; b := b-128 end;
i := b*16777216;
read(f, b); { get high middle }
i := i+b*65536;
read(f, b); { get low middle }
i := i+b*256;
read(f, b); { get low byte }
i := i+b;
if s then i := -i { add back sign }
end;
For 32 bit integers written in "big endian" format (high order bytes first).
Note that the sign is written and read as bit 32, the same place as the
normal sign. Note that the only value you lose this way is $80000000, which
is an invalid value under standard Pascal anyways.
*****************************************************************************
Q. WHAT ARE SOME GOOD CHARACTERISTICS OF A STANDARD PASCAL COMPILER ?
A. Just following the standard is not enough to create a useable compiler.
Obviously freedom from limits (except the limit of available memory) is
a good attribute of a compiler, as well as producing the best code
possible. Other points to look for:
1. Should be able to represent "set of char" without problem. This is really
a must. For a while, it was under consideration to add this requirement to
the standard. It wasn't, but most character handling programs I have seen
rely on being able to use character sets, so you might as well consider it
part of the standard.
2. Represents 8 bit characters. I have found it best if the compiler leaves
it entirely up to you what is done with the 8th bit (of ASCII or ISO
character sets). This allows you do either deal with parity, or manipulate
256 valued extended character sets (like IBM ASCII).
3. "file of char" is not the same as "text". This is a somewhat obscure
point. Normal "text" files are "filtered" by Pascal I/O. Line endings are
made regular, and eolns are converted to spaces. But there may be times
that you want to talk in terms of the exact characters themselves, read
and write carriage returns and line feeds directly, etc. For example, when
reading direct from the computer console, you may want to see directly if
the user hits the "return key".
There is no requirement in the standard that "file of char" do the same
filtering that "text" does, and better compilers consider "file of char"
a clue to completely get out of the way and pass raw characters to and from
the file. Also, it is common for Pascal systems to buffer up whole lines
when reading from the console (see below). File of char will allow this
buffering to be bypassed.
4. Input from the computer console is buffered. If you have a program like:
program readit(input, output);
var i: integer;
begin
writeln('Input the number: ');
readln(i);
writeln('The number was: ', i:1)
end;
If the console is read directly, and the user makes a mistake on entry,
it will not be possible to back up and correct typing. If the line is
buffered, the user can back up, correct the error, and continue without the
simple program above needing to do anything about it. Without this
capability, you would have to write such "line editting" features in
yourself.
A REALLY GOOD Pascal system would implement a complete set of line editting
features, such as back up, insert characters, etc.
5. Ability to represent a file of bytes. One thing that really amazes me
about some compilers I have seen is an inability to read or write files of
byte value:
type byte = 0..255; { a predefined type on many compilers }
bytfil = packed file of byte;
If your compiler decides that bytes are integers, at whatever the size of
integers are on your machine (16, 32 or 64 bits), it may just read or
write integer size values to the file ! This forces you to break apart the
bytes from an integer yourself, which is not only tedious, but creates a
nonportable program.
6. Does something with the program parameters. Somewhere, at some time, it
became standard to just ignore the program parameters. This is a real shame,
because when implemented, they are really useful for creating quick, short
programs:
program copy(source, destination);
var c: char;
begin
while not eof(source) do begin
if eoln(source) then begin
readln(source);
writeln(destination)
end else begin
read(source, c);
write(destination, c)
end
end
end.
Now if the compiler ignores the program header parameters, this program
does nothing, and will probably terminate with an error. But the compiler
can also automatically attach program parameters other than the standard
"input" or "output" to command line parameters, like:
> copy myfile.txt thatfile.txt
So the compiler opens source to "myfile.txt", and destination to
"thatfile.txt".
The result is a program that is completely standard, and yet performs named
file manipulation. Further, this method is much simpler than using the "open
by name" extensions present in most Pascals, and allows you to create simple
example programs faster.
7. Ability to tolerate control characters in source. If the compiler ignores
control characters in the source, such as tabs and form feeds (treats them
as spaces), you will have the freedom to format your source so that it can
be directly printed.
8. Don't have "extended" modes that break the standard rules.
The Pascal standards basically require that the compiler have a "switch"
(command line option) that causes the compiler to accept only standard
Pascal, and rejects any nonstandard constructs. Because some compilers were
originally nonstandard, and were brought into compliance with the standard
after the fact, it is quite possible that they may not accept standard
programs with this switch off ! This is truly unfortunate, since it forces
you to write nonstandard Pascal just to take advantage of Extensions to the
language provided by your compiler.
9. Accept Pascal strings for extended file naming procedures. A common
extension to Pascal is a Basic string construct (as opposed to a pascal
string, which is just an array of characters). Unfortunately, it has become
common to REQUIRE that such strings be used with extended file "open by name"
procedures. A good compiler should accept standard pascal strings as
arguments to such procedures, which allows you to use the string types
you choose. These procedures should also ignore trailing spaces in such
strings, which allows use of the typical Pascal "space padded string".
10. Implements "lazy I/O". Pascal as originally specified assumes that all
files are batch files (ie., not "interactive", or connected to a console).
This creates problems, say, reading from the console keyboard, because Pascal
assumes that the first character in a text file is always available.
Lazy I/O will only request the first character from a file when it is
actually used, and is completely compatible with the standard.
11. Can provide "strict packing". Strict packing means that if a record is
marked as "packed":
var r: packed record
a: boolean;
b: integer;
c: char
end;
Then, for example, "a" will only occupy a single bit, and all fields will be
packed into as few bits as possible. This has implications beyond simple
storage savings, because you can use strict packing to emulate ANY data
structure from any language. for example, if you receive the date as packed
into a single word:
-------------------------------------------
| year (0-99) | month (1-12) | day (1-31) |
-------------------------------------------
7 bits 4 bits 5 bits
This exactly fits in a single 16 bit word ! Then, this can be declared as
a packed record:
var date: packed record
year: 0..99;
month: 1..12;
day: 1..31
end;
And no "format conversion" is required. Note that if the alignment of packed
records to the data structure desired is not perfect, you can insert "shims"
in the form of single bits (boolean values).
*****************************************************************************
Q. WHAT ARE COMMON EXTENSIONS AND METHODS TO PASCAL ?
A. Most compilers include extensions to the basic language, the most popular
(and necessary) extensions being manipulation of external files by name, and
ability to separate program parts into modular form. In fact, a standard of
sorts was created by the popular UCSD Pascal system for these two items.
This is a list of popular extensions, and the form they usually take.
Note that extensions that are part of the extended Pascal standard are
discussed in the extended standard section.
1. Specification of hex constants. Allows you to directly specify hex numbers
in the source:
var a: integer;
...
a := $56;
"$" as the leading character seems to be the most common. Some compilers also
allow binary radix (base 2) and octal radix (base 8).
2. Integer bit boolean operations. Many or even most compilers allow you to
use the "and" and "or" operators on integers:
var a, b: integer;
...
a := a and b;
The result is a bitwise "and" of the two integers. Note that the results when
one or more of the operands is signed varies from compiler to compiler, and
use of these operators with signed integers should be avoided.
Many compilers also include an "xor" operator.
3. File open and close by name. This allows you to open an external file
by name. Many compilers use the UCSD plan:
var f: text;
...
assign(f, 'myfile.txt');
reset(f);
...
In this method, an ASCII name can be "bonded" to the file by the procedure
"assign". If no assign is performed on a file, it is opened as a "temporary"
file (ie., the system just coins a name for the file, and deletes it when
complete) upon "reset" or "rewrite". This allows the system to be completely
backward compatible with standard Pascal. If your system also has a "close"
function:
close(f)
You should use it. Closing open files releases system space (used to keep
track of files), and allows a series of files to be opened under one file
variable.
4. Modular compilation. Again the most common, UCSD adds an extra declaration
section before "label":
program junk(input, output);
uses trash, mylib;
label 1, 2;
...
This tells the compiler to include all the (public) constants, types and
blocks in the files specified within uses list to the present program.
The exact details of the format needed to compile the module (called a
"unit" in UCSD) vary quite a bit from compiler to compiler.
5. Include statements. Most compilers allow the appearance of an "include"
statement or marker in the source, that specifies that another file is to
be included in line. Most common is UCSDs "control comment":
{$I file.pas}
Which would include the file "file.pas" inline, replacing the entire comment.
5. Flexible declaration order. Many or most compilers allow the declarations
to occur in any order, and for declaration sections to occur multiple times:
program test;
var this, that: integer;
const x = 10;
var mystring: packed array [1..x] of char;
...
The reason for this is that when using modular compilation or include
statements, each program section must have its own set of declarations.
Note that the rule that objects must be declared before use is still in
effect even though the declarations can appear in any order.
6. Strings. A full Basic string type is included in Pascal, which allows
constructs as:
var s: string;
s := 'hi there'; { any length can be assigned }
s := s+' george'; { concatenation }
writeln('Length is: ', len(s)); { find string length }
...
This also was in UCSD Pascal.
7. "_" allowed in identifiers. This convention comes (mainly) from C, where
indentifiers like:
my_label
Are common. Since Pascal must live in systems where these names are common,
this is allowed in most implementations.
8. "goto" labels as normal identifiers. The appearance of goto labels as
integers might be considered a sort of arcane plot to punish people for using
"goto"s. Many or most implementations allow goto labels to appear in the
same form as identifiers:
label destination;
...
destination:
...
goto destination;
*****************************************************************************
Q. WHAT COMPILERS EXIST FOR STANDARD PASCAL
A. This is a VERY incomplete list of known standard Pascals, with my notes.
There is no machine/operating system limits to which compilers I intend to
list.
FREEWARE
The following compilers are free, and available over the net.
1. GNU Pascal. GPC is a front end to the very extensive GCC system, which
is a modular compiler that generates code for a truly amazing number of
different operating systems as machines. Because GPC fits into that system,
it should be able to go anywhere that GCC goes. However, as of this writing,
only a Linux compiler has officially been released. Compilers for other
operating systems and machines are under way. GPC is stated to be compatible
with the unextended standard, and work is under way to add the extended
standard to it. However, exceptions to the standard reportedly exist in the
language. The exact details are unknown. I will hopefully be checking this
out personally when I get a dedicated Linux box going.
Location: ftp://kampi.hut.fi/jtv/gnu-pascal
Start by reading the GPC.GUIDE file, it will tell you everything.
GPC states that it should work on any computer or operating system where
GCC is implemented (which would make it the most widely available Pascal
version anywhere). However, I have had people attempting the port tell
me that this involves a lot of work, and as a result I have yet to see
a port outside of Linux. As more ports appear, I will add them to this list.
2. Willhelm J. Withagen's project compiler for OS/2. I have tested this
compiler, and noted one deviance from the standard, and the author informs
me that intraprocedural gotos are not completely implemented.
Location: ftp://ftp.cdrom.com/pub/os2/dev32/pasos2b.zip
PAIDWARE
These compilers are available for cost. Unfortunately, I will only be listing
compilers that are currently shipping, which leaves several compilers out
whose makers have dropped the product or had a business failure.
1. Prospero Pro Pascal. In existence for quite some time, the DOS compiler
supports small, large and huge memory models. Prospero compilers show an
extra effort to conform to the existing standards, and have excellent
documentation. There is a windows version of this compiler, as well as
one for the old (1.x) OS/2. The only drawback to the compiler is the less
than stellar code speed.
Price: ???
Email: prospero@prospero.demon.co.uk
2. Prospero ep32 Pascal. This is prospero's new 32 bit compiler for OS/2.
From what I saw, Prospero did an excellent job of integrating the compiler
to OS/2, including providing routine headers for the OS/2 system calls,
a translator from C to Pascal for any other functions, and a full IDE.
This compiler supports the full original and extended Pascal standards.
The only drawbacks I found were the (again) slow code speed, and the fact
that intraprocedural gotos don't work across module lines (which makes
error recovery very difficult).
Price: about $250 (US).
Email: prospero@prospero.demon.co.uk
*****************************************************************************
======== EXTENDED PASCAL ==========
Now we concentrate on the official "extended" version of the Pascal standard.
This information was provided by John Reagan of DEC, who serves on the Pascal
standards committees, and also programs Pascal compilers for DEC.
****************************************************************************
Q. WHAT IS EXTENDED STANDARD PASCAL ?
A. The Extended Pascal standard was completed in 1989 and is a superset
of ISO 7185. The Extended Pascal standard is both an ANSI/IEEE and
ISO/IEC standard. Both standards are identical in content with minor
editorial differences between the IEEE and ISO/IEC style guiles.
The ANSI/IEEE number is ANSI/IEEE 770X3.160-1989. The ISO/IEC number
is ISO/IEC 10206 : 1991.
Here is part of the foreword from the Extended Pascal standard to provide
a short summary of the new features in Extended Pascal.
- Modularity and Separate Compilation. Modularity provides for
separately-compilable program components, while maintaining type
security.
o Each module exports one or more interfaces containing entities
(values, types, schemata, variables, procedures, and functions)
from that module, thereby controlling visibility into the module.
o A variable may be protected on export, so that an importer may use
it but not alter its value. A type may be restricted, so that its
structure is not visible.
o The form of a module clearly separates its interfaces from its
internal details.
o Any block may import one or more interfaces. Each interface may
be used in whole or in part.
o Entities may be accessed with or without interface-name qualification.
o Entities may be renamed on export or import.
o Initialization and finalization actions may be specified for each
module.
o Modules provide a framework for implementation of libraries and
non-Pascal program components.
Example:
module employee_sort interface;
export employee_sort = (sort_by_name,sort_by_clock_number,
employee_list);
import generic_sort;
type
employee = record
last_name,first_name : string(30);
clock_number : 1..maxint;
end;
employee_list(num_employees : max_sort_index) =
array [1..num_employees] of employee;
procedure sort_by_name(employees : employee_list;
var something_done : Boolean);
procedure sort_by_clock_number(employees : employee_list;
var something_done : Boolean);
end.
- Schemata. A schema determines a collection of similar types. Types
may be selected statically or dynamically from schemata.
o Statically selected types are used as any other types are used.
o Dynamically selected types subsume all the functionality of,
and provide functional capability beyond, conformant arrays.
o The allocation procedure NEW may dynamically select the type
(and thus the size) of the allocated variable.
o A schematic formal-parameter adjusts to the bounds of its
actual-parameters.
o The declaration of a local variable may dynamically select the
type (and thus the size) of the variable.
o The with-statement is extended to work with schemata.
o Formal schema discriminants can be used as variant selectors.
Example:
type
SWidth = 0..1023;
SHeight = 0..2047;
Screen(width: SWidth; height: SHeight) =
array [0..height, 0..width] of boolean;
Matrix(M,N: integer) = array [1..M,1..N] of real;
Vector(M: integer) = array [1..M] of real;
Color = (red,yellow);
Color_Map(formal_discriminant: color) =
record
case formal_discriminant of
red: (red_field : integer);
yellow : (yellow_field : integer);
end;
function bound : integer;
var s : integer;
begin
write('How big?');
readln(s);
bound := s;
end;
var
My_Matrix : Matrix(10,10);
My_Vector : Vector(bound); { Notice the run-time expression! }
Matrix_Ptr : ^Matrix;
X,Y : integer;
begin
readln(x,y);
new(Matrix_Ptr,X,Y);
end
- String Capabilities. The comprehensive string facilities unify
fixed-length strings and character values with variable-length strings.
o All string and character values are compatible.
o The concatenation operator (+) combines all string and character
values.
o String may be compared using blank padding via the relation operators,
or using no padding via the functions EQ, LT, GT, NE, LE, and GE.
o The functions LENGTH, INDEX, SUBSTR, and TRIM provide information
about, or manipulate, strings.
o The substring-variable notation makes accessible, as a variable, a
fixed-length portion of a string variable.
o The transfer procedures READSTR and WRITESTR process strings in the
same manner that READ and WRITE process textfiles.
o The procedure READ has been extended to read strings from textfiles.
- Binding of Variables.
o A variable may optionally be declared to be bindable. Bindable
variables may be bound to external entities (file storage,
real-time clock, command lines, etc.). Only bindable variables
may be so bound.
o The procedures BIND and UNBIND, together with the related type
BINDINGTYPE, provide capabilities for connection and disconnection
of bindable internal (file and non-file) variables to external
entities.
o The function BINDING returns current or default binding information.
- Direct Access File Handling.
o The declaration of a direct-access file indicates an index by which
individual file elements may be accessed.
o The procedures SEEKREAD, SEEKWRITE, and SEEKUPDATE position the file.
o The functions POSITION, LASTPOSITION, and EMPTY report the current
position and size of the file.
o The update file mode and its associated procedure UPDATE provide
in-place modification.
- File Extend Procedure. The procedure EXTEND prepares an existing
file for writing at its end.
- Constant Expressions. A constant expression may occur in any context
needing a constant value.
- Structured Value Constructors. An expression may represent the value
of an array, record, or set in terms of its components. This is
particularly value for defining structured constants.
- Generalized Function Results. The result of a function may have any
assignable type. A function result variable may be specified, which
is especially useful for functions returning structures. [A function
call may be directly array-index, field-selected, or pointer-dereferenced
without having to use an intermediate variable.]
- Initial Variable State. The initial state specifier of a type [or record
field] can specify the value that variables [, or fields, or variant
selectors] are to be created with.
- Relaxation of Ordering of Declarations. There may be any number of
declaration parts (labels, constants, types, variables, procedures
and functions) and in any order. The prohibition of forward references
in declarations is retained.
- Type Inquiry. A variable or parameter may be declared to have the
type of another parameter of another variable.
- Implementation Characteristics. The constant MAXCHAR is the largest
value of type CHAR. The constant MINREAL, MAXREAL, and EPSREAL describe
the range of magnitude and the precision of real arithmetic.
- Case-Statement and Variant Record Enhancements. Each case-constant-list
may contain ranges of values. An OTHERWISE clause represents all
values not listed in the case-constant-lists.
- Set Extensions.
o An operator (><) computes the set symmetric difference.
o The function CARD yields the number of members in a set.
o A form of the for-statement iterates through the members of a set.
- Date and Time. The procedure GETTIMESTAMP and the functions DATE and
TIME, together with the related type TIMESTAMP, provide numeric
representations of the current date and time and convert numeric
representations to strings.
- Inverse Ord. A generalizations of SUCC and PRED provides an inverse
ORD capability.
- Standard Numeric Input. The definition of acceptable character
sequences read from a textfile includes all standard numeric
representations defined by ANSI X3.42-1975.
- Non-Decimal Representation of Numbers. Integer numeric constants may
be expressed using bases two through thirty-six.
- Underscores in Identifiers. The underscore character (_) may occur
within identifiers and are significant to their spelling.
- Zero Field Widths. The total field width and fraction digits expressions
in write parameters may be zero.
- Halt. The procedure HALT causes termination of the program.
- Complex Numbers.
o The simple-type COMPLEX allows complex numbers to be expressed in
either Cartesian or polar notation.
o The monadic operators + and - and dyadic operators +, -, *, /,
=, [and] <> operate on complex values.
o The functions CMPLX, POLAR, RE, IM, and ARG construct or provide
information about complex values.
o The functions ABS, SQR, SQRT, EXP, LN, SIN, COS, [and] ARCTAN
operate on complex values.
- Short Circuit Boolean Evaluation. The operators AND_THEN and OR_ELSE
are logically equivalent to AND and OR, except that evaluation order
is defined as left-to-right, and the right operand is not evaluated
if the value of the expression can be determined solely from the value
of the left operand.
- Protected Parameters. A parameter of a procedure or a function can
be protected from modification within the procedure or function.
- Exponentation. The operators ** and POW provide exponentation of
integer, real, and complex numbers to real and integer powers.
- Subranges Bounds. A general expression can be used to specify the
value of either bound in a subrange.
- Tag Fields of Dynamic Variables. Any tag field specified by a
parameter to the procedure NEW is given the specified value.
- Conformant Arrays. Conformant arrays provide upward compatibility
with level 1 of ISO 7185, Programming languages - PASCAL.
*****************************************************************************
Q. HOW EASY IS IT TO CONVERT BORLAND PROGRAMS TO THE EXTENDED PASCAL
STANDARD ?
A. As mentioned earlier, Turbo Pascal does not conform to any of the Pascal
standards. If you carefully chose a subset of unextended Pascal, you
may be able to port code if you're lucky/careful.
To be fair, Turbo Pascal has some wonderful features that make it very
powerful in the environments in which it runs. However, those same
features are of little use on non Windows/DOS platforms and probably are
not good candidates for standardization.
There are several Turbo Pascal features which are semantically similar
to features in unextended Pascal or Extended Pascal. Here is a list of
mappings between Turbo Pascal features and Extended Pascal features:
- Case constructs
a. Extended Pascal uses otherwise instead of else.
Borland Pascal:
case c of
'A' : ;
'B' : ;
else ...;
end;
Extended Pascal
case c of
'A' : ;
'B' : ;
otherwise ...;
end;
b. Missing cases cause Extended Pascal compilers to halt. In the
case statement above if you had no otherwise clause and char c
had the value 'C', you got an error (note, this would be
unnoticed in Borland Pascal).
- Procedure and function types and variables
Here is an area of subtle differences. Turbo Pascal has true
procedure/function types but doesn't have standard Pascal's
procedural/functional parameters.
Borland Pascal
type
CompareFunction = function(Key1, Key2 : string) : integer;
function Sort(Compare : CompareFunction);
begin
...
end;
Extended Pascal
function Sort(Compare : function(Key1, Key2 : string) : integer);
begin
...
end;
Moving from Turbo Pascal to Extended Pascal might be difficult
if the Turbo Pascal program saves, compares, trades, etc. procedure
values. For example, an array of procedure values isn't possible
in Extended Pascal. Moving the other way is a little easier as
show by the above examples.
- Strings
a. Borland Pascal's string type has a special case, namely string
without a length meaning the same as string[255]. There is no
default in Extended Pascal so you have to change all string types
to string(255). Example:
var
s : string;
becomes:
var
s : string(255);
Note also that you have to use parentheses instead of brackets.
b. A nice pitfall is the pointer to string as in:
type
PString = ^String;
In Extended Pascal this is a pointer to a schema type!! Don't
forget to translate this to:
type
string255 = string(255);
PString = ^string255;
If you indeed want to use String as a schema pointer you can
define things like:
type
MyStr : ^String;
begin
New(MyStr, 1024);
end;
to allocate 1024 bytes of string space.
c. As you could see above, Extended Pascal has no 255 byte limit
for strings. It is however save to assume a limit of about
32000 bytes. At least Prospero's Extended Pascal limits
strings to 32760 bytes. GNU Pascal seems to allow larger
strings. DEC Pascal limits strings to 65535 bytes.
- Constant variables
a. Extended Pascal translates Borland's gruesome:
const
i:integer = 0;
to:
var
i : integer value 0;
Much nicer ain't it?
b. Even nicer is that you can assign initialization values to
types. Like:
type
MyInteger = integer value 0;
var
i : MyInteger;
All variables of type MyInteger are automatically initialized to
0 when created.
c. Constant arrays of type string are translated from:
const
MyStringsCount = 5;
type
Ident = string[20];
const
MyStrings : array [1..MyStringsCount] of Ident = (
'EXPORT', 'IMPLEMENTATION', 'IMPORT', 'INTERFACE',
'MODULE');
to:
const
MyStringsCount = 5;
type
Ident = string(20);
var
MyStrings : array [1..MyStringsCount] of Ident value [
1:'EXPORT'; 2:'IMPLEMENTATION'; 3:'IMPORT';
4:'INTERFACE'; 5:'MODULE'];
There seem to be pros and cons to each style.
Some folks don't like having to specify an index since it requires
renumbering if you want to add a new item to the middle. However,
if you index by an enumerated type, you might be able to avoid
major renumbering by hand.
- Variant records
The following construction is not allowed in Extended Pascal:
type
PersonRec = record
Age : integer;
case EyeColor : (Red, Green, Blue, Brown) of
Red, Green : (Wears_Glasses : Boolean);
Blue, Brown : (Length_of_lashes : integer);
end;
end;
The variant field needs an explicit type. Code this as:
type
EyeColorType = (Red, Green, Blue, Brown);
PersonRec = record
Age : integer;
case EyeColor : EyeColorType of
Red, Green : (Wears_Glasses : Boolean);
Blue, Brown : (Length_of_lashes : integer);
end;
end;
- Units
a. You can translate units almost automatically to Extended Pascal
Modules, taking into account some differences of course.
Extended Pascal does not automatically export everything named
in a module, but you have to create seperate export clauses.
For example translate the following unit:
unit A;
interface
uses
B, C;
procedure D;
implementation
procedure D;
begin
end;
end.
to this module:
module A interface;
export
A = (D);
import
B;
C;
procedure D;
end.
module A implementation;
procedure D;
begin
end;
end.
You can have one or more export clauses and the name of an
export clause doesn't have to be equal to the name of the
module.
You also see in this example how to translate the Borland
Pascal "uses" clause to the Extended Pascal "import" clause.
b. Borland Pascal allows you to have code in a unit that is
executed once, at startup, to initialize things. You can
translate this to Extended Pascal's "to begin do ..end"
structure.
Borland Pascal:
unit A;
interface
implementation
begin
{ do something }
end.
Extended Pascal:
module A interface;
end.
module A implementation;
to begin do begin
{ do something }
end;
end.
Extended Pascal also has a "to end do .... end" so you can
translate Exit handlers also.
- Files
Extended Pascal treats files quite differently as Borland Pascal.
I'm not going to treat file pointers, Get and Put here, but
instead I focus on special Extended Pascal features.
In Borland Pascal you can read any text file as follows:
var
t : text;
Line : string;
begin
Assign(t, 'MYTEXT.TXT');
Reset(t);
while not eof(t) do begin
readln(t, Line);
writeln(Line);
end;
end;
The Assign function associated the textfile T with the file
MYTEXT.TXT.
In Extended Pascal, files are considered entities external to your
program. External entities, which don't need to be files, need to
be bound to a variable your program. Any variable to which
external entities can be bound needs to be declared bindable. So
the variable declaration of t becomes:
var
t : bindable text;
Extended Pascal has the bind function that binds a variable with
an external entity. Here is an Extended Pascal procedure that
emulates the Assign procedure in Turbo Pascal.
procedure Assign(var t : text; protected Name : string);
var
b : BindingType;
begin
unbind (t);
b := binding (t);
b.Name := Name;
bind (t, b);
b := binding (t);
end;
Comments: the unbind procedure unbinds a bindable variable from
its external entity. If it is not bound, nothing happens. The
binding function initializes b. We call binding to set some fields
of the BindingType record. Next we set the name field to the name
of the file. Calling bind will bind t to the external entity. If
we now call binding again, we get the current state of t's binding
type. We can now check for example if the bind has succeeded by:
if not b.bound then
{ do error processing }
Note that Prospero's Pascal defaults to creating the file if it
does not exists! You need to use Prospero's local addition of
setting b.existing to true to work-around this.
I've not worked with binary files enough, so no advice yet on how
to access them, but you access them much the same.
As last an example of getting the size of a file.
function FileSize(filename : String) : integer;
var
f : bindable file [0..MaxInt] of char;
b : BindingType;
begin
unbind(f);
b := binding (f);
b.Name := filename;
bind(f, b);
b := binding(f);
SeekRead(f, 0);
if empty(f)
then filesize := 0
else filesize := LastPosition(f) + 1;
unbind(f);
end(*file_size*);
*****************************************************************************
Q. WHAT IS THE OBJECT ORIENTED PASCAL STANDARD ?
A. After the Extended Pascal standard was completed, the committee took up
the task of adding object-oriented features to Extended Pascal. Actually,
the work applies to both Extended Pascal and unextended Pascal, but to
get the full benefit from the object-oriented features, certain features
from Extended Pascal must also be utilized.
This work was done as an ANSI Technical Report. Unlike the 2 previous
standards, the technical report is not full of "standardize", but rather
is more informal and readable than a full-bodied standard. This report
was completed in 1993.
Features from the technical report include:
- A CLASS definition with support for ABSTRACT and PROPERTY classes.
Abstract classes are place holders in the class hierarchy. An
object cannot be created for an abstract class. Property classes
provide characteristics, attributes, or properties of another class.
An object can be tested to see whether or not is has a property.
- Multiple inheritance from at most one concrete or abstract class and
zero or more property classes.
- Method definitions can be marked with ABSTRACT or OVERRIDE.
- Constructors and destructors with zero or more parameters.
- A predefined ROOT class containing a predefined constructor (CREATE),
a predefined destructor (DESTROY), and two predefined methods
(CLONE and EQUAL).
- Predefined property class TEXTWRITABLE with methods READOBJ and
WRITEOBJ.
- CLASS VIEWs provide a new class type that is a partially opaque
view of an existing class type. Views are used to provide
visibility, security, and protocol between public and private
uses of the class.
- The object model is a reference model. This means that you create
and destroy objects explicitly. Objects can be thought of as if
they were accessed indirectly through references.
- A membership operator, IS.
*****************************************************************************
Q. WHO CREATES THE PASCAL STANDARDS ?
A. Would you believe magic elves? No, I guess not...
The Pascal standards are development and maintained by the American
committee, X3J9, and the International working group, WG2. Most of
the standardizers actually belong to both committees.
During its peak, the committee met at least 4 times a year for a week
at a time. The location of the meetings floated around from member to
member trying to alternate east-coast vs west-coast. The WG2 meetings
tried to alternate between North America and UK/Europe.
Recently, the committee's work has slowed down and the meetings average
about 1 or 2 per year with very rare meetings in the the UK.
The committee is open to the public. To become a voting member on X3J9
you just have to satisfy some attendance/participation requirements
(very easy) and be a member of X3 by paying dues (X3 dues can be waived
for those unable to pay). Basically, anybody with a brain may attend
(and if you've been to some previous meetings, you'll note that we
sometimes waive that requirement too :-) )
If you are interested in attending a Pascal meeting, drop me a note at
reagan@hiyall.enet.dec.com and I can give you information on the next
meeting's location.
Over the past years, many different vendors, user-groups, academics, etc.
have participated in XJ39. Here is a quick list (not intended to be
a total list) of participants: Digital Equipment Corporation, Apple
Computer, Microsoft, Tandem Computers, Sun Microsystems, Intel, Siemens
Nixdorf, IBM, Hewlett Packard, Edinburgh Portable Compilers, Prospero
Software Ltd. University of Bochum, Borland International, US Air Force,
University of Teesside, Visible Software, US Census Bureau, Symantec,
Unisys, GTE, Control Data Corporation, Cray Research Inc., E-Systems,
ACE - Associated Computer Experts bv, Stanford Linear Accelerator
Center, Central Michigan University, Pace University, St. Peter's College,
Prime Computer, Queens University, Research Libraries Group, Florida
International University, Apollo Computer, NCR Corporation, Data General,
and various individuals representing themselves. (Well, technically
almost all members of the committee represent themselves and not their
employers, but I thought people would recognize the names of the companies
and not the names of the individuals.)
*****************************************************************************
Q. ARE THERE ANYMORE STANDARDS IN PROGRESS ?
Recently, the committee has been working on an exception handling model
that is similar to what you might see in C++ or Modula-3. We hope to
produce another ANSI Technical Report on this in the future. However,
we're in need of more people to participate in our work. In the past,
most of the work was physically done at the meetings, but I think that
in the future, we'll have to do work by e-mail or newsgroups in a more
informal fashion and only have physical meetings to consolidate the final
work.
*****************************************************************************
Q. HOW DO I GET COPIES OF THE STANDARDS ?
The unextended Pascal and Extended Pascal standards are both copyrighted
by the IEEE in the US and ISO/IEC in other contries.
You can obtain copies from the IEEE, your countries standards body,
University libraries, corporate libraries, or the ISO/IEC in Genhva
Switzerland. Also, there is a Standards FAQ posted montly in
News.Announce which helps finding who to contact. It can also be found
at ftp.uni-erlangen.de:/pub/doc/ISO/. You can also Telnet to
info.itu.ch with name "gopher".
Here is some contact information:
Philips Business Information 1-800-777-5006
Document Center 1-415-591-7600
ANSI 1-212-642-4900
Attention: Customer Service
11 West 42nd Street
New York, NY 10036
ISO Sales
Case Postale 56
CH-1211 Geneve 20
Switzerland
email: sales@isocs.iso.ch
http://www.iso.ch/welcome.html ! in English
http://www.iso.ch/welcomef.html ! en Frangais
I do not think that the IEEE has republished the unextended Pascal
standard since it was essentially replaced with a pointer to the ISO 7185
standard. The IEEE may still have old copies lying around. The IEEE
order number is SH08912. The ISBN number is 0-471-88944-X. I know that
the revised ISO 7185 is available (I have a copy from BSI in the UK).
For Extended Pascal, the IEEE order number is SH13243 and the ISBN
number is 1-55937-031-9. Last time I checked the IEEE's price was
around US$55. I'm not sure what the ISO/IEC charges. In addition,
I've been told that the GNU Pascal kit at kampi.hut.fi:/jtv/gnu-pascal
contains a LeX script for Extneded Pascal. I have no idea about its
accuracy.
The Object-Oriented Extensions to Pascal technical report is a ANSI
technical report number ANSI/X3-TR-13-1994 (is 13 an omen?) I'm not
sure of its copyright status, but it also isn't online in its final form
(the editor was using some variant of Word Perfect and said he was
unable to provide a readable text form). You can get the technical
report from CBEMA in Washington DC. I have no idea at present how to
get it outside the United States. I personally have a few hardcopies
laying around my office and if you drop me a line at
reagan@hiyall.enet.dec.com, I'll see what I can do.
I should point out that the unextended Pascal and Extended Pascal
standards are written in a very legalistic form and are not light
reading. They really aren't suitable as an implementation guide
or learning how to use features from unextended or Extended Pascal.
On the other hand, the Object-Oriented Extensions to Pascal technical
report is written in a less formal style. The committee is planning
on producing "standardese" for the technical report to include in
a future revision of the unextended and Extended Pascal standards.
*****************************************************************************
Q. WHAT EXTENDED COMPILERS EXIST ?
A. For Extended Pascal, only Prospero Pascal claims complete acceptance
of Extended Pascal source code. Other vendor's compilers, like Digital,
accept portions of the Extended Pascal standard. You'll have to ask your
favorite vendor about support.
The publically available GPC (Gnu Pascal Compiler) project includes many
of the features of the extended standard, with the goal of implementing
the entire standard.
At present, there is no official Extended Pascal Validation Suite. Now
that Prospero has obtained the rights to the Pascal Validation Suite,
there is chance for a future EPVS. You'll have to ask Prospero about
their plans.
For the Object-Oriented Extensions to Pascal, I know of no compiler
that yet claims support of the document. Again, ask your vendor to make
sure.
*****************************************************************************
ADDITIONS TO THE FAQ
Submissions to this FAQ are encouraged. Just send me a Q & A formatted
letter with a problem that is of common interest to Pascal programmers, and
I will include it.
sam@value.net
*****************************************************************************
               (
geocities.com/SiliconValley/2926)                   (
geocities.com/SiliconValley)