Dormitat Homerus
"I say what I mean - at least, I mean what I say - its the same thing you know"
Lewis Carroll, 'Alice in Wonderland'
1.1 Although a universal computer is capable in principle of doing anything that is logically possible it does not always do so, and this most often stems from a failure in communication between the user or programmer, whose thoughts are framed in natural language, and the machine, which is constrained to work in terms of an artificial language with tight syntax that allows no ambiguities. A computer cannot work on a basis of 'I hear what you say, but I understand what you mean'. At best it can only report 'I do not understand', at worst, misinterpret what the user intends and do something different. In fact interpreting natural language reliably is a task beyond current machine capabilities, and is likely to remain so because of inherent ambiguities and drift in meaning over time.
1.2 The commonest cause of failure is probably user error in data input or manual selection of imperfectly understood operations. In the days before computers professional number crunchers customarily spent more time cross-checking the validity of the data being used than in performing the calculations. In the era of punched card main-frames input was usually checked at least four times using three independent operators before being submitted to the machine. While it is feasible to arrange for the machine to check that inputs are legal automatically it is inherently impossible for it to check input which is legal but not what the user intended. The issue of verifying input often warrants more attention than it receives
2.1 In some cases a question may be meaningless because it is incompletely defined (1) but in others it may be intelligible but uncomputable. The formal definition of operations which are computable is straightforward - they can be expressed as a finite sequence of logical operations (ie are algorithmic): anything else is not computable. In practice however the distinction is not so clear-cut since there is no definitive way of determining whether an suitable algorithm is merely undiscovered or cannot exist. In fact the definition stems from an underlying philosophical view that a problem cannot be regarded as solved until positive proof that the solution is correct is demonstrated - guesswork won't do.
2.2 Many operations involving real numbers are non-computable in the strict sense since they require an infinite sequence of logical operations to give an exact result eg calculating pi or successive approximation algorithms. These are typical of many problems where approximations with arbitrarily small error can be achieved by truncating the sequence after an appropriate number of terms (making it algorithmic) This presents no substantial difficulty as far as programming is concerned but raises the issue of deciding how many terms to include in a program intended for other users. This becomes especially difficult with proprietary programs where user needs may be very varied and basically unknown to the programmer while the user is denied information on program structure and details.
2.3 The constraints on computability apply as much to human reasoning as to computers but in everyday life uncomputable problems abound and guesswork is often regarded as acceptable, presumably with the expectation of being wrong on occasion (2). In these cases several different situations arise if computer assistance is sought :-
(a) The problem is meaningless because it does not give sufficient information to solve it e.g. 'Why did the chicken cross the road' - a computer can only reply 'I do not know'
(b) The problem specification is well-defined but inconsistent - a good program will detect this and report accordingly
(c) An appropriate logical sequence cannot be constructed because the mental processes involved are not understood e.g. recognizing a picture of a face - this is notoriously unreliable but babies can acquire this ability within a few weeks of birth.
2.4 In the last case a specific photograph of a face can be recognized with high reliability by both human and computer: it is recognizing different photographs of the same face that presents difficulties. The reliability can become low if they are presented in different formats or contexts. Other good examples are the Optical Character Recognition paradigms used with document scanners. These typically claim a reliability of around 99% but since this results in 20 - 30 errors on each page of text proof-reading and correction require an appreciable fraction of the effort required to type the document in from the keyboard. The performance is best if the document is restricted to a specific font, when allowance for different sizes and orientations of characters is feasible, but when unspecified fonts are used the computer performance degrades greatly, sometimes recognizing only a few percent of the characters despite the fact that the human eye can interpret the text without difficulty.
3.1 Another grey area is where the problem is well-defined and computable but the user does not understand the constraints implied. Simple examples are the many programs which offer to order or classify data. In selecting or designing an ordering program it is necessary to consider :-
(a) A suitable format for input data - a text-based file with a constant width characters is vastly easier to handle than a graphic-based format with variable width characters.
(b) Whether the required ordering criterion allows complete or only partial ordering, and in the latter case what is to be done with duplicates
(c) Whether a composite criterion is needed and the order in which the components are applied
(d) What signal is used to indicate the end of the data.
none of these present any major programming problems but a user may find considerable difficulty in discovering these details in an existing program written by someone else.
For classification programs the category selection conditions need careful definition to ensure that they produce collectively comprehensive but mutually exclusive classes. If this condition is not satisfied provision to define what is to be done with data which fits no category or more than one category is needed.
3.2 The traditional approach to mathematical problems requires determination of whether the problem has
(a) no solutions
(b) one unique solution
(c) more than one solution.
In some cases this creates a slightly more complex situation. An example is matrix inversion which is widely used in many problems, including solving sets of simultaneous linear equations and statistical analysis. Here all three cases can arise depending on the specific data values and a large number of methods have been developed to deal with these various situations. A user needs to understand enough about the subject (and the program) to ensure that a suitable method has been employed for the data he wishes to use if errors, which may be gross, are to be avoided.
4.1 It has been said that commercial programs are released when the cost of disaffected customers is judged to equal the cost of continuing testing and delaying release. Reading the fine print in the typical disclaimer attached to them it would be over-optimistic to dismiss this view. It is easy however to be critical but less easy to suggest a definitive way of testing whether programs are totally correct - debugging can only find bugs, not guarantee their absence.
4.2 In theory it is possible to prove a program for a finite-state machine which has a finite number of legal inputs totally correct by applying the inputs in turn and checking that the machine stops with the correct answer for each. In practice however this is only possible for selected simple examples since
(a) For a realistic application the combinations of legal inputs are too numerous to be practicable
(b) A real machine will require an operating system which interrupts the action at unpredictable times, so the overall machine state at any point of the program is equally unpredictable and, despite attempts to isolate the effects, proof of non-interference is required.
Ironically for cases where this is feasible (ie a table of legal input vs correct output can be constructed) the program is actually redundant since a simple table look-up program serves for all.
4.3 The basic definition of a totally correct program is one that, given a legal input, will always halt with the correct answer. To be meticulous one might add that given an illegal input the program should also halt with an indication that the input is illegal (this is logically equivalent but much more laborious since there are usually many more illegal inputs than legal). It has been shown that a general algorithm proving programs will halt cannot exist, so there is no test program which can establish total correctness. In practice validation has to revert to empirical testing, examining the program operation in detail with selected inputs and comparing the calculated result with independently known solutions. Attention has to be paid to selecting inputs which cover the whole legal range and any special cases that might occur, as well as the effects of rounding errors when real numbers are involved (which is nearly always).
4.4 In cases where operations are Safety Critical or formal Quality Assurance is sought attempts have been made to systematize the process by inserting intermediate assertions at selected points in the program and verifying that these are satisfied by all paths through the program. It turns out that while the selection of checkpoints and verification of local conditions can be automated no algorithmic method of selecting the assertions needed to provide a proof exists, so that this does not differ in any fundamental way from the 'empirical debugging' approach, despite being more complex and demanding.
5.1 The examples quoted are merely a random selection of situations that can (and do) occur but suffice to show that a computer cannot be regarded as an infallible oracle. Taking a pessimistic view this might be taken to imply that computers are an impediment rather than a utility but this is well wide of the mark. There are many areas where progress would be impossibly laborious without the aid of a computer and in most cases problems arise only rarely. The main point is that computers save perspiration not inspiration. In order to do something effectively and reliably the user needs to understand some of the implications of what is being attempted - ideally enough to be able to do the job with pencil and paper in principle even when life is too short to actually do it - and some assessment of whether results are valid. The most vulnerable situations are with large applications where the volume of data involved is too great to examine in detail and the logical design is too complex to follow even when the program structure and source code are available eg CAD programs engineering and architectural design.
5.2 Some of the most popular applications are fairly straightforward - in entertainment applications for example (playing games, music, or videos) (3) the outcome is immediately apparent - they make relatively little use of the computing capabilities and either work or don't work. Word-processing is possibly the most widely used application and this again uses little of the computing capability while the results are verifiable by direct inspection. Spread-sheet applications are similar when used for balance sheet preparation, although results are rather more laborious to verify, but need greater caution when more elaborate arithmetical operations are involved since they frequently use arithmetic subroutines which are not as robust as they should be and the order in which formulae are evaluated is not always apparent.
5.3 Database applications usually involve sorting and searching algorithms which may be susceptible to the problems noted above but the main problem is not computational but the reliability and current status of the data they contain. The most suspect applications are those which involve statistical inference and prediction (4), such as scheduling, business planning, stock market predictions, and linear programming optimization. A vast range of methods of dealing with such problems exist from the pre-computer age and judging whether those incorporated in programs is appropriate and reliable for the data in hand is problematic even if they are known.
1. A schoolboy parody popular 60 years ago was - 'A train leaves Glasgow for London at 12 noon and travels at 70 mph. Another train leaves London for Glasgow at 12.30 pm and travels at 60 mph.What is the colour of the engine-driver's socks ?'. As I recall the traditional answer was 'Brown, because his name was Green'.
2. There are many more ways of being wrong than of being right - breaking even is doing pretty well, being right more often than wrong verges on genius.
3. The situation may change due to the activities of a commercial lobby in both the US and EU attempting to obtain statutory force for their efforts to take control of users machines via the Internet to protect their copyright and 'intellectual' property rights (which in many cases actually involve public domain material).
4. Many of these involve a group of algorithms which, for reasons too tedious to enumerate, are called NPC (Non-deterministic Polynomial Complete) problems. These are formally computable and essentially all the same problem, since they can all be reduced to one another, but are unsolvable in practice because the time required increases so rapidly with size as to be unrealistic. Simple algorithms exist which can solve them for a handful of elements but the time required rises to the age of the universe for 50 or so elements. 'Heuristic' programs which guess a solution are frequently used and often give reasonable solutions which nevertheless are not the optimum. The concern with an optimum solution often stems more from a desire to estimate what advantage a new heuristic might give than the actually achieving the optimum.