The normal stages of a program's life cycle are:

- Initial conception, formulation of requirements.
- Overall design of a program that will meet these requirements
- Detailed design and coding.
- Program review, with rework and extension as needed to clarify, simplify or improve efficiency.
- Development of a test plan, testing and debugging, removal of errors and retest.
- Operational use of program.
- Enhancement and repair during continuing operational use.
- Retirement.

This chapter discusses various key aspects of this program life cycle providing hints that aim to help the inexperienced programmer to cope effectively with the pragmatic problems normally associated with program design, debugging, and maintenance.

- Know that they will occur. Since any small error, e.g., forgetting a line, typing "-" where " + " is meant, misspelling an identifier or keyword, incorrectly inserting parentheses into an expression, will cause a bug, you must train and discipline yourself to higher levels of logical and
typographical accuracy in programming than are required in any other
human activity. Be suspicious. Program defensively. Check your programs scrupulously for syntactic and logical correctness, several times if
necessary, before you try to run them. If in doubt as to the meaning of
any operation or programming language construct, look it up.
- When bugs occur, your problem is to locate, recognize, and remove
them. Bugs cannot be located unless you know the programming language with which you are working well enough to recognize problems
when you are looking at them. Bugs cannot be eliminated until you have
understood them well enough to know why and how they cause the
faults that betray their presence. Finding bugs, like finding needles in
a haystack, calls for systematic sifting, for careful detective work. A
program is a delicate piece of machinery, and it is simple folly to
think that you can make it work by kicking it hard in some random
way to make its pieces fall into place. Because they involve many
submechanisms, all of which must interface correctly if they are to work
together properly, programs, like elaborate combination locks, require
careful analysis and attentive sensing of their hidden internals when they
need repair. The novice who tries to fix a malfunctioning program
without fully understanding the way in which it is working is attempting
a task that is far less hopeful than that faced by someone who tries
to open an unfamiliar safe without understanding its workings or
combination. "The sequence 33-8-19-27 doesn't work? Then I'll try
23-92-69-46. This doesn't work either? Then maybe 17-51-85-34 will be
luckier." A student who allows himself to be drawn into of this sort of
thoughtless, random attempt to repair a program will inevitably find
that his efforts drag on unsuccessfully, not only till the end of the
term or year, but until the end of the solar system, without revealing
anything. What is needed instead is a systematic, analytic approach.
- Though programs are almost never entirely bug-free, observance of the
rules of good programming style can reduce the density of bugs in your
initial program drafts and allow bugs to be found more quickly once
testing of your program begins. Finding the right approach to the
programming task that confronts you, the right style in which to start
writing the code that you need, is of prime importance. To find this "right
approach" requires careful consideration of the logical structure of your
programming task, with the aim of defining a collection of intuitively
transparent operations that work well together and can be used to
accomplish this task in as straightforward a way as possible. Code
should impress by its clarity, naturalness, and inevitability, all of which
make avoidance and exposure of bugs easier, rather than by obscure
trickery and impenetrable cleverness. Programs that achieve brevity
without sacrificing clarity are most desirable, since lines of code that you
never need to write will never contain bugs. Effective brevity is attained
by a correct choice of intermediate operations and by systematic use
of these operations to produce the program you require. SETL is in
itself a powerful programming language, but especially for larger, more
complex applications it may be well to program by first inventing a still
more powerful language specially adapted to your intended application.
Then your initial program draft can be written in this (possibly unimplemented) language, after which it can be transcribed into SETL to
make it executable. In this sort of approach, the primitives of your
invented language will become the procedures of your SETL
code. By using an auxiliary language in this way and by handling its
transcription into SETL in as mechanical a style as possible, valuable
protection against error is gained. See Sections 5.2 and 8.5 for a discussion
of related issues.
- Careful program documentation also serves to expose and eliminate
bugs. Good documentation will add an important degree of redundancy
to your program. Your code expresses your intent in one way and your
comments express the same intent in another. Discrepancies between
the two indicate the presence of bugs. Carefully thought-out comments
should be added to a program as soon as the code is written. Some
comments will in fact be written before the code to which they refer, in
order to guide composition of the code. Any additional comments
needed to make documentation complete should be added to the code
while it is still fresh; this creates an opportunity to review the code,
checking it for logical faults. After the whole text, code plus comments,
has been constructed and put into proper format, it should be left
to "cool" for a few hours or days, after which it should be reviewed
attentively and suspiciously. Such a "cooling-off period" will dispel some
of the initial misapprehensions which may have crept into a code and
thus will allow various systematic errors to be corrected.
- As has been said, brevity in coding is desirable, but this should be the
kind of brevity that flows naturally from an effective overall approach to
the programming task at hand, not the undesirable brevity which comes
from stinting redundancy (e.g., by using short, unmnemonic, variable
names). Use the features of the SETL language vigorously and eliminate
clumsy circumlocutions where direct modes of expression exist, but
avoid obscure tricks even where these gain brevity.
- Certain constructions, for example those that perform elementary arithmetic computations to determine positions in strings and tuples (for
which "off-by-one" errors can easily occur) are bug-prone and need to be approached with caution. For example, what is the length of a string
*s*(*i*..*j*), is it*j*-*i*,*j*-*i*+ 1, or*j*-*i*- 1? To ensure that*s*(*i*..*j*) is exactly*k*characters long, what value do we give*j*:*i*+*k*- 1,or*i*+*k*+ 1? Learn to recognize these trouble spots, double-check them when preparing your code, and surround them with**assert**checks when you do use them.For example, if you write

**assert**#*s(i*..*j*) =*j*-*i*;immediately before proceeding on this assumption, your error will be pinpointed immediately; if you omit this check, you may have to find your way back to this error from some obscure symptom. A related idea is to introduce, and use, a collection of small standard procedures to handle these touchy situations in ways that are more instructive. For example, by introducing the macros (see Chapter 8).

**procedure**len_from(i,j);**return**j - i + 1;**end**;**procedure**take_len(s,i,k);**return**s(i..i + k-1);**end**;we can accurately extract a string of length k from s starting at character position i by writing

take_len(s,i,k);

and can evaluate the length of s(i..j) by writing

len_from(i,j)

- As Donald Knuth has remarked, premature optimization is the root
of much evil in programming. Compulsive (and ultimately ineffective)
attempts to gain minor efficiency advantages often complicate programs
and introduce bugs into them. As you compose a program, remember
that substantial efficiency advantages will be gained globally by choice of
effective algorithms, not locally by compounding of minor advantages.
- Your program test plan should begin to be developed as your program
is being written, and a substantial portion of the collection of test- and
debug-oriented
**print**and**assert**statements that you will use to test your program should be composed and entered as soon as the first draft of the program beings to approach completion. Early attention to your test plan will serve to pinpoint complex program sections that require careful testing. These are also the sections whose logic needs to be inspected most closely before testing begins. See Sections 6.4 and 6.6 for additional discussion of this point.

A good first step, but one that should not be allowed to hold you up too
long, is to look closely at whatever fragments of correct output have been
produced. If little or no output is correct, then your program may have failed
before even the first **print** statement was executed. This hint may help you narrow the bug hunt. On the other hand, if some output is correct, then the
program was probably functioning correctly till some point past the statement
which produced the last correct output. Find the point in your program at
which this output was produced, and see what comes before and after it. Again,
this may narrow the hunt. Examine the erroneous output carefully and try to
see whether its logical pattern reminds you of any particular section of your
program. This also can sometimes yield useful hints concerning the likely
location of the bug, especially if different parts of your output data are
produced by recognizably different sections of code. If certain items of output
that you expect are missing, try to see what evidence there is that all the code that you expected to execute did actually execute: remember that unanticipated data may have caused your program to follow an unexpected path, so that it may have bypassed, or may never have reached, the code sections which were supposed to have produced the output which you are surprised not to see. Analysis of evidence of this general kind will in favorable cases point the finger of suspicion at certain narrow program sections. However, in less favorable cases, the available evidence will be ambiguous. In this case, you will need to generate more extensive traces and dumps. This can be done in one of two ways:

- By inserting additional
**print**statements into your program, to make it print out something of a "motion picture" of what has happened. - By inserting various other checks, especially
**assert**statements, which check assumptions on which your program depends, but which you suspect might be failing.

Section 6.6 will have more to say about technique (b), which is related
to the general issue of formal program verification. The following more
pragmatic hints will help you to apply this technique effectively. It is particularly important to place **assert** statements in sections of code known to involve delicate constructions, especially if (as in the case of the "off by one" bugs considered in the last section) the necessary checks are simple. Since the correct functioning of a program often hinges upon the assumption that key variables will change in a consistent way as iterative execution proceeds (for example, always increasing or always decreasing) it can be useful to save the last previous value of each significant variable var and to write checks which compare the last previous value of var with its current value. This can be done by introducing an auxiliary variable *last_var* for each *var* and writing an assignment

whenever it is desirable to save the last value of *var*. Then checks like

etc., will all prove useful.

Ultimately, however, the problem with a purely assertion-based debugging technique is that it is not easy to formulate the necessary checks comprehensively enough to make it unlikely that a bug (which probably relates to something that has been overlooked) can slip through.

Hence one must often fall back on method (a), which generates additional raw evidence for inspection. The problem in using this method is to avoid burying yourself in too voluminous a trace of the thousands or millions of events that take place as a program executes. To avoid this danger a carefully planned sequence of probes is necessary. A good idea is to resurvey your program, mentally list its main phases, and determine all the data objects which each phase passes to the next phase. If your program has been well designed, i.e., has been built up from modules interacting with each other in well-structured fashion, there should not be too many of these objects, and then it is reasonable to print them out for inspection. Before inspecting this information, review the logic of your program, and make sure you know just what features you expect to find in values of the variables that you have printed. Try to be aware of every feature on which any part of your program depends. Then check the actual data. If the data printed at the end of a phase looks correct in every detail, then this phase is probably correct. If something strange-looking appears in the data produced by a given phase, although the data supplied to this phase look correct, then there is probably something wrong with the code of this phase.

When this stage of debugging is reached, you will at least have determined which of the several phases of your code contains the error for which you are hunting. At this point, it is a good idea to think over all the evidence that you have examined, and see if any compelling picture of the problem seems to suggest itself. Sometimes the fact that the offending phase has now been located removes enough confusion for the difficulty to be guessed quickly. If not, you will have to carry your tracing to a more detailed level. This is a matter of inserting print and assert statements more densely into the offending phase, in order to locate the particular subphase that contains the error. As before, this is the subphase to which good data are being supplied, but which is seen to pass bad data along to its successor.

- Once the bug location has been pinned down to a program section roughly
a dozen lines long, review the logic of these lines. Read them very closely,
looking for some misunderstanding which could have produced the
anomalous data which you know that this section has generated. Try
again to correlate data features with the operations responsible for producing these features. If this doesn't work, take the data supplied to the
erroneous subphase, and try to trace the way that the subphase will act
on this data, by hand, step by step, until you spot some error.
- In most cases, these steps will find the bug without too irritating an
expenditure of effort. However, in the stubbornest, fortunately rare, cases,
the problem for which you are hunting may still elude clear identification.
In these particularly resistant cases one of three causes may be at fault.
- If the algorithm which you are using is complex, you may have
misunderstood its logic. It may be that no single line of your code
is wrong: rather, its overall pattern may be subtly wrong, causing it
to produce the output you see, rather than the results you wrongly
expected it to generate. Global logic errors of this sort are often
quite confusing. If you come to suspect that a problem of this sort
has occurred, you should reason once more through the structure
of your program, trying to convince yourself by careful analysis that
it is logically sound. Section 6.6 describes the formal rules that
underlie reasoning of this sort.
- There may be nothing wrong: you may simply have misunderstood
what output your program was supposed to produce. Or you may
have been looking at the wrong phase of a program which really
does contain a bug, because you thought that the output of this
phase showed some error, while in reality the bug was elsewhere.
Or you may not have been running the program you thought you
were running, or the version of the program you thought you
were running, or your program may have been reading input data
different from what you assumed. In such case, take a few minutes
to cool off, review the whole situation, including the logic of your
program, once more, and start over.
- Your problem may be caused by a true "system bug," that is, an
error, not in your program, but in one of the layers of the software
including the SETL compiler, run-time library, or operating system
under which you are running. Concerning bugs of this kind we can
say the following:
(iii.a) Don't be too quick to suspect them. Though such problems do crop up from time to time, they are much rarer than errors in your newly written programs. Remember that dozens of people are using the same software systems that you are, and that if the problem afflicting you is a system-level problem, it would affect all of these people. Before you become willing to blame your problem on anything other than a fault in your program, you should always have examined your program with great care, located a section just a few lines long which you can be sure is receiving correct input (because you have printed and inspected its input) and producing bad output (again, you must have printed and inspected this output). Finally, meticulous examination of these few lines, with review of the definition of all the operations these lines involve, of the parenthesization of those lines, and of any applicable rules of operator precedence must give you "courtroom" evidence that the system is not performing according to its specifications. At this point you are almost (but still not quite) in position to report a system problem to the expert in charge of maintaining your copy of SETL system (or of the operating system within which the SETL system runs). Before doing so, however, you should try to simplify the evidence still further, isolating the malfunctioning lines into a malfunctioning program just a few lines long, and then paring this program down still further if possible, ideally to the point at which it contains just three lines: an assignment initialising a very few variables, a single line which obviously does not function as it should, and a print statement which confirms the fact that this line has failed to act in the manner demanded by the rules of SETL. If the system problem which you think lies at the root of your troubles disappears somewhere during this sequence of steps, the cause of your difficulties may not be a system problem at all, but an error or misunderstanding on your part, which your attempts to locate the suspected systems problem may have clarified. In this case, chastened, you should return to your original program, fix the error in it, and continue your debugging. If, however, you do succeed in creating a very short program which gives unmistakable evidence of system malfunction, you should transmit a complete, clean copy of this program to a systems expert. This should be accompanied by a clear explanation of the problem you have pinned down. He will then take steps to fix the SETL system, or to have it fixed.

Note that problems in the SETL system, like problems in your own programs, are most likely to concern marginal, rarely exercised cases. Though the system has been in use for a few years and has been tested fairly extensively, exhaustive testing of so complex a system is simply not possible. (See Section 6.4 for a discussion of some of the issues involved in attempts to test programs comprehensively.)

There are a few cases in which it is reasonable to jump a little more rapidly to the conclusion that a system bug is affecting you. One is the case in which two runs of absolutely identical programs and data yield different results. Another is the case in which insertion into your program of a statement which is harmless by definition changes the behavior of the program significantly. For example, if insertion of a

**print**changes your program's flow of control, something is obviously amiss at the system level. This may be evidence that can be reported to an expert immediately (but see the caution extended in (f)). - It should be clear from what has been said that one of the very first things that you will want to trace when you start to analyze a malfunctioning
program is the input data it is reading. Always echo this data by printing
it out immediately after it is read. Your input data may not be what you
think it is, or you may be reading it incorrectly.
- Especially if a difficult bug is being pursued, debugging as an activity tends to create an atmosphere of confusion, which grows like a thundercloud as
the mind struggles to free itself from the misapprehension which first
allowed the bug to slip in. Particularly difficult bugs sometimes make one
feel that one is going insane, since the laws of logic seem to be breaking
down. To combat this perilous confusion, you must maintain a very
deliberate, step-at-a-time, and above all skeptical, attitude while you are
debugging. Verify the situation at every turn; look at what really is in
your source text rather than trying to remember what was there; print out
a record of what your program really is doing rather than guess what is
going on. Inexperienced student programmers often come to advisors with
old versions of programs they are trying to debug, claiming that "I ran
this program on Tuesday, and I made two or three changes that I am sure
are harmless, and now it doesn't work." A more experienced programmer,
who knows that the only valid evidence to work from is a current, single,
untorn listing showing program and output unmistakably together, will
only laugh at this.
To reduce the level of your own confusion, it is sometimes helpful to work over your problem with a friend, trying to explain what is going on, and reviewing salient parts of the logic of your program with him, till he begins to understand it. A more expert consultant will often be able to spot the trouble that you have missed, but even if your consultant is less expert than you yourself, you will often find that the very act of explaining the problem lets you spot what is wrong.

- Even when a program has once begun to function (and often even when
it has been used successfully and intensively over a considerable period),
it may still contain bugs, which can lurk within sections of code which are
rarely, perhaps almost never, exercised. For this reason, code inserted for
debugging should generally not be removed once the bug is found. Don't
throw away your crutches: it may become necessary to debug the same
program again! Instead of removing debug code, you can
*comment it out*by inserting a '--' sign at the start of each line inserted for debugging. (Only inserted lines that never generated any evidence useful for debugging should be wholly removed.) Another technique, particularly useful during extended development and debugging of large programs, is to make the most valuable debug prints and checks*conditional*, by including them in**if**statements containing conditions which are normally**false**but can be turned on by supplying program initialization parameters. (See Chapter 9 for a discussion of program parameters.) If this is done, it becomes possible to examine the inner working of a malfunctioning program quickly, without having to recompile it all.

- If the algorithm which you are using is complex, you may have
misunderstood its logic. It may be that no single line of your code
is wrong: rather, its overall pattern may be subtly wrong, causing it
to produce the output you see, rather than the results you wrongly
expected it to generate. Global logic errors of this sort are often
quite confusing. If you come to suspect that a problem of this sort
has occurred, you should reason once more through the structure
of your program, trying to convince yourself by careful analysis that
it is logically sound. Section 6.6 describes the formal rules that
underlie reasoning of this sort.

Bug | Likely SETL symptom |

Variable not given any initial value | "Illegal data-type" error |

Incorrect termination of a loop (e.g., count off by 1) | Missing items in data collections, sums or sets too small if loop terminates too soon. "Illegal data-type" error if loop terminates too late |

Incorrect limits in string and tuple slices (e.g., count off by 1) | (Similar to incorrect loop termination) |

Incorrectly structured while loop conditions or bodies, or incorrect initial conditions in while loops | Program does not terminate |

Incorrect treatment of initial cases in recursions, or bad procedure calls | Program does not terminate, possible memory overflow |

Omission of exit or continue statement | Program "runs on" into code not intended for execution. |

Omitted or improperly inserted data-value update | Failure of relationships expected subsequently; unexpected values or jumps. Effects can be subtle. |

Shared global variable unexpectedly modified by invoked procedure or function | Efforts can be very subtle, and particularly hard to find. Beware of global variables! |

Misspelled variables, e.g., AO for A0, Bl for B1, c1 for ci | "Illegal data-type" error, possibly no output |

Reading unexpected data | Can cause wide variety of effects |

Some unanticipated characters encountered by the string-scan operations | Program does not terminate |

Failure to set a program switch | Effects can be subtle |

Parameters out of order in procedure call | "Illegal data-type" error (generally easy to find) |

Variable inadvertently modified by assignment to a variable intended to be different but having the same name | If no data-type error is caused, effects can be subtle |

Variables out of order in read statement | "Illegal data-type" error (generally easy to find) |

Read operations of program inconsistent with data actually present in input file | "Illegal data-type" error (generally easy to find) |

Target of an assignment statement misspelled | Effects can be very subtle |

Not resetting a counter | Effects can be very subtle (see "Incorrect loop termination") |

Complex, incorrect combination of Boolean conditions | Effects can be very subtle |

Too few or too many parentheses in expressions; misunderstanding of precedence rules | Effects can be very subtle |

*Debugging* is the process of searching for the exact location of a program error when you know that some error is definitely there. *Testing* is the systematic exercise of a program which you believe might be correct, in an effort to see whether bugs are really absent. If testing shows a bug, debugging starts again. If your tests are not systematically designed, then bugs may go undetected and remain in the program. All one knows about a poorly tested program is that it works in the few cases for which it has been tried; it may fail in many others.

Test design is as important a part of program development as the choice of algorithms and data structures. Development of a test plan should begin while a program is being written. A procedure which is hard to test is apt to be bug-prone, and should be simplified if possible. By keeping testability in mind, you will avoid unnecessarily complex constructions, and produce cleaner, sounder code.

Testing falls into three distinguishable phases, which we will call *first-stage testing*, *second-stage* or *quality assurance testing*, and *maintenance* or *regression testing*. First-stage testing begins as soon as a program is complete enough for execution to be possible. Its guiding hypothesis is that bugs are present in sufficient numbers to prevent much of the program from working at all. During first-stage testing, one aims to make the main facilities of the program being debugged operable by finding and removing bugs quickly. Quality assurance testing begins when first stage testing ends. It assumes that a few obscure bugs remain in the program to be tested and aims to test systematically enough to smoke them out. Maintenance testing aims to ensure that new bugs are not introduced into old programs during their extension and repair.

Perhaps because realism might lead to suicide, programmers are generally overoptimistic concerning the likelihood that a program that they have just written will work right away. Careful preparation of a first-stage test plan serves to counteract this common illusion; the more realistic attitude thereby engendered encourages more careful initial program inspection, and this often reduces the number of bugs present when first-stage testing begins. This is why programs developed cautiously often become operational quickly, whereas programs developed in too optimistic a frame of mind often begin to work only after frustrating and totally unexpected delays.

An effective way of organising tests is to group them into a single procedure
called *test_prog*, whose one parameter s is a string consisting of test names separated by asterisks. The *test_prog* can then have the following structure:

proceduretest_prog(s) -- skeleton of test procedurewhiles /= "" loopspan(s,"*"); tn :=break(s,"*");iftn = ""then exit;end if;casetnof(put sequence of named tests here)elseend case;end loop;endtest prog;

To trigger a sequence of tests named test_4, test_2, etc., one has only to
write something like test_prog("test_1*test_2*..."). Later, when first-stage testing is complete, this call, and *test_prog*, can be left in the program *P* that has been tested, but the argument of the test_prog call can be changed so that it reads test_prog(getspp("TEST=/")) (see Chapter 9 for an account of the *getspp* library procedure.) If this is done, no tests will be executed unless *P* is invoked with a command line parameter of the form TESTS = testl*test2...*testn, in which case the named tests will be performed. This approach makes it easy to retest a program in which unexpected trouble has developed. Of course, the test facilities available should be carefully documented at the start of the *test_prog* procedure.

Especially when a long program *P* is being developed, it may be desirable
to begin testing before all parts of P have been coded (or even designed) in
detail. Of course, such an approach will be reasonable only if *P* has a sound, highly modular overall design, and only if the missing sections of P have been designed in enough detail so that you can be sure that no inconsistency will develop when they are designed and coded in detail. This mode of organizing development and testing is sometimes called *top-down* testing. It has the advantage of allowing testing and development to proceed in parallel. A related advantage is that it can provide particularly early confirmation of overall design soundness, or, if a design proves to be unsound (say, in terms of human factors, i.e., usability), it can give early warning of trouble.

If a top-down approach to development and testing is taken it will be found useful to provide a standard, multiparameter library routine having the name MISSING_SECTION. Then parts of your program that have not yet been coded can be replaced by invocations

where the string-valued parameter -name_of_missing_section- should assign the missing section a name that can be printed. The MISSING-SECTION procedure should also allow optional additional parameters, so that it can be invoked by

where *p*1, *p*2,..name various parameters with which the missing section would have to deal or which might explain why it was (perhaps unexpectedly) invoked.

*Second-stage* (or *quality assurance*) testing should aim to exercise a program comprehensively, in at least the following senses:

- It is obvious that parts of your program that have never been executed during debugging may well contain unrecognized errors. The battery of tests you develop should therefore force every line of your program to be executed at least once.

- If your program branches on a Boolean condition, then you will want
to supply at least one test case in which the condition evaluates to
**true**, and another in which the condition evaluates to**FALSE**. - Improper treatment of extreme values is a common cause of program
failure. A program may work for nonnull sets, tuples, or strings, but not
for the corresponding null cases; for positive integers n but not if n = 0;
for integers less than the length of some string, but not for integers equal
to this length, etc. It may work when a
**while**or a**for**loop which it contains is entered, but fail if the loop is bypassed entirely.In preparing a comprehensive collection of tests, you will therefore need to survey your program systematically, listing marginal situations of this kind as exhaustively as you can; then you should prepare at least one test that will force each logically possible situation to occur.

- Once a list of all procedures, loops, branches, code sections, and marginal cases to be tested has been collected and a comprehensive set of tests has been developed, it may be worth preparing a formal test coverage matrix which shows which tests exercise each program feature. A chart of this kind makes it easier to spot cases that have never been tested. It can also help to select tests to be run during regression testing (see the following discussion) and can help to pinpoint program sections to be examined when a test fails. Such a chart will also make it easier to avoid running too many tests, all of which exercise the same limited group of program features but never use others. Note that, if regarded as a kind of test, "production" use of a program is subject to this objection; i.e., daily use of a program often exercises only a limited subset of its features. This is why programs that have been in heavy use for extended periods sometimes fail when their usage pattern changes significantly.
- Compilers sometimes include features which make it easier to determine the coverage provided by a family of tests. For example, it may be possible to generate a listing of all program statements executed during a sequence of runs, of all branches taken, of all procedures invoked, etc. The SETL profiling facilities is useful for this. You will want to familiarize yourself with this facility, which can help assure that the test sets you develop for your programs are adequate.
- Once developed, test sets become an important adjunct to the programs that they test. Such test sets should therefore be organized in a manner which facilitates their long-term maintenance and reuse. The tests that are available, and the coverage they provide, should be adequately documented.
- Programmers often find it hard to bring sufficient enthusiasm to the task of systematically rooting obscure bugs out of code that they themselves have written. In part, this is a matter of optimism; in part, a result of the mental fatigue which tends to set in at the end of a lengthy code development; in part, a consequence of the difficulty of overcoming the very mind-set which introduced an error in the first place. For all these reasons, it is good practice to make testing of large programs the responsibility of a quality assurance group independent of the development group that produced these programs. If this is done, then, knowing that an independent group of programmers will probe their work systematically to find shortcomings, the original development group will be encouraged to simplify their product so as to improve its reliability.

Even where resources do not permit fully independent organization of the activity of program testing, it is well to ensure that every line of a complex program is read and understood by at least two programmers, each of whom will be able to spot problems and suggest tests that the other might have overlooked.

If a chart has been prepared showing the program features exercised by each test (see iv above) it can be used when a test fails to suggest what part of the program should be examined first to find the cause of failure. If some one of these program sections has just been changed, it will of course come under immediate suspicion.

- The estimation of program efficiency.
- The study of means to improve program efficiency by refinement and transformation of a correct but inefficient working program.

Two kinds of machine resource are relevant to the study of program efficiency: execution time and data storage required. In what follows, we will concentrate our attention on the first of these, because in many cases it is time which is critical. Data storage requirements were a pressing concern in the early days of computing, because memory was an extremely expensive hardware component and computers were equipped with only a few thousand words of it. As a result, early programming languages such as FORTRAN included various complex mechanisms for managing the limited memory resources available, which forced the programmer to concern himself with low-level details which were extraneous to the actual application to be programmed. Memory has in the meantime become a much cheaper commodity, and concern with its optimal utilization can in many cases (but by no means all) be left to the machine itself (in the form of some facility for dynamic storage allocation). This is what we will do in most of what follows.

foriin[1..1000]loopforjin[1..2000]loopforkin[1..3000]loopforlin[1..4000]loop(1) sum +:= (2 * i**3 + j**3 + k**2);end loop;end loop;end loop;end loop;

In this code, for each successive value i, the variable j iterates over 2000 different values; for each value of i and j, the variable k iterates over 3000 values; and for each value of i, j, and k, the variable l iterates over 4000 values. Thus, all in all, the innermost statement of the code fragment (1) will be executed 1000 x 2000 x 3000 x 4000 times, i.e., 240 billion times. This statement involves 6 multiplications and 4 additions, so that at least 2.4 trillion elementary arithmetic operations are required to execute the code (1). Even on a computer capable of executing 400 million arithmetic operations per second (a fairly typical performance figure nowadays) and even if the code (1) were written in a programming language capable of exploiting this raw arithmetic capability to the utmost, 6,000 seconds would be needed to execute the code (1). Since an hour is about 4000 seconds, this is about 1.5 hours. However, since SETL (which pays a price in efficiency for its very high level) is about 30 times less efficient than this, execution of the SETL code (1) would require about 2 days of computer time. This makes it quite clear that in writing SETL programs one needs some way of estimating the computational resource which will be consumed to execute the code that one sets down.

Instructions whose operands are simple data types: integers, floating-point numbers, Booleans, all take roughly one cycle to execute. This includes

Arithmetic operations on integers and floating-point numbersBoolean operations.

Comparison operations ( =, /=, >, <, etc.)

At the level of the actual machine, there may be secondary differences among some of these; for example, a floating-point division is typically slower than an integer addition. In this discussion we can safely ignore these differences, and we shall do so in what follows.

The simple assignment statement

where v is a simple variable name, also takes roughly one cycle to execute (of course, only after the value of expr has been obtained).

Cardinality: | #S takes one cycle to evaluate, regardless of the actual number of elements of S |

Membership: | (x in S) where x is a simple item (numeric or Boolean) takes typically three to six cycles to evaluate, and increases only slowly with the size of the set S |

There is nothing obvious about these remarks! You might wonder how the
membership test can actually be performed without examining all the elements
of S. The answer lies in clever choice of the internal data structures used by
the SETL run-time system itself These structures are discussed in Chapter 10
and we will not say more about them now, except to note that the possibility
of using sets as freely as we do in SETL depends in part in the constant
execution time required by the membership operation. Note that this constant
time behavior applies when x is a simple (nonstructured) value. In general, the
membership test x **in** S takes as long as (or a little longer than) an equality test x = y. The relevance of this remark will become apparent in the following section.

Insertion and deletion: | For simple objects the operations x with S and S less x, take a nearly constant number of cycles to execute. Exceptions relate to questions of storage allocation, and will be discussed later. The actual number of cycles needed for insertion of a simple value into a set of moderate size is about 10-15 cycles. |

Map retrieval: | To evaluate or to modify f(x), where f is a map and x is a simple value, takes nearly constant time, comparable to the time required for set membership operation. Once again, efficient implementation of this fundamental operation is secured by the use of a sophisticated data structure about which we will say more in Chapter 10. |

- Initialize by making a copy of S1. This can be expected to take a number a steps proportional to #S1.
- Iterate over S2, and check each element of S2 for membership in S1. Whenever an element is found which is not in S1, insert it into the union set.

This process is described by the following equivalent SETL fragment:

S3 := S1; -- Initialize resultforxinS2loop-- Iterate over second set.ifxnotinS1thenS3with:= x; (2)end if;end;loop

After code like this has been executed internally, S3 will clearly be equal to (S1 + S2). To estimate the execution cost of the loop in (2), we can reason as follows: the loop body is executed (#S2) times, i.e., once for each element of S2. For each such element we perform a membership test and (possibly) a set insertion, both of which take constant time. Therefore the loop takes an amount of time proportional to #S2. The initialization of S3 amounts to making a copy of S1, which takes a time proportional to the size of S1. Putting all of this together, and neglecting small differences between the execution time of membership and insertion operations, we can conclude that the cost of a union operation is proportional to the sum of the cardinalities of the sets involved. This is not an exact statement, but it is sufficiently accurate to convey an idea of the order of magnitude of the execution time of this set operation.

We can apply a similar analysis to the set intersection operation (S1*S2). This is evaluated by the SETL run-time system in the following fashion:

- Initialize the result to the empty set.
- Determine which one of the two sets has the smaller cardinality (for definiteness, let us say it is S1). Then iterate over S1, testing each of its elements for membership in S2. Insert each element of S1 which is also a member of S2 into the intersection set being built up. This process corresponds to the following SETL fragment:

S3 := {};forxinS1loopifxinS2thenS3with:= x; (3)end if;end loop;

Using much the same reasoning, we can easily see that the execution time for this fragment is proportional to the cardinality of S1, i.e., to the size of the smaller of the two sets entering into the intersection operation.

Set difference, set inclusion, and subset testing operations are performed in similar fashion and can therefore be expected to take an amount of time proportional to the cardinality of one of their arguments.

Other SETL primitives involve similar (implicit) iterations over sets. The
most obvious of these is the **print** statement: **print**(S) requires the examination and output of each element of S. Set equality and inequality are more complex examples. To test whether S1 and S2 are equal, the SETL run-time system performs the following sequence of actions:

- Compare the cardinalities of S1 and S2. If these cardinalities differ, the two sets are distinct.
- If the two sets have the same cardinality, then we must examine each element of one set, and check it for membership in the other set. If an element is found which does not belong in both, we know that the two sets are not equal. If the two sets are equal, each element will have to be examined, and the process of establishing equality will take a time proportional to the common size of S1 and S2. The internal equality testing process can be described by the following SETL procedure:

procedureequal(S1,S2);if# S2 /= # S1then return false;elseforxinS1loopifxnotinS2then return false;end if;end loop;return true; -- All elements are common.end if;end equal;

We mentioned in the previous section that the membership operation must often perform an equality test involving the object being tested for membership. This means that if S1 is a set of sets, and su is a set, then the operation

will take a time proportional to the size of su, because it will involve testing su and one or more element(s) of S1 for equality. It should be clear that when sets of sets of sets of..are involved on both sides, the membership test operation will become more complex, and the size of the subobjects of S1 will have to be taken carefully into consideration.

On the other hand, membership testing, i.e., the operation (x **in** T) where T is a tuple or string, proceeds by examining all elements of T in succession, in the manner suggested by the following equivalent SETL procedure:

procedureintuple(x,T);fory = T(i) loopifx = ythen returntrue;end if;end loop;returnfalse; -- x was not found in T.endintuple;

Given the way that tuples and strings are represented internally, there is no more efficient way to tell a priori where x is found in T, if indeed it is found in T at all. If x is not in T, this will only be ascertained after all elements of T have been examined, and if x is in T, it may be the first, second, ... n-th component. We can say that on the average, half of the elements of T will have to be examined before x is found. This means that a membership test applied to a tuple T can be expected to take a number of cycles proportional to #T. (Compare this with the more advantageous constant time behaviour of membership tests on sets).

If x is itself a structured object, and T is a tuple of such structures, then the equality test in the loop (5) can be expected to be proportional to the size of x. Therefore in this case the membership test (x in T) will take time proportional to (#x * #T).

Tuple and string concatenation are analogous to set union: they take a time proportional to the sum of the sizes of the operands, i.e., to the size of the result. Testing tuples and strings for equality is also linear in the size of the operands, as it requires that each component of each object (in the worst case) be examined.

The execution times of the basic SETL operations are summarized in Table 6.1. We can estimate the cost of any straight-line program simply by adding

Operation | Time Units | |

Arithmetic | i + j, i * j | 1 |

Assignment | x := y | 1 |

Indexing | T(i) | 1 |

Map retrieval | M(x) | 5 |

Set membership | x in S | 5 |

Tuple membership | x in T | #T |

Set union | S1 + S2 | #S1 + #S2 |

Concatenation | T1 + T2 | #T1 + #T2 |

Quantifier | exists x in S | C(x) | #S * (1 + cost(C)) |

the costs of each instruction in it. However, other issues, discussed in the next section, arise if we go on to examine the more interesting problem of estimating the cost of programs with various loop structures.

or over a map, tuple, or string, as in

produces set elements (or map values, tuple components, or string characters) at a rate of one per cycle. Essentially the same remark applies to numerical iterators, like those in (1). Hence to estimate the time required to execute a loop, we have to multiply the number of times the loop will be executed by the (average) time that it will take to execute the body of the loop. An obvious generalization of this rule applies to imbedded loops: if one for loop is imbedded within another, then the time required to execute the outer loop is the number of times it will be executed, multiplied by the (average) number of times the imbedded loop will be executed, multiplied by the time required to execute the body of the embedded loop. For example, the double loop

will execute in a time roughly equal to 1000 * 500 multiplied by the amount of time required to execute the loop body, since the embedded loop over j will execute an average of 500 times for each successive value of i. (This number 500 is halfway between the number one of times that j changes when i = 1 and the number (1000) of times that j changes when i = 1000.)

Since quantifiers and set formers are in effect prepackaged iterations, very similar rules apply to them. An existential quantifier like

...existsxins | C(x) ... (6)

is evaluated as if it were the following SETL fragment:

maybe :=an internal variable that holds the Boolean result of the test. This SETL fragment will execute in time equal to the number of items examined, multiplied by the average time required to evaluate the Boolean condition C(x). If the quantifier (6) evaluates tofalse;forxinsloopifC(x)thenmaybe :=true;exit;end if;end loop;

Similar remarks apply to set formers and the tuple formers, except that:

- Each insertion into a set takes somewhat longer than a simple iterative step, because we must check for and eliminate duplicate elements; and
- The implicit iteration appearing in a set or tuple former like

must always proceed until all the elements of s have been examined. As an application of these rules, note that execution of the harmless-looking code fragment

f := {};forxinsloopf(x) := {yins | (existszins | C(x, y, z))}; (7)end loop;

involves three nested loops: first the **for** loop which appears explicitly, next the implicit iteration over s **in** the setformer {y **in** s..}, and then finally the implicit iteration over s in the quantifier **exists** z **in** s...Therefore the number of cycles required to execute (7) is roughly as high as the cube of the number of elements of the set s.

k := 0;while(k +:= 1) < nandt(k) /=OM loopstatements..end loop;

where k is incremented each time through the loop, behaves in very much the same way as a **for** loop and therefore will terminate after no more than n - 1 iterations. On the other hand, consider the loop

t := n * [0];while existsx = t(i) | x = 0loopend loop;

This begins by generating a tuple t := [0,0,0,...0] of n zeroes and then repeatedly sets the first nonzero coordinate of t to 1 and all the coordinates preceding this coordinate to zero, thus carrying out a (left-to-right) form of binary counting. The sequence of tuples printed is

[0,0,0,...0] [1,0,0,...0] [0,1,0,...0] [1,1,0,...0] [0,0,1,...0] [1,0,1,...0], etc.

and is plainly of length 2^{n}, in that it generates all binary numbers from 0 to 2^{n} - 1. Hence the number of cycles required to execute the **while** loop (8) is at least 2^{n}, which means that if n = 50 the loop will execute for roughly 320 years, even if 100,000 iterations are performed per second!

For a more realistic example of the way in which **while** loops are typically used, consider a particularly simple sorting method, dubbed flip-sort.

while existsiin[1..#t1] | t(i) > t(i + 1)loop[t(i),t(i + 1)] := [t(i + 1),t(i)]; (9)end loop;

This searches a tuple t for out-of-order components and interchanges a pair of such components whenever one is found. Plainly the number of cycles
required to execute (9) is the average time required to search the tuple t for an out-of-order pair of adjacent components, multiplied by the number of interchanges required to put t in sorted order. Even though precise analysis of these times requires subtle analysis going far beyond the scope of this book, it is not hard to estimate these time requirements crudely. We can guess that, as long as an out-of-order pair exists, one such pair will be found after searching through some fraction of the length of tuple t being sorted; thus evaluation of the existential quantifier appearing in the first line of (9) is estimated to require cn cycles, where c is some constant which we will not attempt to evaluate here and n is #t. Moreover, since each execution of the body of the **while** loop (9) corrects exactly one case in which a pair of elements appears in inverted order, the expected number of times that (9) must iterate to put t into its final sorted order should be roughly equal to the number of pairs of components of t which occur in inverted order. In a random arrangement of the components of t, roughly half the components to the left of a given t(i) should be larger than t(i), and roughly half the components to the right of t(i) should be smaller than t(i). Thus each component t(i) of t should appear in roughly #t/2 inverted pairs, and it follows, since this way of looking at things counts inverted pairs twice, once for each of the components in such a pair that the expected number of inverted pairs in a randomly chosen arrangement of the components of t should be roughly (1/4)n^{2}). Multiplying this expression by cn, representing the estimated time required to evaluate the existential quantifier in the first line of (9), we arrive at

(1/4) cn^{3}(10)

for the time required to execute this 'flip-sort' code on a tuple of length n.

The approximations which we have made in arriving at the estimate (10) are too crude for the constant (c/4) appearing in (10) to be particularly
meaningful. (Ex. 9 following Section 6.5 outlines an experimental procedure
for estimating this coefficient more accurately.) The significant feature of the estimate (10) is that it tells us that the time required to sort a tuple by the flip-sorting method is proportional to the cube of the length of t, i.e., that - sorting a tuple of length 10 by this method should take roughly 120 cycles sorting a tuple of length 100 roughly 120,000 cycles, and sorting a tuple of length 1000 roughly 120,000,000 cycles. These figures, which are not very favorable, reflect the rapidity with which the cube of n grows as n increases. In the jargon of algorithm analysis, one says that flip-sort is an 'n cube' algorithm - or an 0(n^{3}) algorithm. Clearly, any sorting algorithm whose time requirement grows less rapidly than the cube of the length of t will be very much superior to flip sort as a technique for sorting large tuples t.

Let us therefore modify our naive flip-sort algorithm in order to obtain a more efficient sorting method. We begin by noticing that the existential quantifier in (9) forces us to scan the tuple from the beginning, each time we find an out-of-order pair. This is wasteful because an out-of-order element may be out of place with respect to several of its successors, not just one. A better approach is to try to push the winner of each such exchange as far as it will go, and go back to the beginning of the tuple only after we have made a full pass over the tuple. This leads to the following:

forjin[1.. #t - 1]loopforiin[1 .. #t - 1]loopift(i) > t(i + 1)then[t(i),t(i + 1)] := [t(i + 1),t(i)]; (11)end if;end loop;end loop;

Note that this code will push the largest element in the tuple to the last position the first time the inner loop is executed, the second largest element will be pushed to the second position from the end on the next pass through the inner loop, and so on. Each execution of the inner loop adds one element to the fully sorted portion of t which is built up at its end. The outer loop is executed (#t1) times to ensure that every element of t finds its proper place.

A further improvement is possible. Elements that have reached their correct position need not be examined in successive passes because we know that they are in their proper places. We can therefore modify (11) to read as follows:

forjin[1..#t - 1]loopforiin[1..#t - j]loopift(i) > t(i + 1)then[t(i),t(i + 1)] := [t(i + 1),t(i)]; (12)end if;end loop;end loop;

The fragment (12) is the bubble-sort method that we examined in Section 5.1.1.

Next let us examine the execution time of (12). Again, let n be #t. The outer loop is executed (n - 1) times. The inner loop is executed (n - 1), (n - 2), ... 3,2,1 times, that is to say, (n/2) times on the average. The body of the inner loop consists of various comparison, indexing and assignment operations, all of which take some constant time c1. Therefore the total execution time of (12) is roughly c1 * (n - 1) * (n/2) cycles. We therefore can say that the code (12) executes in time proportional to the square of the size of t. This is a considerable improvement over our initial flip sort. Several further refinements can be made to (12), but these improvements only affect the
constant c1 and do not modify the n^{2} behavior that we have just obtained.

The order-of-magnitude analysis that we have just performed is the most frequent and useful kind of algorithm performance analysis. Often it is enough to know what the order-of-magnitude behavior of a program is to estimate whether it is acceptable for the small, medium, large, or very large amount of data that is to be processed.

The transformation that took us from (9) to (12) deserves further discussion. What we did was to remove an existential expression

which describes some exhaustive search over s, and replace it with a more intelligent search over that set. This shows a fairly typical use of existential quantifiers: they allow us to write a crude specification for some kind of search process, in order to obtain a working algorithm quickly. When the time comes to improve the efficiency of that algorithm, we can often replace the existential quantifier with a more precise (and therefore more efficient) search, better adapted to the problem at hand. This stepwise approach to program efficiency is one of the most important principles of good programming practice. Proper use of SETL consists precisely in starting with as condensed and abstract a formulation as can be achieved by use of set formers, quantified expressions, power sets, etc. Once such an abstract program is running, one can then replace these abstract constructs, in those places where they are unacceptably inefficient, by more efficient lower-level constructs which make use of sophisticated data arrangements.

However, in most cases the important expense is the time the called procedure actually takes to execute. If the called procedure consists only of loops and simple statements, we can estimate its execution time using the techniques sketched in the previous section. On the other hand if the called procedure calls other procedures in turn, or more interestingly calls itself recursively, then the analysis of its efficiency is considerably more subtle. To indicate the kind of reasoning required, and also introduce some useful notation, let us first examine the "quintessential recursive procedure," namely the factorial:

procedurefact(N);returnif N = 0then1elseN *fact(N - 1)end if; (1)endfact;

Since the factorial of a positive number N is the product of successive integers up to N, an iterative calculation of fact(N) will require N multiplications, i.e., take time proportional to N. It is conventional to indicate this by saying that iterative calculation of fact (N) uses time O(N).

This is also the case for our recursive definition. To establish this, we reason as follows: if N is not zero, execution of fact (N) will require one comparison operation (the test N = O) one multiplication, and one recursive call to fact involving a smaller argument, namely N1. If we let T(N) designate the time it takes to execute fact when its argument is N, this leads to the following equation:

T(N) = 2 + T(N - 1) (2)

where the constant 2 corresponds to the two elementary operations noted.

This equation is called a *recurrence relation* for T and defines the value T for each possible argument N, as a function of the value that T(N) takes as an smaller argument. Note that this recurrence parallels the recursive definition of fact, in which the value calculated is also defined in terms of another invocation of the same procedure. To solve the recurrence relation means to express T(N) as some explicit function of N. For the recurrence (2), this can
be done easily, as follows. The relation (2) holds for arbitrary values of N (except 0), so that substituting N for N1 in (2) we find that

We can use this to replace T(N1) on the right-hand side of (2), which gives

Repeating the same reasoning, we can express T(N - 2) in terms of T(N - 3), and so on. This process ends when we express T(1) as 2 + T(0). T(0), that is to say the time required to evaluate the factorial when its argument vanishes, is just the execution time of the test (N = O) because in that case the procedure does not recurse. Thus T(0) = 1. Putting all of this together, we obtain

T(N) = 2 * N + 1 (3)

which indicates that our recursive factorial program also executes in a time proportional to its argument N. The calculation just described neglected the actual cost of the procedure call itself, and that of the return instruction. Taking these into account would modify (3) only by constant factors, i.e., would gives us

where the constants a and b are 10. Hence our conclusion that calculation
of **fact**(N) requires time proportional to N remains correct.

Next let us examine a more complex example of a recursive procedure: the Towers of Hanoi procedure of Section 5.4.4. Recall that the Towers of Hanoi problem is that of moving N rings, stacked in decreasing order of size on a peg A, to a peg B, using some intermediate peg C as a ring holder, and respecting the ordering of the rings, so that a ring never lies over a smaller ring. As seen in Section 5.4.4, the solution is given by means of the following recursive procedure:

procedureHanoi(N, fr, to, via);ifN = 0thenreturn;elseHanoi(N - 1, fr, via, to);end if;endHanoi;

Using the same reasoning as before, we can readily establish the following facts:

- When the argument N is zero, Hanoi does not recurse but executes a single test and returns.
- When N is greater than zero, Hanoi executes a test, a
**print**statement, and two recursive calls with argument (N - 1). In other words, the time consumed by the procedure is described by the recurrence relationT(N) = a + 2 * T(N - 1) (5)

and by the "boundary condition"

T(O) = b (6)

where a and b are small constants. Once again, we can use this recurrence relation for T(N) to express T(N1) in terms of T(N2), T(N2) in terms of T(N3), and so on, until T(N) is expressed only in terms of a and b. These successive replacements look as follows:

T(N) = a + 2 * (a + 2 * T(N - 2)) = a * (1 + 2) + 4 * T(N - 3) = a * (1 + 2) + 4 (a + 2 * T(N - 4)) = a * (1 + 2 + 4) + 8 * T(N - 4)

The pattern should be clear: each substitution adds another power of 2 to the
coefficient of a, and another factor of 2 to the coefficient of T(Ni), so that this coefficient is always 2^{i-1}. Our calculation stops when i = N, at which point we have

We conclude that procedure Hanoi takes an amount of time which is an exponential function of its input parameter N (the number of rings to be moved) This means that adding one more ring doubles its execution time. Clearly, a procedure with this behavior can only be useful for relatively small values of its argument. (The Buddhist legend from which this problem originates states that 64 such rings are being moved by dedicated monks since the beginning of time, and that the completion of their task will mark the end of the world. The preceding analysis indicates that we need not be too concerned yet about this prospect.)

Procedures are only useful in practice if they do not have the explosive time
requirement that the Hanoi procedure illustrates: factorial is linear in its
argument, as we saw previously, and the sorting procedures discussed so far are cubic or quadratic in the number of elements to be sorted. Most of the
procedures that we use in practice consume time O(n), O(n^{2}) or O(n^{3}) at most.

Finally let us analyze the performance of one more simple recursive procedure, namely quicksort, which was presented in Section 5.4.1. This procedure, which can sort any homogeneous set s of integers, floating-point numbers, or strings, is simply

procedurequick_sort(s);if#s = 0thenreturn[ ];end if; x :=arbs; (7)returnquick_sort({yins | y < x}) + [x] + quick_sort({yins | y > x});endquick_sort;

Let T(n) be the number of cycles that this procedure will typically require to sort a set of n elements. Building up the two sets which appear in the final **return** statement of (7) will require a number of steps proportional to the size of s, in that a full pass over s is required in each case. Thus the time required to execute (7) is equal to some small constant multiple c*n of the length n of s (c = 3 is a fair guess), plus the time required to execute the two recursive invocations of quicksort which appear in the second **return** statement of (7). Since typically the element x chosen from s by the **arb** function of (7) will lie roughly halfway between the largest and the smallest elements of s, each of the two sets {y **in** s | y < x} and {y **in** s | y > x} should contain approximately half the elements of s. Thus, given that T(n) is the time required to sort a collection of n elements by the quicksort method, sorting each of these sets by use of quicksort should require roughly T(n/2) cycles. It follows that T(n) satisfies the recursive relationship

T(n) = 2 * T(n/2) + c * n (8)

The first of the terms on the right of (8) represents the time typically required to execute the two recursive invocations of quicksort appearing in (7), and the term c * n represents all the work needed to prepare for the two invocations.

We solve (8) in the same fashion as in our two previous examples, by substituting the expression (8) for the occurrence of T on the right of (8), which gives

T(n) = c * n + 2 * c * (n/2) + 4 * T(n/4) = 2 * c * n + 4 * T(n/4), (8a)

and then substituting (8) for T(n) on the right of (8a) to get

T(n) = 2 * c * n + 4 * c * (n/4) + 8 * T(n/8) = 3 * c * n + 8 * T(n/8). (8b)

Continuing inductively in this way we will clearly get

T(n) = 4 * c * n + 16 * T(n/16), T(n) = 5 * c * n + 32 * T(n/32),

and so forth, until eventually, when the power of 2 in the denominator on the right becomes roughly equal to n (which will happen after log n steps, where log n designates the logarithm of n to the base 2), we will find that T(n) is roughly

i.e., that T(n) can be estimated as the product of a small constant c (still roughly 3), times n, times the logarithm of n. One therefore says, in the jargon of algorithm analysis, that quicksort is an "n log n" or "O(n log n)" algorithm.

For n at all large and c roughly equal to 3, c * n * log n will be vastly smaller than the cube or even the square of n. For example, for n = 1000, n ** 3 is 1,000,000,000, n ** 2 is 1,000,000, whereas c*n* log n is only 30,000. Therefore quicksort can be used to sort large amounts of data, which could not be sorted in any reasonable amount of time using flip or even bubble sort. For example, if #t= 10,000, and on a computer capable of executing 100,000 of our nominal instruction cycles per second, sorting t using the flip-sort method will require approximately 16 hours, bubble sort will take about 20 minutes, whereas quicksort will accomplish the same operation in roughly 4 seconds.

This simple example shows why it is so important to find algorithms whose time requirements do not rise rapidly as the arguments passed to them grow larger. Very considerable efforts have been devoted to the search for such high-efficiency algorithms during the past decade, and a great many ingenious and important algorithms having efficient behaviour have been devised. Most of these algorithms lie beyond the scope of the present introductory work. For basic accounts of this important material, see the References at the end of this chapter.

To understand this important point, note first of all that copying is sometimes logically necessary. Consider, for example, the code

s := {1,2,3,4,5,6,7,8,9,10,15,20}; s1 := s; s with := 25; s1 less := 2;

The output that this will produce is

s = {1 2 3 4 5 6 7 8 9 10 15 20 25} s1 = {1 3 4 5 6 7 8 9 10 15 20}.

Since two different values will have been created by the time we reach the final step of (1), it is clear that somewhere along the way to this final step the single set constant assigned to the variable s will have to be copied. This copying can be done when the value of s is assigned to the second variable s1 in the second line of (1) (copying on assignment), or can be done in the third line of (1), when the value of s (but not that of s1) is modified by addition of the extra element 25 (copying on modification). Exactly where copying actually takes place will depend on the particular version of the SETL system that you are using, and especially on whether or not this compiler includes an "optimization" phase. But in any case, whenever two variables appear to be sharing the value of a composite object, and one of these variables is modified, the other one must be provided with a fresh copy of that composite object, because in SETL assigning to one variable never affects another. Copying a set or tuple with n components can require O(n) cycles. Hence execution of an apparently cheap operation like t(i) := x can require a number of cycles proportional to the length of t, if the value of t has been assigned to some other variable.

On the other hand, copying is frequently unnecessary, and both the
optimizing version of the SETL compiler and the SETL run-time library
include mechanisms for avoiding copying when it is not logically necessary.
(Since these are implementation-level mechanisms, and fairly complex ones at
that, we shall say nothing about how this is done.) When no copying is
involved, the operation s **with** x is only two or so times slower than the membership test x **in** s, and similar remarks apply to the other operations in the group we have been considering. For example, the assignment t(i) := x can be done in just one of our nominal "cycles," and in the same circumstances map assignment is roughly five times as slow.

This is but one symptom of the fact that the storage management which the SETL system performs contributes to the total execution time of a program. This contribution will be small if the objects manipulated by the program do not change size very frequently, but can be very substantial if large objects are constantly created, shared, updated, and then discarded. Programming techniques that avoid storage management costs are known and studied in data-structure courses, for which a number of excellent texts exist. Such data structures can be constructed in SETL, and the next section gives one example of such, namely the linked list. In this text we have largely ignored the subject of data structures of this type, in accordance with the tenet of high-level programming which states that the burden of storage management is best left to the system. The gains in ease of programming thereby attained are well worth the execution costs incurred. Nevertheless, the fact that storage management can become expensive should be kept in mind. Note also that in some circumstances, such as in the programming and control of real-time devices (i.e., physical equipment such as a motor, a cardiac monitor, a radar, an air-conditioning system, etc.) where a computer system must respond with minimal delay, no hidden execution costs can be allowed, and the programmer must know, within a few milli- or microseconds, the time that a given code sequence will take to execute. Such systems cannot be programmed in SETL or in any other high-level language such as APL or SNOBOL For such systems, lower-level languages, in which all storage allocation is described in full detail by the programmer, must be used.

In order to focus our discussion, we have to examine more details of the implementation of the basic SETL types. In this section we will describe the way tuples are actually represented in the SETL run-time system and examine the advantages and disadvantages of this representation. This will allow us to understand the potential advantages of an alternative representation which is efficient in cases where the standard representation for tuples is comparatively expensive. Note that the representations of sets and maps used by the run-time system are considerably more complex than the representation of tuples.

Internally, that is to say in the memory of the computer on which your program is running, a tuple can be represented by a contiguous sequence of cells, or words, that contain the components of the tuple. These components appear in order, so that the first component of the tuple appears in the first of these cells, the second component in the second, etc. For example, if f is the tuple [2,3,5,7,11], it could be stored in memory as follows:

2 | 3 | 5 | 7 | 11 |

When working with such a representation the SETL run-time system would always know where the first component of
t is found. From this, it can easily calculate the location of the second, third, components of t, and so on: the location of any component t(i) is simply (i - 1) plus the location of the first component.
cells after the location of *t*(1). This is why a tuple is the most efficient way storing data that is referred to by number, i.e., according to its ordering: when such a collection of data is represented as a tuple, all items in this collections are equally accessible. *Tuples are built for efficient indexing*.

Other operations on tuples are not as easy to perform. Consider

whose purpose is to shorten *t* by removing two of its components. After this operation is performed, *t* will be stored as follows:

2 | 7 | 11 |

Thus we now have *t*(2) = 7, *t*(3) = 11, etc. This requires that the components of *t* that appear after the deleted section be "moved down" two positions. For a long tuple, this can represent a substantial expense. Similarly, operations that enlarge a tuple, by insertion or concatenation, such as

twith:= 13; t(1..3) := [1,2,3,4,5]; t +:= [17,19,23,29];

may require that a certain number of elements of *t* be moved or copied.:
there is no room in the portion of memory currently allocated for *t* for the expansion required by one of the preceding operations, then a larger area will have to be found by the storage manager, into which the whole of *t* will have to be copied and then modified. We conclude that operations that modify the size of a tuple repeatedly carry a potentially substantial penalty in storage allocation activity. In cases where data are ordered but are not accessed *by number*, i.e. when no indexing operations are performed, and where substantial changes in the size of a tuple will occur, other representations exist that lead to smaller storage allocation and data motion costs. The most familiar among such representations is the chain, or *linked list*.

In the abstract, a *linked list* is a structure in which data are organized according to some linear order, i.e., with a first element, a second one, and so on, and so that it is easy to go from one element to its immediate successor. A tuple is of course an example of a list, but one in which elements are stored contiguously. In general a linked list is not stored contiguously, and the mechanism that takes us from one element to the next must be described explicitly. In SETL we may choose to use a map (call it next) which takes us from any element to its successor in the ordering. Then rather than having to use contiguous locations, successive elements of the list can be anywhere. In this case we can think of the location that holds each element of the list a being marked by an arbitrary tag, for which SETL atoms will be used in what follows. The contents of each such location are also described by a mapping (call it *val_of*), so that if *C* is a cell in this list, val_of(*C*) is the element stored at *C*, and next_of(*C*) is the cell that contains the next element. In addition we need to have an explicit way of referring to the first element of the list: a variable *comp1* must be used, whose value is the first cell. Then, all in all, val_of(*comp1*) is the first element of the list, val_of(next_of(*comp1*)) is the value of the second element of the list, etc. The last cell cn in the list is distinguished by having no successor, which we express by setting next_of(*cn*) = **OM**.

We are now ready to build and manipulate lists. In order to create a list with a single element x, we execute the following fragment:

comp1 :=newat(); -- Define cell. val_of(comp1) := x; -- place element in cell. next_of(comp1) :=OM; -- First element is currently the last also.

In order to add a new element y after x on this list, we execute the following fragment:

new :=newat(); -- Define new cell. val_of(new) := y; -- Place element in it. next_of(comp1) := new; -- First element now has successor. next_of(new) :=OM; -- Second one does not.

If instead we want to add a new element in front of the list, we execute

new :=newat(); -- New cell. val_of(new) := z; -- And its value. next_of(new) := comp1; -- First element now becomes second. comp1 := new; -- And most recent element is first.

Note that each invocation of **newat**() just represents a way of obtaining a unique marker for each element in the list. Instead of atoms we could have used
integers, strings, or indeed any kind of SETL value. We choose to use atoms
to emphasize that each cell in the list requires a "name" of some sort, but that this name serves as a place-marker for the value that goes with it and has no other significant relationship to that value.

In order to transform any tuple t into this list representation, we can make use of the following procedure:

procedureinto_list(t);ift = [ ]then returnOM;end if; first :=newat(); cell := first; succ :=newat();foriin[1..#t]loop-- Place tuple component val_of(cell) := t(i); -- Cellfornext. next_of(cell) := succ; -- In which next component cell := succ; -- will be stored. succ :=newat();end loop; next_of(cell) :=OM; -- Last element.returnfirst; -- First allocated cell.endinto_list;

This procedure (which assumes that *next_of* and *val_of* are globally available, map-valued quantities) returns the first cell of the list, so that by applying the
mappings *next_of* and *val_of* repeatedly we can reach all the elements of the list. We have assumed that *next* and *val_of* are global variables, defined outside this procedure. This is indeed the way in which a program using lists is usually structured: a single next mapping is used to describe the chaining of all lists present in the program. The chained representation of a tuple that is obtained from procedure *into_list* can be pictured as follows (Figure 6.1):

At first sight, representing a tuple in this way may not appear to be such a
good idea. Of course, it is not hard to iterate over tuples having this representation: we simply start at comp1 and apply next repeatedly to step along, always applying *val_of* to get the component value corresponding to whatever index we have reached. For example, instead of writing

ifexistsx = t(i) | C(x)then...else... (1a)

as we would if *t* were a standard SETL tuple, we would write

i := comp1; -- Initialize search indexwhilei /=OM loopx := val_of(i); -- Value of next element.ifc(x)then exit;end if; -- If found, search is successful (1b) i := next_of(i); -- else advance search indexend loop;ifi /=OMthen...else...

Although no less efficient than (la), the code (lb) is certainly more comp1ex
and hard to read. Moreover, accessing a given component *t*(*k*) of *t* is much less efficient in the list representation, since for standard tuples the operation

is performed in one or two cycles, whereas if *t* has the list representation we will instead have to count up to *k* as we step through the list, as the following fragment shows:

i := comp1;forjin[1..k-1]loopi := next_of(i);end; x := val_of(i);

whose execution requires at least k cycles; that is to say, indexing operations
are not efficient when applied to lists. On the other hand, certain operations
that modify the size of tuples can be performed much more rapidly in the list
representation than for standard SETL tuples. For example, for a standard
tuple the operation which inserts *x* immediately after the *i*-th component of *t* requires time proportional to the length of *t*, since to create the expanded tuple all of the elements of *t* will have to be copied into new positions. On the other hand, if *t* has the list representation, and we know the name *ni* of the cell after which the insertion is to be performed, this operation can be done in just a few cycles, since all we have to do is

- create a new atom
*ix*to name the cell for the new component value*x*; - link
*ix*at the appropriate position into the list representing*t*.

Similar remarks apply to the operation t(i..i) := [ ] which deletes a given component from a tuple in list representation. The following two procedures represent these operations. In writing these procedures, we suppose that a single pair of maps I>next and *val_of* will be used to represent all lists, and that the variables *next* and *val_of* have been declared global. We also suppose that only one logical reference to any of these maps is ever extant, so that no copying (see the preceding section) ever needs to be performed.

varnext,val_of; -- Global Variables next := {}; -- Initialize to null set val_of := {}; -- Initialize to null setprocedureinsert(x, ni); -- Inserts x immediately after the list -- component ix :=newat(); -- whose cell is ni. Create a new cell ix, next_of(ix) := next_of(ni); --andmake it the predecessor of next_of(ni) next_of(ni) := ix; --andthe successor of ni val_of(ix) := x; -- Place valueincell.endinsert;proceduredelete(ni); -- delete the component immediately following that whose cell is ni next_of(ni) := next_of(next_of(ni)?ni); -- unless ni is the last indexinits list make -- ni's successor the successor of ni's -- original successorenddelete;

Provided tuples *t*_{1} and *t*_{2} are represented as lists, that *t*_{1} will not be required after *t*_{1} and *t*_{2} are concatenated, and that the name *i* of the last cell of *t*_{1} is easily available, the concatenation of *t*_{1} and *t*_{2} can also be formed in a number of cycles independent of the length of *t*_{1} and *t*_{2}. The following procedure, in which we assume that each tuple *t*_{1} n list form is represented by a pair [first, last] consisting of the first and the last cell of *t*, shows this:

procedureconcat(t1,t2); [t1_first,t1_last] := t1; -- unpack the first and -- last cells of t1 [t2_first,t2_last] := t2; -- and those of t2ift1_first =OMthen -- t1 is an empty listreturnt2;elseift2_first =OMthen -- t2 is an empty listreturnt1;elsenext_of(t1_last) := t2_first; -- link the two listsreturn[t1_first,t2_last];end if;endconcat;

The mapping next_of, and the variables used to keep track of first and last
elements of lists, together define what are called *pointers* in other programming languages.

Quite a few other very useful trick representations of tuples, sets, maps, etc., are known. These representations make use of more comp1ex arrangements of pointers, i.e., internal mappings between elements. If the family of operations applied to a SETL object is appropriately limited, use of an appropriate one of these special representations can be very advantageous. Since further exploration of this very important issue would take us beyond the scope of the present work, we refer the reader to the References at the end of this chapter for additional material concerning the issue of data structuring.

**EXERCISES**

1. Write a **while** loop that takes 2**(2**n) cycles to execute, for some n. Write a **while**
loop that take 2**(2**(2***n*)) cycles to execute. For how many centuries will these
loops run, if *n* = 100?

2. If the monks in charge of the Towers of Hanoi began moving the 64 rings in the year 1 million B.C. and if they more one ring per second, when will the world end?

3. By adding a mapping *prev* that takes a given list element to its predecessor in the
list, it is possible to iterate over a list forward and backward. Write a procedure to
transform a standard tuple into this "doubly-linked" list representation.

4. Write procedures to insert and delete elements from a doubly-linked list.

5. Estimate the cost of the list concatenation procedure of Section 6.5.9. Assuming
that a map operation such as next_of(last) := first; takes five cycles, and that copying
a tuple of size n takes 2**n* cycles, determine the size of t1, t2 for which the linked
representation is preferable to the standard contiguous representation for performing
tuple concatenation.

The examination of various sorting examples in the previous sections indicates the very general fact that several algorithms, of quite different efficiencies, can usually be found to perform a given task. Several questions then arise:

- How can we find the most efficient method or algorithm to perform a computation?
- How can we know that it is the most efficient one available?
- Given one algorithm which is not the most efficient for the problem at hand, can we modify it in some systematic fashion in order to improve its efficiency, perhaps by transforming it into one of the more efficient algorithms of its family?

The answer to (a) is found in the efficiency analysis which we sketched in
Section 6.5. Given two algorithms, we can compare the number of operations
they are expected to take in various cases and thus determine which one is
preferable. Question (b), namely how do we know that a given algorithm is
indeed the best possible for a given task, is much harder, and is the subject of so-called lower-bound analysis. For example, in the case of sorting lower-bound analysis gives the following result: "When we sort a set of size *n*
by means of comparisons between its elements, 0(*n* log *n*) operations are required". This rigorous mathematical result (which we shall not prove here) shows that quicksort is probably close to the best way to sort, although bubble-sort is definitely not. Unfortunately such lower bounds are known only for a few interesting problems, and in general we have no systematic way of knowing how close to optimal a newly invented algorithm is.

Question (c), namely how can we improve on a given algorithm in order to improve it, is the source of much current research and also of large amounts of programming lore. Some basic techniques for transforming a program into a more efficient one take the form of very general prescriptions, such as

- Do not evaluate the same expression repeatedly.
- In particular, if a calculation is performed repeatedly in a loop and yields the same result every time, then perform the calculation once outside the loop, save the result, and use it within the loop.
- Try to reuse storage occupied by a variable if the value of that variable is not needed any further.
- When possible, replace recursive calls by iterative loops.

These prescriptions are simple and intuitive and can be expected to be generally useful. They can be applied almost mechanically and in fact can be automated to some extent, so that a clever compiler can apply such prescriptions by itself. In fact, most good compilers perform transformations corresponding to (1) and (2). Reuse of storage, as suggested in (3), is harder to automate, and the application of this rule requires greater ingenuity on the part of the programmer. The removal of recursive calls in favor of iterations has been studied in great detail in the context of languages such as LISP, which use recursion very heavily. We will examine a nontrivial example of recursion removal in a few pages.

In addition to the general rules stated in (1.)-(4.), there are certain optimizing program transformations that apply to very specific program constructs. Catalogs of such transformations can be found in some of the works listed in the References. Here we will only give some simple examples of applications of the rules (1), (2), and (3).

The simplest kind of efficiency-improving transformation which we can perform is to identify expressions that are common to several calculations and that can be evaluated once instead of repeatedly. For example, when calculating the roots of a quadratic equation, we may write the following familiar formulae:

x1 := (-b +sqrt(b**2-4.0*a*c))/(2.0*a); (1) x2 := (-b -sqrt(b**2-4.0*a*c))/(2.0*a);

In this case it is plain that the expression involving the square root is the same for both roots, which makes it possible to rewrite (1) as follows:

discr :=sqrt(b**2-4.0*a*c); x1 := (-b + discr)/(2.0*a); x2 := (-b - discr)/(2.0*a);

Transforming from (1) into (2) saves four arithmetic operations and one
square root evaluation and makes the expressions simpler to type. We remark
that the expression (2.0**a*), which is a common subexpression to *x*1 and *x*2, is still evaluated repeatedly in (2). Replacing the two evaluations by a single one by saving the result in some other variable would eliminate one arithmetic instruction but add one assignment statement and therefore would hardly be a saving at all.

As a less trivial example of removal of redundant subexpressions, consider the evaluation of polynomials at a point. That is to say, given some polynomial

we want to determine its numerical value at some point, e.g., for *x* = 10. If we represent the polynomial by a tuple *P* containing its coefficients in decreasing order, then we can evaluate the polynomial for *x* = *c* by means of the following code:

degree := #P - 1; value := +/[a*c**(degree-i + 1): a = P(i)];

Note that in the loop we evaluate c***n*, c**(*n*-1), c**(*n*-2), etc. These exponentiations are evaluated as repeated multiplications, so that the number of operations performed will be

where the O(*n*) term takes care of the iteration over *P* and of a final summation of individual terms. In other words, as written the fragment (3) takes an amount of time proportional to the square of the degree of the polynomial. Substantial improvement can be obtained if we observe that the powers of *c* can be calculated in increasing order rather than in decreasing order, so that c***n* is obtained from c**(*n*-1) by means of a single multiplication. An improved version is the following:

power := 1; value := 0;foriin[#P,#P - 1..1]loopterm := power * P(i); (4) value +:= term; power *:= c;end loop;

Here the loop is executed *n* times (where *n* is the degree of *P*) and each execution of the loop takes three arithmetic instructions and a few assignments, so that the algorithm is now of order *n*, rather than *n*2. This is a substantial improvement for large enough *n*.

Let us note that a similar improvement can be obtained by a different
approach, which is several centuries old, and is known as *Horner's rule*. We can rewrite the polynomial (*) by means of successive factorizations:

This suggests the following method for evaluating P:

value := 0;fora = P(i)loopvalue := value * c + a;end loop;

Now the loop contains only two arithmetic operations and one assignment. This is certainly superior to (4), and is therefore the method of choice for evaluating polynomials. Note however that (5) was obtained by means of a clever rewriting of the original problem, and the solution was not the result of some systematic transformation rule, but was "pulled out of a hat," as it were. It is appropriate to remember at this point that the development and optimization of algorithms require a healthy dose of inventiveness (guided, to be sure, by a knowledge of existing algorithms and optimization techniques)

In order to illustrate the kind of program improvement that can be obtained by recursion removal, we will examine in some detail the (by now familiar) quicksort method. In its simplest version, we have described quicksort as follows:

procedurequicksort(S);ifS = {}then return[ ];elsex :=arbS; lo := {yinS | y < x}; hi := {yinS | y > x}; (1)returnquicksort(lo) + [x] + quicksort(hi);end if;endquicksort;

The skeleton of this algorithm consists of two steps: *partition* and *recurse*. That is to say: pick some element of the set to be sorted, and partition the rest of the set into two disjoint sets, corresponding to elements that are smaller (resp. larger) than the partitioning element. Then apply this process recursively to each partition. The two very substantial improvement that can be brought to bear on the efficiency of quicksort are performing the partitioning in place and replacing the recursion with a loop.

*Partitioning in Place*. To see how to apply this improvement we first observe that after *S* has been partitioned into lo and hi, the value of S is not used again. This suggests that, instead of building lo and hi at each step as new sets, we may want to reuse the space occupied by S, to place lo and hi in it. How to do this is not immediately apparent: as long as S, lo, and hi are sets, we do not have a simple way of describing how they are actually stored. If however we make them into tuples forming part of a larger tuple, then it is possible to describe them by their size and their starting locations. Suppose *S* is a tuple of size *n*, and that the partitioning element which we choose turns out to be the *k*-th one (in increasing order). Then the locations 1, 2..*k*-1 can be used
for lo, and the locations k + 1, k + 2..*n* can be used for hi. Then, rather than passing lo and hi as tuples in the recursive call, we can instead pass the end points of each and then proceed to partition each subtuple in place as well. (How the partitioning in place is actually performed remains to be discussed.)
This leads to the following (tentative) improved version of quicksort:

T:= [xinS]; -- Initial tuple to be sorted. quicksort2(1,#T); -- First call to routine.procedurequicksort2(m, n); -- Apply quicksort to the portion of T (which is global) between T(m) and -- T(n) inclusive.ifm >= nthenreturn; -- Trivial case.elsek = partition(m,n); -- k is the position of the partitioning element quicksort(m,k - 1); -- Elements smaller than it. quicksort(k + 1,n); -- Elements greater than it.end if;endquicksort2;

In this version of the algorithm, no composite objects are constructed (except
for the global tuple *T*). Each call to quicksort is given an explicit region of *T* on which to operate. This forces us to mention the indices *m*, *n*, *k*, etc., explicitly, which makes for a longer and less readable piece of code. We must now show that the partitioning itself can be done in place. This is an interesting algorithm in its own right.

*Partitioning an Array in Place*. The problem is the following: given a tuple *t*, and an arbitrary element *e* of *t*, reorder the elements of *t* so that all elements *l* < *e* appear before *e*, and all elements *h* > *e* appear after e. For simplicity let us assume that *e* is chosen to be the first element of *t*. A first thought might be the following: scan *t* from left to right, and whenever an element *e*_{1} > *e* is found, move it up. But exactly where to? If we are to reorder *t* in place, then in order not to loose any information, we can only exchange elements. As a moment's reflection will show, the proper way to proceed is the following:

- Going from left to right, find an element
*e*_{1}>*e*, which must therefore appear after*e*in the desired reordering. - Going from right to left, find an element
*e*_{2}<*e*, which must therefore appear before*e*. - Interchange
*e*_{1}and*e*_{2}. - Continue scanning from left to right from the position of
*e*_{2}, and from right to left from the position of*e*_{1}, interchanging whenever the next pair of out-of-place elements is found. - Stop when those two scans meet (at the correct location for
*e*).

The following code (which deserves very careful scrutiny) performs this partitioning process:

procedurepartition(i,j); -- Partition the portion of array T from T(i) to T(j), -- using T(i) as the partitioning element e. The procedure places e at -- position k, and places all elements smaller than e at positions < k. e := T(i); -- Partitioning element. l := i + 1; -- Index for left-to-right scan. r := j; -- Index for right-to-left scan.whilel < rloop-- Scans have not met in middle.whilel < randT(l) < eloop-- bypass smaller elements. l +:= 1;end loop;whilel < randT(r) > eloop-- bypass larger elements. r -:= 1;end loop; -- Now T(l) > e,andT(r) < e. Exchange themifthe two scans have not met yet.ifl < rthen[T(l), T(r)] := [T(r), T(l)]; -- Skip over these two elements which are now in proper place. l +:= 1;ifr > lthenr -:= 1;end if;end if;end loop; -- Now l = r. l = r or l - 1 (if it is positive) is the correct position -- for the partitioning element. Place it there, and place T(r), which is no -- larger than it, at the original position of the partitioning element.ifl > #TorT(l) > T(i)thenl := (l - 1)minj;end if;ifl > 0thenT(i) := T(l); T(l) := e;end if;returnl;endpartition;

The top-level driver code for this routine can simply be

procedurequicksort(i,j);ifi >= jthenreturn;end if; mid := partition(i,j); quicksort(i,mid - 1); quicksort(mid + 1,j);endquicksort;

Both these procedures assume that the tuple T is globally available.

At this point, our first optimization goal has been achieved: we have a quicksort algorithm that works in place, that is to say, without constructing any new composite objects in the course of its execution. This has been achieved by explicitly describing the elementary operations that place each element of the object being sorted at some specified position in a single tuple.

In other words, we have deliberately avoided the use of composite objects in the explicit fashion characterizing version (1) of quicksort. Of course, when we avoid manipulation of composite objects, much of the power of high-level languages such as SETL is lost. In fact, our version of quicksort now looks remarkably similar to a PASCAL or an Ada version of the same algorithm. This indicates the relative advantages of these various languages: SETL and other languages that manipulate arbitrary composite objects are the most convenient programming tools and lead most rapidly and with smallest programming effort to correct and lucid versions of algorithms. However, when efficiency considerations become critical, we must often abandon the richer semantic primitives of these languages (or drop to a lower-level language altogether) in order to control the efficiency of every statement we write.

**Removing the Recursion from Quicksort**. The next step in our improvement
of quicksort is to replace the recursive calls within quicksort by some iterative construct which can be expected to be cheaper (recall that a procedure call is as expensive as 10 to 20 arithmetic operations). The skeleton of the recursive version of quicksort is the following:

procedurequicksort(i,j); ... quicksort(i,k - 1); quicksort(k + 1,j);

Each invocation of quicksort receives some values for i and j and assigns some value to k. Therefore, to eliminate recursive invocations of the procedure, we must save the current values of i, j, and k, because they might be used after we return from the recursive call. This can be achieved by means of a stack on which we "pile up" the parameters which we need to preserve. A stack is a tuple which is manipulated as follows: elements are added to it at its end and deleted from it from the same end. In other words, the operations that modify a tuple are the following:

stackwith:= x; -- "push" operator zfromestack; -- "pop" operator

Whenever we perform a recursive call, we use the stack to save whatever needs to be saved, i.e., whatever current values of variables and parameters will be needed after the recursive call. Whenever we return from the recursive call, we obtain, from the stack, all values that we need to resume the computation at the point where we left off. This is in fact what the SETL run-time system does: whenever a recursive call is executed, current values are saved before the (new instance of the) procedure starts executing. However, the run-time system does this in a stereotyped fashion which is often inefficient. In the case at hand, the two recursive calls are sandwiched between stacking and unstacking operations, whose action corresponds to the following code:

procedurequicksort(i,j); ... stackwith:= [i,j,k]; -- Save current values. quicksort(i, k-l); -- Perform recursive call. [i,j,k]fromestack; -- Restore current values. stackwith:= [i,j,k]; -- Save them again. quicksort(k + l,j); [i,j,k]fromestack; -- restore current values.endquicksort;

(Recall that the stack manipulations are the ones performed by the run-time
system.) Several remaining inefficiencies should now be apparent: the most
glaring one relates to the fact that the second recursive call is the last statement in the body of quicksort. In other words, there are no computations to be done after the second call, and therefore there is no need to save and restore current values of variables and parameters at this point. A smaller inefficiency is the fact that the second call only requires the values of *k* + 1 and *j*, so that there is therefore no need to save *i* either. So in fact we can do with the following:

procedurequicksort(i,j); ... stackwith:= [k,j]; quicksort(i,k-1); [k,j]fromestack; quicksort(k + 1,j);endquicksort;

At this point, the meaning of the stack should be clear: it allows us to keep
track of portions of the tuple *T* which remain to be sorted. With this understanding, we realize that quicksort will continue calling itself recursively as long as there is some portion of *T* still unsorted, i.e., as long as there is something left on the stack that indicates some pending invocation of quicksort. Having this insight, we can transform our algorithm into an iterative one that loops as long as the stack is not empty:

whilestack /= [ ]loop[i,j]fromestack; -- Values to work on. ... stackwith:= [i, k - 1]; -- to be worked on. stackwith:= [k + 1,j]; -- ditto.end loop;

This is still incomplete: as written, each step deletes one pair of indices from the stack and adds two such pairs to it: clearly the stack will grow without bound, and the algorithm never terminates. What prevents this disaster from happening is that successive pairs describe shorter and shorter portions of the tuple being sorted. Whenever *j* - *i* < 2, there is nothing to sort. Therefore we should only save a pair on the stack if it describes a portion of the tuple of size greater than 1. A final improvement is to save only one pair on the stack and go to work immediately on the other, rather than stacking it and unstacking it at once. In order to keep the stack small, it turns out best to stack the indices of the larger portion to be sorted and go to work on the smaller one. Putting all this together, we obtain the following:

procedurequicksort(rwT); -- Optimized version of quicksort procedure. It operates in place, with -- explicit stack manipulation, and the partitioning process is performed -- on line. i:= 1; j:= #T; -- Initial portion to be sorted is the -- whole of T. stack := [ ];loope := T(i); -- Partitioning element. l := (orig_l := i) + 1; -- Index for left-to-right scan. orig_r := r := j; -- Index for right-to-left scan.whilel < rloop-- Scans have not met in middle.whilel < randT(l) < eloop-- bypass smaller elements. l +:= 1;end loop;whilel < randT(r) > eloop-- bypass larger elements. r -:= 1;end loop; -- Now T(l) > e, and T(r) < e. Exchange them if the two scans have not met yet.ifl < rthen[T(l), T(r)] := [T(r), T(l)]; -- Skip over these two elements, which are nowinproper place. l +:= 1;ifr > lthenr -:= 1;end if;end if;end loop;ifl > #TorT(l) > T(i)thenl := (l - 1)minj;end if;ifl > 0thenT(i) := T(l); T(l) := e;end if; mid := l; -- Save larger partition on stack.iforig_r - mid > mid - orig_lthen[savei, savej] := [mid + 1,orig_r]; j := mid - 1; i := orig_l; -- Work on [orig_l..mid - 1]else[savei, savej] := [orig_l,mid - 1]; i := mid + 1; j := orig_r; -- Work on [mid + 1..orig_r]end if; -- Save larger partition only if it is longer than 1.ifsavej - savei >= 1thenstackwith:= [savei, savej];end if; -- Work on smaller partition only if it is longer than 1. -- Otherwise take the top pending partition to work on.ifj - i = 1andT(i) > T(j)then[T(i),T(j)] := [T(j),T(i)];end if;ifj - i < 2thenifstack = [ ]then-- Nothing left to do.exit;else[i,j]fromestack;end if;end if;end loop;endquicksort;

This is very far from the short and intuitive algorithm with which we
started. However, this final version is considerably more efficient than the
original. It is in this final form that the quicksort procedure is usually implemented. Moreover, this more efficient version is still incorrect in some details and would have to be touched up to function properly! For example it will not work if *T* contains duplicate elements; moreover, during the partitioning process, it is possible for *l* or *r* to go out of bounds (i.e., *l* may become > #*T*, and *r* may become 0). These problems can be remedied without difficulty (see Exercises). However, their presence is a reminder that optimization is almost invariably accompanied by increased complexity.

Acceptance of the programming burdens that optimization demands is only justified for algorithms that are used often, consume large amounts of resources, or are critical to the overall performance of a system. Clearly, sorting is a frequently used process, and efforts spent in obtaining better sorting algorithms (the subject of much theoretical and pragmatic research in itself) and encoding them well are quite worthwhile. But it is unwise to devote strenuous optimization efforts to procedures that are not frequently used or are not critical to system performance. In many cases clarity and maintainability are far more important considerations in software design and construction than mere speed. Note however that recent research holds out some promise that it may eventually be possible for sophisticated computer programs to offer substantial assistance in applying optimizations as deep as those described in the present section.

**EXERCISES**

1. How could you use the techniques described in Section 6.5 to make nonterminating recursions less likely to occur?

2. The following SETL code is syntactically correct, but very poorly laid out. Put it in a better format, and add appropriate comments.

programsort; s := {3,2,1}; t := [ ];whiles/= {}loopsless:= (x :=max/s); t := [x] + t;end loop;endsort;

Improve the following program, which is also correct but poorly laid out.

programfind_palindromes; -- "Madam Im Adam" x := "madam im adam"; y := +/[c: cinx | c /= " "];ify = +/[y(j): jin[#y, #y-1..1]]then print(x," is a palindorome");elsenota palindorome");end if;endfind_palindromes;

3. What cases should be run to test the following recursive sort procedure comprehensively?

proceduresort(s); -- recursive sort of a set of integersreturn if(x :=arbs) = OMthen[ ]elsesort({yins | y < x}) + [x] + sort({yins | y > x})end if;endsort;

Run your tests, and try to estimate how thoroughly they test this code.

4. Suppose that G is a graph represented as a set of ordered pairs. If G contains a cycle C then C can be regarded as a subset of G such that for each x in C there exists a y in C (namely the edge y following the edge x in the cycle C) such that x(2) = y(1). Conversely, if this condition is satisfied, then there must exist a cycle, since starting with any x in C we can find an x' such that x(2) = x'(1), then an x" in in C such that x'(2) = x"(1), and so on, until eventually we must return to an edge that occurs earlier in the chain x, x', x", ..., at which point a cycle will have been formed. This leads to the following procedure for testing a graph to see whether it has a cycle:

Work out a good battery of tests for this procedure, test it, and try to estimate how comprehensive your tests really are.proceduretest_for_cycle(G); -- testsanygraphforthe existence of a cycleifexists Cin pow(G) | C /= {}and(forallxinc | (exists yinc | x(2) = y(1)))then"There exists a cycle"else"There exists no cycle"end if);endtest_for_cycle;

5. In the original quick_sort program, change the expression

to

thereby introducing a bug. Then run the erroneous program. What happens? Could you guess the problem from the symptom? What would be a good way of debugging this program?

6. Suppose that the subexpression [x] in Ex. S is not omitted but is accidentally
mistyped as [z]. What will happen? Why? What will happen **if** it is accidentally
mistyped as [0]? As {x}?

7. For debugging purposes, it is useful to have a monadic operator @OUT such that @OUT x always has the value x, but such that "evaluation" of @OUT x prints the value of x. A binary operator s @OUT2 x which returns x but prints both s and x can also be useful. Write definitions for these operators. How might you use them to debug the faulty recursive procedure described in Ex. 5?

8. Each string in the following set consists of characters which are easily mistyped for each other:

Write an expression that converts this set of strings into a set s consisting of all
pairs of letters that are easily confused for one another. Use this set to create a
"bugging program" *B*, which can read the text of any SETL program *P*, introduce
one randomly selected character substitution chosen from *P* into it, and write the
erroneous version of *P* thereby produced into a file.

Collect various short sample
programs from your friends, "bug" them using *B*, and ask your friends to see
whether they can spot the error. Then debug these programs, to see how long it
takes you to track down the errors which *B* has introduced. If *B* is modified so
that it never changes characters in SETL keywords, but only in identifiers, how
much more elusive do the bugs that it introduces become?

9. Take the bubble-sort procedure described in Section 5.1.1 and the mergesort procedure described in Section 5.4.2, and modify them by inserting code which will count the number of comparisons which they make when used to sort a given vector t. Use these modified routines to sort tuples of length 50, 100, and 200, counting the number of comparisons performed and measuring their relative efficiencies. Try tuples with random components, and also try tuples with only a few components out of sorted order.

10. Suppose that the statement

in the bubble-sort procedure of Section 5.1.1 is replaced by

What would happen? If we checked the resulting version of bubble_sort by adding

would the problem introduced by the change (*) be found? What checking assertion could we write to catch the sort of error that (*) introduces?

11. Suppose that in the bubble-sort procedure of Section 5.1.1 we inadvertently wrote [1..#t] instead of [1..#t - 1]. What would happen? If we wrote [#t..1] instead? If we wrote [1..#t - 2]? If we wrote [2..#t]?

12. Take the bubble-sort procedure shown in Section 5.1.1 and find at least three errors that might plausibly be made in writing or typing it, any of which would cause the code to loop endlessly. None of these errors should involve changing more than six characters. Take these erroneous versions of bubble sort to friends, and see how long it takes them to spot the errors.

13. Take the merge_sort program shown in Section 5.4.2. Then modify it, to produce four different erroneous versions of merge_sort, each of which contains one of the following list of common bugs. (Try to make your modifications as plausible, and as hard to spot as possible.)

- Boolean condition stated in reversed form.
- One branch of an
**if**statement omitted. - Premature exit from a loop.
- Input data not checked; data of the wrong type read.

14. Repeat Ex. 13, but for the buckets and well program shown in Section 5.3.1.
Produce five erroneous versions of this program, each with one or two plausible
errors in every procedure. Devise a debugging plan which could discover most of
these errors quickly. In what order does it seem best to test the procedures of this
program? Where would it be most useful to place **assert** statements? Try to devise
assertions that can be checked quickly, but whose verification will be strong
evidence that the program is working as it should.

15. The following version of quicksort contains just one error. What is it?

procedurequicksort(t); -- t is assumed to be a tuple of integersift = [ ]then return;end if; x := t(1); t1 := [y: y = t(i) | y < x]; t2 := [y: y = t(i) | y = x]; t3 := [y: y = t(i) | y > x]; quicksort(t1); quicksort(t3); t := t1 + t2 + t3;endquicksort;

16. How many of the errors introduced by the "bugging program" described in Ex. 8 could be found more easily using a program which reads SETL programs and prints out a list of all identifiers appearing only once in them?

17. Write a maintenance test which could be used to check a sort program by comparing its results with those of quicksort. Use this test to verify that the mergesort procedure shown in Section 5.4.2 is correct.

18. Write a SETL system maintenance test which computes 50 set- or tuple-related values in two radically different ways and compares the results obtained. Your test should exploit various set-theoretic identities. For example,

should be true for every map *f*, and

19. To see what parts of a program have been executed in a series of tests, we can
introduce a global variable called *points*, and a **procedure**

Then if we initialize *points* by writing points := {1..n}, insert a sequence of
statements *point*(j), j = 1,...,*n* into the code being tested, and print *points* at the
end of execution, each remaining member of points will represent a section of code
that has never been executed.
Apply this technique to develop a comprehensive set of tests for the text-editing
program described in Section 5.10.2. Add tests to your set until the condition
*points* = {} is achieved, to make sure that your collection of tests does not leave
any section of code unexecuted.

20. A *boundary test* for a program *P* is a systematic collection of tests which exercises
*P* in all the legal but extreme cases which *P* is supposed to handle. Work up several
such boundary tests for the text-editing program described in Section 5.10.2. Your
tests should include items like checks for $0.00, empty transaction files, etc.

21. The text-editing program described in Section 5.10.2 is totally unprotected against bad input. Modify it so that all input is systematically examined for acceptability; your input-examination procedures should check for all remotely plausible input errors. Write an English-language explanation of the input errors for which you check.

22. Take one of your programs, approximately 10 lines long. Strip all comments from it, and then introduce one misprint, to cause a bug (not one that syntax analysis would find). Give the result to a friend (a good friend!) with a 3-line explanation of what the program is supposed to do. See whether your friend can find and fix the error without expending more than an hour's effort.

23. Develop test data for the GCD program out1ined in Ex. 12 (following Section 5.5). Your tests should include cases in which the data are zero, negative, etc., and should test all relevant combinations of "extreme" data of this kind.

24. Write the "MISSING_SECTIONS" procedure described in Section 6.4.

The great importance of programs to banks, airlines, engineering firms, insurance companies, universities, indeed to all the major institutions of our society, lends an inescapable importance to the question of program correctness. Once a program has been written, how can we be sure that it is correct, i.e., that when given legal input it will always produce the output that its author desires? This is a deep question, whose systematic exploration would take us far beyond the boundaries of the present text. Nevertheless, in order to shed some light on the issues involved, we this section will say something about it.

To begin with, we emphasise that mere program testing, even systematic
testing like that described in Section 6.4, can never prove a program's correctness in any rigorous sense. *Testing*, to repeat an important maxim of the
Dutch computer scientist Edsger Dijkstra, *can show the presence, but not
the absence, of bugs.* Though systematic testing is an essential tool of program development, in asserting the rigorous correctness of a program we are
asserting that it will run correctly in each of a potentially infinite family of cases. Clearly, no finite sequence of test cases can cover all of them, and so any rigorous assertion that a program functions correctly in all possible cases must rely on some kind of mathematical proof.

The basic raw material out of which such proofs can be built is not too hard to find. When a programmer has written a program and checked it carefully for the first time, why does he believe that it will run correctly? If legitimate, this feeling of correctness must always rest on a comprehensive logical analysis of the conditions that arise as control moves from point to point during the execution of a program.

To show what such analysis involves, we will take a very simple program,
namely one which calculates the product of two integers *n* and *m* by adding *n* to itself *m* times. (The basic technique that we will use to prove the correctness of this trivial program is entirely general; however, the mass of technical detail needed to handle more comp1ex examples grows rapidly, and to avoid this it is best to stick to a rudimentary example.) Since it is a bit easier to handle **while** loops than **for** loops, we write our multiplication code as follows:

prod := 0; iterations := 0;whileiterations /= mloopprod := prod + n; (1) iterations := iterations + 1;end loop;

To begin to prove this program correct, we must first supplement it by adding a formal statement of what it is that the program is supposed to achieve. This can be done by adding an assert statement at the very end of the program, giving us

Line1: prod := 0; Line2: iterations := 0; Line3:In (2), all lines of the program have been labeled to facilitate later reference. Note that addition of the finalwhileiterations /= mloopLine4: prod := prod + n; (2) Line5: iterations := iterations + 1; Line6:end loop; Line7:assert(prod = m * n);

This fundamental principle being understood, we go on to remark that in
proving a program correct what one basically needs to do is just to write down
the logical relationships between data items upon which the programmer's
understanding of his program's behavior rests. However, these relationships
must be written down in a sufficiently complete manner and must be expressed
formally, using additional **assert** statements.

To see what is involved, let us first analyze program (2) informally. If the author of (2) wished to convince a skeptical colleague that it really does compute the product m*n, what facts about (2) would he point out, what more detailed analysis would he offer? The crucial fact upon which program (2) depends is that each time the loop starting at Line 3 begins to repeat, the variable prod will be equal to the product of the variable 'iterations' by the variable 'm'. This is certainly true on the first iteration, since then both prod and iterations are zero, so we certainly have

prod = iterations * n (3)

(i.e., 0 = 0 * n) on first entry to the loop. But if (3) is true at the start of k-th iteration, it must also be true at the end of the k-th iteration, since the body of the loop increments 'prod' by n and increments 'iterations' by 1. Hence (3) remains true during every iteration. Thus, since the loop only terminates when the variables 'iterations' and 'm' are equal, (3) implies that prod = m*n at the end of the loop, which is what we wanted to prove.

The argument we have just presented is a satisfactory informal proof of the correctness of the program (2). Nevertheless, it is not quite what we require. In proving that a program is correct, we aim to exclude rigorously the possibility of any small, easily overlooked program bug. For this, merely informal, English-language proof is insufficient, since such proofs are no less likely than programs to contain small errors. Moreover, some of the likeliest errors in programs (for example, counting in a manner that is off by 1) correspond closely to errors that occur frequently in mathematical proofs (for example, starting a mathematical induction at the wrong place or missing one among multiple cases that a proof needs to examine; see Ex. 12 of Section 6.7). Therefore, when we set out to prove a program rigorously correct, we must aim at something more formal and machine-checkable than an ordinary English-language proof of the kind ordinarily found in textbooks.

This observation drives us to a more formal approach, like that devised
by Robert Floyd and Tony Hoare, for proving programs like (2) correct. Floyd's formalism
requires us to add **assert** statements to a program P that we are trying to prove correct. These auxiliary **assert** statements, sometimes called the *Floyd/Hoare assertions* for P, must satisfy two principal conditions:

- Enough
**assert**statements must be added so that there can exist no indefinitely long path through the program P which does not pass through at least one**assert**statement. Another way of putting this is to say that at least one auxiliary**assert**statement must be inserted into every loop in the program P. - Consider any one of these auxiliary
**assert**statements. It will have the form**assert**(C) (4)

where C which can be any Boolean-valued expression, called the condition of the **assert** statement. The auxiliary assertion (4) will occur at some specific place in the program P, say, to be specific, immediately after Linej of P. Then we require C to assert every fact about the state of the program's variables that is relevant at Linej, i.e., every fact upon which the correct functioning of P from Linej onward will depend. This important rule ensures that all the essential facts needed for proving the correctness of P are explicitly and formally written down in the auxiliary assertions added to P, and this is what makes a rigorous proof of correctness possible in principle.

Once the required assertions (4) have been added to P, we proceed as
follows. Starting either at the first statement of P or at some one of the auxiliary **assert** statements in it, we move forward line by line through the program along every possible path (i.e., every path of control flow, which is to say every path that the program could follow during its execution. All possible paths through P which start at an **assert** statement but do not pass through any other **assert** statement must be considered one after another. Because (by condition (a)) there are no infinite loops not passing through an **assert** statement, there can exist only finitely many such paths, and each such path will be bounded in length.

Tracing out all such paths q, we will use each of them in the following way to generate a set V of *verification clauses*. (With one exception, noted in (f) the verification clauses associated with a particular path q collect logical relationships between variable values which are certain to hold along q.)

- Suppose that the path q starts at an
**assert**statement**assert**(C), where C is a Boolean formula. Then we begin by putting C into V as its first clause. This simply reflects the fact that C is assumed to be true at the start of q. - If the path q passes through an assignment statement of the form
(A) x := expn;

(where

*expn*can be any expression) we introduce a new variable identifier x' (this identifier simply designates the value which x has after execution the assignment (A)) and add the clause(B) x' = expn

to V. Occurrences of x encountered later along the path q (but prior to any subsequent assignment to the same variable x) are then replaced by occurrences of x'. (But at and after any later assignment to x we replace x by yet another new identifier x".) For example, the sequence of assignments

x := x + 1; y := y + 1; z := x + y; x = x + z; would generate the clausesx' = x + 1, y' = y + 1, z' = x' + y', x" = x' + z'. These rules simply reflect the fact that the new value x' which the variable x takes on immediately after the assignment (A) satisfies the equation (B), and that x retains this value until it becomes the target of a subsequent assignment.

- If the path q passes through an assignment of the special form
(C) x :=

**arb**(s);where s is some set-valued expression, then just as in paragraph (b) we introduce a new name for x, but in this case we add the clause

(D) x'

**in**sto V. (This reflects the fact that the

**arb**operator can select an arbitrary element of s, so that (D) asserts everything we can know about the new value x' given to the variable x by the assignment (C).) - Conditional and unconditional
**goto**'s: The SETL language does not provide anystatement, but to state the next verification rule that concerns us, it is convenient to pretend that it does, and to think of SETL's**goto****while**and**until**loops as abbreviations for lower-level constructions involving**goto**s and labels. The expansion of a**while**loop**while**C**loop**statements...**end loop**;Start_Label:

The corresponding expansion of an**if not**C**then****goto**Exit_label;**end if;**statements...**goto**Start_Label; Exit_label: ...**until**loop**until**C**loop**statements...**end loop**;Start_Label: statements...

An**if**C**then****goto**Start_Label;**end if;**Exit_label: ...**exit**statement in a**while**loop translates directly into**goto**Exit_label;**while**loops containing**continue**statements, for example**while**C**loop**statements...**continue**; more_statements...**end loop**;Start_Label:

and similarly for**if not**C**then****goto**Exit_label;**end if;**statements...**goto**Start_Label; more_statements...**goto Start_Label;**; Exit_label: ...**exit**and**continue**statements in**until**loops.Thinking of the SETL

**loop**and**if**constructions in this way lets us state the following rules for setting up verification clauses. If the path q passes through a control statement of the form(E)

**goto**Label;then the path q must continue with the statement following the Label that appears in (E), but we add no clause to V at this point, since a simple

**goto**does not test any condition or change the value of any variable.On the other hand, if the path q passes through a control statement of the form

(F)

**if**C**then goto**Label;**end if**;then the path q can go on either to the statement immediately following (F) or to the statement following the Label that appears in (F). In the first case, we add the clause

**not**C to V; in the second case we add the clause C to V. These rules simply reflect the fact that**not**C must hold if and when q passes through a control statement (F) without the instruction**goto**Label applying, but that C must hold if and when q reaches (F) and the instruction**goto**Label is executed. - As we have noted, the rules for more complex control structures, for example, general
**if**constructs,**while**loops, and**until**loops can be deduced by rewriting them in terms of the more primitive constructs (E) and (F) and then applying the rules stated previously. For example, if q encounters a multibranch**if**statement of the form(G)

**if**C1**then**block 1**elseif**C2**then**block2 ...**end if**;and then enters block2, it is obvious that we must add the two clauses

**not**C1, C2to V. Later, if and when q passes from the last statement of block2 to the first statement following the multibranch

**if**, no clause needs to be added to V, since this transition, like (E), counts as an unconditional transfer.The rules applying to a

**while**loop(H)

**while**C**loop**body**end loop**;are similar. If and when q passes through the

**while**header, either by entering the loop from the statement immediately prior to (H) or by looping back from the final statement of the body of (H), we must add the Boolean clause C to V. On the other hand, if the path q encounters the**while**header but then leaves the loop (H) immediately, we must add the negated clause**not**C to V.When q encounters the

**end loop**line in (H) it will go immediately to the loop header standing at the start of (H). Since this is an unconditional transfer we add no clause to V in this case.When q enters an

**until**loop we need not add any clause to V since entry to such a loop is unconditional. However if and when q encounters the**end until**terminating such a loop the action that we must take is a bit more complex. Suppose, to be specific, that the loop in question has the form(I)

**until**C**loop**body**end loop**;If, after encountering the

**end loop**clause, q exits the loop, then plainly we must add the clause C to V. On the other hand,if q encounters the**end loop**clause and then loops back and continues with the first statement of the body of the loop. We must add the negated clause**not**C to V. - Eventually, the path q that we are following will end at an
**assert**statement**assert**(C').Our aim is then to show that C' is necessarily true at the end of q, provided that the assertion C with which q starts (see (a)) is true at the beginning of q, and provided also that the program's execution does indeed follow the sequence of steps corresponding to q. It is most convenient for this purpose to add the negated condition

**not**C'to V. After doing this our aim must be to show that the set V of clauses is

*inconsistent*, i.e. that not all the clauses of V can be true simultaneously. This is equivalent to requiring that, taken all together, the clauses of V, other than its final clause**not**C', imply the condition C'. - To complete the set of rules stated in the preceding paragraphs, we would need rules that tell us how to handle
**procedure**definitions and invocations. However, since these rules are somewhat more complex than those already stated, we omit them.This means that the rules that we have stated suffice for the formal verification of programs containing no**procedure**invocations, but not for programs which make use of**procedure**s. This deficiency is not fatal - it would not be terribly hard to remedy - but to do so would take us beyond the limits proper to the present work.

Once we have taken a program P containing **assert** statements and generated the set V of verification clauses corresponding to each path q starting and ending at an **assert** statement (but not passing through any other **assert** statement), we are in position to prove the correctness of P mathematically. To do this, we must prove mathematically that each of the clause sets V which we have generated (i.e. each of the clause sets corresponding to a path q) is inconsistent. Suppose that we can succeed in doing this. We can then note that the clause CI initially placed in V is true by assumption, and that, with the exception of the final clause CF of V (see (f)) all the other clauses inserted into V are true in virtue of the very meaning of the statements which the path traverses. Hence, by showing that V is inconsistent, we will have shown that if CI is true at the start of q, then CF is true at the end of q. Once this has been demonstrated for every path q through the program P (or, more precisely, every path which connects two **assert** statements but does not pass through any other **assert** statement), it will follow by mathematical induction that every **assert** statement written into P must evaluate to **true** whenever it is reached, provided only that the **assert** statement standing at the very head of P is true at the moment that execution of P begins. (This initial **assert** statement, often called the *input assertion* of P,will normally summarize all the assumptions concerning input data on which the program P relies.) All in all, we will have shown that the truth of every assertion written into P follows from the assumption that its input assertion is true.

It is important to realize that this final step of a formal program verification, i.e. the step of proving that every set of verification clauses corresponding to a path q between **assert** statements is inconsistent, is a purely mathematical task. That is, when we begin this task we will already have decoupled the work which remains from any entanglement with the control structures and other programming dictions present in the original program P. It is precisely in order to achieve this, i.e. precisely in order to transform our our original program-related verification task into a purely mathematical question, that we go to the trouble of reducing the program P to the collection of clause sets V that it generates. Note again that, once all the necessary Floyd assertions have been written into the text of P, generation of the clause sets V using the rules stated above is a simple mechanical matter, essentially a matter of systematic variable renaming and extraction of suitable portions of the statements encountered along each of the paths q.

Line1: prod := 0; Line2: iterations := 0; Line3:Writing (5) puts us in position to generate the clause sets needed to verify the correctness of the program we are considering. There are just four paths through this program that needed to be taken into account. The first of these is the path running from the start of (5) to the firstwhileiterations /= mloopassert(prod = iterations * n) Line4: prod := prod + n; (5) Line5: iterations := iterations + 1; Line6:end loop; Line7:assert(prod = m * n);

prod' = 0, iterations' = 0, iterations /= m,

not(prod' = iterations'* n) (6)

The second path that we need to consider runs from the start of (5) to the **while**-loop header but not into the **while** loop and then to the final **assert** statement. This path generates the clause set.

prod'= 0, iterations' = 0,not(iterations' /= m),not(prod' = m * n). (7)

A third path between **asserts** runs from the **assert** statement following Line3 through the body of the **while** loop and then back to this same **assert** statement. This generates the clause set.

prod = iterations * n, prod' = prod + n, iterations' = iterations + 1, iterations' /= m,not(prod' = iterations' * n). (8)

The fourth and final path that we need to consider is the one which runs from the **assert** statement following Line3 through the body of the **while** loop but then exits the loop passing through Line3 and then going immediately to Line7. The rules stated previously tell us that this path generates the clause set.

prod = iterations * n, prod' = prod + n, iterations' = iterations + 1,not(iterations' /= m),not(prod' = m * n) (9)

These are all possible paths not running through any **assert** statement and hence these are all the clause sets that we need to consider. Once these clause sets have been generated it is easy to prove that each of them is inconsistent. In view of the simplicity of our original example nothing more than elementary algebra is needed for any of these proofs. In (6) the first two clauses plainly contradict the fourth clause in (7), the first three clauses contradict the fourth. In (8) the first three clauses contradict the fifth in (9), the first four clauses contradict the fifth. This completes our formal verification of the program (2).

It is important to note that this formal verification is very close in spirit to the informal English-language proof of the correctness of (2) that we gave earlier: the formal proof only regularizes and systematizes the informal proof. However, this formalization has the vital effect of making it possible to proceed *mechanically*, thereby ruling out the possibility of small errors. Strictly speaking, to be quite sure that error is impossible the clause sets would have to be generated mechanically by an extension of the SETL compiler, and the informal proof of inconsistency which we have supplied for each clause set would have to be checked mechanically. This can be done, but its final step is not easy.

As already observed the clause-set generation process that we have applied to the example program (2) is quite general, and will apply with much the same ease to any other long or short program which has been decorated with a sufficiently full set of assertions. However for more complex programs the sets generated will not be as simple as (6), (7), (8) and (9). Program (2) involves algebraic operations only, and this is why the clause sets generated from it consist entirely of elementary algebraic formulae. Less elementary programs generally involve both algebraic and set-theoretic operations and this will cause set-theoretic expressions to appear in the Floyd assertions and hence in the clause sets associated with these programs. (Several programs illustrating this remark appear in the verification-oriented exercises of Section 6.7.) To show that such clause sets are inconsistent is considerably less trivial than to deal with the clause sets arising in the highly simplified example that we have considered. Nevertheless, with care and sufficient effort, the proofs required to show clause set inconsistency can always be checked formally after they have been constructed by using only the tools which formal mathematics and symbolic logic make available. In this sense, the formal **assert**-statement based verification approach that we have described reduces the problem of rigorous program verification to a purely mathematical question, namely that of proving the inconsistency of certain clause sets written in a formal mathematical notation. This is as far as we will carry our discussion of formal verification, since to discuss the mathematical problems that must then be faced would take us outside the scope proper to this text.

To understand what underlying forces shape the development of programs, it is well to observe that ingredients of two fundamental sorts enter into the composition of a program. Material of the first kind serves to define user desires and expectations concerning an intended application: for example, the nature of expected input and of output, including output text formats, graphic layouts, prompts and warnings issued by interactive systems, compiler error diagnostics generated by compilers, etc. This material, which often constitutes the overwhelming bulk of a particular application-oriented program, is motivated by user-oriented considerations having an intrinsically nonmathematical character. Material of a second, much more highly algorithmic kind also appears in programs. This *internal* program material creates the toolbox of operations which is used to achieve whatever *external* behavior is desired. Depending on the relative weight of program material belonging to these two categories (external and internal), a program can be called an *externally. motivated* or *internally motivated* program, an *application* or an *algorithm*; one might even say a *superficial* or *subtle* program.

Looking back over some of the programs presented in earlier chapters it is easy to apply this distinction. For example, the shortest path code presented in Section 4.3.8.2 is an internally motivated algorithm (though not a very deep one). In contrast, the cross reference program presented in Section 5.8.12 of Chapter 5 has very little algorithmic content. Most of its details relate to such external matters as the rules which distinguish words from punctuation marks in English text, and one whole procedure of this program, namely 'arrange', is needed only because we want to print lists of line numbers in a neat, easy to read tabular arrangement. Other examples are the quicksort procedure of Section 5 4.1 and the mergesort procedure of Section 5.4.2, which are both algorithms whose recursive structure gives them a certain depth in spite of their brevity, and the polynomial manipulation procedures of Section 5.1, which are also algorithms, albeit rather easy ones, since they are little more than transcriptions into SETL of the ordinary algebraic definitions of polynomial sum, difference, product etc. On the other hand the Turtle language interpreter presented in Section 4.7.1 is externally rather than internally determined: this code uses no nontrivial algorithm, but merely reflects the rules of the Turtle language in an almost one-to-one manner. The 'buckets and well' program of Section 5.7 makes the distinction between internally and externally motivated code particularly clear, since one of its procedures, namely the crucial find_path routine, is an internally motivated algorithm. All its other procedures are externally motivated, some of these relating to such issues as the acquisition and checking of initial data, although others merely serve to represent the rules of the problem itself.

The basic concepts and notations of mathematics, which SETL makes available as tools of programming, serve very adequately to define the internally motivated, algorithmic parts of programs. We have seen that SETL's set-theoretic features allow mathematical functions to be described either in a deliberately succinct "high" style which defines them very directly, or more procedurally by algorithms which compute these same functions, sometimes in surprising clever much more efficient ways. We have also noted that useful mathematical operations which are not directly provided by SETL can be built up by writing suitable families of procedures, and have emphasized (see our discussion of the family of polynomial manipulating procedures developed in Section 5 1 3) that such families should be written to hide the internal representational details of the mathematical objects they manipulate, allowing a user to think in terms of these objects (e. g. polynomials) rather than in more primitive set-theoretic terms. By using such approaches; by studying important algorithms carefully; and by consulting the rapidly growing technical literature of algorithms, which by now describes many useful highly sophisticated ones, you will find that the purely algorithmic side of programming can be brought under a reasonable degree of control.

The externally motivated aspects of programs reflect a considerably more miscellaneous congeries of influences, for example the physical or administrative structure of real-world systems; the form and sequencing of expected input and desired output; the reactions, including prompts and warnings, expected from interactive systems; and the heuristic approaches used to manipulate physical or symbolic objects effectively. How can we come to terms with such varied material?

There has developed a large, though largely administrative, literature concerning the important problem of how to come to terms with external aspects of application design before the start of detailed programming. This is the so-called problem of *requirements specification*. Concerning the literature devoted to this problem the astute observer G. J. Myers comments

Although no methodology exists for external design, a valuable principle to follow is the idea of conceptual integrity [i.e.] the harmony (or lack of harmony) among the external interfaces of the system. The easiest way not to achieve conceptual harmony is to attempt to produce an external design with too many people. The magic number seems to be about two. Depending on the size of the project one or two people should have the responsibility for the external design. Who, then should these select responsible people be? The process of external design has little or nothing to do with programming; it is more directly concerned with understanding the user's environment, problem, and needs, and the psychology of man-machine communications. Because of its increasing importance in software development, external design requires some type of specialist. The specialist must understand all the fields mentioned above, and should also have a familiarity with all phases of software design and testing, to understand the effects of external design on these phases. Candidates that come to mind are systems analysts, behavioral psychologists, operations-research specialists, industrial engineers and possibly computer scientists (providing their education includes these areas, which is rarely the case)

Though Myers's general remarks are helpful, it is still important to try to say something more about the organization of externally motivated applications-oriented programs

One important possibility in this area is to develop special applications oriented programming languages whose objects and operations define useful, standard approaches to important application areas. The Tk widget library and the 'Panel' system discussed later in this book are languages of this kind, as are the 'Berkeley socket' library for Internet programming that SETL is able to access via its Tk or Python interfaces (these are discussed later.)

Even if such languages remain unimplemented and are not available to be run on any computer, their notations and conceptual structure can provide important tools of thought. In developing an application it maybe well to write out a first version of the application in a helpful, even if unimplemented, *auxiliary language*. This first version can then be translated into SETL, by choosing SETL representations for all the kinds of objects appearing in the auxiliary language, and writing SETL routines which implement its primitive operations. Used in this way, the auxiliary language can identify families of operations that work harmoniously together, and procedures into which a SETL application code can usefully be organized. For this reason, comparative study of disparate application-oriented languages,e.g. SNOBOL, GPSS, APL, COBOL, LISP and PROLOG, JAVA, PERL, SQL, VRML, and of gestural programming environments like Macromedia Director, is an essential exercise for the would-be programmer.

A useful suggestion is to isolate the internal algorithmic content of applications from their external requirements, and to prototype their internal operations using general mathematical operations rather than tailored special cases of them. Contrasting with this recommended practice, ordinary application oriented code tends to mix internally and externally motivated program material inextricably, i. e., output details are allowed to control the choice of algorithms, and opportunities to generate output which an algorithm seems to afford are allowed to determine much of what the end user sees. Inartistic packages, which meet user requirements only minimally, and which are full of redundant, hard to maintain, and inefficient algorithmic fragments, often result.

A related suggestion is to use well-designed general-purpose application modules as building blocks for the construction of more complex applications. The Tk-derived widget sets of the SETL interactive interface facilitate this approach.

As we have said, most of the text of an applications-oriented program is often nothing more than a restatement, in programming language terms, of external facts and rules pertaining to the intended application. Once a program's designer has found a way of representing these facts and rules as succinctly and clearly as a well-conceived English language description of these same details would, he/she has programmed these external aspects about as effectively as can be expected. The remaining algorithmic content of a highly external program will normally be small. However:

- A few genuine but generally rather elementary algorithms maybe used. For example one may want to sort, perform a binary search, or put the data to be processed into some arrangement which makes it easy to locate important groups of interrelated data items.
- To improve efficiency one will often apply
*formal differentiation*to an application-oriented code. This is the technique of speeding up the calculation of a quantity E that will be required repeatedly by storing its value in a variable value of E. which must then be updated whenever any parameter on which E depends is changed. (Whenever this common technique is applied, it tends to complicate the application code since it replaces a single integral, often self-explanatory computation of E by multiple scattered, harder-to-fathom updates of value of_E.) A related technique is to replace direct use of set formers and tuple formers by loops which build these same values. Sometimes this is done in order to combine several such loops, all of which iterate over the same set into a single loop For example in application-oriented code (and even in hand-optimized algorithms) one is less apt to see.

num_rich_families := #{xthan to see something likeinfamilies | family_income(x) >= 100000}; num_middle_families := #{xinfamilies | family_income(x) < 100000andfamily_income(x) > 15000}; (1a) num_poor_families := # {xinfamilies | family_income(x) <= 15000};

num_rich_families := num_middle_families := num_poor_families = 0;forxinfamiliesloopif(income := family_income(x)) >= 100000thennum_rich_families +:= 1;elseifincome > 15000then(1b) num_middle_families +:= 1;elsenum_poor_families +:= 1;end if;end loop;

The code (1b) arises from (1a) by expansion into loops of the three set formers appearing in (1a), followed by combination of the three resulting loops, and then by the application of a few other rather obvious optimizing transformations. Note that (1b), although much more efficient and not much longer than (1a), is not quite as obvious a piece of code; certainly (1b) is less brutally direct than (1a).

Internally motivated code passages, i.e. significant algorithms, use a much wider range of tricks than appear in more superficial, application-oriented programs (It is partly for this reason that it is well to separate internally determined from externally determined code sections. Externally oriented code can often be ground out routinely once a good approach has been defined, but deeper, internally oriented code needs to be approached much more cautiously, more "by the book".)

Formal differentiation, described previously, often shapes the design of internally oriented code passages. Another important technique is exploitation of recursive mathematical relationships which express some desired function f of a composite object x in terms of values f(x1),...,f(xn) calculated for one or more smaller subparts xj of x. As noted in Section 5.4, relationships

of this recursive kind underlie such high-efficiency algorithms as mergesort and quicksort. Beyond these two most common techniques, algorithm designers have uncovered many sophisticated techniques which handle many important tasks with remarkable efficiency. Some of these algorithms rest on subtle mathematical relationships whose discussion goes beyond the scope of this book. Since your ability to devise truly effective approaches to programming problems be strongly conditioned by your familiarity with the rich and growing literature of algorithms, you are strongly advised to study this material as soon as you have mastered the more basic material contained in this book. A short list of useful collections of advanced algorithms is found at the end of this chapter.

EXERCISES

1. Into the bubble-sort code shown as (12) of Section 6.5.6, insert code which will count the number of iterations performed.

Then:

- Measure this number i of iterations for randomly chosen tuples of
varying lengths L, and calculate the ratio of i to L**3, to
estimate the constant c that should appear in the formula

i = c * L**3 projected in Section 6.5.6. Do the same for quicksort. - How much more efficient than the bubble-sort method (12) of Section 6.5.6 would you expect quicksort to be, for sorting a tuple of 10 elements? For sorting a tuple of 100 or 1000 elements?

3. Use the technique described in Section 6.5 to estimate the time required to sort a vector of length using mergesort and also the time required to search for a specified component in a sorted vector using the binary search algorithm given in Section 5.4.3.

4. What set will be printed by the following code?

n := 10; s := {};If we changed the first statement to n := 1000, for roughly how long would you expect the resulting code to execute?foriin[1..n]loopswith:= s;end loop;

5. Compare the time required to execute the following codes

n := 500; s := {};andforiin[1..n]loopswith:= 2 * i;end loop;

n := 500; s := {};What accounts for the difference?foriin[1..n]loopswith:= 2 * i; t := s;end loop;

6. Write a program which will execute the 10 elementary SETL operations which you consider most important 1000 times each, and which will estimate the time required to execute each such instruction. To estimate the time required just to execute looping operations, your tests should compare loops like

foriin[1..n]loopx := y + z;end loop;foriin[1..n]loopx := y + z; x := y + z;end loop;

The time difference per iteration is then clearly the cost of executing the additional operation.

7. Take the 'buckets and well' program described in Section 5.3.1 and modify it by inserting code which will count the number of times that every one of its procedures is called and the number of times that every loop in it is executed. This information should be written to a tuple, and a general-purpose routine which prints this information in an attractive format should be designed and implemented.

8. A tuple t all of whose components are different from **OM** can be represented
in the "list" form described in Section 6.5.9 i.e., by a pair [x1,f] such that
x1 is the first component of t and f is a map which sends each components of t
into the next component of t. Use an iteration procedure to write short codes which
convert a tuple t from its standard form to this list form and vice versa.

9. Rewrite program (2) of Section 6.6 by introducing labels and **gotos** in place
of the **while** loop appearing in this program. More precisely the
**while**-loop header should be replaced by the following labeled statement.

Label1: If iterations /= mand the loop trailerthengotoLabel2;end;

If we transform (2) in this way we can insert the auxiliary assertiongotoLabel1; Label2: ... -- the finalassertstatement of (2) -- should follow this label.

immediately after Label1. Make this assertion; then generate clause sets as in Section 6.6.1, and prove that the resulting variant of program (2) is correct. How does this proof compare in difficulty to the proof of correctness of program (2) given in Section 6.6.?

10. A set-theoretic iteration

can be rewritten as a **while** loop in the following way: We introduce a
new variable sc (representing the collection of elements of s
that have not yet been iterated over.) Then the loop header (1) can be
rewritten as the following **while** loop header:

sc := s;(Thewhilesc /= {}loop(1') x :=arb(sc); sc := sc - {x};end loop;

count1 := 0; count2 := 0;forxinsloopifx > 0 then count1 := count1 - 1;end if; (2)ifx <= 0 then count2 := count2 - 1;end if;end loop;

gives the variables count1 and count2 final values satisfying the equations count1 + count2 = #s. Work out a full set of Floyd-Hoare assertions for the program, and write out the clause sets generated by these Floyd assertions. A rigorous English language proof that each of these clause sets is inconsistent should then be given.

11. Assume that s1 and s2 are two sets. Proceeding as in Ex. 10, prove that the program

count := 0;gives the variable count a final value equal to #s1*#s2.forxins1loopforyins2loopcount := count - 1;end loop;end loop;

12. The following incorrect mathematical proof contains a bug. What typical program bug does it resemble? How likely is it that a proof-bug of this sort could slip through in an (erroneous) proof that some program is correct?

**Theorem**. Given any group of n balls. all the balls in the group hare the same color.

**Proof**. (By mathematical induction) This is clearly true if n = 1. Suppose now that it is true for m = n - 1. Given a group of n balls, remove one, call it x. Then by inductive hypothesis the remaining balls all have the same color. Choose one ball in this group, call it y, and replace y by x in the group. This is still a group of m balls, so by inductive hypothesis all the balls in the group have the same color. This shows that x has the same color as all the other balls, so all n balls have the same color, Q. E. D.

Reingold, E., Nievergelt, J., and Deo, N., *Combinatorial Algorithms: Theory and Practice* (Prentice-Hall Publishers 1977) is an intermediate-level work which presents many useful techniques for generating combinatorial objects, fast searching and sorting, and graph processing. It also discusses the mathematical techniques used to estimate algorithm efficiency and can serve well as a guide to further reading in this important area.

*The Design and Analysis of Computer Algorithms* by A. Aho, J. Hopcroft, and J. Ullman (Addison-Wesley Publishers, 1975), which is more advanced, contains an excellent survey of many important algorithms, data-structuring techniques, and methods for determining the efficiency of algorithms. This useful book also describes various important techniques for proving upper bounds on the speed with which various quantities can be calculated.

The first three volumes of Donald Knuth's famous *Art of Computer Programming* (Addison-Wesley Publishers, 1973) cover several important classes of algorithms (including basic combinatorial algorithms, polynomial manipulation, multiprecision arithmetic, calculation of random numbers, sorting, and searching) very comprehensively. Knuth gives many detailed analyses of algorithm efficiency and is a basic reference for this topic.

Borodin and Munro, *Computational Complexity of Algebraic and Numeric Problems* (American Elsevier Publishers, 1975) is a specialized work which presents many algorithms for high-efficiency processing of polynomials and for related algebraic and arithmetic processes.

Numerical algorithms i. e., algorithms for carrying out numerical computations, including solution of linear and nonlinear equations, calculation of integrals, solution of differential equations, and minimization of functions of several variables, have a very extensive history which reaches back to the nineteenth century and beyond. A first-class modern introduction to this classical area of computational technique is found in Dahlquist, Bjorck and Anderson *Numerical Methods* (Prentice-Hall Publishers 1974)

Methods for treating systems of linear equations and inequalities form the content of the area of algorithmics known as linear programming. For an account of this interesting and important subject see D. Luenberger, Introduction to *Linear and Nonlinear Programming* (Addison-Wesley Publishers, 1973)

Many areas of algorithm design have developed very actively during the last few years. One of the most fascinating of these is computational geometry, the body of techniques used for the rapid calculation of solutions to geometric problems For an introduction to work in this area see M. Shamos and G. Preparata, *Computational Geometry* (Springer-Verlag, 1985).