CHAPTER 6

Program Development, Testing, and Debugging

The normal stages of a program's life cycle are:

Initial conception, formulation of requirements.
Overall design of a program that will meet these requirements
Detailed design and coding.
Program review, with rework and extension as needed to clarify, simplify or improve efficiency.
Development of a test plan, testing and debugging, removal of errors and retest.
Operational use of program.
Enhancement and repair during continuing operational use.
Retirement.

This chapter discusses various key aspects of this program life cycle providing hints that aim to help the inexperienced programmer to cope effectively with the pragmatic problems normally associated with program design, debugging, and maintenance.

6.1 Bugs: How To Minimize Them

Any error affecting the behavior of a program is called a bug. Bugs are inevitable, but a few cardinal rules can help minimize the degree to which they infest your programs.

Know that they will occur. Since any small error, e.g., forgetting a line, typing "-" where " + " is meant, misspelling an identifier or keyword, incorrectly inserting parentheses into an expression, will cause a bug, you must train and discipline yourself to higher levels of logical and typographical accuracy in programming than are required in any other human activity. Be suspicious. Program defensively. Check your programs scrupulously for syntactic and logical correctness, several times if necessary, before you try to run them. If in doubt as to the meaning of any operation or programming language construct, look it up.
When bugs occur, your problem is to locate, recognize, and remove them. Bugs cannot be located unless you know the programming language with which you are working well enough to recognize problems when you are looking at them. Bugs cannot be eliminated until you have understood them well enough to know why and how they cause the faults that betray their presence. Finding bugs, like finding needles in a haystack, calls for systematic sifting, for careful detective work. A program is a delicate piece of machinery, and it is simple folly to think that you can make it work by kicking it hard in some random way to make its pieces fall into place. Because they involve many submechanisms, all of which must interface correctly if they are to work together properly, programs, like elaborate combination locks, require careful analysis and attentive sensing of their hidden internals when they need repair. The novice who tries to fix a malfunctioning program without fully understanding the way in which it is working is attempting a task that is far less hopeful than that faced by someone who tries to open an unfamiliar safe without understanding its workings or combination. "The sequence 33-8-19-27 doesn't work? Then I'll try 23-92-69-46. This doesn't work either? Then maybe 17-51-85-34 will be luckier." A student who allows himself to be drawn into of this sort of thoughtless, random attempt to repair a program will inevitably find that his efforts drag on unsuccessfully, not only till the end of the term or year, but until the end of the solar system, without revealing anything. What is needed instead is a systematic, analytic approach.
Though programs are almost never entirely bug-free, observance of the rules of good programming style can reduce the density of bugs in your initial program drafts and allow bugs to be found more quickly once testing of your program begins. Finding the right approach to the programming task that confronts you, the right style in which to start writing the code that you need, is of prime importance. To find this "right approach" requires careful consideration of the logical structure of your programming task, with the aim of defining a collection of intuitively transparent operations that work well together and can be used to accomplish this task in as straightforward a way as possible. Code should impress by its clarity, naturalness, and inevitability, all of which make avoidance and exposure of bugs easier, rather than by obscure trickery and impenetrable cleverness. Programs that achieve brevity without sacrificing clarity are most desirable, since lines of code that you never need to write will never contain bugs. Effective brevity is attained by a correct choice of intermediate operations and by systematic use of these operations to produce the program you require. SETL is in itself a powerful programming language, but especially for larger, more complex applications it may be well to program by first inventing a still more powerful language specially adapted to your intended application. Then your initial program draft can be written in this (possibly unimplemented) language, after which it can be transcribed into SETL to make it executable. In this sort of approach, the primitives of your invented language will become the procedures of your SETL code. By using an auxiliary language in this way and by handling its transcription into SETL in as mechanical a style as possible, valuable protection against error is gained. See Sections 5.2 and 8.5 for a discussion of related issues.
Careful program documentation also serves to expose and eliminate bugs. Good documentation will add an important degree of redundancy to your program. Your code expresses your intent in one way and your comments express the same intent in another. Discrepancies between the two indicate the presence of bugs. Carefully thought-out comments should be added to a program as soon as the code is written. Some comments will in fact be written before the code to which they refer, in order to guide composition of the code. Any additional comments needed to make documentation complete should be added to the code while it is still fresh; this creates an opportunity to review the code, checking it for logical faults. After the whole text, code plus comments, has been constructed and put into proper format, it should be left to "cool" for a few hours or days, after which it should be reviewed attentively and suspiciously. Such a "cooling-off period" will dispel some of the initial misapprehensions which may have crept into a code and thus will allow various systematic errors to be corrected.
As has been said, brevity in coding is desirable, but this should be the kind of brevity that flows naturally from an effective overall approach to the programming task at hand, not the undesirable brevity which comes from stinting redundancy (e.g., by using short, unmnemonic, variable names). Use the features of the SETL language vigorously and eliminate clumsy circumlocutions where direct modes of expression exist, but avoid obscure tricks even where these gain brevity.
Certain constructions, for example those that perform elementary arithmetic computations to determine positions in strings and tuples (for which "off-by-one" errors can easily occur) are bug-prone and need to be approached with caution. For example, what is the length of a string s(i..j), is it j - i, j - i + 1, or j - i - 1? To ensure that s(i..j) is exactly k characters long, what value do we give j: i + k - 1,or i + k + 1? Learn to recognize these trouble spots, double-check them when preparing your code, and surround them with assert checks when you do use them.
For example, if you write

assert #s(i..j) = j - i;
immediately before proceeding on this assumption, your error will be pinpointed immediately; if you omit this check, you may have to find your way back to this error from some obscure symptom. A related idea is to introduce, and use, a collection of small standard procedures to handle these touchy situations in ways that are more instructive. For example, by introducing the macros (see Chapter 8).
```
			procedure len_from(i,j); return j - i + 1; end;
			
			procedure take_len(s,i,k); return s(i..i + k-1); end;
```
we can accurately extract a string of length k from s starting at character position i by writing
```
			take_len(s,i,k);
```
and can evaluate the length of s(i..j) by writing
```
			len_from(i,j)
```
As Donald Knuth has remarked, premature optimization is the root of much evil in programming. Compulsive (and ultimately ineffective) attempts to gain minor efficiency advantages often complicate programs and introduce bugs into them. As you compose a program, remember that substantial efficiency advantages will be gained globally by choice of effective algorithms, not locally by compounding of minor advantages.
Your program test plan should begin to be developed as your program is being written, and a substantial portion of the collection of test- and debug-oriented print and assert statements that you will use to test your program should be composed and entered as soon as the first draft of the program beings to approach completion. Early attention to your test plan will serve to pinpoint complex program sections that require careful testing. These are also the sections whose logic needs to be inspected most closely before testing begins. See Sections 6.4 and 6.6 for additional discussion of this point.

6.2 Finding Bugs

Even, alas, if you are very systematic and professional, some bugs will creep into your program, and the problem will then be to find and fix them. The following remarks should help you learn how to do this effectively. Debugging always starts with evidence that a program error has occurred somewhere during a program run. The problem in debugging is to work one's way back, from the visible symptom first noticed, to the underlying error. The errors one is looking for can be called the error sources or primal anomalies. These are the first (incorrectly written) operations or statements which get correct data from what has gone before them but pass data that are no longer correct to what comes after them. They are the instructions at which your program first "runs off the rails." The initial evidence of error that you see may relate only indirectly to these primary error sources. The difficulty of finding the erroneous statements is complicated by the fact that the full history of an extensive computation comprises a vast mass of data, impossible to survey comprehensively. In debugging you must therefore aim to explore as narrow a path as possible, while still finding your way back to one or more primal anomalies.

A good first step, but one that should not be allowed to hold you up too long, is to look closely at whatever fragments of correct output have been produced. If little or no output is correct, then your program may have failed before even the first print statement was executed. This hint may help you narrow the bug hunt. On the other hand, if some output is correct, then the program was probably functioning correctly till some point past the statement which produced the last correct output. Find the point in your program at which this output was produced, and see what comes before and after it. Again, this may narrow the hunt. Examine the erroneous output carefully and try to see whether its logical pattern reminds you of any particular section of your program. This also can sometimes yield useful hints concerning the likely location of the bug, especially if different parts of your output data are produced by recognizably different sections of code. If certain items of output that you expect are missing, try to see what evidence there is that all the code that you expected to execute did actually execute: remember that unanticipated data may have caused your program to follow an unexpected path, so that it may have bypassed, or may never have reached, the code sections which were supposed to have produced the output which you are surprised not to see. Analysis of evidence of this general kind will in favorable cases point the finger of suspicion at certain narrow program sections. However, in less favorable cases, the available evidence will be ambiguous. In this case, you will need to generate more extensive traces and dumps. This can be done in one of two ways:

By inserting additional print statements into your program, to make it print out something of a "motion picture" of what has happened.
By inserting various other checks, especially assert statements, which check assumptions on which your program depends, but which you suspect might be failing.

Section 6.6 will have more to say about technique (b), which is related to the general issue of formal program verification. The following more pragmatic hints will help you to apply this technique effectively. It is particularly important to place assert statements in sections of code known to involve delicate constructions, especially if (as in the case of the "off by one" bugs considered in the last section) the necessary checks are simple. Since the correct functioning of a program often hinges upon the assumption that key variables will change in a consistent way as iterative execution proceeds (for example, always increasing or always decreasing) it can be useful to save the last previous value of each significant variable var and to write checks which compare the last previous value of var with its current value. This can be done by introducing an auxiliary variable last_var for each var and writing an assignment

last_var := var;

whenever it is desirable to save the last value of var. Then checks like

assert var = last_var; or assert var /= last_var; or assert last_var = OM or var * last_var = {} and var /= last_var;

etc., will all prove useful.

Ultimately, however, the problem with a purely assertion-based debugging technique is that it is not easy to formulate the necessary checks comprehensively enough to make it unlikely that a bug (which probably relates to something that has been overlooked) can slip through.

Hence one must often fall back on method (a), which generates additional raw evidence for inspection. The problem in using this method is to avoid burying yourself in too voluminous a trace of the thousands or millions of events that take place as a program executes. To avoid this danger a carefully planned sequence of probes is necessary. A good idea is to resurvey your program, mentally list its main phases, and determine all the data objects which each phase passes to the next phase. If your program has been well designed, i.e., has been built up from modules interacting with each other in well-structured fashion, there should not be too many of these objects, and then it is reasonable to print them out for inspection. Before inspecting this information, review the logic of your program, and make sure you know just what features you expect to find in values of the variables that you have printed. Try to be aware of every feature on which any part of your program depends. Then check the actual data. If the data printed at the end of a phase looks correct in every detail, then this phase is probably correct. If something strange-looking appears in the data produced by a given phase, although the data supplied to this phase look correct, then there is probably something wrong with the code of this phase.

When this stage of debugging is reached, you will at least have determined which of the several phases of your code contains the error for which you are hunting. At this point, it is a good idea to think over all the evidence that you have examined, and see if any compelling picture of the problem seems to suggest itself. Sometimes the fact that the offending phase has now been located removes enough confusion for the difficulty to be guessed quickly. If not, you will have to carry your tracing to a more detailed level. This is a matter of inserting print and assert statements more densely into the offending phase, in order to locate the particular subphase that contains the error. As before, this is the subphase to which good data are being supplied, but which is seen to pass bad data along to its successor.

Once the bug location has been pinned down to a program section roughly a dozen lines long, review the logic of these lines. Read them very closely, looking for some misunderstanding which could have produced the anomalous data which you know that this section has generated. Try again to correlate data features with the operations responsible for producing these features. If this doesn't work, take the data supplied to the erroneous subphase, and try to trace the way that the subphase will act on this data, by hand, step by step, until you spot some error.
In most cases, these steps will find the bug without too irritating an expenditure of effort. However, in the stubbornest, fortunately rare, cases, the problem for which you are hunting may still elude clear identification. In these particularly resistant cases one of three causes may be at fault.
1. If the algorithm which you are using is complex, you may have misunderstood its logic. It may be that no single line of your code is wrong: rather, its overall pattern may be subtly wrong, causing it to produce the output you see, rather than the results you wrongly expected it to generate. Global logic errors of this sort are often quite confusing. If you come to suspect that a problem of this sort has occurred, you should reason once more through the structure of your program, trying to convince yourself by careful analysis that it is logically sound. Section 6.6 describes the formal rules that underlie reasoning of this sort.
2. There may be nothing wrong: you may simply have misunderstood what output your program was supposed to produce. Or you may have been looking at the wrong phase of a program which really does contain a bug, because you thought that the output of this phase showed some error, while in reality the bug was elsewhere. Or you may not have been running the program you thought you were running, or the version of the program you thought you were running, or your program may have been reading input data different from what you assumed. In such case, take a few minutes to cool off, review the whole situation, including the logic of your program, once more, and start over.
3. Your problem may be caused by a true "system bug," that is, an error, not in your program, but in one of the layers of the software including the SETL compiler, run-time library, or operating system under which you are running. Concerning bugs of this kind we can say the following:
  (iii.a) Don't be too quick to suspect them. Though such problems do crop up from time to time, they are much rarer than errors in your newly written programs. Remember that dozens of people are using the same software systems that you are, and that if the problem afflicting you is a system-level problem, it would affect all of these people. Before you become willing to blame your problem on anything other than a fault in your program, you should always have examined your program with great care, located a section just a few lines long which you can be sure is receiving correct input (because you have printed and inspected its input) and producing bad output (again, you must have printed and inspected this output). Finally, meticulous examination of these few lines, with review of the definition of all the operations these lines involve, of the parenthesization of those lines, and of any applicable rules of operator precedence must give you "courtroom" evidence that the system is not performing according to its specifications. At this point you are almost (but still not quite) in position to report a system problem to the expert in charge of maintaining your copy of SETL system (or of the operating system within which the SETL system runs). Before doing so, however, you should try to simplify the evidence still further, isolating the malfunctioning lines into a malfunctioning program just a few lines long, and then paring this program down still further if possible, ideally to the point at which it contains just three lines: an assignment initialising a very few variables, a single line which obviously does not function as it should, and a print statement which confirms the fact that this line has failed to act in the manner demanded by the rules of SETL. If the system problem which you think lies at the root of your troubles disappears somewhere during this sequence of steps, the cause of your difficulties may not be a system problem at all, but an error or misunderstanding on your part, which your attempts to locate the suspected systems problem may have clarified. In this case, chastened, you should return to your original program, fix the error in it, and continue your debugging. If, however, you do succeed in creating a very short program which gives unmistakable evidence of system malfunction, you should transmit a complete, clean copy of this program to a systems expert. This should be accompanied by a clear explanation of the problem you have pinned down. He will then take steps to fix the SETL system, or to have it fixed.
  Note that problems in the SETL system, like problems in your own programs, are most likely to concern marginal, rarely exercised cases. Though the system has been in use for a few years and has been tested fairly extensively, exhaustive testing of so complex a system is simply not possible. (See Section 6.4 for a discussion of some of the issues involved in attempts to test programs comprehensively.)
  There are a few cases in which it is reasonable to jump a little more rapidly to the conclusion that a system bug is affecting you. One is the case in which two runs of absolutely identical programs and data yield different results. Another is the case in which insertion into your program of a statement which is harmless by definition changes the behavior of the program significantly. For example, if insertion of a print changes your program's flow of control, something is obviously amiss at the system level. This may be evidence that can be reported to an expert immediately (but see the caution extended in (f)).
4. It should be clear from what has been said that one of the very first things that you will want to trace when you start to analyze a malfunctioning program is the input data it is reading. Always echo this data by printing it out immediately after it is read. Your input data may not be what you think it is, or you may be reading it incorrectly.
5. Especially if a difficult bug is being pursued, debugging as an activity tends to create an atmosphere of confusion, which grows like a thundercloud as the mind struggles to free itself from the misapprehension which first allowed the bug to slip in. Particularly difficult bugs sometimes make one feel that one is going insane, since the laws of logic seem to be breaking down. To combat this perilous confusion, you must maintain a very deliberate, step-at-a-time, and above all skeptical, attitude while you are debugging. Verify the situation at every turn; look at what really is in your source text rather than trying to remember what was there; print out a record of what your program really is doing rather than guess what is going on. Inexperienced student programmers often come to advisors with old versions of programs they are trying to debug, claiming that "I ran this program on Tuesday, and I made two or three changes that I am sure are harmless, and now it doesn't work." A more experienced programmer, who knows that the only valid evidence to work from is a current, single, untorn listing showing program and output unmistakably together, will only laugh at this.
  To reduce the level of your own confusion, it is sometimes helpful to work over your problem with a friend, trying to explain what is going on, and reviewing salient parts of the logic of your program with him, till he begins to understand it. A more expert consultant will often be able to spot the trouble that you have missed, but even if your consultant is less expert than you yourself, you will often find that the very act of explaining the problem lets you spot what is wrong.
6. Even when a program has once begun to function (and often even when it has been used successfully and intensively over a considerable period), it may still contain bugs, which can lurk within sections of code which are rarely, perhaps almost never, exercised. For this reason, code inserted for debugging should generally not be removed once the bug is found. Don't throw away your crutches: it may become necessary to debug the same program again! Instead of removing debug code, you can comment it out by inserting a '--' sign at the start of each line inserted for debugging. (Only inserted lines that never generated any evidence useful for debugging should be wholly removed.) Another technique, particularly useful during extended development and debugging of large programs, is to make the most valuable debug prints and checks conditional, by including them in if statements containing conditions which are normally false but can be turned on by supplying program initialization parameters. (See Chapter 9 for a discussion of program parameters.) If this is done, it becomes possible to examine the inner working of a malfunctioning program quickly, without having to recompile it all.

6.3 A Checklist of Common Bugs

Certain bugs occur frequently, and the experienced programmer learns to recognize their characteristic symptoms. Here is a checklist of commonly occurring bugs, with some indication of the symptoms they are likely to produce. We only list bugs that would pass through compilation undetected.

Bug Likely SETL symptom

Variable not given any initial value "Illegal data-type" error

Incorrect termination of a loop (e.g., count off by 1) Missing items in data collections, sums or sets too small if loop terminates too soon. "Illegal data-type" error if loop terminates too late

Incorrect limits in string and tuple slices (e.g., count off by 1) (Similar to incorrect loop termination)

Incorrectly structured while loop conditions or bodies, or incorrect initial conditions in while loops Program does not terminate

Incorrect treatment of initial cases in recursions, or bad procedure calls Program does not terminate, possible memory overflow

Omission of exit or continue statement Program "runs on" into code not intended for execution.

Omitted or improperly inserted data-value update Failure of relationships expected subsequently; unexpected values or jumps. Effects can be subtle.

Shared global variable unexpectedly modified by invoked procedure or function Efforts can be very subtle, and particularly hard to find. Beware of global variables!

Misspelled variables, e.g., AO for A0, Bl for B1, c1 for ci "Illegal data-type" error, possibly no output

Reading unexpected data Can cause wide variety of effects

Some unanticipated characters encountered by the string-scan operations Program does not terminate

Failure to set a program switch Effects can be subtle

Parameters out of order in procedure call "Illegal data-type" error (generally easy to find)

Variable inadvertently modified by assignment to a variable intended to be different but having the same name If no data-type error is caused, effects can be subtle

Variables out of order in read statement "Illegal data-type" error (generally easy to find)

Read operations of program inconsistent with data actually present in input file "Illegal data-type" error (generally easy to find)

Target of an assignment statement misspelled Effects can be very subtle

Not resetting a counter Effects can be very subtle (see "Incorrect loop termination")

Complex, incorrect combination of Boolean conditions Effects can be very subtle

Too few or too many parentheses in expressions; misunderstanding of precedence rules Effects can be very subtle

6.4 Program Testing

Debugging is the process of searching for the exact location of a program error when you know that some error is definitely there. Testing is the systematic exercise of a program which you believe might be correct, in an effort to see whether bugs are really absent. If testing shows a bug, debugging starts again. If your tests are not systematically designed, then bugs may go undetected and remain in the program. All one knows about a poorly tested program is that it works in the few cases for which it has been tried; it may fail in many others.

Test design is as important a part of program development as the choice of algorithms and data structures. Development of a test plan should begin while a program is being written. A procedure which is hard to test is apt to be bug-prone, and should be simplified if possible. By keeping testability in mind, you will avoid unnecessarily complex constructions, and produce cleaner, sounder code.

Testing falls into three distinguishable phases, which we will call first-stage testing, second-stage or quality assurance testing, and maintenance or regression testing. First-stage testing begins as soon as a program is complete enough for execution to be possible. Its guiding hypothesis is that bugs are present in sufficient numbers to prevent much of the program from working at all. During first-stage testing, one aims to make the main facilities of the program being debugged operable by finding and removing bugs quickly. Quality assurance testing begins when first stage testing ends. It assumes that a few obscure bugs remain in the program to be tested and aims to test systematically enough to smoke them out. Maintenance testing aims to ensure that new bugs are not introduced into old programs during their extension and repair.

6.4.1 First-stage testing

First-stage testing should work through a program "bottom up," first testing the bottom-level procedures which implement the basic operations used by the rest of the program. Once the code realizing these operations has been checked and found to be operable, the testing process will focus on intermediate-level procedures, and once these have been checked one will begin testing the program's main capabilities. Attempts to short-circuit this systematic, level-by-level test procedure by jumping directly to tests of higher program levels are more apt to waste time than to speed things up, since the lower-level causes of higher-level failures will then be hard to understand. For systematic testing, test input will need to be prepared for each procedure to be tested; this should be designed to make the output produced easy to inspect. If any of the procedures being tested make use of difficult or obscure data structures, it may be necessary to develop auxiliary output procedures which print these data structures in formats that clarify their logical meaning. When such procedures become necessary, they should be written and tested immediately.

Perhaps because realism might lead to suicide, programmers are generally overoptimistic concerning the likelihood that a program that they have just written will work right away. Careful preparation of a first-stage test plan serves to counteract this common illusion; the more realistic attitude thereby engendered encourages more careful initial program inspection, and this often reduces the number of bugs present when first-stage testing begins. This is why programs developed cautiously often become operational quickly, whereas programs developed in too optimistic a frame of mind often begin to work only after frustrating and totally unexpected delays.

An effective way of organising tests is to group them into a single procedure called test_prog, whose one parameter s is a string consisting of test names separated by asterisks. The test_prog can then have the following structure:

procedure test_prog(s) -- skeleton of test procedure

while s /= "" loop
	span(s,"*"); tn := break(s,"*"); 
	if tn = "" then exit; end if;
	print("Beginning Test", tn);
	case tn of
		(put sequence of named tests here)
		else
			print("Unknown test name");
		end case;
	end loop;
end test prog;

To trigger a sequence of tests named test_4, test_2, etc., one has only to write something like test_prog("test_1*test_2*..."). Later, when first-stage testing is complete, this call, and test_prog, can be left in the program P that has been tested, but the argument of the test_prog call can be changed so that it reads test_prog(getspp("TEST=/")) (see Chapter 9 for an account of the getspp library procedure.) If this is done, no tests will be executed unless P is invoked with a command line parameter of the form TESTS = testl*test2...*testn, in which case the named tests will be performed. This approach makes it easy to retest a program in which unexpected trouble has developed. Of course, the test facilities available should be carefully documented at the start of the test_prog procedure.

Especially when a long program P is being developed, it may be desirable to begin testing before all parts of P have been coded (or even designed) in detail. Of course, such an approach will be reasonable only if P has a sound, highly modular overall design, and only if the missing sections of P have been designed in enough detail so that you can be sure that no inconsistency will develop when they are designed and coded in detail. This mode of organizing development and testing is sometimes called top-down testing. It has the advantage of allowing testing and development to proceed in parallel. A related advantage is that it can provide particularly early confirmation of overall design soundness, or, if a design proves to be unsound (say, in terms of human factors, i.e., usability), it can give early warning of trouble.

If a top-down approach to development and testing is taken it will be found useful to provide a standard, multiparameter library routine having the name MISSING_SECTION. Then parts of your program that have not yet been coded can be replaced by invocations

MISSING_SECTION(name_of missing_section);

where the string-valued parameter -name_of_missing_section- should assign the missing section a name that can be printed. The MISSING-SECTION procedure should also allow optional additional parameters, so that it can be invoked by

MISSING_SECTION("name_of missing", "p1 p2 ... pk", [p1, ..., pk]);

where p1, p2,..name various parameters with which the missing section would have to deal or which might explain why it was (perhaps unexpectedly) invoked.

6.4.2 Quality assurance testing

Second-stage (or quality assurance) testing should aim to exercise a program comprehensively, in at least the following senses:

It is obvious that parts of your program that have never been executed during debugging may well contain unrecognized errors. The battery of tests you develop should therefore force every line of your program to be executed at least once.

If your program branches on a Boolean condition, then you will want to supply at least one test case in which the condition evaluates to true, and another in which the condition evaluates to FALSE.
Improper treatment of extreme values is a common cause of program failure. A program may work for nonnull sets, tuples, or strings, but not for the corresponding null cases; for positive integers n but not if n = 0; for integers less than the length of some string, but not for integers equal to this length, etc. It may work when a while or a for loop which it contains is entered, but fail if the loop is bypassed entirely.
In preparing a comprehensive collection of tests, you will therefore need to survey your program systematically, listing marginal situations of this kind as exhaustively as you can; then you should prepare at least one test that will force each logically possible situation to occur.
Once a list of all procedures, loops, branches, code sections, and marginal cases to be tested has been collected and a comprehensive set of tests has been developed, it may be worth preparing a formal test coverage matrix which shows which tests exercise each program feature. A chart of this kind makes it easier to spot cases that have never been tested. It can also help to select tests to be run during regression testing (see the following discussion) and can help to pinpoint program sections to be examined when a test fails. Such a chart will also make it easier to avoid running too many tests, all of which exercise the same limited group of program features but never use others. Note that, if regarded as a kind of test, "production" use of a program is subject to this objection; i.e., daily use of a program often exercises only a limited subset of its features. This is why programs that have been in heavy use for extended periods sometimes fail when their usage pattern changes significantly.
Compilers sometimes include features which make it easier to determine the coverage provided by a family of tests. For example, it may be possible to generate a listing of all program statements executed during a sequence of runs, of all branches taken, of all procedures invoked, etc. The SETL profiling facilities is useful for this. You will want to familiarize yourself with this facility, which can help assure that the test sets you develop for your programs are adequate.
Once developed, test sets become an important adjunct to the programs that they test. Such test sets should therefore be organized in a manner which facilitates their long-term maintenance and reuse. The tests that are available, and the coverage they provide, should be adequately documented.
Programmers often find it hard to bring sufficient enthusiasm to the task of systematically rooting obscure bugs out of code that they themselves have written. In part, this is a matter of optimism; in part, a result of the mental fatigue which tends to set in at the end of a lengthy code development; in part, a consequence of the difficulty of overcoming the very mind-set which introduced an error in the first place. For all these reasons, it is good practice to make testing of large programs the responsibility of a quality assurance group independent of the development group that produced these programs. If this is done, then, knowing that an independent group of programmers will probe their work systematically to find shortcomings, the original development group will be encouraged to simplify their product so as to improve its reliability.

Even where resources do not permit fully independent organization of the activity of program testing, it is well to ensure that every line of a complex program is read and understood by at least two programmers, each of whom will be able to spot problems and suggest tests that the other might have overlooked.

6.4.3 Regression testing

Regression testing is testing routinely applied whenever a previously working program is amended, to ensure that newly introduced code has not caused new errors. Tests which will be used in this way should be written so as to be self-checking, i.e., to produce little or no output if they have run correctly, but to produce copious output pinpointing a problem as closely as possible when an error is detected. This can be done by organizing the tests so that they perform various calculations, always in at least two different but logically equivalent ways. If these paired computations produce the same result, then either no output, or a simple message "TEST xxx PASSED", should be printed, but if a discrepancy is detected output which shows the discrepancy and displays the values of all variables related to the discrepancy should be printed.

If a chart has been prepared showing the program features exercised by each test (see iv above) it can be used when a test fails to suggest what part of the program should be examined first to find the cause of failure. If some one of these program sections has just been changed, it will of course come under immediate suspicion.

6.5 Analysis of Program Efficiency

Up to this point, we have generally ignored one of fundamental issues of programming practice: how much machine resource, i.e., execution time and memory, are actually required to execute a given program. This omission was deliberate, and reflects a basic maxim of programming: Make it right before you make it fast. It is now time to repair this omission, and examine two related topics:

The estimation of program efficiency.
The study of means to improve program efficiency by refinement and transformation of a correct but inefficient working program.

Two kinds of machine resource are relevant to the study of program efficiency: execution time and data storage required. In what follows, we will concentrate our attention on the first of these, because in many cases it is time which is critical. Data storage requirements were a pressing concern in the early days of computing, because memory was an extremely expensive hardware component and computers were equipped with only a few thousand words of it. As a result, early programming languages such as FORTRAN included various complex mechanisms for managing the limited memory resources available, which forced the programmer to concern himself with low-level details which were extraneous to the actual application to be programmed. Memory has in the meantime become a much cheaper commodity, and concern with its optimal utilization can in many cases (but by no means all) be left to the machine itself (in the form of some facility for dynamic storage allocation). This is what we will do in most of what follows.

6.5.1 Estimation of program efficiency

It is very easy to use SETL (or any other programming language) to write programs which would take years, or even hundreds or thousands of years, to finish executing. Consider, for example, the code fragment

	for i in [1..1000] loop
		for j in [1..2000] loop
			for k in [1..3000] loop
				for l in [1..4000] loop 				 (1)
					sum +:= (2 * i**3 + j**3 + k**2);
				end loop;
			end loop;
		end loop;
	end loop;

In this code, for each successive value i, the variable j iterates over 2000 different values; for each value of i and j, the variable k iterates over 3000 values; and for each value of i, j, and k, the variable l iterates over 4000 values. Thus, all in all, the innermost statement of the code fragment (1) will be executed 1000 x 2000 x 3000 x 4000 times, i.e., 240 billion times. This statement involves 6 multiplications and 4 additions, so that at least 2.4 trillion elementary arithmetic operations are required to execute the code (1). Even on a computer capable of executing 400 million arithmetic operations per second (a fairly typical performance figure nowadays) and even if the code (1) were written in a programming language capable of exploiting this raw arithmetic capability to the utmost, 6,000 seconds would be needed to execute the code (1). Since an hour is about 4000 seconds, this is about 1.5 hours. However, since SETL (which pays a price in efficiency for its very high level) is about 30 times less efficient than this, execution of the SETL code (1) would require about 2 days of computer time. This makes it quite clear that in writing SETL programs one needs some way of estimating the computational resource which will be consumed to execute the code that one sets down.

6.5.2 The execution time of elementary instructions

Let us begin by describing a basic unit of expense. There is some elementary time interval, which we will call D, which is the time it takes to execute the simplest SETL instructions. All instructions consume an amount of time which is some multiple of D. The actual value of D depends on the computer and on the specific SETL implementation being used and can be anywhere between a millionth to a ten-thousandth of a second. In what follows, estimates of execution times will always be given in units of D. We will refer to D as the basic instruction cycle, or cycle for short.

Instructions whose operands are simple data types: integers, floating-point numbers, Booleans, all take roughly one cycle to execute. This includes

Arithmetic operations on integers and floating-point numbers
Boolean operations.
Comparison operations ( =, /=, >, <, etc.)

At the level of the actual machine, there may be secondary differences among some of these; for example, a floating-point division is typically slower than an integer addition. In this discussion we can safely ignore these differences, and we shall do so in what follows.

The simple assignment statement

v := expr;

where v is a simple variable name, also takes roughly one cycle to execute (of course, only after the value of expr has been obtained).

6.5.3 Operations on sets

Operations on composite objects (sets, tuples, and maps) fall into two categories: instructions that take a fixed number of cycles, regardless of the size of the object to which they apply, and instructions whose execution time is a function of the size of their arguments. Let us examine each of these categories of set operations.

6.5.3.1 Elementary operations on sets

The following operations on sets can be performed in a constant number of cycles:

Cardinality: #S takes one cycle to evaluate, regardless of the actual number of elements of S

Membership: (x in S) where x is a simple item (numeric or Boolean) takes typically three to six cycles to evaluate, and increases only slowly with the size of the set S

There is nothing obvious about these remarks! You might wonder how the membership test can actually be performed without examining all the elements of S. The answer lies in clever choice of the internal data structures used by the SETL run-time system itself These structures are discussed in Chapter 10 and we will not say more about them now, except to note that the possibility of using sets as freely as we do in SETL depends in part in the constant execution time required by the membership operation. Note that this constant time behavior applies when x is a simple (nonstructured) value. In general, the membership test x in S takes as long as (or a little longer than) an equality test x = y. The relevance of this remark will become apparent in the following section.

Insertion and deletion: For simple objects the operations x with S and S less x, take a nearly constant number of cycles to execute. Exceptions relate to questions of storage allocation, and will be discussed later. The actual number of cycles needed for insertion of a simple value into a set of moderate size is about 10-15 cycles.

Map retrieval: To evaluate or to modify f(x), where f is a map and x is a simple value, takes nearly constant time, comparable to the time required for set membership operation. Once again, efficient implementation of this fundamental operation is secured by the use of a sophisticated data structure about which we will say more in Chapter 10.

6.5.3.2 Global operations on sets

Operations that construct composite objects can be expected to take an amount of time proportional to the size of the object they build. This means that set union, intersection, and difference take longer when their arguments are larger. To illustrate this point let us examine the set union operation S1 + S2. This is evaluated in several steps:

Initialize by making a copy of S1. This can be expected to take a number a steps proportional to #S1.
Iterate over S2, and check each element of S2 for membership in S1. Whenever an element is found which is not in S1, insert it into the union set.

This process is described by the following equivalent SETL fragment:

	S3 := S1;           	 -- Initialize result
	for x in S2 loop        -- Iterate over second set.
		if x notin S1 then
			S3 with:= x;                                 (2)
		end if;
	end loop;

After code like this has been executed internally, S3 will clearly be equal to (S1 + S2). To estimate the execution cost of the loop in (2), we can reason as follows: the loop body is executed (#S2) times, i.e., once for each element of S2. For each such element we perform a membership test and (possibly) a set insertion, both of which take constant time. Therefore the loop takes an amount of time proportional to #S2. The initialization of S3 amounts to making a copy of S1, which takes a time proportional to the size of S1. Putting all of this together, and neglecting small differences between the execution time of membership and insertion operations, we can conclude that the cost of a union operation is proportional to the sum of the cardinalities of the sets involved. This is not an exact statement, but it is sufficiently accurate to convey an idea of the order of magnitude of the execution time of this set operation.

We can apply a similar analysis to the set intersection operation (S1*S2). This is evaluated by the SETL run-time system in the following fashion:

Initialize the result to the empty set.
Determine which one of the two sets has the smaller cardinality (for definiteness, let us say it is S1). Then iterate over S1, testing each of its elements for membership in S2. Insert each element of S1 which is also a member of S2 into the intersection set being built up. This process corresponds to the following SETL fragment:

		S3 := {};
		for x in S1 loop
			if x in S2 then
				S3 with:= x; 				  (3)
			end if;
		end loop;

Using much the same reasoning, we can easily see that the execution time for this fragment is proportional to the cardinality of S1, i.e., to the size of the smaller of the two sets entering into the intersection operation.

Set difference, set inclusion, and subset testing operations are performed in similar fashion and can therefore be expected to take an amount of time proportional to the cardinality of one of their arguments.

Other SETL primitives involve similar (implicit) iterations over sets. The most obvious of these is the print statement: print(S) requires the examination and output of each element of S. Set equality and inequality are more complex examples. To test whether S1 and S2 are equal, the SETL run-time system performs the following sequence of actions:

Compare the cardinalities of S1 and S2. If these cardinalities differ, the two sets are distinct.
If the two sets have the same cardinality, then we must examine each element of one set, and check it for membership in the other set. If an element is found which does not belong in both, we know that the two sets are not equal. If the two sets are equal, each element will have to be examined, and the process of establishing equality will take a time proportional to the common size of S1 and S2. The internal equality testing process can be described by the following SETL procedure:

	procedure equal(S1,S2);
		
		if # S2 /= # S1 then return false;
		else
		
			for x in S1 loop
				if x notin S2 then return false; end 		if;
			end loop;
		
			return true; -- All elements are common.
		
		end if;
		
	end equal;

We mentioned in the previous section that the membership operation must often perform an equality test involving the object being tested for membership. This means that if S1 is a set of sets, and su is a set, then the operation

su in S1

will take a time proportional to the size of su, because it will involve testing su and one or more element(s) of S1 for equality. It should be clear that when sets of sets of sets of..are involved on both sides, the membership test operation will become more complex, and the size of the subobjects of S1 will have to be taken carefully into consideration.

6.5.4 Operations on tuples and strings

Indexing, i.e., the retrieval or modification of the i-th component of a tuple or string, is the basic operation on indexable objects like strings and tuples. This operation takes a small constant amount of time, typically two to three basic cycles.

On the other hand, membership testing, i.e., the operation (x in T) where T is a tuple or string, proceeds by examining all elements of T in succession, in the manner suggested by the following equivalent SETL procedure:

	procedure intuple(x,T);

		for y = T(i) loop
			if x = y then return true; end if;
		end loop;
	
		return false; 		-- x was not found in T.

	end intuple;

Given the way that tuples and strings are represented internally, there is no more efficient way to tell a priori where x is found in T, if indeed it is found in T at all. If x is not in T, this will only be ascertained after all elements of T have been examined, and if x is in T, it may be the first, second, ... n-th component. We can say that on the average, half of the elements of T will have to be examined before x is found. This means that a membership test applied to a tuple T can be expected to take a number of cycles proportional to #T. (Compare this with the more advantageous constant time behaviour of membership tests on sets).

If x is itself a structured object, and T is a tuple of such structures, then the equality test in the loop (5) can be expected to be proportional to the size of x. Therefore in this case the membership test (x in T) will take time proportional to (#x * #T).

Tuple and string concatenation are analogous to set union: they take a time proportional to the sum of the sizes of the operands, i.e., to the size of the result. Testing tuples and strings for equality is also linear in the size of the operands, as it requires that each component of each object (in the worst case) be examined.

The execution times of the basic SETL operations are summarized in Table 6.1. We can estimate the cost of any straight-line program simply by adding

Table 6.1. Execution Time of Various SETL Primitives

Operation Time Units

Arithmetic i + j, i * j 1

Assignment x := y 1

Indexing T(i) 1

Map retrieval M(x) 5

Set membership x in S 5

Tuple membership x in T #T

Set union S1 + S2 #S1 + #S2

Concatenation T1 + T2 #T1 + #T2

Quantifier exists x in S | C(x) #S * (1 + cost(C))

i, j designate integers, T designates a tuple, S a set, and M a map.

the costs of each instruction in it. However, other issues, discussed in the next section, arise if we go on to examine the more interesting problem of estimating the cost of programs with various loop structures.

6.5.5 The execution time of programs containing loops

Iteration over a set, as for example in

for x in s loop ... end loop;

or over a map, tuple, or string, as in

for x = t(i) loop ... end loop;

produces set elements (or map values, tuple components, or string characters) at a rate of one per cycle. Essentially the same remark applies to numerical iterators, like those in (1). Hence to estimate the time required to execute a loop, we have to multiply the number of times the loop will be executed by the (average) time that it will take to execute the body of the loop. An obvious generalization of this rule applies to imbedded loops: if one for loop is imbedded within another, then the time required to execute the outer loop is the number of times it will be executed, multiplied by the (average) number of times the imbedded loop will be executed, multiplied by the time required to execute the body of the embedded loop. For example, the double loop

for i in [1..1000], j in [1..i] loop ... end loop;

will execute in a time roughly equal to 1000 * 500 multiplied by the amount of time required to execute the loop body, since the embedded loop over j will execute an average of 500 times for each successive value of i. (This number 500 is halfway between the number one of times that j changes when i = 1 and the number (1000) of times that j changes when i = 1000.)

Since quantifiers and set formers are in effect prepackaged iterations, very similar rules apply to them. An existential quantifier like

		... exists x in s | C(x) ...    		(6)

is evaluated as if it were the following SETL fragment:

		maybe := false;
		
		for x in s loop
			if C(x) then maybe := true; exit; end if;
		end loop;

an internal variable that holds the Boolean result of the test. This SETL fragment will execute in time equal to the number of items examined, multiplied by the average time required to evaluate the Boolean condition C(x). If the quantifier (6) evaluates to false, then all the members of s will need to be examined, so the time required will be #s multiplied by the average time to evaluate C(x). If (6) evaluates to true, then iteration over s will terminate as soon as an x that actually satisfies C(x) is found; since iteration over a set is performed in a somewhat unpredictable order, the number of iterations needed to find such an x should be roughly # s/( # sat + 1) where sat is the set of all x satisfying the condition C(x).

Similar remarks apply to set formers and the tuple formers, except that:

Each insertion into a set takes somewhat longer than a simple iterative step, because we must check for and eliminate duplicate elements; and
The implicit iteration appearing in a set or tuple former like

{x in s | C(x)}

must always proceed until all the elements of s have been examined. As an application of these rules, note that execution of the harmless-looking code fragment

	f := {};

	for x in s loop
		f(x) := {y in s | (exists z in s | C(x, y, z))};         (7)
	end loop;

involves three nested loops: first the for loop which appears explicitly, next the implicit iteration over s in the setformer {y in s..}, and then finally the implicit iteration over s in the quantifier exists z in s...Therefore the number of cycles required to execute (7) is roughly as high as the cube of the number of elements of the set s.

6.5.6 The execution time of while loops

The possibility that a program can loop forever in an ill-constructed while loop should serve to alert us to the fact that analysis of the time required to execute a while loop can be much subtler than for loop analysis. Of course some while loops are easily analyzed. For example, if the variable k is not modified in its body, the loop

	k := 0;
	while (k +:= 1) < n and t(k) /= OM loop 
		statements..
	end loop;

where k is incremented each time through the loop, behaves in very much the same way as a for loop and therefore will terminate after no more than n - 1 iterations. On the other hand, consider the loop

	t := n * [0];

	while exists x = t(i) | x = 0 loop
		print(t);
		t(i) := 1;
		t(1..i - 1) := (i - 1) * [0];
	end loop;

This begins by generating a tuple t := [0,0,0,...0] of n zeroes and then repeatedly sets the first nonzero coordinate of t to 1 and all the coordinates preceding this coordinate to zero, thus carrying out a (left-to-right) form of binary counting. The sequence of tuples printed is

			[0,0,0,...0]
			[1,0,0,...0]
			[0,1,0,...0]
			[1,1,0,...0]
			[0,0,1,...0]
			[1,0,1,...0], etc.

and is plainly of length 2ⁿ, in that it generates all binary numbers from 0 to 2ⁿ - 1. Hence the number of cycles required to execute the while loop (8) is at least 2ⁿ, which means that if n = 50 the loop will execute for roughly 320 years, even if 100,000 iterations are performed per second!

For a more realistic example of the way in which while loops are typically used, consider a particularly simple sorting method, dubbed flip-sort.

	while exists i in [1..#t1] | t(i) > t(i + 1) loop
		[t(i),t(i + 1)] := [t(i + 1),t(i)];              (9)
	end loop;

This searches a tuple t for out-of-order components and interchanges a pair of such components whenever one is found. Plainly the number of cycles required to execute (9) is the average time required to search the tuple t for an out-of-order pair of adjacent components, multiplied by the number of interchanges required to put t in sorted order. Even though precise analysis of these times requires subtle analysis going far beyond the scope of this book, it is not hard to estimate these time requirements crudely. We can guess that, as long as an out-of-order pair exists, one such pair will be found after searching through some fraction of the length of tuple t being sorted; thus evaluation of the existential quantifier appearing in the first line of (9) is estimated to require cn cycles, where c is some constant which we will not attempt to evaluate here and n is #t. Moreover, since each execution of the body of the while loop (9) corrects exactly one case in which a pair of elements appears in inverted order, the expected number of times that (9) must iterate to put t into its final sorted order should be roughly equal to the number of pairs of components of t which occur in inverted order. In a random arrangement of the components of t, roughly half the components to the left of a given t(i) should be larger than t(i), and roughly half the components to the right of t(i) should be smaller than t(i). Thus each component t(i) of t should appear in roughly #t/2 inverted pairs, and it follows, since this way of looking at things counts inverted pairs twice, once for each of the components in such a pair that the expected number of inverted pairs in a randomly chosen arrangement of the components of t should be roughly (1/4)n²). Multiplying this expression by cn, representing the estimated time required to evaluate the existential quantifier in the first line of (9), we arrive at

		(1/4) cn³			 (10)

for the time required to execute this 'flip-sort' code on a tuple of length n.

The approximations which we have made in arriving at the estimate (10) are too crude for the constant (c/4) appearing in (10) to be particularly meaningful. (Ex. 9 following Section 6.5 outlines an experimental procedure for estimating this coefficient more accurately.) The significant feature of the estimate (10) is that it tells us that the time required to sort a tuple by the flip-sorting method is proportional to the cube of the length of t, i.e., that - sorting a tuple of length 10 by this method should take roughly 120 cycles sorting a tuple of length 100 roughly 120,000 cycles, and sorting a tuple of length 1000 roughly 120,000,000 cycles. These figures, which are not very favorable, reflect the rapidity with which the cube of n grows as n increases. In the jargon of algorithm analysis, one says that flip-sort is an 'n cube' algorithm - or an 0(n³) algorithm. Clearly, any sorting algorithm whose time requirement grows less rapidly than the cube of the length of t will be very much superior to flip sort as a technique for sorting large tuples t.

Let us therefore modify our naive flip-sort algorithm in order to obtain a more efficient sorting method. We begin by noticing that the existential quantifier in (9) forces us to scan the tuple from the beginning, each time we find an out-of-order pair. This is wasteful because an out-of-order element may be out of place with respect to several of its successors, not just one. A better approach is to try to push the winner of each such exchange as far as it will go, and go back to the beginning of the tuple only after we have made a full pass over the tuple. This leads to the following:

for j in [1.. #t - 1] loop
	for i in [1 .. #t - 1] loop

		if t(i) > t(i + 1) then
			[t(i),t(i + 1)] := [t(i + 1),t(i)];		  (11)
		end if;  				

	end loop;
end loop;

Note that this code will push the largest element in the tuple to the last position the first time the inner loop is executed, the second largest element will be pushed to the second position from the end on the next pass through the inner loop, and so on. Each execution of the inner loop adds one element to the fully sorted portion of t which is built up at its end. The outer loop is executed (#t1) times to ensure that every element of t finds its proper place.

A further improvement is possible. Elements that have reached their correct position need not be examined in successive passes because we know that they are in their proper places. We can therefore modify (11) to read as follows:

     
for j in [1..#t - 1] loop
	for i in [1..#t - j] loop

		if t(i) > t(i + 1) then
			[t(i),t(i + 1)] := [t(i + 1),t(i)]; 		 (12)
    	end if;

	end loop;
end loop;

The fragment (12) is the bubble-sort method that we examined in Section 5.1.1.

Next let us examine the execution time of (12). Again, let n be #t. The outer loop is executed (n - 1) times. The inner loop is executed (n - 1), (n - 2), ... 3,2,1 times, that is to say, (n/2) times on the average. The body of the inner loop consists of various comparison, indexing and assignment operations, all of which take some constant time c1. Therefore the total execution time of (12) is roughly c1 * (n - 1) * (n/2) cycles. We therefore can say that the code (12) executes in time proportional to the square of the size of t. This is a considerable improvement over our initial flip sort. Several further refinements can be made to (12), but these improvements only affect the constant c1 and do not modify the n² behavior that we have just obtained.

The order-of-magnitude analysis that we have just performed is the most frequent and useful kind of algorithm performance analysis. Often it is enough to know what the order-of-magnitude behavior of a program is to estimate whether it is acceptable for the small, medium, large, or very large amount of data that is to be processed.

The transformation that took us from (9) to (12) deserves further discussion. What we did was to remove an existential expression

... exists x in s | C(x)

which describes some exhaustive search over s, and replace it with a more intelligent search over that set. This shows a fairly typical use of existential quantifiers: they allow us to write a crude specification for some kind of search process, in order to obtain a working algorithm quickly. When the time comes to improve the efficiency of that algorithm, we can often replace the existential quantifier with a more precise (and therefore more efficient) search, better adapted to the problem at hand. This stepwise approach to program efficiency is one of the most important principles of good programming practice. Proper use of SETL consists precisely in starting with as condensed and abstract a formulation as can be achieved by use of set formers, quantified expressions, power sets, etc. Once such an abstract program is running, one can then replace these abstract constructs, in those places where they are unacceptably inefficient, by more efficient lower-level constructs which make use of sophisticated data arrangements.

6.5.7 Efficiency analysis of recursive routines

The presence of procedure calls adds another dimension to efficiency analysis. To begin with, execution of the procedure call instruction itself has a certain cost. This cost can be divided into a fixed portion and a variable portion. The first part of the cost (on the order of five cycles) does what is needed to remember the point of call (so we can return to it) and start the called procedure. The variable portion of the cost is proportional to the number of actual parameters being passed to the called procedure and can be expected to add two to five cycles per parameter. This indicates that a procedure call involving just a few parameters is as time-consuming as 10 to 15 arithmetic operations.

However, in most cases the important expense is the time the called procedure actually takes to execute. If the called procedure consists only of loops and simple statements, we can estimate its execution time using the techniques sketched in the previous section. On the other hand if the called procedure calls other procedures in turn, or more interestingly calls itself recursively, then the analysis of its efficiency is considerably more subtle. To indicate the kind of reasoning required, and also introduce some useful notation, let us first examine the "quintessential recursive procedure," namely the factorial:

	procedure fact(N);
		return if N = 0 then 1 else N * fact(N - 1) end if;       (1)
	end fact;

Since the factorial of a positive number N is the product of successive integers up to N, an iterative calculation of fact(N) will require N multiplications, i.e., take time proportional to N. It is conventional to indicate this by saying that iterative calculation of fact (N) uses time O(N).

This is also the case for our recursive definition. To establish this, we reason as follows: if N is not zero, execution of fact (N) will require one comparison operation (the test N = O) one multiplication, and one recursive call to fact involving a smaller argument, namely N1. If we let T(N) designate the time it takes to execute fact when its argument is N, this leads to the following equation:

                  T(N) = 2 + T(N - 1)              (2)

where the constant 2 corresponds to the two elementary operations noted.

This equation is called a recurrence relation for T and defines the value T for each possible argument N, as a function of the value that T(N) takes as an smaller argument. Note that this recurrence parallels the recursive definition of fact, in which the value calculated is also defined in terms of another invocation of the same procedure. To solve the recurrence relation means to express T(N) as some explicit function of N. For the recurrence (2), this can be done easily, as follows. The relation (2) holds for arbitrary values of N (except 0), so that substituting N for N1 in (2) we find that

T(N - 1) = 2 + T(N - 2).

We can use this to replace T(N1) on the right-hand side of (2), which gives

T(N) = 4 + T(N - 2)

Repeating the same reasoning, we can express T(N - 2) in terms of T(N - 3), and so on. This process ends when we express T(1) as 2 + T(0). T(0), that is to say the time required to evaluate the factorial when its argument vanishes, is just the execution time of the test (N = O) because in that case the procedure does not recurse. Thus T(0) = 1. Putting all of this together, we obtain

				T(N) = 2 * N + 1					(3)

which indicates that our recursive factorial program also executes in a time proportional to its argument N. The calculation just described neglected the actual cost of the procedure call itself, and that of the return instruction. Taking these into account would modify (3) only by constant factors, i.e., would gives us

T(N) = aN + b

where the constants a and b are 10. Hence our conclusion that calculation of fact(N) requires time proportional to N remains correct.

Next let us examine a more complex example of a recursive procedure: the Towers of Hanoi procedure of Section 5.4.4. Recall that the Towers of Hanoi problem is that of moving N rings, stacked in decreasing order of size on a peg A, to a peg B, using some intermediate peg C as a ring holder, and respecting the ordering of the rings, so that a ring never lies over a smaller ring. As seen in Section 5.4.4, the solution is given by means of the following recursive procedure:

 
procedure Hanoi(N, fr, to, via);

	if N = 0 then return;
	else
		Hanoi(N - 1, fr, via, to);
		print("from peg ", fr, " to peg ", to);
		Hanoi(N - 1, via, to, fr);
	end if;

end Hanoi;

Using the same reasoning as before, we can readily establish the following facts:

When the argument N is zero, Hanoi does not recurse but executes a single test and returns.
When N is greater than zero, Hanoi executes a test, a print statement, and two recursive calls with argument (N - 1). In other words, the time consumed by the procedure is described by the recurrence relation
```
                    T(N) = a + 2 * T(N - 1)			(5)
```

and by the "boundary condition"

               	    	 T(O) = b				(6)

where a and b are small constants. Once again, we can use this recurrence relation for T(N) to express T(N1) in terms of T(N2), T(N2) in terms of T(N3), and so on, until T(N) is expressed only in terms of a and b. These successive replacements look as follows:

             T(N) = a + 2 * (a + 2 * T(N - 2))
                  = a * (1 + 2) + 4 * T(N - 3)
                  = a * (1 + 2) + 4 (a + 2 * T(N - 4))
                  = a * (1 + 2 + 4) + 8 * T(N - 4)

The pattern should be clear: each substitution adds another power of 2 to the coefficient of a, and another factor of 2 to the coefficient of T(Ni), so that this coefficient is always 2^i-1. Our calculation stops when i = N, at which point we have

T(N) = a(1 + 2 + ... + 2^N-2) + 2^N-1 * T(0)

a(2^N-1) + 2^N-1 * b = 2^N-1 * (a + b) - a

We conclude that procedure Hanoi takes an amount of time which is an exponential function of its input parameter N (the number of rings to be moved) This means that adding one more ring doubles its execution time. Clearly, a procedure with this behavior can only be useful for relatively small values of its argument. (The Buddhist legend from which this problem originates states that 64 such rings are being moved by dedicated monks since the beginning of time, and that the completion of their task will mark the end of the world. The preceding analysis indicates that we need not be too concerned yet about this prospect.)

Procedures are only useful in practice if they do not have the explosive time requirement that the Hanoi procedure illustrates: factorial is linear in its argument, as we saw previously, and the sorting procedures discussed so far are cubic or quadratic in the number of elements to be sorted. Most of the procedures that we use in practice consume time O(n), O(n²) or O(n³) at most.

Finally let us analyze the performance of one more simple recursive procedure, namely quicksort, which was presented in Section 5.4.1. This procedure, which can sort any homogeneous set s of integers, floating-point numbers, or strings, is simply

procedure quick_sort(s);

	if #s = 0 then return [ ]; end if;
	x := arb s;						(7)
	
	return quick_sort({y in s | y < x})
		+ [x] + quick_sort({y in s | y > x});

end quick_sort;

Let T(n) be the number of cycles that this procedure will typically require to sort a set of n elements. Building up the two sets which appear in the final return statement of (7) will require a number of steps proportional to the size of s, in that a full pass over s is required in each case. Thus the time required to execute (7) is equal to some small constant multiple c*n of the length n of s (c = 3 is a fair guess), plus the time required to execute the two recursive invocations of quicksort which appear in the second return statement of (7). Since typically the element x chosen from s by the arb function of (7) will lie roughly halfway between the largest and the smallest elements of s, each of the two sets {y in s | y < x} and {y in s | y > x} should contain approximately half the elements of s. Thus, given that T(n) is the time required to sort a collection of n elements by the quicksort method, sorting each of these sets by use of quicksort should require roughly T(n/2) cycles. It follows that T(n) satisfies the recursive relationship

				T(n) = 2 * T(n/2) + c * n              (8)

The first of the terms on the right of (8) represents the time typically required to execute the two recursive invocations of quicksort appearing in (7), and the term c * n represents all the work needed to prepare for the two invocations.

We solve (8) in the same fashion as in our two previous examples, by substituting the expression (8) for the occurrence of T on the right of (8), which gives

	T(n) = c * n + 2 * c * (n/2) + 4 * T(n/4) = 2 * c * n + 4 * T(n/4),		(8a)

and then substituting (8) for T(n) on the right of (8a) to get

	T(n) = 2 * c * n + 4 * c * (n/4) + 8 * T(n/8) = 3 * c * n + 8 * T(n/8).		(8b)

Continuing inductively in this way we will clearly get

			T(n) = 4 * c * n + 16 * T(n/16),
			T(n) = 5 * c * n + 32 * T(n/32),

and so forth, until eventually, when the power of 2 in the denominator on the right becomes roughly equal to n (which will happen after log n steps, where log n designates the logarithm of n to the base 2), we will find that T(n) is roughly

c * n * log n + n * T(1),

i.e., that T(n) can be estimated as the product of a small constant c (still roughly 3), times n, times the logarithm of n. One therefore says, in the jargon of algorithm analysis, that quicksort is an "n log n" or "O(n log n)" algorithm.

For n at all large and c roughly equal to 3, c * n * log n will be vastly smaller than the cube or even the square of n. For example, for n = 1000, n ** 3 is 1,000,000,000, n ** 2 is 1,000,000, whereas c*n* log n is only 30,000. Therefore quicksort can be used to sort large amounts of data, which could not be sorted in any reasonable amount of time using flip or even bubble sort. For example, if #t= 10,000, and on a computer capable of executing 100,000 of our nominal instruction cycles per second, sorting t using the flip-sort method will require approximately 16 hours, bubble sort will take about 20 minutes, whereas quicksort will accomplish the same operation in roughly 4 seconds.

This simple example shows why it is so important to find algorithms whose time requirements do not rise rapidly as the arguments passed to them grow larger. Very considerable efforts have been devoted to the search for such high-efficiency algorithms during the past decade, and a great many ingenious and important algorithms having efficient behaviour have been devised. Most of these algorithms lie beyond the scope of the present introductory work. For basic accounts of this important material, see the References at the end of this chapter.

6.5.8 The cost of copying, or Space is Time after all

Some SETL operations, like (s with x) where s is a set, and (t with x) where t is a tuple, also s less x, x from s, x fromb t, and x frome t, modify a composite object (i.e., a set s or tuple t) which may be large. The same remark applies to the tuple assignment t(i) := x, and to map assignments like f(i) := x and f{i} := x. The time required to execute these operations will vary dramatically depending on whether or not the large composite argument of the operation needs to be copied.

To understand this important point, note first of all that copying is sometimes logically necessary. Consider, for example, the code

	s := {1,2,3,4,5,6,7,8,9,10,15,20};
	s1 := s;
	s with := 25;
	s1 less := 2;
	print("s = ",s);
	print("s1 = ",s1);

The output that this will produce is

		 s = {1 2 3 4 5 6 7 8 9 10 15 20 25}
		s1 = {1 3 4 5 6 7 8 9 10 15 20}.

Since two different values will have been created by the time we reach the final step of (1), it is clear that somewhere along the way to this final step the single set constant assigned to the variable s will have to be copied. This copying can be done when the value of s is assigned to the second variable s1 in the second line of (1) (copying on assignment), or can be done in the third line of (1), when the value of s (but not that of s1) is modified by addition of the extra element 25 (copying on modification). Exactly where copying actually takes place will depend on the particular version of the SETL system that you are using, and especially on whether or not this compiler includes an "optimization" phase. But in any case, whenever two variables appear to be sharing the value of a composite object, and one of these variables is modified, the other one must be provided with a fresh copy of that composite object, because in SETL assigning to one variable never affects another. Copying a set or tuple with n components can require O(n) cycles. Hence execution of an apparently cheap operation like t(i) := x can require a number of cycles proportional to the length of t, if the value of t has been assigned to some other variable.

On the other hand, copying is frequently unnecessary, and both the optimizing version of the SETL compiler and the SETL run-time library include mechanisms for avoiding copying when it is not logically necessary. (Since these are implementation-level mechanisms, and fairly complex ones at that, we shall say nothing about how this is done.) When no copying is involved, the operation s with x is only two or so times slower than the membership test x in s, and similar remarks apply to the other operations in the group we have been considering. For example, the assignment t(i) := x can be done in just one of our nominal "cycles," and in the same circumstances map assignment is roughly five times as slow.

This is but one symptom of the fact that the storage management which the SETL system performs contributes to the total execution time of a program. This contribution will be small if the objects manipulated by the program do not change size very frequently, but can be very substantial if large objects are constantly created, shared, updated, and then discarded. Programming techniques that avoid storage management costs are known and studied in data-structure courses, for which a number of excellent texts exist. Such data structures can be constructed in SETL, and the next section gives one example of such, namely the linked list. In this text we have largely ignored the subject of data structures of this type, in accordance with the tenet of high-level programming which states that the burden of storage management is best left to the system. The gains in ease of programming thereby attained are well worth the execution costs incurred. Nevertheless, the fact that storage management can become expensive should be kept in mind. Note also that in some circumstances, such as in the programming and control of real-time devices (i.e., physical equipment such as a motor, a cardiac monitor, a radar, an air-conditioning system, etc.) where a computer system must respond with minimal delay, no hidden execution costs can be allowed, and the programmer must know, within a few milli- or microseconds, the time that a given code sequence will take to execute. Such systems cannot be programmed in SETL or in any other high-level language such as APL or SNOBOL For such systems, lower-level languages, in which all storage allocation is described in full detail by the programmer, must be used.

6.5.9 Data structures used for the efficient realization of some SETL primitives: the linked list

As seen in the preceding pages, execution of some of the most important SETL operations, for example, set union and tuple concatenation, may require time proportional to the size of the arguments to which these operations are applied. However, if the programs performing these unions and concatenations use the relevant sets and tuples only in restricted, particularly favorable ways, we can sometimes improve their efficiency very greatly by using alternate more complex representations for the objects appearing in these programs. Although doing this is something of a violation of the SETL "spirit," which emphasises ease of programming over efficiency of program, SETL can at least be used to explain these efficiency-oriented techniques.

In order to focus our discussion, we have to examine more details of the implementation of the basic SETL types. In this section we will describe the way tuples are actually represented in the SETL run-time system and examine the advantages and disadvantages of this representation. This will allow us to understand the potential advantages of an alternative representation which is efficient in cases where the standard representation for tuples is comparatively expensive. Note that the representations of sets and maps used by the run-time system are considerably more complex than the representation of tuples.

Internally, that is to say in the memory of the computer on which your program is running, a tuple can be represented by a contiguous sequence of cells, or words, that contain the components of the tuple. These components appear in order, so that the first component of the tuple appears in the first of these cells, the second component in the second, etc. For example, if f is the tuple [2,3,5,7,11], it could be stored in memory as follows:

When working with such a representation the SETL run-time system would always know where the first component of t is found. From this, it can easily calculate the location of the second, third, components of t, and so on: the location of any component t(i) is simply (i - 1) plus the location of the first component. cells after the location of t(1). This is why a tuple is the most efficient way storing data that is referred to by number, i.e., according to its ordering: when such a collection of data is represented as a tuple, all items in this collections are equally accessible. Tuples are built for efficient indexing.

Other operations on tuples are not as easy to perform. Consider

t(2..3) := [ ];

whose purpose is to shorten t by removing two of its components. After this operation is performed, t will be stored as follows:

Thus we now have t(2) = 7, t(3) = 11, etc. This requires that the components of t that appear after the deleted section be "moved down" two positions. For a long tuple, this can represent a substantial expense. Similarly, operations that enlarge a tuple, by insertion or concatenation, such as

			t with:= 13;
			t(1..3) := [1,2,3,4,5];
			t +:= [17,19,23,29];

may require that a certain number of elements of t be moved or copied.: there is no room in the portion of memory currently allocated for t for the expansion required by one of the preceding operations, then a larger area will have to be found by the storage manager, into which the whole of t will have to be copied and then modified. We conclude that operations that modify the size of a tuple repeatedly carry a potentially substantial penalty in storage allocation activity. In cases where data are ordered but are not accessed by number, i.e. when no indexing operations are performed, and where substantial changes in the size of a tuple will occur, other representations exist that lead to smaller storage allocation and data motion costs. The most familiar among such representations is the chain, or linked list.

In the abstract, a linked list is a structure in which data are organized according to some linear order, i.e., with a first element, a second one, and so on, and so that it is easy to go from one element to its immediate successor. A tuple is of course an example of a list, but one in which elements are stored contiguously. In general a linked list is not stored contiguously, and the mechanism that takes us from one element to the next must be described explicitly. In SETL we may choose to use a map (call it next) which takes us from any element to its successor in the ordering. Then rather than having to use contiguous locations, successive elements of the list can be anywhere. In this case we can think of the location that holds each element of the list a being marked by an arbitrary tag, for which SETL atoms will be used in what follows. The contents of each such location are also described by a mapping (call it val_of), so that if C is a cell in this list, val_of(C) is the element stored at C, and next_of(C) is the cell that contains the next element. In addition we need to have an explicit way of referring to the first element of the list: a variable comp1 must be used, whose value is the first cell. Then, all in all, val_of(comp1) is the first element of the list, val_of(next_of(comp1)) is the value of the second element of the list, etc. The last cell cn in the list is distinguished by having no successor, which we express by setting next_of(cn) = OM.

We are now ready to build and manipulate lists. In order to create a list with a single element x, we execute the following fragment:

	comp1 := newat();    -- Define cell.
	val_of(comp1) := x; -- place element in cell.
	next_of(comp1) := OM; -- First element is currently the last also.

In order to add a new element y after x on this list, we execute the following fragment:

	new := newat();        -- Define new cell.
	val_of(new) := y;    -- Place element in it.
	next_of(comp1) := new;  -- First element now has successor.
	next_of(new) := OM;     -- Second one does not.

If instead we want to add a new element in front of the list, we execute

	new := newat();		-- New cell.
	val_of(new) := z;	-- And its value.
	next_of(new) := comp1;	-- First element now becomes second.
	comp1 := new;		-- And most recent element is first.

Note that each invocation of newat() just represents a way of obtaining a unique marker for each element in the list. Instead of atoms we could have used integers, strings, or indeed any kind of SETL value. We choose to use atoms to emphasize that each cell in the list requires a "name" of some sort, but that this name serves as a place-marker for the value that goes with it and has no other significant relationship to that value.

In order to transform any tuple t into this list representation, we can make use of the following procedure:

    procedure into_list(t);
        if t = [ ] then return OM; end if;

        first := newat();
        cell := first;
        succ := newat();

        for i in [1..#t] loop   -- Place tuple component
                val_of(cell) := t(i);   -- Cell for next.
                next_of(cell) := succ;  -- In which next component
                cell := succ;   -- will be stored.
                succ := newat();
        end loop;
        
        next_of(cell) := OM;       -- Last element.
        return first;   -- First allocated cell.
    end  into_list;

This procedure (which assumes that next_of and val_of are globally available, map-valued quantities) returns the first cell of the list, so that by applying the mappings next_of and val_of repeatedly we can reach all the elements of the list. We have assumed that next and val_of are global variables, defined outside this procedure. This is indeed the way in which a program using lists is usually structured: a single next mapping is used to describe the chaining of all lists present in the program. The chained representation of a tuple that is obtained from procedure into_list can be pictured as follows (Figure 6.1):

Figure 6.1 A Tuple Represented by a Chained List of Elements

At first sight, representing a tuple in this way may not appear to be such a good idea. Of course, it is not hard to iterate over tuples having this representation: we simply start at comp1 and apply next repeatedly to step along, always applying val_of to get the component value corresponding to whatever index we have reached. For example, instead of writing

	if exists x = t(i) | C(x) then ... else...  	(1a)

as we would if t were a standard SETL tuple, we would write


	i := comp1;			-- Initialize search index
	while i /= OM loop 
		x := val_of(i);        -- Value of next element.
		if c(x) then exit; end if; -- If found, search is successful 		(1b)
		i := next_of(i);          -- else advance search index
	end loop;
	if i /= OM then ... else...

Although no less efficient than (la), the code (lb) is certainly more comp1ex and hard to read. Moreover, accessing a given component t(k) of t is much less efficient in the list representation, since for standard tuples the operation

x := t(k)

is performed in one or two cycles, whereas if t has the list representation we will instead have to count up to k as we step through the list, as the following fragment shows:

	i := comp1;
	for j in [1..k-1] loop i := next_of(i); end;
	x := val_of(i);

whose execution requires at least k cycles; that is to say, indexing operations are not efficient when applied to lists. On the other hand, certain operations that modify the size of tuples can be performed much more rapidly in the list representation than for standard SETL tuples. For example, for a standard tuple the operation which inserts x immediately after the i-th component of t requires time proportional to the length of t, since to create the expanded tuple all of the elements of t will have to be copied into new positions. On the other hand, if t has the list representation, and we know the name ni of the cell after which the insertion is to be performed, this operation can be done in just a few cycles, since all we have to do is

create a new atom ix to name the cell for the new component value x;
link ix at the appropriate position into the list representing t.

Similar remarks apply to the operation t(i..i) := [ ] which deletes a given component from a tuple in list representation. The following two procedures represent these operations. In writing these procedures, we suppose that a single pair of maps I>next and val_of will be used to represent all lists, and that the variables next and val_of have been declared global. We also suppose that only one logical reference to any of these maps is ever extant, so that no copying (see the preceding section) ever needs to be performed.

	var next,val_of;		-- Global Variables
	
	next := {};			-- Initialize to null set
	val_of := {};			-- Initialize to null set
	
    procedure insert(x, ni);                -- Inserts x immediately after the list
                                        -- component
        ix := newat();                  -- whose cell is ni. Create a new cell ix,
        next_of(ix) := next_of(ni);           -- and make it the predecessor of next_of(ni)
        next_of(ni) := ix;                 -- and the successor of ni
        val_of(ix) := x;                -- Place value in cell.
    end  insert;
        
        
    procedure delete(ni);
        
                -- delete the component immediately following that whose cell is ni
                next_of(ni) := next_of(next_of(ni)?ni); -- unless ni is the last index in its    list make
                                -- ni's successor the successor of ni's
                                -- original successor
    end  delete;

Provided tuples t₁ and t₂ are represented as lists, that t₁ will not be required after t₁ and t₂ are concatenated, and that the name i of the last cell of t₁ is easily available, the concatenation of t₁ and t₂ can also be formed in a number of cycles independent of the length of t₁ and t₂. The following procedure, in which we assume that each tuple t₁ n list form is represented by a pair [first, last] consisting of the first and the last cell of t, shows this:

	procedure concat(t1,t2);
	
		[t1_first,t1_last] := t1;		-- unpack the first and
							-- last cells of t1
		
		[t2_first,t2_last] := t2;		-- and those of t2
		
		if t1_first = OM then			-- t1 is an empty list
			return t2;
		elseif t2_first = OM then		-- t2 is an empty list
			return t1;
		else
			next_of(t1_last) := t2_first;	-- link the two lists
			return[t1_first,t2_last];
		end if;
	
	end  concat;

The mapping next_of, and the variables used to keep track of first and last elements of lists, together define what are called pointers in other programming languages.

Quite a few other very useful trick representations of tuples, sets, maps, etc., are known. These representations make use of more comp1ex arrangements of pointers, i.e., internal mappings between elements. If the family of operations applied to a SETL object is appropriately limited, use of an appropriate one of these special representations can be very advantageous. Since further exploration of this very important issue would take us beyond the scope of the present work, we refer the reader to the References at the end of this chapter for additional material concerning the issue of data structuring.

EXERCISES

1. Write a while loop that takes 2**(2**n) cycles to execute, for some n. Write a while loop that take 2**(2**(2**n)) cycles to execute. For how many centuries will these loops run, if n = 100?

2. If the monks in charge of the Towers of Hanoi began moving the 64 rings in the year 1 million B.C. and if they more one ring per second, when will the world end?

3. By adding a mapping prev that takes a given list element to its predecessor in the list, it is possible to iterate over a list forward and backward. Write a procedure to transform a standard tuple into this "doubly-linked" list representation.

4. Write procedures to insert and delete elements from a doubly-linked list.

5. Estimate the cost of the list concatenation procedure of Section 6.5.9. Assuming that a map operation such as next_of(last) := first; takes five cycles, and that copying a tuple of size n takes 2*n cycles, determine the size of t1, t2 for which the linked representation is preferable to the standard contiguous representation for performing tuple concatenation.

6.5.10 Some techniques for program transformation

The examination of various sorting examples in the previous sections indicates the very general fact that several algorithms, of quite different efficiencies, can usually be found to perform a given task. Several questions then arise:

How can we find the most efficient method or algorithm to perform a computation?
How can we know that it is the most efficient one available?
Given one algorithm which is not the most efficient for the problem at hand, can we modify it in some systematic fashion in order to improve its efficiency, perhaps by transforming it into one of the more efficient algorithms of its family?

The answer to (a) is found in the efficiency analysis which we sketched in Section 6.5. Given two algorithms, we can compare the number of operations they are expected to take in various cases and thus determine which one is preferable. Question (b), namely how do we know that a given algorithm is indeed the best possible for a given task, is much harder, and is the subject of so-called lower-bound analysis. For example, in the case of sorting lower-bound analysis gives the following result: "When we sort a set of size n by means of comparisons between its elements, 0(n log n) operations are required". This rigorous mathematical result (which we shall not prove here) shows that quicksort is probably close to the best way to sort, although bubble-sort is definitely not. Unfortunately such lower bounds are known only for a few interesting problems, and in general we have no systematic way of knowing how close to optimal a newly invented algorithm is.

Question (c), namely how can we improve on a given algorithm in order to improve it, is the source of much current research and also of large amounts of programming lore. Some basic techniques for transforming a program into a more efficient one take the form of very general prescriptions, such as

Do not evaluate the same expression repeatedly.
In particular, if a calculation is performed repeatedly in a loop and yields the same result every time, then perform the calculation once outside the loop, save the result, and use it within the loop.
Try to reuse storage occupied by a variable if the value of that variable is not needed any further.
When possible, replace recursive calls by iterative loops.

These prescriptions are simple and intuitive and can be expected to be generally useful. They can be applied almost mechanically and in fact can be automated to some extent, so that a clever compiler can apply such prescriptions by itself. In fact, most good compilers perform transformations corresponding to (1) and (2). Reuse of storage, as suggested in (3), is harder to automate, and the application of this rule requires greater ingenuity on the part of the programmer. The removal of recursive calls in favor of iterations has been studied in great detail in the context of languages such as LISP, which use recursion very heavily. We will examine a nontrivial example of recursion removal in a few pages.

In addition to the general rules stated in (1.)-(4.), there are certain optimizing program transformations that apply to very specific program constructs. Catalogs of such transformations can be found in some of the works listed in the References. Here we will only give some simple examples of applications of the rules (1), (2), and (3).

6.5.10.1 The elimination of repeated calculations

The simplest kind of efficiency-improving transformation which we can perform is to identify expressions that are common to several calculations and that can be evaluated once instead of repeatedly. For example, when calculating the roots of a quadratic equation, we may write the following familiar formulae:

		x1 := (-b + sqrt(b**2-4.0*a*c))/(2.0*a);        (1)
		x2 := (-b - sqrt(b**2-4.0*a*c))/(2.0*a);

In this case it is plain that the expression involving the square root is the same for both roots, which makes it possible to rewrite (1) as follows:

		discr := sqrt(b**2-4.0*a*c);
		x1 := (-b + discr)/(2.0*a);
		x2 := (-b - discr)/(2.0*a);

Transforming from (1) into (2) saves four arithmetic operations and one square root evaluation and makes the expressions simpler to type. We remark that the expression (2.0*a), which is a common subexpression to x1 and x2, is still evaluated repeatedly in (2). Replacing the two evaluations by a single one by saving the result in some other variable would eliminate one arithmetic instruction but add one assignment statement and therefore would hardly be a saving at all.

As a less trivial example of removal of redundant subexpressions, consider the evaluation of polynomials at a point. That is to say, given some polynomial

a_nxⁿ + a_{n - 1} x^{n - 1} + ... + a₀ (*)

we want to determine its numerical value at some point, e.g., for x = 10. If we represent the polynomial by a tuple P containing its coefficients in decreasing order, then we can evaluate the polynomial for x = c by means of the following code:

			degree := #P - 1;
			value := +/[a*c**(degree-i + 1): a = P(i)];

Note that in the loop we evaluate c**n, c**(n-1), c**(n-2), etc. These exponentiations are evaluated as repeated multiplications, so that the number of operations performed will be

n + (n-1) + (n-2) + + 0(n) = n² + 0(n) = 0(n²)

where the O(n) term takes care of the iteration over P and of a final summation of individual terms. In other words, as written the fragment (3) takes an amount of time proportional to the square of the degree of the polynomial. Substantial improvement can be obtained if we observe that the powers of c can be calculated in increasing order rather than in decreasing order, so that c**n is obtained from c**(n-1) by means of a single multiplication. An improved version is the following:

                power := 1;
                value := 0;
                for i in [#P,#P - 1..1] loop
                        term := power * P(i);        (4)
                        value +:= term;
                        power *:= c;
                end loop;

Here the loop is executed n times (where n is the degree of P) and each execution of the loop takes three arithmetic instructions and a few assignments, so that the algorithm is now of order n, rather than n2. This is a substantial improvement for large enough n.

Let us note that a similar improvement can be obtained by a different approach, which is several centuries old, and is known as Horner's rule. We can rewrite the polynomial (*) by means of successive factorizations:

((..(a_nx + a_{n - 1})x + a_{n - 2})..)) + aO

This suggests the following method for evaluating P:

                value := 0;
                for a = P(i) loop
                        value := value * c + a;
                end loop;

Now the loop contains only two arithmetic operations and one assignment. This is certainly superior to (4), and is therefore the method of choice for evaluating polynomials. Note however that (5) was obtained by means of a clever rewriting of the original problem, and the solution was not the result of some systematic transformation rule, but was "pulled out of a hat," as it were. It is appropriate to remember at this point that the development and optimization of algorithms require a healthy dose of inventiveness (guided, to be sure, by a knowledge of existing algorithms and optimization techniques)

6.5.10.2 Reuse of storage and recursion elimination: the case of quicksort

In order to illustrate the kind of program improvement that can be obtained by recursion removal, we will examine in some detail the (by now familiar) quicksort method. In its simplest version, we have described quicksort as follows:

	procedure quicksort(S);

		if S = {} then return [ ];
		else
			x := arb S;
			lo := {y in S | y < x};
			hi := {y in S | y > x};               		  (1)
			return quicksort(lo) + [x] + quicksort(hi);
		end if;

	end quicksort;

The skeleton of this algorithm consists of two steps: partition and recurse. That is to say: pick some element of the set to be sorted, and partition the rest of the set into two disjoint sets, corresponding to elements that are smaller (resp. larger) than the partitioning element. Then apply this process recursively to each partition. The two very substantial improvement that can be brought to bear on the efficiency of quicksort are performing the partitioning in place and replacing the recursion with a loop.

Partitioning in Place. To see how to apply this improvement we first observe that after S has been partitioned into lo and hi, the value of S is not used again. This suggests that, instead of building lo and hi at each step as new sets, we may want to reuse the space occupied by S, to place lo and hi in it. How to do this is not immediately apparent: as long as S, lo, and hi are sets, we do not have a simple way of describing how they are actually stored. If however we make them into tuples forming part of a larger tuple, then it is possible to describe them by their size and their starting locations. Suppose S is a tuple of size n, and that the partitioning element which we choose turns out to be the k-th one (in increasing order). Then the locations 1, 2..k-1 can be used for lo, and the locations k + 1, k + 2..n can be used for hi. Then, rather than passing lo and hi as tuples in the recursive call, we can instead pass the end points of each and then proceed to partition each subtuple in place as well. (How the partitioning in place is actually performed remains to be discussed.) This leads to the following (tentative) improved version of quicksort:

	T:= [x in S];		-- Initial tuple to be sorted.
	quicksort2(1,#T);		-- First call to routine.

procedure quicksort2(m, n);

-- Apply quicksort to the portion of T (which is global) between T(m) and
-- T(n) inclusive.

	if m >= n then return; -- Trivial case.
	else
		k = partition(m,n);		-- k is the position of the partitioning element
	
		quicksort(m,k - 1);		-- Elements smaller than it.
		quicksort(k + 1,n);		-- Elements greater than it.
	end if;	

end quicksort2;

In this version of the algorithm, no composite objects are constructed (except for the global tuple T). Each call to quicksort is given an explicit region of T on which to operate. This forces us to mention the indices m, n, k, etc., explicitly, which makes for a longer and less readable piece of code. We must now show that the partitioning itself can be done in place. This is an interesting algorithm in its own right.

Partitioning an Array in Place. The problem is the following: given a tuple t, and an arbitrary element e of t, reorder the elements of t so that all elements l < e appear before e, and all elements h > e appear after e. For simplicity let us assume that e is chosen to be the first element of t. A first thought might be the following: scan t from left to right, and whenever an element e₁ > e is found, move it up. But exactly where to? If we are to reorder t in place, then in order not to loose any information, we can only exchange elements. As a moment's reflection will show, the proper way to proceed is the following:

Going from left to right, find an element e₁ > e, which must therefore appear after e in the desired reordering.
Going from right to left, find an element e₂ < e, which must therefore appear before e .
Interchange e₁ and e₂ .
Continue scanning from left to right from the position of e₂ , and from right to left from the position of e₁ , interchanging whenever the next pair of out-of-place elements is found.
Stop when those two scans meet (at the correct location for e).

The following code (which deserves very careful scrutiny) performs this partitioning process:

procedure partition(i,j);
        
        -- Partition the portion of array T from T(i) to T(j),
        -- using T(i) as the partitioning element e. The procedure places e at
        -- position k, and places all elements smaller than e at positions < k.
        
        
        e := T(i);      -- Partitioning element.
        l := i + 1;     -- Index for left-to-right scan.
        r := j;         -- Index for right-to-left scan.
        
	while l < r loop 	-- Scans have not met in middle.
 
                while l < r and T(l) < e loop     -- bypass smaller elements.
                        l +:= 1;
                end loop;
        
        
                while l < r and T(r) > e loop     -- bypass larger elements.
                        r -:= 1;
                end loop;
         
        -- Now T(l) > e, and T(r) < e. Exchange them if the two scans have not met yet.
        
                if l < r then
                        [T(l), T(r)] := [T(r), T(l)];
                                -- Skip over these two elements which are now in proper place.
                        l +:= 1; if r > l then r -:= 1; end if;
                end if;
        
        end loop;

        -- Now l = r. l = r or l - 1 (if it is positive) is the correct position
        -- for the partitioning element. Place it there, and place T(r), which is  no
        -- larger than it, at the original position of the partitioning element.

        if l > #T or T(l) > T(i) then l := (l - 1) min j; end if;
        if l > 0 then T(i) := T(l);  T(l) := e; end if;
        
        return l;

end partition;

The top-level driver code for this routine can simply be

	procedure quicksort(i,j);
	    if  i >= j then  return;  end if;
	    mid := partition(i,j);
	    quicksort(i,mid - 1); quicksort(mid + 1,j);
	end quicksort;

Both these procedures assume that the tuple T is globally available.

At this point, our first optimization goal has been achieved: we have a quicksort algorithm that works in place, that is to say, without constructing any new composite objects in the course of its execution. This has been achieved by explicitly describing the elementary operations that place each element of the object being sorted at some specified position in a single tuple.

In other words, we have deliberately avoided the use of composite objects in the explicit fashion characterizing version (1) of quicksort. Of course, when we avoid manipulation of composite objects, much of the power of high-level languages such as SETL is lost. In fact, our version of quicksort now looks remarkably similar to a PASCAL or an Ada version of the same algorithm. This indicates the relative advantages of these various languages: SETL and other languages that manipulate arbitrary composite objects are the most convenient programming tools and lead most rapidly and with smallest programming effort to correct and lucid versions of algorithms. However, when efficiency considerations become critical, we must often abandon the richer semantic primitives of these languages (or drop to a lower-level language altogether) in order to control the efficiency of every statement we write.

Removing the Recursion from Quicksort. The next step in our improvement of quicksort is to replace the recursive calls within quicksort by some iterative construct which can be expected to be cheaper (recall that a procedure call is as expensive as 10 to 20 arithmetic operations). The skeleton of the recursive version of quicksort is the following:

		procedure quicksort(i,j);
			...
			quicksort(i,k - 1);
			quicksort(k + 1,j);

Each invocation of quicksort receives some values for i and j and assigns some value to k. Therefore, to eliminate recursive invocations of the procedure, we must save the current values of i, j, and k, because they might be used after we return from the recursive call. This can be achieved by means of a stack on which we "pile up" the parameters which we need to preserve. A stack is a tuple which is manipulated as follows: elements are added to it at its end and deleted from it from the same end. In other words, the operations that modify a tuple are the following:

			stack with:= x; -- "push" operator
			z frome stack; -- "pop" operator

Whenever we perform a recursive call, we use the stack to save whatever needs to be saved, i.e., whatever current values of variables and parameters will be needed after the recursive call. Whenever we return from the recursive call, we obtain, from the stack, all values that we need to resume the computation at the point where we left off. This is in fact what the SETL run-time system does: whenever a recursive call is executed, current values are saved before the (new instance of the) procedure starts executing. However, the run-time system does this in a stereotyped fashion which is often inefficient. In the case at hand, the two recursive calls are sandwiched between stacking and unstacking operations, whose action corresponds to the following code:

	procedure quicksort(i,j);
			...
		stack with:= [i,j,k]; 	-- Save current values.
		quicksort(i, k-l); 	-- Perform recursive call.
		[i,j,k] frome stack; 	-- Restore current values.
		stack with:= [i,j,k]; 	-- Save them again.
		quicksort(k + l,j);
		[i,j,k] frome stack;	 -- restore current values.

	end quicksort;

(Recall that the stack manipulations are the ones performed by the run-time system.) Several remaining inefficiencies should now be apparent: the most glaring one relates to the fact that the second recursive call is the last statement in the body of quicksort. In other words, there are no computations to be done after the second call, and therefore there is no need to save and restore current values of variables and parameters at this point. A smaller inefficiency is the fact that the second call only requires the values of k + 1 and j, so that there is therefore no need to save i either. So in fact we can do with the following:

	procedure quicksort(i,j);
		...
		stack with:= [k,j];
		quicksort(i,k-1);
		[k,j] frome stack;
		quicksort(k + 1,j);

	end quicksort;

At this point, the meaning of the stack should be clear: it allows us to keep track of portions of the tuple T which remain to be sorted. With this understanding, we realize that quicksort will continue calling itself recursively as long as there is some portion of T still unsorted, i.e., as long as there is something left on the stack that indicates some pending invocation of quicksort. Having this insight, we can transform our algorithm into an iterative one that loops as long as the stack is not empty:

	while stack /= [ ] loop
		[i,j] frome stack;	-- Values to work on.
		...
		stack with:= [i, k - 1]; -- to be worked on.
		stack with:= [k + 1,j]; -- ditto.
	end loop;

This is still incomplete: as written, each step deletes one pair of indices from the stack and adds two such pairs to it: clearly the stack will grow without bound, and the algorithm never terminates. What prevents this disaster from happening is that successive pairs describe shorter and shorter portions of the tuple being sorted. Whenever j - i < 2, there is nothing to sort. Therefore we should only save a pair on the stack if it describes a portion of the tuple of size greater than 1. A final improvement is to save only one pair on the stack and go to work immediately on the other, rather than stacking it and unstacking it at once. In order to keep the stack small, it turns out best to stack the indices of the larger portion to be sorted and go to work on the smaller one. Putting all this together, we obtain the following:

procedure quicksort(rw T);

    -- Optimized version of quicksort procedure. It operates in place, with
    -- explicit stack manipulation, and the partitioning process is performed
    -- on line.
    
    i:= 1;          
    j:= #T;                 -- Initial portion to be sorted is the
                            -- whole of T.
    
    stack := [ ];        

    loop

        e := T(i);      -- Partitioning element.
        l := (orig_l := i) + 1;     -- Index for left-to-right scan.
        orig_r := r := j;         -- Index for right-to-left scan.
        
        while l < r loop                        -- Scans have not met in middle.
 
                while l < r and T(l) < e loop     -- bypass smaller elements.
                        l +:= 1;
                end loop;
        
        
                while l < r and T(r) > e loop     -- bypass larger elements.
                        r -:= 1;
                end loop;
         
        -- Now T(l) > e, and T(r) < e. Exchange them if the two scans have not met yet.
        
                if l < r then
                        [T(l), T(r)] := [T(r), T(l)];
                                -- Skip over these two elements, which are now in proper place.
                        l +:= 1; if r > l then r -:= 1; end if;
                end if;
        
        end loop;
 
        if l > #T or T(l) > T(i) then l := (l - 1) min j; end if;
        if l > 0 then T(i) := T(l);  T(l) := e; end if;
        
        mid := l;

                    -- Save larger partition on stack.
        if orig_r - mid > mid - orig_l then
                [savei, savej] := [mid + 1,orig_r];
                j := mid - 1; i := orig_l;             -- Work on [orig_l..mid - 1]
        else
                [savei, savej] := [orig_l,mid - 1];
                i := mid + 1; j := orig_r;                       -- Work on [mid + 1..orig_r]
        end if;
        
                    -- Save larger partition only if it is longer than 1.
        if savej - savei >= 1 then stack with:= [savei, savej]; end if;
    
            -- Work on smaller partition only if it is longer than 1.
            -- Otherwise take the top pending partition to work on.

        if j - i = 1 and T(i) > T(j) then [T(i),T(j)] := [T(j),T(i)]; end if;
        
        if j  - i < 2 then 
            if stack = [ ] then   -- Nothing left to do.
                 exit;
            else
                 [i,j] frome stack;
            end if;
        end if;

end loop;
    
end quicksort;

This is very far from the short and intuitive algorithm with which we started. However, this final version is considerably more efficient than the original. It is in this final form that the quicksort procedure is usually implemented. Moreover, this more efficient version is still incorrect in some details and would have to be touched up to function properly! For example it will not work if T contains duplicate elements; moreover, during the partitioning process, it is possible for l or r to go out of bounds (i.e., l may become > #T, and r may become 0). These problems can be remedied without difficulty (see Exercises). However, their presence is a reminder that optimization is almost invariably accompanied by increased complexity.

Acceptance of the programming burdens that optimization demands is only justified for algorithms that are used often, consume large amounts of resources, or are critical to the overall performance of a system. Clearly, sorting is a frequently used process, and efforts spent in obtaining better sorting algorithms (the subject of much theoretical and pragmatic research in itself) and encoding them well are quite worthwhile. But it is unwise to devote strenuous optimization efforts to procedures that are not frequently used or are not critical to system performance. In many cases clarity and maintainability are far more important considerations in software design and construction than mere speed. Note however that recent research holds out some promise that it may eventually be possible for sophisticated computer programs to offer substantial assistance in applying optimizations as deep as those described in the present section.

EXERCISES

1. How could you use the techniques described in Section 6.5 to make nonterminating recursions less likely to occur?

2. The following SETL code is syntactically correct, but very poorly laid out. Put it in a better format, and add appropriate comments.

program sort; s := {3,2,1}; t := [ ]; while s/= {}loop s less:= (x := max/s);
t := [x] + t; end loop; print(t); end sort;

Improve the following program, which is also correct but poorly laid out.

 program find_palindromes; -- "Madam Im Adam"
x := "madam im adam"; y := +/[c: c in x | c /= " "]; print(y);
if y = +/[y(j): j in [#y, #y-1..1]] then print(x," is a palindorome"); else
print(x," is not  a palindorome"); end if; end find_palindromes;

3. What cases should be run to test the following recursive sort procedure comprehensively?

procedure sort(s); -- recursive sort of a set of integers

return if (x := arb s) = OM then [ ]
        else sort({y in s | y < x}) + [x] +  sort({y in s | y > x})
        end if;
end sort;

Run your tests, and try to estimate how thoroughly they test this code.

4. Suppose that G is a graph represented as a set of ordered pairs. If G contains a cycle C then C can be regarded as a subset of G such that for each x in C there exists a y in C (namely the edge y following the edge x in the cycle C) such that x(2) = y(1). Conversely, if this condition is satisfied, then there must exist a cycle, since starting with any x in C we can find an x' such that x(2) = x'(1), then an x" in in C such that x'(2) = x"(1), and so on, until eventually we must return to an edge that occurs earlier in the chain x, x', x", ..., at which point a cycle will have been formed. This leads to the following procedure for testing a graph to see whether it has a cycle:

	
	procedure test_for_cycle(G); -- tests any graph for the existence of a cycle

		print(if exists C in pow(G) | C /= {} and 
			(forall x in c | (exists y in c | x(2) = y(1)))
                then "There exists a cycle"
                else "There exists no cycle" end if);
        end test_for_cycle;

Work out a good battery of tests for this procedure, test it, and try to estimate how comprehensive your tests really are.

5. In the original quick_sort program, change the expression

sort({y in s | y < x}) + [x] + sort({y in s | y > x})

sort({y in x | y < x}) + sort({y in s | y > x}),

thereby introducing a bug. Then run the erroneous program. What happens? Could you guess the problem from the symptom? What would be a good way of debugging this program?

6. Suppose that the subexpression [x] in Ex. S is not omitted but is accidentally mistyped as [z]. What will happen? Why? What will happen if it is accidentally mistyped as [0]? As {x}?

7. For debugging purposes, it is useful to have a monadic operator @OUT such that @OUT x always has the value x, but such that "evaluation" of @OUT x prints the value of x. A binary operator s @OUT2 x which returns x but prints both s and x can also be useful. Write definitions for these operators. How might you use them to debug the faulty recursive procedure described in Ex. 5?

8. Each string in the following set consists of characters which are easily mistyped for each other:

{"1I|1/", "7>", "L<", "D0Oo", "S5s", "Z7", "UVuv", "6b", "_-", "GC6"}.

Write an expression that converts this set of strings into a set s consisting of all pairs of letters that are easily confused for one another. Use this set to create a "bugging program" B, which can read the text of any SETL program P, introduce one randomly selected character substitution chosen from P into it, and write the erroneous version of P thereby produced into a file.

Collect various short sample programs from your friends, "bug" them using B, and ask your friends to see whether they can spot the error. Then debug these programs, to see how long it takes you to track down the errors which B has introduced. If B is modified so that it never changes characters in SETL keywords, but only in identifiers, how much more elusive do the bugs that it introduces become?

9. Take the bubble-sort procedure described in Section 5.1.1 and the mergesort procedure described in Section 5.4.2, and modify them by inserting code which will count the number of comparisons which they make when used to sort a given vector t. Use these modified routines to sort tuples of length 50, 100, and 200, counting the number of comparisons performed and measuring their relative efficiencies. Try tuples with random components, and also try tuples with only a few components out of sorted order.

10. Suppose that the statement

[t(i), t(i + 1)] := [t(i + 1), t(i)];

in the bubble-sort procedure of Section 5.1.1 is replaced by

t(i) := t(i + l); t(i + 1) := t(i); (*)

What would happen? If we checked the resulting version of bubble_sort by adding

assert(forall i in [1..# t-1] | t(i) <= t(i + 1));

would the problem introduced by the change (*) be found? What checking assertion could we write to catch the sort of error that (*) introduces?

11. Suppose that in the bubble-sort procedure of Section 5.1.1 we inadvertently wrote [1..#t] instead of [1..#t - 1]. What would happen? If we wrote [#t..1] instead? If we wrote [1..#t - 2]? If we wrote [2..#t]?

12. Take the bubble-sort procedure shown in Section 5.1.1 and find at least three errors that might plausibly be made in writing or typing it, any of which would cause the code to loop endlessly. None of these errors should involve changing more than six characters. Take these erroneous versions of bubble sort to friends, and see how long it takes them to spot the errors.

13. Take the merge_sort program shown in Section 5.4.2. Then modify it, to produce four different erroneous versions of merge_sort, each of which contains one of the following list of common bugs. (Try to make your modifications as plausible, and as hard to spot as possible.)

Boolean condition stated in reversed form.
One branch of an if statement omitted.
Premature exit from a loop.
Input data not checked; data of the wrong type read.

For each of these erroneous versions estimate the time that would be required to find the error if you did not know where it was. Write a battery of tests sufficient to show that there is something wrong with each of these erroneous programs.

14. Repeat Ex. 13, but for the buckets and well program shown in Section 5.3.1. Produce five erroneous versions of this program, each with one or two plausible errors in every procedure. Devise a debugging plan which could discover most of these errors quickly. In what order does it seem best to test the procedures of this program? Where would it be most useful to place assert statements? Try to devise assertions that can be checked quickly, but whose verification will be strong evidence that the program is working as it should.

15. The following version of quicksort contains just one error. What is it?

       procedure quicksort(t); -- t is assumed to be a tuple of integers
                if t = [ ] then return; end if;
        
                x := t(1);
                t1 := [y: y = t(i) | y < x];
                t2 := [y: y = t(i) | y = x];
                t3 := [y: y = t(i) | y > x];
                quicksort(t1); quicksort(t3);
                t := t1 + t2 + t3;

        end  quicksort;

16. How many of the errors introduced by the "bugging program" described in Ex. 8 could be found more easily using a program which reads SETL programs and prints out a list of all identifiers appearing only once in them?

17. Write a maintenance test which could be used to check a sort program by comparing its results with those of quicksort. Use this test to verify that the mergesort procedure shown in Section 5.4.2 is correct.

18. Write a SETL system maintenance test which computes 50 set- or tuple-related values in two radically different ways and compares the results obtained. Your test should exploit various set-theoretic identities. For example,

{e(x): x in domain f} = {e(x): [x,y] in f } = {e(x): y = f(x)}

should be true for every map f, and

s1 * s2 = s1 - (s1 - s2) should be true for every pair of sets.

19. To see what parts of a program have been executed in a series of tests, we can introduce a global variable called points, and a procedure

procedure traversed(k); points less:= k; end traversed;

Then if we initialize points by writing points := {1..n}, insert a sequence of statements point(j), j = 1,...,n into the code being tested, and print points at the end of execution, each remaining member of points will represent a section of code that has never been executed. Apply this technique to develop a comprehensive set of tests for the text-editing program described in Section 5.10.2. Add tests to your set until the condition points = {} is achieved, to make sure that your collection of tests does not leave any section of code unexecuted.

20. A boundary test for a program P is a systematic collection of tests which exercises P in all the legal but extreme cases which P is supposed to handle. Work up several such boundary tests for the text-editing program described in Section 5.10.2. Your tests should include items like checks for $0.00, empty transaction files, etc.

21. The text-editing program described in Section 5.10.2 is totally unprotected against bad input. Modify it so that all input is systematically examined for acceptability; your input-examination procedures should check for all remotely plausible input errors. Write an English-language explanation of the input errors for which you check.

22. Take one of your programs, approximately 10 lines long. Strip all comments from it, and then introduce one misprint, to cause a bug (not one that syntax analysis would find). Give the result to a friend (a good friend!) with a 3-line explanation of what the program is supposed to do. See whether your friend can find and fix the error without expending more than an hour's effort.

23. Develop test data for the GCD program out1ined in Ex. 12 (following Section 5.5). Your tests should include cases in which the data are zero, negative, etc., and should test all relevant combinations of "extreme" data of this kind.

24. Write the "MISSING_SECTIONS" procedure described in Section 6.4.

6.6 Formal Verification of Programs

The great importance of programs to banks, airlines, engineering firms, insurance companies, universities, indeed to all the major institutions of our society, lends an inescapable importance to the question of program correctness. Once a program has been written, how can we be sure that it is correct, i.e., that when given legal input it will always produce the output that its author desires? This is a deep question, whose systematic exploration would take us far beyond the boundaries of the present text. Nevertheless, in order to shed some light on the issues involved, we this section will say something about it.

To begin with, we emphasise that mere program testing, even systematic testing like that described in Section 6.4, can never prove a program's correctness in any rigorous sense. Testing, to repeat an important maxim of the Dutch computer scientist Edsger Dijkstra, can show the presence, but not the absence, of bugs. Though systematic testing is an essential tool of program development, in asserting the rigorous correctness of a program we are asserting that it will run correctly in each of a potentially infinite family of cases. Clearly, no finite sequence of test cases can cover all of them, and so any rigorous assertion that a program functions correctly in all possible cases must rely on some kind of mathematical proof.

The basic raw material out of which such proofs can be built is not too hard to find. When a programmer has written a program and checked it carefully for the first time, why does he believe that it will run correctly? If legitimate, this feeling of correctness must always rest on a comprehensive logical analysis of the conditions that arise as control moves from point to point during the execution of a program.

To show what such analysis involves, we will take a very simple program, namely one which calculates the product of two integers n and m by adding n to itself m times. (The basic technique that we will use to prove the correctness of this trivial program is entirely general; however, the mass of technical detail needed to handle more comp1ex examples grows rapidly, and to avoid this it is best to stick to a rudimentary example.) Since it is a bit easier to handle while loops than for loops, we write our multiplication code as follows:

		prod := 0;
		iterations := 0;

		while iterations /= m loop
			prod := prod + n;				(1)
			iterations := iterations + 1;
		end loop;

To begin to prove this program correct, we must first supplement it by adding a formal statement of what it is that the program is supposed to achieve. This can be done by adding an assert statement at the very end of the program, giving us


	Line1:	prod := 0;
	Line2:	iterations := 0;
	Line3:	while iterations /= m loop
	Line4:	prod := prod + n;   (2)
	Line5:	iterations := iterations + 1;
	Line6:	end loop;
	Line7:	assert(prod = m * n);

In (2), all lines of the program have been labeled to facilitate later reference. Note that addition of the final assert statement is an absolutely necessary preliminary to any attempt to prove anything at all about the program: Until we have stated what a program is supposed to do, we cannot even begin to prove that it does what it is supposed to! This is to say that all rigorous proofs of program correctness are really proofs that two different descriptions of a computation, one a deliberately very high level mathematical statement (like the final line in (2)) of what an algorithm is supposed to accomplish, the other a more detailed procedure (like the rest of the code (2)), really say the same thing.

This fundamental principle being understood, we go on to remark that in proving a program correct what one basically needs to do is just to write down the logical relationships between data items upon which the programmer's understanding of his program's behavior rests. However, these relationships must be written down in a sufficiently complete manner and must be expressed formally, using additional assert statements.

To see what is involved, let us first analyze program (2) informally. If the author of (2) wished to convince a skeptical colleague that it really does compute the product m*n, what facts about (2) would he point out, what more detailed analysis would he offer? The crucial fact upon which program (2) depends is that each time the loop starting at Line 3 begins to repeat, the variable prod will be equal to the product of the variable 'iterations' by the variable 'm'. This is certainly true on the first iteration, since then both prod and iterations are zero, so we certainly have

				prod = iterations * n      		         (3)

(i.e., 0 = 0 * n) on first entry to the loop. But if (3) is true at the start of k-th iteration, it must also be true at the end of the k-th iteration, since the body of the loop increments 'prod' by n and increments 'iterations' by 1. Hence (3) remains true during every iteration. Thus, since the loop only terminates when the variables 'iterations' and 'm' are equal, (3) implies that prod = m*n at the end of the loop, which is what we wanted to prove.

The argument we have just presented is a satisfactory informal proof of the correctness of the program (2). Nevertheless, it is not quite what we require. In proving that a program is correct, we aim to exclude rigorously the possibility of any small, easily overlooked program bug. For this, merely informal, English-language proof is insufficient, since such proofs are no less likely than programs to contain small errors. Moreover, some of the likeliest errors in programs (for example, counting in a manner that is off by 1) correspond closely to errors that occur frequently in mathematical proofs (for example, starting a mathematical induction at the wrong place or missing one among multiple cases that a proof needs to examine; see Ex. 12 of Section 6.7). Therefore, when we set out to prove a program rigorously correct, we must aim at something more formal and machine-checkable than an ordinary English-language proof of the kind ordinarily found in textbooks.

6.6.1 Formal verification using Floyd/Hoare assertions: general approach

This observation drives us to a more formal approach, like that devised by Robert Floyd and Tony Hoare, for proving programs like (2) correct. Floyd's formalism requires us to add assert statements to a program P that we are trying to prove correct. These auxiliary assert statements, sometimes called the Floyd/Hoare assertions for P, must satisfy two principal conditions:

Enough assert statements must be added so that there can exist no indefinitely long path through the program P which does not pass through at least one assert statement. Another way of putting this is to say that at least one auxiliary assert statement must be inserted into every loop in the program P.
Consider any one of these auxiliary assert statements. It will have the form
```
			assert(C)               	   (4)
```

where C which can be any Boolean-valued expression, called the condition of the assert statement. The auxiliary assertion (4) will occur at some specific place in the program P, say, to be specific, immediately after Linej of P. Then we require C to assert every fact about the state of the program's variables that is relevant at Linej, i.e., every fact upon which the correct functioning of P from Linej onward will depend. This important rule ensures that all the essential facts needed for proving the correctness of P are explicitly and formally written down in the auxiliary assertions added to P, and this is what makes a rigorous proof of correctness possible in principle.

Once the required assertions (4) have been added to P, we proceed as follows. Starting either at the first statement of P or at some one of the auxiliary assert statements in it, we move forward line by line through the program along every possible path (i.e., every path of control flow, which is to say every path that the program could follow during its execution. All possible paths through P which start at an assert statement but do not pass through any other assert statement must be considered one after another. Because (by condition (a)) there are no infinite loops not passing through an assert statement, there can exist only finitely many such paths, and each such path will be bounded in length.

Tracing out all such paths q, we will use each of them in the following way to generate a set V of verification clauses. (With one exception, noted in (f) the verification clauses associated with a particular path q collect logical relationships between variable values which are certain to hold along q.)

Suppose that the path q starts at an assert statement assert(C), where C is a Boolean formula. Then we begin by putting C into V as its first clause. This simply reflects the fact that C is assumed to be true at the start of q.
If the path q passes through an assignment statement of the form
```
(A)				x := expn;
```
(where expn can be any expression) we introduce a new variable identifier x' (this identifier simply designates the value which x has after execution the assignment (A)) and add the clause
```
(B)				x' = expn
```
to V. Occurrences of x encountered later along the path q (but prior to any subsequent assignment to the same variable x) are then replaced by occurrences of x'. (But at and after any later assignment to x we replace x by yet another new identifier x".) For example, the sequence of assignments
x := x + 1; y := y + 1; z := x + y; x = x + z; would generate the clauses
x' = x + 1, y' = y + 1, z' = x' + y', x" = x' + z'.
These rules simply reflect the fact that the new value x' which the variable x takes on immediately after the assignment (A) satisfies the equation (B), and that x retains this value until it becomes the target of a subsequent assignment.
If the path q passes through an assignment of the special form
```
(C)				x := arb(s);
```
where s is some set-valued expression, then just as in paragraph (b) we introduce a new name for x, but in this case we add the clause
```
(D)				x' in s
```
to V. (This reflects the fact that the arb operator can select an arbitrary element of s, so that (D) asserts everything we can know about the new value x' given to the variable x by the assignment (C).)
Conditional and unconditional goto's: The SETL language does not provide any goto statement, but to state the next verification rule that concerns us, it is convenient to pretend that it does, and to think of SETL's while and until loops as abbreviations for lower-level constructions involving gotos and labels. The expansion of a while loop
```
				while C loop
					statements...
				end loop;
```
into its underlying meaning would then be
```
			Start_Label: if not C then goto Exit_label; end if;
					statements...
				goto Start_Label;
			Exit_label: ... 
```
The corresponding expansion of an until loop
```
				until C loop
					statements...
				end loop;
```
is
```
				Start_Label: 
					statements...
 				if  C then goto Start_Label; end if;
			Exit_label: ... 
```
An exit statement in a while loop translates directly into
```
			goto Exit_label;
```
Also, while loops containing continue statements, for example
```
			while C loop

				statements...

				continue;

				more_statements...

			end loop;
```
can be translated as
```
			Start_Label: if not C then goto Exit_label; end if;

				statements...

				goto Start_Label;

				more_statements...

			goto Start_Label;;

		Exit_label: ... 
```
and similarly for exit and continue statements in until loops.
Thinking of the SETL loop and if constructions in this way lets us state the following rules for setting up verification clauses. If the path q passes through a control statement of the form
```
(E)				goto Label;
```
then the path q must continue with the statement following the Label that appears in (E), but we add no clause to V at this point, since a simple goto does not test any condition or change the value of any variable.
On the other hand, if the path q passes through a control statement of the form
```
(F)				if C then goto Label; end if;
```
then the path q can go on either to the statement immediately following (F) or to the statement following the Label that appears in (F). In the first case, we add the clause not C to V; in the second case we add the clause C to V. These rules simply reflect the fact that not C must hold if and when q passes through a control statement (F) without the instruction goto Label applying, but that C must hold if and when q reaches (F) and the instruction goto Label is executed.
As we have noted, the rules for more complex control structures, for example, general if constructs, while loops, and until loops can be deduced by rewriting them in terms of the more primitive constructs (E) and (F) and then applying the rules stated previously. For example, if q encounters a multibranch if statement of the form
```
   
(G)				if C1 then
					block 1
				elseif C2 then
					block2
					...
				end if;
```
and then enters block2, it is obvious that we must add the two clauses

not C1, C2
to V. Later, if and when q passes from the last statement of block2 to the first statement following the multibranch if, no clause needs to be added to V, since this transition, like (E), counts as an unconditional transfer.
The rules applying to a while loop
```
(H)				while C loop
				     body
				end loop;
```
are similar. If and when q passes through the while header, either by entering the loop from the statement immediately prior to (H) or by looping back from the final statement of the body of (H), we must add the Boolean clause C to V. On the other hand, if the path q encounters the while header but then leaves the loop (H) immediately, we must add the negated clause not C to V.
When q encounters the end loop line in (H) it will go immediately to the loop header standing at the start of (H). Since this is an unconditional transfer we add no clause to V in this case.
When q enters an until loop we need not add any clause to V since entry to such a loop is unconditional. However if and when q encounters the end until terminating such a loop the action that we must take is a bit more complex. Suppose, to be specific, that the loop in question has the form
```
(I)				until C loop
					body
				end loop;
```
If, after encountering the end loop clause, q exits the loop, then plainly we must add the clause C to V. On the other hand,if q encounters the end loop clause and then loops back and continues with the first statement of the body of the loop. We must add the negated clause not C to V.
Eventually, the path q that we are following will end at an assert statement
assert(C').
Our aim is then to show that C' is necessarily true at the end of q, provided that the assertion C with which q starts (see (a)) is true at the beginning of q, and provided also that the program's execution does indeed follow the sequence of steps corresponding to q. It is most convenient for this purpose to add the negated condition
not C'
to V. After doing this our aim must be to show that the set V of clauses is inconsistent, i.e. that not all the clauses of V can be true simultaneously. This is equivalent to requiring that, taken all together, the clauses of V, other than its final clause not C', imply the condition C'.
To complete the set of rules stated in the preceding paragraphs, we would need rules that tell us how to handle procedure definitions and invocations. However, since these rules are somewhat more complex than those already stated, we omit them.This means that the rules that we have stated suffice for the formal verification of programs containing no procedure invocations, but not for programs which make use of procedures. This deficiency is not fatal - it would not be terribly hard to remedy - but to do so would take us beyond the limits proper to the present work.

Once we have taken a program P containing assert statements and generated the set V of verification clauses corresponding to each path q starting and ending at an assert statement (but not passing through any other assert statement), we are in position to prove the correctness of P mathematically. To do this, we must prove mathematically that each of the clause sets V which we have generated (i.e. each of the clause sets corresponding to a path q) is inconsistent. Suppose that we can succeed in doing this. We can then note that the clause CI initially placed in V is true by assumption, and that, with the exception of the final clause CF of V (see (f)) all the other clauses inserted into V are true in virtue of the very meaning of the statements which the path traverses. Hence, by showing that V is inconsistent, we will have shown that if CI is true at the start of q, then CF is true at the end of q. Once this has been demonstrated for every path q through the program P (or, more precisely, every path which connects two assert statements but does not pass through any other assert statement), it will follow by mathematical induction that every assert statement written into P must evaluate to true whenever it is reached, provided only that the assert statement standing at the very head of P is true at the moment that execution of P begins. (This initial assert statement, often called the input assertion of P,will normally summarize all the assumptions concerning input data on which the program P relies.) All in all, we will have shown that the truth of every assertion written into P follows from the assumption that its input assertion is true.

It is important to realize that this final step of a formal program verification, i.e. the step of proving that every set of verification clauses corresponding to a path q between assert statements is inconsistent, is a purely mathematical task. That is, when we begin this task we will already have decoupled the work which remains from any entanglement with the control structures and other programming dictions present in the original program P. It is precisely in order to achieve this, i.e. precisely in order to transform our our original program-related verification task into a purely mathematical question, that we go to the trouble of reducing the program P to the collection of clause sets V that it generates. Note again that, once all the necessary Floyd assertions have been written into the text of P, generation of the clause sets V using the rules stated above is a simple mechanical matter, essentially a matter of systematic variable renaming and extraction of suitable portions of the statements encountered along each of the paths q.

6.6.2. Formal verification using Floyd assertions: an example.

To apply the formal verification technique just outlined to the example we consider,we must insert an auxiliary assert statement into the while loop appearing in the example. We choose to insert this assert statement immediately after Line3 of (2). Call this place p. As explained, this added assertion must put on record every condition C which always holds at p and which would appear, implicitly or explicitly, in an informal proof of the correctness of the program (2). Since we have already given an informal proof that this simple program is correct, we already know what the inserted statement should say (namely it should say that (3) is always true at the beginning of an iteration). Such an assertion is easily written and inserted; doing so, we obtain

	Line1:	prod := 0;
	Line2:	iterations := 0;
	
	Line3:	while iterations /= m loop
		assert(prod = iterations * n)
	Line4:	prod := prod + n;	(5)
	Line5:	iterations := iterations + 1;
	Line6:	end loop;
	
	Line7:	assert(prod = m * n);

Writing (5) puts us in position to generate the clause sets needed to verify the correctness of the program we are considering. There are just four paths through this program that needed to be taken into account. The first of these is the path running from the start of (5) to the first assert statement in (5). By the rules stated this path generates the clause set


	prod' = 0, iterations' = 0, iterations /= m, 
	not (prod' = iterations'* n)			(6)

The second path that we need to consider runs from the start of (5) to the while-loop header but not into the while loop and then to the final assert statement. This path generates the clause set.

	prod'= 0, iterations' = 0, not (iterations' /= m),
	not (prod' = m * n).				(7)

A third path between asserts runs from the assert statement following Line3 through the body of the while loop and then back to this same assert statement. This generates the clause set.

	prod = iterations * n, prod' = prod + n, iterations' = iterations + 1, 
	iterations' /= m, not (prod' = iterations' * n).		(8)

The fourth and final path that we need to consider is the one which runs from the assert statement following Line3 through the body of the while loop but then exits the loop passing through Line3 and then going immediately to Line7. The rules stated previously tell us that this path generates the clause set.

	prod = iterations * n, prod' = prod + n, iterations' = iterations + 1, 
	not (iterations' /= m), not (prod' = m * n)		(9)

These are all possible paths not running through any assert statement and hence these are all the clause sets that we need to consider. Once these clause sets have been generated it is easy to prove that each of them is inconsistent. In view of the simplicity of our original example nothing more than elementary algebra is needed for any of these proofs. In (6) the first two clauses plainly contradict the fourth clause in (7), the first three clauses contradict the fourth. In (8) the first three clauses contradict the fifth in (9), the first four clauses contradict the fifth. This completes our formal verification of the program (2).

It is important to note that this formal verification is very close in spirit to the informal English-language proof of the correctness of (2) that we gave earlier: the formal proof only regularizes and systematizes the informal proof. However, this formalization has the vital effect of making it possible to proceed mechanically, thereby ruling out the possibility of small errors. Strictly speaking, to be quite sure that error is impossible the clause sets would have to be generated mechanically by an extension of the SETL compiler, and the informal proof of inconsistency which we have supplied for each clause set would have to be checked mechanically. This can be done, but its final step is not easy.

As already observed the clause-set generation process that we have applied to the example program (2) is quite general, and will apply with much the same ease to any other long or short program which has been decorated with a sufficiently full set of assertions. However for more complex programs the sets generated will not be as simple as (6), (7), (8) and (9). Program (2) involves algebraic operations only, and this is why the clause sets generated from it consist entirely of elementary algebraic formulae. Less elementary programs generally involve both algebraic and set-theoretic operations and this will cause set-theoretic expressions to appear in the Floyd assertions and hence in the clause sets associated with these programs. (Several programs illustrating this remark appear in the verification-oriented exercises of Section 6.7.) To show that such clause sets are inconsistent is considerably less trivial than to deal with the clause sets arising in the highly simplified example that we have considered. Nevertheless, with care and sufficient effort, the proofs required to show clause set inconsistency can always be checked formally after they have been constructed by using only the tools which formal mathematics and symbolic logic make available. In this sense, the formal assert-statement based verification approach that we have described reduces the problem of rigorous program verification to a purely mathematical question, namely that of proving the inconsistency of certain clause sets written in a formal mathematical notation. This is as far as we will carry our discussion of formal verification, since to discuss the mathematical problems that must then be faced would take us outside the scope proper to this text.

6.7 Formative Influences on Program Development

At this point in our text we have presented programs ranging from the simple to the complex, and have discussed both the pragmatic methods used to test programs systematically and the considerably more formal techniques that can be used to prove their correctness rigorously. The present section will discuss a deeper but more amorphous issue. Specifically, we will try to give some account of the formative influences which shape programs and which determine the features that programs typically exhibit. By gaining some understanding of this fundamental question we can hope to put other important issues such as program design and program testing into a helpful broader perspective.

To understand what underlying forces shape the development of programs, it is well to observe that ingredients of two fundamental sorts enter into the composition of a program. Material of the first kind serves to define user desires and expectations concerning an intended application: for example, the nature of expected input and of output, including output text formats, graphic layouts, prompts and warnings issued by interactive systems, compiler error diagnostics generated by compilers, etc. This material, which often constitutes the overwhelming bulk of a particular application-oriented program, is motivated by user-oriented considerations having an intrinsically nonmathematical character. Material of a second, much more highly algorithmic kind also appears in programs. This internal program material creates the toolbox of operations which is used to achieve whatever external behavior is desired. Depending on the relative weight of program material belonging to these two categories (external and internal), a program can be called an externally. motivated or internally motivated program, an application or an algorithm; one might even say a superficial or subtle program.

Looking back over some of the programs presented in earlier chapters it is easy to apply this distinction. For example, the shortest path code presented in Section 4.3.8.2 is an internally motivated algorithm (though not a very deep one). In contrast, the cross reference program presented in Section 5.8.12 of Chapter 5 has very little algorithmic content. Most of its details relate to such external matters as the rules which distinguish words from punctuation marks in English text, and one whole procedure of this program, namely 'arrange', is needed only because we want to print lists of line numbers in a neat, easy to read tabular arrangement. Other examples are the quicksort procedure of Section 5 4.1 and the mergesort procedure of Section 5.4.2, which are both algorithms whose recursive structure gives them a certain depth in spite of their brevity, and the polynomial manipulation procedures of Section 5.1, which are also algorithms, albeit rather easy ones, since they are little more than transcriptions into SETL of the ordinary algebraic definitions of polynomial sum, difference, product etc. On the other hand the Turtle language interpreter presented in Section 4.7.1 is externally rather than internally determined: this code uses no nontrivial algorithm, but merely reflects the rules of the Turtle language in an almost one-to-one manner. The 'buckets and well' program of Section 5.7 makes the distinction between internally and externally motivated code particularly clear, since one of its procedures, namely the crucial find_path routine, is an internally motivated algorithm. All its other procedures are externally motivated, some of these relating to such issues as the acquisition and checking of initial data, although others merely serve to represent the rules of the problem itself.

The basic concepts and notations of mathematics, which SETL makes available as tools of programming, serve very adequately to define the internally motivated, algorithmic parts of programs. We have seen that SETL's set-theoretic features allow mathematical functions to be described either in a deliberately succinct "high" style which defines them very directly, or more procedurally by algorithms which compute these same functions, sometimes in surprising clever much more efficient ways. We have also noted that useful mathematical operations which are not directly provided by SETL can be built up by writing suitable families of procedures, and have emphasized (see our discussion of the family of polynomial manipulating procedures developed in Section 5 1 3) that such families should be written to hide the internal representational details of the mathematical objects they manipulate, allowing a user to think in terms of these objects (e. g. polynomials) rather than in more primitive set-theoretic terms. By using such approaches; by studying important algorithms carefully; and by consulting the rapidly growing technical literature of algorithms, which by now describes many useful highly sophisticated ones, you will find that the purely algorithmic side of programming can be brought under a reasonable degree of control.

The externally motivated aspects of programs reflect a considerably more miscellaneous congeries of influences, for example the physical or administrative structure of real-world systems; the form and sequencing of expected input and desired output; the reactions, including prompts and warnings, expected from interactive systems; and the heuristic approaches used to manipulate physical or symbolic objects effectively. How can we come to terms with such varied material?

There has developed a large, though largely administrative, literature concerning the important problem of how to come to terms with external aspects of application design before the start of detailed programming. This is the so-called problem of requirements specification. Concerning the literature devoted to this problem the astute observer G. J. Myers comments

Although no methodology exists for external design, a valuable principle to follow is the idea of conceptual integrity [i.e.] the harmony (or lack of harmony) among the external interfaces of the system. The easiest way not to achieve conceptual harmony is to attempt to produce an external design with too many people. The magic number seems to be about two. Depending on the size of the project one or two people should have the responsibility for the external design. Who, then should these select responsible people be? The process of external design has little or nothing to do with programming; it is more directly concerned with understanding the user's environment, problem, and needs, and the psychology of man-machine communications. Because of its increasing importance in software development, external design requires some type of specialist. The specialist must understand all the fields mentioned above, and should also have a familiarity with all phases of software design and testing, to understand the effects of external design on these phases. Candidates that come to mind are systems analysts, behavioral psychologists, operations-research specialists, industrial engineers and possibly computer scientists (providing their education includes these areas, which is rarely the case)

Though Myers's general remarks are helpful, it is still important to try to say something more about the organization of externally motivated applications-oriented programs

One important possibility in this area is to develop special applications oriented programming languages whose objects and operations define useful, standard approaches to important application areas. The Tk widget library and the 'Panel' system discussed later in this book are languages of this kind, as are the 'Berkeley socket' library for Internet programming that SETL is able to access via its Tk or Python interfaces (these are discussed later.)

Even if such languages remain unimplemented and are not available to be run on any computer, their notations and conceptual structure can provide important tools of thought. In developing an application it maybe well to write out a first version of the application in a helpful, even if unimplemented, auxiliary language. This first version can then be translated into SETL, by choosing SETL representations for all the kinds of objects appearing in the auxiliary language, and writing SETL routines which implement its primitive operations. Used in this way, the auxiliary language can identify families of operations that work harmoniously together, and procedures into which a SETL application code can usefully be organized. For this reason, comparative study of disparate application-oriented languages,e.g. SNOBOL, GPSS, APL, COBOL, LISP and PROLOG, JAVA, PERL, SQL, VRML, and of gestural programming environments like Macromedia Director, is an essential exercise for the would-be programmer.

A useful suggestion is to isolate the internal algorithmic content of applications from their external requirements, and to prototype their internal operations using general mathematical operations rather than tailored special cases of them. Contrasting with this recommended practice, ordinary application oriented code tends to mix internally and externally motivated program material inextricably, i. e., output details are allowed to control the choice of algorithms, and opportunities to generate output which an algorithm seems to afford are allowed to determine much of what the end user sees. Inartistic packages, which meet user requirements only minimally, and which are full of redundant, hard to maintain, and inefficient algorithmic fragments, often result.

A related suggestion is to use well-designed general-purpose application modules as building blocks for the construction of more complex applications. The Tk-derived widget sets of the SETL interactive interface facilitate this approach.

As we have said, most of the text of an applications-oriented program is often nothing more than a restatement, in programming language terms, of external facts and rules pertaining to the intended application. Once a program's designer has found a way of representing these facts and rules as succinctly and clearly as a well-conceived English language description of these same details would, he/she has programmed these external aspects about as effectively as can be expected. The remaining algorithmic content of a highly external program will normally be small. However:

A few genuine but generally rather elementary algorithms maybe used. For example one may want to sort, perform a binary search, or put the data to be processed into some arrangement which makes it easy to locate important groups of interrelated data items.
To improve efficiency one will often apply formal differentiation to an application-oriented code. This is the technique of speeding up the calculation of a quantity E that will be required repeatedly by storing its value in a variable value of E. which must then be updated whenever any parameter on which E depends is changed. (Whenever this common technique is applied, it tends to complicate the application code since it replaces a single integral, often self-explanatory computation of E by multiple scattered, harder-to-fathom updates of value of_E.) A related technique is to replace direct use of set formers and tuple formers by loops which build these same values. Sometimes this is done in order to combine several such loops, all of which iterate over the same set into a single loop For example in application-oriented code (and even in hand-optimized algorithms) one is less apt to see.

	num_rich_families := #{x in families | family_income(x) >= 100000};

	num_middle_families := #{x in families |
			family_income(x) < 100000 and family_income(x) > 15000}; (1a)


	num_poor_families := # {x in families | family_income(x) <= 15000};

than to see something like

	num_rich_families := num_middle_families := 
	num_poor_families = 0; 
	
	for x in families loop

		if (income := family_income(x)) >= 100000 then 
			num_rich_families +:= 1;
		elseif income > 15000 then  			(1b)
			num_middle_families +:= 1; 
		else 
			num_poor_families +:= 1; 				
		end if;
	
	end loop;

The code (1b) arises from (1a) by expansion into loops of the three set formers appearing in (1a), followed by combination of the three resulting loops, and then by the application of a few other rather obvious optimizing transformations. Note that (1b), although much more efficient and not much longer than (1a), is not quite as obvious a piece of code; certainly (1b) is less brutally direct than (1a).

Internally motivated code passages, i.e. significant algorithms, use a much wider range of tricks than appear in more superficial, application-oriented programs (It is partly for this reason that it is well to separate internally determined from externally determined code sections. Externally oriented code can often be ground out routinely once a good approach has been defined, but deeper, internally oriented code needs to be approached much more cautiously, more "by the book".)

Formal differentiation, described previously, often shapes the design of internally oriented code passages. Another important technique is exploitation of recursive mathematical relationships which express some desired function f of a composite object x in terms of values f(x1),...,f(xn) calculated for one or more smaller subparts xj of x. As noted in Section 5.4, relationships

f(x) = g(f(x1),...,f(xn))

of this recursive kind underlie such high-efficiency algorithms as mergesort and quicksort. Beyond these two most common techniques, algorithm designers have uncovered many sophisticated techniques which handle many important tasks with remarkable efficiency. Some of these algorithms rest on subtle mathematical relationships whose discussion goes beyond the scope of this book. Since your ability to devise truly effective approaches to programming problems be strongly conditioned by your familiarity with the rich and growing literature of algorithms, you are strongly advised to study this material as soon as you have mastered the more basic material contained in this book. A short list of useful collections of advanced algorithms is found at the end of this chapter.

EXERCISES

1. Into the bubble-sort code shown as (12) of Section 6.5.6, insert code which will count the number of iterations performed.

Then:

Measure this number i of iterations for randomly chosen tuples of varying lengths L, and calculate the ratio of i to L**3, to estimate the constant c that should appear in the formula
i = c * L**3 projected in Section 6.5.6. Do the same for quicksort.
How much more efficient than the bubble-sort method (12) of Section 6.5.6 would you expect quicksort to be, for sorting a tuple of 10 elements? For sorting a tuple of 100 or 1000 elements?

2. Take quicksort and the mergesort procedure described in Sections 5.4.1 and 5.4.2, and modify them by inserting code which will count the number of comparisons they make when used to sort a given tuple. Use them to sort tuples of length 50 100 and 200 counting, the number of comparisons performed and measuring their relative efficiencies. Try all three tuples with random components and tuples with only a few components out of sorted order.

3. Use the technique described in Section 6.5 to estimate the time required to sort a vector of length using mergesort and also the time required to search for a specified component in a sorted vector using the binary search algorithm given in Section 5.4.3.

4. What set will be printed by the following code?

		n := 10;
		s := {};
		for i in [1..n] loop s with:= s; end loop;
		print(s);

If we changed the first statement to n := 1000, for roughly how long would you expect the resulting code to execute?

5. Compare the time required to execute the following codes

		n := 500;
		s := {};
		for i in [1..n] loop 
			s with:= 2 * i;
		end loop;

and

		n := 500;
		s := {};
		for i in [1..n] loop 
			s with:= 2 * i; t := s; 
		end loop;

What accounts for the difference?

6. Write a program which will execute the 10 elementary SETL operations which you consider most important 1000 times each, and which will estimate the time required to execute each such instruction. To estimate the time required just to execute looping operations, your tests should compare loops like


	for i in [1..n] loop x := y + z; end loop; 
	for i in [1..n] loop x := y + z; x := y + z; end loop;

The time difference per iteration is then clearly the cost of executing the additional operation.

7. Take the 'buckets and well' program described in Section 5.3.1 and modify it by inserting code which will count the number of times that every one of its procedures is called and the number of times that every loop in it is executed. This information should be written to a tuple, and a general-purpose routine which prints this information in an attractive format should be designed and implemented.

8. A tuple t all of whose components are different from OM can be represented in the "list" form described in Section 6.5.9 i.e., by a pair [x1,f] such that x1 is the first component of t and f is a map which sends each components of t into the next component of t. Use an iteration procedure to write short codes which convert a tuple t from its standard form to this list form and vice versa.

9. Rewrite program (2) of Section 6.6 by introducing labels and gotos in place of the while loop appearing in this program. More precisely the while-loop header should be replaced by the following labeled statement.

	Label1: If iterations /= m then goto Label2; end;

and the loop trailer end loop should be replaced by the sequence

	goto Label1;
	Label2: ...                     -- the final assert statement of (2) 
					-- should follow this label.

If we transform (2) in this way we can insert the auxiliary assertion

assert prod = iterations * n

immediately after Label1. Make this assertion; then generate clause sets as in Section 6.6.1, and prove that the resulting variant of program (2) is correct. How does this proof compare in difficulty to the proof of correctness of program (2) given in Section 6.6.?

10. A set-theoretic iteration

for x in s loop

can be rewritten as a while loop in the following way: We introduce a new variable sc (representing the collection of elements of s that have not yet been iterated over.) Then the loop header (1) can be rewritten as the following while loop header:

		sc := s;
		
		while sc /= {}	loop	(1')
			x := arb(sc); sc := sc - {x};
		end loop;

(The end for corresponding to (1) must be replaced by end loop). Applying this technique prove that if s is a set of integers then the program

		count1 := 0; count2 := 0;

		for x in s loop
			if x > 0 then count1 := count1 - 1; end if;		(2)
			if x <= 0 then count2 := count2 - 1; end if;
		end loop;

gives the variables count1 and count2 final values satisfying the equations count1 + count2 = #s. Work out a full set of Floyd-Hoare assertions for the program, and write out the clause sets generated by these Floyd assertions. A rigorous English language proof that each of these clause sets is inconsistent should then be given.

11. Assume that s1 and s2 are two sets. Proceeding as in Ex. 10, prove that the program

			count := 0;

			for x in s1 loop
				for y in s2 loop
					count := count - 1;
				end loop; 
			end loop;

gives the variable count a final value equal to #s1*#s2.

12. The following incorrect mathematical proof contains a bug. What typical program bug does it resemble? How likely is it that a proof-bug of this sort could slip through in an (erroneous) proof that some program is correct?

Theorem. Given any group of n balls. all the balls in the group hare the same color.

Proof. (By mathematical induction) This is clearly true if n = 1. Suppose now that it is true for m = n - 1. Given a group of n balls, remove one, call it x. Then by inductive hypothesis the remaining balls all have the same color. Choose one ball in this group, call it y, and replace y by x in the group. This is still a group of m balls, so by inductive hypothesis all the balls in the group have the same color. This shows that x has the same color as all the other balls, so all n balls have the same color, Q. E. D.

6.8 References to Material on Alternative Data Structures: References for Additional Material on Algorithms

Reingold, E., Nievergelt, J., and Deo, N., Combinatorial Algorithms: Theory and Practice (Prentice-Hall Publishers 1977) is an intermediate-level work which presents many useful techniques for generating combinatorial objects, fast searching and sorting, and graph processing. It also discusses the mathematical techniques used to estimate algorithm efficiency and can serve well as a guide to further reading in this important area.

The Design and Analysis of Computer Algorithms by A. Aho, J. Hopcroft, and J. Ullman (Addison-Wesley Publishers, 1975), which is more advanced, contains an excellent survey of many important algorithms, data-structuring techniques, and methods for determining the efficiency of algorithms. This useful book also describes various important techniques for proving upper bounds on the speed with which various quantities can be calculated.

The first three volumes of Donald Knuth's famous Art of Computer Programming (Addison-Wesley Publishers, 1973) cover several important classes of algorithms (including basic combinatorial algorithms, polynomial manipulation, multiprecision arithmetic, calculation of random numbers, sorting, and searching) very comprehensively. Knuth gives many detailed analyses of algorithm efficiency and is a basic reference for this topic.

Borodin and Munro, Computational Complexity of Algebraic and Numeric Problems (American Elsevier Publishers, 1975) is a specialized work which presents many algorithms for high-efficiency processing of polynomials and for related algebraic and arithmetic processes.

Numerical algorithms i. e., algorithms for carrying out numerical computations, including solution of linear and nonlinear equations, calculation of integrals, solution of differential equations, and minimization of functions of several variables, have a very extensive history which reaches back to the nineteenth century and beyond. A first-class modern introduction to this classical area of computational technique is found in Dahlquist, Bjorck and Anderson Numerical Methods (Prentice-Hall Publishers 1974)

Methods for treating systems of linear equations and inequalities form the content of the area of algorithmics known as linear programming. For an account of this interesting and important subject see D. Luenberger, Introduction to Linear and Nonlinear Programming (Addison-Wesley Publishers, 1973)

Many areas of algorithm design have developed very actively during the last few years. One of the most fascinating of these is computational geometry, the body of techniques used for the rapid calculation of solutions to geometric problems For an introduction to work in this area see M. Shamos and G. Preparata, Computational Geometry (Springer-Verlag, 1985).

Bug	Likely SETL symptom
Variable not given any initial value	"Illegal data-type" error
Incorrect termination of a loop (e.g., count off by 1)	Missing items in data collections, sums or sets too small if loop terminates too soon. "Illegal data-type" error if loop terminates too late
Incorrect limits in string and tuple slices (e.g., count off by 1)	(Similar to incorrect loop termination)
Incorrectly structured while loop conditions or bodies, or incorrect initial conditions in while loops	Program does not terminate
Incorrect treatment of initial cases in recursions, or bad procedure calls	Program does not terminate, possible memory overflow
Omission of exit or continue statement	Program "runs on" into code not intended for execution.
Omitted or improperly inserted data-value update	Failure of relationships expected subsequently; unexpected values or jumps. Effects can be subtle.
Shared global variable unexpectedly modified by invoked procedure or function	Efforts can be very subtle, and particularly hard to find. Beware of global variables!
Misspelled variables, e.g., AO for A0, Bl for B1, c1 for ci	"Illegal data-type" error, possibly no output
Reading unexpected data	Can cause wide variety of effects
Some unanticipated characters encountered by the string-scan operations	Program does not terminate
Failure to set a program switch	Effects can be subtle
Parameters out of order in procedure call	"Illegal data-type" error (generally easy to find)
Variable inadvertently modified by assignment to a variable intended to be different but having the same name	If no data-type error is caused, effects can be subtle
Variables out of order in read statement	"Illegal data-type" error (generally easy to find)
Read operations of program inconsistent with data actually present in input file	"Illegal data-type" error (generally easy to find)
Target of an assignment statement misspelled	Effects can be very subtle
Not resetting a counter	Effects can be very subtle (see "Incorrect loop termination")
Complex, incorrect combination of Boolean conditions	Effects can be very subtle
Too few or too many parentheses in expressions; misunderstanding of precedence rules	Effects can be very subtle

Cardinality:	#S takes one cycle to evaluate, regardless of the actual number of elements of S
Membership:	(x in S) where x is a simple item (numeric or Boolean) takes typically three to six cycles to evaluate, and increases only slowly with the size of the set S

Insertion and deletion:	For simple objects the operations x with S and S less x, take a nearly constant number of cycles to execute. Exceptions relate to questions of storage allocation, and will be discussed later. The actual number of cycles needed for insertion of a simple value into a set of moderate size is about 10-15 cycles.
Map retrieval:	To evaluate or to modify f(x), where f is a map and x is a simple value, takes nearly constant time, comparable to the time required for set membership operation. Once again, efficient implementation of this fundamental operation is secured by the use of a sophisticated data structure about which we will say more in Chapter 10.

Operation		Time Units

Arithmetic	i + j, i * j	1
Assignment	x := y	1
Indexing	T(i)	1
Map retrieval	M(x)	5
Set membership	x in S	5
Tuple membership	x in T	#T
Set union	S1 + S2	#S1 + #S2
Concatenation	T1 + T2	#T1 + #T2
Quantifier	exists x in S \| C(x)	#S * (1 + cost(C))