CHAPTER 5

Procedures

A procedure in SETL is a sequence of computational steps which have been given a name and which, using one or more data items passed to it for processing, will compute and deliver a value. Most of the built-in SETL operators, for example max, which returns the maximum of two values x and y, and cos, which returns the cosine of a floating-point number x passed to it, are procedures in this sense. However, since no finite collection will ever exhaust the whole catalog of procedures that a programmer may want to use, it is important to have a way of defining, and then using, as many additional operations as are helpful.

5.1 Writing and Using Procedures

To make the preceding point more convincing, we can consider a simple example Suppose that the weights of individual eggs in batches coming from a chicken farm are measured daily, thus producing batches of measurements, each of which can be thought of as a set of numbers, e.g.,

			{2.7,2.85,1.90,...,1.86}					(1)

Suppose that in order to enforce some sort of quality control, various statistical properties are to be reported for each batch, including the weights of the three largest and the three smallest eggs in the batch.

To make this calculation easily, it would be convenient to use a pre-programmed procedure to which a set s like (1) can be passed, and which would then produce a tuple t

			[1 86,1.90,...,2.7,2.85]					(2)

such that all the members of s are arranged in increasing order. Since this procedure would simply sort the members of s, it can appropriately be called 'sort'. We would like to be able to produce the ordered tuple t from the set (1) simply by writing

			t := sort(s).					(3)

Note that if this can be done, then to print the three largest and three smallest measurements we have only to write

	print("three smallest measurements are:",t(1),t(2),t(3));
	print("three largest measurements are:",t(#t),t(#t-1), t(#-2));

Of course, sorting the set s is not hard and can be done by the simple method explained in Section 4.7.2, which is to say, using the code

				t := [ ];

				while s /= { } loop
					t with:= (x := min/s);  (4)
					s less:= x;
				end loop;

However, what we want is to package the code (4), giving it the name sort and invoking it by this name. By doing this we make it possible to get the effect of the code (4), without having to concern ourselves with its inner workings, simply by writing (3). To "package" bits of code in this way becomes absolutely essential when one is constructing large programs (say a few hundred lines or more). Such large programs can only be built successfully if they are organized hierarchically into a modular collection of sub-procedures. Typically such a collection will include both high-level functions which simply make use of facilities provided by lower-level functions, and low-level procedures, like the sort which we have been discussing, which encapsulate generally useful primitive operations. Like most other programming languages, SETL does provide a facility for defining as many new procedures as you need, and we now proceed to explain how this is done.

To package or encapsulate the code (4), all we need to do is to enclose it between procedure header and trailer lines and add a return statement. This gives procedure sort(s);

			procedure sort(s);      

				t := [ ];

				while s /= { } loop
					t with:= (x := min/s);
					s less:= x;
				end loop;  					(5)

				return t;

			end sort;

In (5) the procedure header line is

procedure sort(s);

This line, introduced by the special keyword procedure, opens the procedure (5), gives it a name (in this case, the name sort), and also names its formal parameters (sometimes simply called parameters), i.e., the names of values which will be passed to the procedure whenever it is used (as in (3)), and from which the procedure will calculate the value that it returns. (In (5), the value returned is t, and there is only one formal parameter, namely, s.) The concluding trailer line

					end sort; 			(5b)

marks the end of the procedure.

Finally the command

					return t; 				(5c)

appearing in (5) both indicates the point at which the procedure computation has finished calculating the value which it is to produce and defines the value that the procedure will return.

To call or invoke the procedure sort defined by (5), we have only to write sort(e), where e can be any set-valued expression (provided that the set members are all integers, or all real numbers, etc.). This automatically calculates and makes available the value returned by the procedure (5). For example, if we write

					print(sort({5,1,2,7,0}));             (5d)
the result will be

[0,1,2,5,7]

The expression e occurring in such an invocation sort(e) of the procedure sort is called the actual parameter, or supplied argument, of the invocation. Whenever evaluation of a procedure invocation like (5d) begins, the value of the actual parameter (or parameters) appearing in it is transmitted to the procedure invoked and becomes the initial value of the procedure's formal parameter (or parameters).

To examine the behavior of SETL function call more closely let us consider the following invocation of the procedure 'sort', and trace through the way it works.

					x := sort({5, 1, 2, 7,0});             (6)

As with all assignment statements, execution of (6) begins with evaluation of its right-hand side. Since sort is the name of a procedure, evaluation of the procedure call appearing on the right-hand side of the assignment (6) involves the following steps:

  1. The current of 'actual' parameter value {5,1, 2, 7, 0} that appears in the procedure invocation is assigned as the initial value of the formal parameter variable s appearing in the procedure header line of the procedure (5).

    Figure 5.1 Detour and Return in Function Invocations

  2. Execution of the procedure (5) begins: the statements appearing in the body of this procedure are executed in the ordinary way. However, when any formal parameter appears in the body of the procedure, the corresponding actual parameter value passed to the procedure is used.

  3. As soon as a return statement is encountered, control is passed back from the procedure (5) to the instruction immediately following the call (6). Just before this happens, the expression following the keyword return is evaluated and becomes the value that the procedure (5) yields (e.g., becomes the value of the variable x in (6)).

This "detour and return" action of function invocations is shown schematically in Figure 5.1.

The following analogy should help to clarify the important distinction between the formal parameters and the actual parameters of a procedure. The formal parameters of a procedure can be compared to the ingredient names in a cookbook recipe. For example, a recipe may say "break an egg into half a cup of flour and stir." The names egg and flour appearing in such a recipe are formal names which stand for all the actual eggs and actual half cups of flour that will be used when the recipe is actually followed. As in the case of a function, new actual items, i.e., a different egg and a different half cup of flour, must be supplied each time the recipe is used, even though the formal names egg and flour appearing in the recipe remain the same. Continuing this analogy, the text of the recipe can be compared to the body of a procedure, which will yield something (e.g., a cake) when actual ingredients matching the formal ingredient names to which it refers are supplied.

It is also instructive to consider an example involving two invocations of the sort routine, with two different parameters:

          	      x := sort(s1) + sort(s2);             (6b)

Suppose that when (6b) is executed s1 and s2 happen to have the values {3, 1, 0} and {-3, -1, 0} respectively. Then evaluation of sort(s1) will produce the value [0, 1, 3] and evaluation of sort(s2) will produce the value [-3, -1, 0], so that after (6b) is executed the variable x will have the value [O, 1, 3, -3, -1, 0].

The way this happens is as follows. As with all assignment statements, execution of (6b) begins with evaluation of its right-hand side, i.e., sort(s1) + sort(s2). This is an expression and is evaluated by first evaluating its two subexpressions sort(sl) and sort(s2) and then combining the two resulting values using the " + " operator.

The value of x in statement (6b) will be the same as the value of x resulting from the execution of

			temp1 := sort(s1);
			temp2 := sort(s2);				 (7)
			x := temp1 + temp2;

As you can see, (7) involves two successive invocations of sort, followed by a use of the " +" operator to combine the two results produced.

The following important rules govern the use of procedures.

  1. The formal parameters that appear in the procedure heading must be valid identifiers, that is to say, they must be variable names; furthermore no two formal parameters can have the same name. For example, both

    				procedure p1(s * t);  		(8a)

    				procedure p2(s, t, s);  	(8b)

    are illegal: (8a) because the parameter s*t is not a simple variable, and (8b) because the first and the third formal parameters of p2 are identical. On the other hand, any actual parameter of a function invocation can be an (arbitrarily complicated) expression, and actual parameters can be repeated. For example,

    				x := sort({x in ss | x > 0});           (9a)

    is legal if ss is a set (and if ss were {-10, 20, -20, 15, 10} would give x the value [10, 15, 20]). Similarly, if dot_prod(x,y) is a function which calculates and returns the dot-product of the two tuples x and y, then

    				a := dot_prod(u,u);    (9b)

    is legal (and will put the sum of the squared components of the tuple u into a).

  2. Each invocation of a procedure must have exactly as many actual parameters as the procedure has formal parameters. When a procedure is invoked, the value of its first (resp. second, third, etc.) actual parameter becomes the value of its first (resp. second, third, etc.) formal parameter. For example, if a procedure whose header line is

    procedure intermingle(a, b, c);

    is invoked by

    x := intermingle ({x in s | x > 0}, {y in s2 | y < 0}, {x in s | x > 0});

    then a and c initially get the value {x in s | x > 0}, and the value {y in s2 | y < 0} is transmitted to b.

  3. The body of a procedure can contain any number of return statements and often will contain more than one. The following code, which simply calculates and returns the maximum of two quantities, exemplifies this remark:

    		procedure my_very_own_max_function(x, y);
    
    			if x > y then
    				return x;
    			else
    				return y;
    			end if;
    
    		end my_very_own_max_function;
    

    If no return statement is encountered, execution of the procedure will terminate when and if its trailer line end proc_name is reached, and in this case the undefined value OM will be returned.

    Note: Other programming languages make the distinction between a function which returns a value, and a procedure, which does not. This distinction is not present in SETL: a procedure may or may not return a value.

    Note that the keyword return can be followed by an arbitrary expression. This expression may be complex; in fact, the whole body of the function may simply consist of a single return statement and nothing else, as in

    procedure positive_elements_in(s);
    						-- returns the set of positive elements of s
    	return {x in s | x > 0};
    
    end positive_elements_in;
    

    Figure 5.2 Patterns of Control Transfer in Multiple Function Calls.

  4. Procedures can invoke other procedures (including themselves) without restriction. When control is transferred to a procedure f which in turn invokes a function g, execution will proceed within the body of f until an invocation of g is encountered, at which point execution of f will be suspended and execution of g will begin. Thereafter, g will execute until a return statement is encountered within g, at which point g will terminate, sending control, and possibly a value, back to f. Subsequently, when a return statement is encountered in f, f will itself be terminated, sending control (and a value) back to the procedure from which f was invoked. This will lead to patterns of control transfer like that shown in Figure 5.2.

  5. Procedure invocations are themselves expressions and can be used freely as parts of more complex expressions. For example, if sort is a function which returns the elements of a set s in sorted order as a tuple, and sum_square is a function which returns the sum of the squares of the three first elements of a tuple, then we can write

    print(sum_square(sort(s)));

    to display the sum of the squares of the three smallest elements of s.

5.1.1 Some simple sorting procedures

To illustrate the use of procedures, we will now exhibit a variety of procedures for sorting a set or tuple of elements into order. One simple, well-known way of sorting is the so-called bubble-sort method, which, simply stated, operates as follows: as long as there are two adjacent elements that are out of order in the sequence, interchange them. This is not a very efficient sorting method (and in the form presented here it is even more inefficient than the standard bubble sort), but it is one of the simplest to state and program. The input to the procedure is a tuple, and the output is another tuple, whose elements are in increasing order. Note that the code that follows applies equally well to a tuple of integers, a tuple of floating-point numbers, or a tuple of strings: in all three cases the comparison operator " > " defines the desired ordering.

     
	procedure sort(t); -- sorts a tuple by the bubble-sort method
	
		while exists i in [1..#t - 1] | t(i) > t(i + 1) loop
			[t(i),t(i + 1)] := [t(i + 1),t(i)];
		end loop;

		return t;

	end sort;

(The attentive reader will notice that this procedure modifies its own parameter t and will wonder whether the value of the actual parameter will be modified when sort is invoked. In fact, the value of the actual parameter will not really be affected outside sort; but the rule guaranteeing this will only be stated in Section 5.5. This same remark also applies to several of the procedures presented later in this section.)

As we mentioned, the procedure just shown can be used to sort any tuple of integers, of reals, or of strings. For example, if we write

print(sort(["Joe", "Ralph", "Albert", "Cynthia", "Robert", "Alfredo"]));

the result will be

["Albert", "Alfredo", "Cynthia", "Joe", "Ralph", "Robert"]

More complex sorting routines than that shown are often needed. One reason for this is that sorting is often used to arrange more complex "records" into an order determined by some common "subfield" of the records. In SETL, such records are typically represented as tuples. Suppose, for example, that a group of students have taken a course in which their grades on a series of homework exercises and examinations have been collected, producing a tuple of tuples having the following form:

	records := [["Gonzalez, Aldo", 80, 87, OM, 73, 90,..],
		    ["Woburn, Linda", 82, 89, 85, 91, 90, 65,..],
		    ["Luciano, Luigi", 80, 81, 75, 79, OM, 70,..],...]

Grades are assumed to be represented by integers, and missed exercises or examinations by occurrences of OM. One might then want to arrange these records in various orders, e.g.,

  1. Alphabetic order of student names
  2. Order of grade averages, with largest first
  3. Order of grades on midterm examination, largest first
  4. Order of number of exercises not handed in, largest first, etc.

To make it easy to sort these records according to any of their fields, we modify our original sorting procedure, so that it takes two arguments:

  1. The tuple of records to be sorted.
  2. The record component by which the records must be sorted.

This leads to the following procedure (which, however, does not treat OM components correctly: see the following discussion).

		procedure sort1(t, pos);

		-- t is a tuple of records (tuples) to be sorted.
		-- pos is the index of the component in each record, along which
		-- the records are to be sorted in increasing order.

			while exists i in [1..#t-1] | t(i)(pos) > t(i + 1)(pos) loop
				[t(i),t(i + 1)] := [t(i + 1),t(i)];
			end loop;

			return t;

		end sort1;

Using this function, we can print the class records in alphabetical order simply by writing

			for x in sort1(records,1) loop
				print(x);
			end loop;

Suppose now that we want to list these records in order of decreasing midterm grades, with students who have missed the midterm coming last. If the midterm is the 11th entry in the record, we may be tempted to sort the records (into increasing order) according to that component and then list them in reverse. The attentive reader will notice that sort1 as written will not work in the presence of missing grades: recall the convention that a missed test is marked as OM in the record. The comparison (OM > x) where x is a non-OM value is not meaningful, and in fact the-SETL system will stop any program at the point at which such a comparison is attempted. As a necessary modification to our sorting procedure, we therefore replace the comparison that drives the while loop, so that a value of OM is regarded as smaller than any existing grade. Using the "is undefined" (i.e. questionmark) operator, we simply replace t(i)(pos) by t(i)(pos)?(-1). The improved sorting routine then reads

		procedure sort2(t,pos);

		-- T is a tuple of records, some of whose components may be OM.
		-- pos is the index of the record component along which the records
		-- are to be sorted in increasing order.

			while exists i in [1..#t - 1] | t(i)(pos)?(-1) > t(i + 1)(pos)?(-1) loop
				[t(i), t(i + 1)] := [t(i + 1), t(i)];
			end loop;

			return t;

		end sort2;

With this modification, we can print the desired ordering of records by midterm grades using the following code (recall that a student's name is the first component of his/her record, the midterm grade is the 11th component of the record, and this grade may be undefined):

		ordered := sort2(records, 11);
		for i in [#ordered,#ordered-1..1] loop
			print(ordered(i)(1)," ",ordered(i)(11) ? "**absent**");
		end loop;

5.1.1.1 The main block of a program

A program that makes use of procedures ordinarily includes commands that invoke these procedures; otherwise the procedures might as well not be there. As we have explained, the first function invoked can in turn invoke any or all of the other functions, but at least one instruction not belownging to any procedure is needed to trigger this first invocation. In a program including one or more procedures, the "directly executed" portion of the program, i.e., everything not included in any procedure, is called the main block of the program, or the main program for short. This block of instructions has exactly the form of a program body, as described in Chapters 2 and 3, and it must precede all procedures. The main program and all the procedures which follow it must be prefixed by a program header line of the usual form, and a corresponding trailer line starting with the keyword end must follow the last procedure.

For example, a complete program consisting of the sort function shown previously and the two fragments of code which invoke it would have the following overall structure:

program print_grade_info;	-- program to print student grade records

	input_handle := open("student_record_file","TEXT-IN");
			-- file operations are described  later in this chapter
	reada(input_handle,records);		-- acquire the basic data

	print("Student records in alphabetical order");
	print("--------------------------------------------\n");

	for x in sort (records,1) loop
		print(x);
	end loop;

	print("Students and mid-term grades, in decreasing grade order");
	print("----------------------------------------------------------------\n");

	ordered := sort(records,11);

	for i in [#ordered,#ordered-1..1] loop
		print(ordered(i)(1)," ",ordered(i)(11)?"**absent**");
	end loop;
	
	procedure sort(t, pos);
	-- t is a tuple of records. pos is the position of the record component
	-- according to which the records are to be sorted in increasing order.

		while exists i in [1..#t - 1] | t(i)(pos)?(-1) > t(i + 1)(pos)?(-1) loop
			[t(i),t(i + 1)] := [t(i + 1),t(i)];
		end loop;

		return t;

	end sort;

end print_grade_info;

Execution of such a program begins at the first statement of its main program block and ends as soon as the last statement of its main program block has been executed (or when a stop statement is encountered; see Section 4.5).

5.1.2 A character-conversion procedure

As a next example, we define a procedure that takes a string and returns a similar string in which all lowercase alphabetic characters have been changed into the corresponding uppercase characters. Blanks and punctuation marks are not affected.

procedure capitalize(s); -- capitalizes the string s and returns
			-- the result. Nonalphabetic characters are left alone

	small_letters := "abcdefghijklmnopqrstuvwxyz";
	big_letters := "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
	capital_of := {[let, big_letters(i)]: let = small_letters(i)};

			-- maps each small letter into the corresponding capital.

	return + /[capital_of(let)?let: let = s(i)];

		-- Note that the map capital_of is defined over alphabetic characters
		-- only. Nonalphabetic characters, such as punctuation marks, are not
		-- converted, but left as they are. This is the purpose of the "? let"
		-- expression.

end capitalize;

A procedure can have any number of parameters, even no parameters. For example, suppose that we want to use a procedure which reads an input string, uses the capitalize procedure to capitalize this input, and returns the capitalized result. This function can be written as follows:

	procedure read_a_line; 	-- procedure to read and capitalize a line
		file_handle := open("test_file","TEXT-IN");
		reada(file_handle,x);    		-- read a quoted string
		close(file_handle);

		return if x = OM then OM else capitalize(x) end if;

	end read_a_line;

To invoke a parameterless procedure of this kind, one must write its name, followed by an empty parameter list. For example, to invoke the next_line procedure and print the capitalized string that it returns, we would write

print(read_a_line( ));

We emphasize that the empty parameter list, i.e. the "( )" following the name of the parameterless procedure next_line, is obligatory.

5.1.3 A simple package of procedures for manipulating polynomials

As a further illustration of the use of procedures, we give a set of procedures for adding, subtracting, multiplying, and dividing polynomials in a single variable with real coefficients. Such polynomials are ordinarily printed in a standard algebraic form like

3.1 * x2 + 7.7 * x + 4.5.

In the procedures that follow we will assume that a polynomial is represented internally by a SETL map which sends the exponent of each term of the polynomial into the coefficient of that term. For example, the polynomial shown previously would be represented internally by the map

{[2,3.1],[1,7.7],[0,4.5]}.

As in algebra, we simply omit terms whose coefficients are zero.

Developing a package of procedures for manipulating polynomials represented in this way is easy.

To add (resp. subtract) two polynomials, we simply add (resp. subtract) the coefficients of corresponding terms. So the addition of two polynomials can proceed as follows:

procedure sum(p1,p2);			-- computes the sum of two polynomials

	result := { };

	for c = p1(e) loop			-- iterate over terms of first polynomial
		if p2(e) /= OM then		-- second polynomial has matching term

			cr := c + p2(e);	 -- coefficient of result
			if cr /= 0.0 then	-- term is present
				result(e) := cr;
			end if;

		else

			result(e) := c;

		end if;

	end loop;

	for c = p2(e) | p1(e) = OM loop  -- add terms in second polynomial that are
					   -- not present in first
		result(e) := c;
	end loop;

	return result;

end sum;

Note that the result of the second loop can be replaced by the following more compact expression.

{[e,c]: c = p2(e) | p1(e) = OM}

We can also abbreviate the first loop by using the "?" operator and obtain the following compact procedure:

	procedure sum(p1, p2); -- forms the sum of two polynomials

		return {[e,c]: c1 = p1(e) | (c := c1 + (p2(e)?0.0)) /= 0.0}
					+ {[e,c2]: c2 = p2(e) | p1(e) = OM};
	end sum;

Adapting this we can easily write a procedure for polynomial difference:

	procedure diff(p1 , p2); -- forms the difference of two polynomials
		return {[e,c]: c1 = p1(e) | (c := c1 - (p2(e)?0.0)) /= 0.0}
				+ {[e,-c2]: c2 = p2(e) | p1(e) = OM};
	end diff;

To multiply two polynomials, we multiply and sum all pairs of their individual terms. Finally, we eliminate terms which turn out to have zero coefficients. This is simply

                procedure prod(p1, p2); -- forms the product of two polynomials

                        p := {};
                        for c1 = p1(e1), c2 = p2(e2) loop p(e1 + e2) := p(e1 + e2)?0.0 + c1 * c2; end loop;
                        return {[e,c]: c = p(e) | c /= 0.0};

                end prod;

Next, we show how to divide a polynomial p1 by a polynomial p2. Let c1xj1 be the leading term of p1, i.e., the term having largest exponent, and let c1xj2 be the leading term of p2. Then we subtract (c1/c2)xj1-j2) times p2 from p1, to eliminate the leading term of p1, and so on repeatedly until all terms of p1 with exponents larger than j2 have been eliminated. The collection of all terms by which p2 is multiplied constitutes the terms of the quotient.

procedure div(p1,p2);		-- forms the quotient polynomial p1/p2

	if p2 = { } then return OM; end if; -- this is the case p2 = 0.
	
	e1 := max/[e: c = p1(e)];     -- largest exponent of p1
	e2 := max/[e: c = p2(e)];     -- largest exponent of p2
	qcoeff := { };		      -- start with an empty quotient

	for j in [e1 - e2, e1 - e2 - 1..0] | p1(e2 + j) /= 0.0 	loop
		qcoeff(j) := p1(e2 + j) / p2(e2);
		p1 := diff(p1,{[e + j,qcoeff(j) * c]: c = p2(e)});
	end loop;
	
	return qcoeff;            -- return the map representing the quotient.

end div;

We note that techniques for manipulating polynomials by computer have been studied very intensively, and that muchh more efficient methods than those used in these simple illustrative procedures are known. See Knuth, The Art of Computer Programming, Vol. 2, for an account of these developments, which go beyond the scope of the present book.

5.2 Name Scopes; Local and Global Variable Names: The var Declaration

In writing a long program, which can involve hundreds of procedures, it is irritating, as well as highly error-inducing, to have to remember which variables had been used for which purposes through the whole of a long text. To see this, consider a function invocation imbedded in a while loop like

				i:= O; j:= O;
				while (i + j) < f(j) loop. . .

and suppose that f is an invocation of a function whose body is found somewhere else in a long program text. It is entirely plausible that, unknown to the author of the code (1), the body of the function f should make use of the convenient variable name 'i', e.g., in a loop like

			forall i in [1..#t] |...              (2)

But then, if the i appearing in (1) and the i appearing in (2) were regarded as representing the same variable, the function invocation f(j) which occurs in the while loop could change the value of i in ways not at all hinted at by the outward form of the code (1). Were this the case, a programmer wishing to write a loop like (1) would first have to examine the body of the function f, to avoid variable name conflict. This would introduce many highly undesirable interactions between widely separated parts of a lengthy program and make large programs harder to write.

To avoid these very undesirable effects, most programming languages make use of rules which restrict the scope of names. The SETL scope rule is as follows. In the absence of explicit declarations, variables retain their meaning only within a single procedure (or main program). This implies that ordinarily a variable i appearing in one procedure and a variable i appearing in another procedure are treated as distinct. In effect, the SETL compiler (invisibly) applies the following renaming procedure to the program text which it processes:

  1. The main program which begins the program text is numbered zero, and the procedures which follow this main program are numbered 1, 2, .. in their order of occurrence.

  2. Every variable name xxx used in the n-th procedure, including the names of its formal parameters, is implicitly changed to xxx_n.

As an example, consider the program

   program example;

	x := {3,0,1,2};
	print(squares(sort({i in x | i > 0})));

	procedure sort(s); -- sorts by selection
	t := [ ];
	while s /= { } loop
		t with := (x := min/ s);
		s less := x;
	end loop;

	return t;

	end sort;
 
	procedure squares(x); -- forms and returns the tuple of squares of the
					 -- components of the tuple x
	return [e * e: e = x(i)];

	end squares;

   end example;

Given this program as input, the SETL compiler will implicitly apply the renaming rules (a), (b), and therefore it will really see the following renamed variant:

program example;

	x_O := {3,0,1,2};		-- main program
	print(squares(sort({i_O in x_O | i_O > 0})));

	procedure sort(s_1);			-- procedure number 1
	t_1 := [ ];

	while s_1 /= { } loop
		t_1 with := (x_1 := min/s_1);
		s_1 less := x_1;
	end loop;

	return t_1;

	end sort;

	procedure squares(x_2);                 -- procedure number 2
		return [e_2 * e_2: e_2 = x_2(i_2)];

	end squares;

end example;

As stated previously, rule (b) serves to isolate variables having the same name from each other if they are used in different procedures. Variables used in this way are said to be local to the procedures in which they appear.

In some cases, however, we do want a variable used in several procedures to refer to the same object. For example, one or more "major" data objects may be used by all the functions in a related group of functions. To see this, consider the case of a group of functions written as part of an inquiry system to be used by the executives of a bank. This might involve many functions, for example,

procedure payments(customer name);	-- returns a given customer's payment
					 -- record

procedure tel_no(customer_name);	-- returns a given customer's telephone
					 -- number

procedure overdue(ndays);		-- returns set of a customers whose
					 -- payments are more than ndays
					 -- overdue

...etc.

All these procedures will have to make use of one or more "master files". (When represented in SETL, these "files" are likely to be sets of tuples representing records, maps sending customer names, or perhaps customer identifiers such as social security or account numbers, into associated records, etc.) Instead of insisting that these master files be passed as parameters to all the procedures that need to use them, it is more reasonable to make them available directly to every procedure, giving them easily recognizable variable names such as master_customer_file. To make this possible, SETL provides a special form of statement, called the var declaration. By writing

var master_customer_file;

at the start of the overall program in which the listed functions appear, we make master customer_file a global variable which designates the same object in all the procedures which reference this variable. The required layout of a program using one or more global variables is shown in the following example:

program banking_system;	-- header line for overall program
	var master_customer_file;	-- declaration of global variable
	-- (additional global variable declarations come here)
	-- (body of "main" program of banking_system comes here)

	procedure payments (customer_name); -- first procedure
	...
	end payments;

	procedure tel_no (customer_name); -- second procedure
	...
	end tel_no;
	
	procedure overdue (n_days);  -- third procedure
	...
	end overdue;

	-- (more procedures can come here)

end banking_system;

The statement

var master_customer_file;

appearing first in this example is called a declaration rather than an executable statement because it serves to establish the meaning of certain names rather than to trigger any particular calculation.

The simplest form of a var declaration is

var x1, x2, . . . , xn;

i.e., it consists of the keyword var followed by a comma-separated list of distinct variable identifiers.

Such declaration can appear in one of several positions:

  1. In a program, before the program's first executable statement. Variable identifiers appearing in such a declaration are defined to be global variables directly accessible to each following procedure in the program. A var declaration appearing in this position is called a global var declaration.

  2. In a procedure within a program, before the procedure's first executable statement. A var declaration appearing in this position is called a local var declaration. Variable identifiers appearing in such declarations are defined to be local variables accessible only within the procedure (and in sub-procedures nested within it). Since variable names not appearing in any var declaration are in any case local to the procedures in which they appear, var declarations appearing in this position often serve only to document the way in which a procedure uses its variables. However, if the procedure is recursive (see Section 5.4), var declarations appearing in it have a more significant effect, which will be described more fully later.

  3. In a package or package body. We postpone discussion of var declarations in packages, which are tools for organizing large SETL programs, to Chapter 7.

  4. At the start of a class or class body. We postpone discussion of var declarations in classes and class bodies to Chapter 8, where this important aspect of SETL is discussed.

Any number of var declarations may appear either at the start of a program or within a procedure, but all such declarations must precede the first executable statement of the program or procedure in which they appear. No variable should appear twice in var declarations (either global var declarations or declarations within a single procedure), nor is it legal for any procedure parameter name to appear in a global var declaration.

A global variable retains its value between invocations of the procedures that use it.

To sum up, there are two ways in which values can be communicated between separate procedures:

  1. By being passed as parameters or returned by procedures.

  2. By direct global communication, i.e., by being the values of variables which have been declared to be global and hence are accessible to more than one procedure.

Method (ii) is powerful, but potentially undisciplined, since it allows procedures to influence each other in ways that their invocations hide. It is therefore good programming practice to avoid using more than a very few declared global variables. Generally speaking, variables should be made global only if

  1. They represent major data objects accessed by more than one of a program's procedures, and their usage is subject to clearly understood rules of style which pervade the entire program.

  2. They represent flags or other conditions that many procedures need to test (e.g., to determine whether particular debugging traces should be produced), but that play no role in the normal functioning of these procedures and are rarely modified.

  3. They need to be shared between procedures that do not call each other and must be kept alive between successive invocations of these procedures.

  4. They represent constants, too complex to be set up conveniently using a const declaration (see the following section), which need to be used whenever a procedure is invoked.

  5. They need to be accessible to all logical copies of a recursive procedure (see Section 5.4).

The capitalize function appearing in Section 5.1 can be used to illustrate point (d). As written, this forms the map

capital_of := {["a", "A"], ["b", "B"], ["c", "C"],..., ["z", "Z"]}

each time it is invoked. To do this is of course wasteful of computer time. Using the const declaration described in the following section we would instead declare capital_of to be a constant having this value, but this requires writing out all the elements of capital_of explicitly, a nuisance since this involves typing 104 apostrophes, 51 commas, 52 brackets, etc. It is more convenient to declare

var capital_of;

and then to add the instructions

		small_letters :="abcdefghijklmnopqrstuvwxyz";
		big_letters :="ABCDEFGHIJKLMNOPQRSTUVWXYZ";
		capital_of := {[c,big_letters(i)]: 1 = small_letters(c)};

as part of a main program block before the first use of capitalize. The capitalize function then reduces to the following simple form:

	procedure capitalize(s);
		return +/ [capital_of(let)?let: let = s(i)];
	end capitalize;

5.2.1 var Declarations with Initialization Clauses; the const Declaration

One can attach an initialization clause to any one of the variable names appearing in a variable declaration like

var x1, x2, . . . , xn;
Such clauses have the same syntactic forms as the right-hand sides of assignment statements. They serve to define initial values for the variables listed, which would otherwise be OM. An example is

var x, y := 1, z := "abcd", w := {[x,x]: x in z};

It is often convenient to use a symbolic name for a constant appearing repeatedly in a program. Among other things, naming a constant and using its name rather than its explicit representation make it muchh easier to modify your program if modification subsequently becomes necessary. To define constants, one or more const declarations are used. These have muchh the same form as var declarations with initialization clauses, except that an initialization clause is required for each name listed in a const declaration. An example is

const x := "This must have an initialization", y := 1, z := "abcd", w := {[x,x]: x in z};

const declarations have muchh the same semantics as var declarations with initialization clauses, but the name values that they declare cannot be changed since no subsequent assignment to a name declared constant is allowed.

This example illustrates the following rules:

  1. Each const name must be a valid SETL identifier. By virtue of its appearance in const declaration, this identifier becomes a constant identifier, i.e., a synonym for the constant denotation, const_expn, that follows it in the const declaration. It retains this meaning throughout the scope of the identifier.

  2. Each const_expnj appearing to the right of an equals sign in a declaration like (1) must be a valid constant expression. Such expressions are built out of the following:

    1. Elementary constant denotations, each of which designates an integer, a real number, or a quoted string.

    2. Constant identifiers, i.e., identifiers of constants introduced by earlier const declarations.

      For example, it is possible to write

      		const one := 1
      		two := 2,
      		one_and_two := {one, two};
      

      This is equivalent to

      		const one1 := 1, two := 2;
      		const one_and_two := {1,2};
      

    3. Compound constant denotations can also appear in const declarations. Examples are

      		const complex_thing := [{"A",1},{"B",2},{{ }}];
      		const let_1 := "alpha",
      		let_2 := "beta",
      		let_map := {["A",let_1],["B",let_2]};
      
      
      		const two_pi := 2.0 * 3.14159;
      
      		const sixty_blanks := 60 * " ";
      

      As the preceding examples show, fairly general expressions can appear in initialization clauses, though of course they must all be evaluable in terms of constants that have appeared in earlier var and const declarations. The best way of determining whether a particular initialization clause is legal is simply to try it out. If it does not generate a syntax error it will work correctly.

5.3 Programming Examples

5.3.1 The buckets and well problem: a simple artificial intelligence example

The following kind of problem, often called the "buckets and well" puzzle, commonly appears on IQ tests. Suppose that one is given several buckets of various sizes, and that a well full of water is available. To focus on a simple specific case, suppose that just two buckets, a 3-quart bucket and a 5-quart bucket, are given. We are required to use them to measure out exactly 4 quarts of water. Since exactly this amount of water is to be measured out, no nonprecise operation is allowed. This means that only three kinds of operation can be used in a solution of this problem:

  1. Any bucket can be filled brim-full from the well;

  2. Any bucket can be emptied completely;

  3. Any bucket can be poured into any other, until either the first bucket becomes completely empty or the second bucket becomes brim-full.

As an example, the following is a way of measuring out exactly 4 quarts using only a 3- and a 5-quart bucket.

  1. Fill the 5-quart bucket.

  2. Pour the 5-quart bucket into the 3-quart bucket (leaving 2 quarts in the 5-quart bucket).

  3. Empty the 3-quart bucket. Pour the contents of the 5-quart bucket into the 3-quart bucket. (Now 2 quarts is in the 3-quart bucket, and the 5-quart bucket is empty).

  4. Fill the 5-quart bucket.

  5. Pour the 5-quart bucket into the 3-quart bucket, until the 3-quart bucket becomes full. (This leaves exactly 4 quarts in the 5-quart bucket.)

  6. Empty the 3-quart bucket. (Now exactly 4 quarts has been measured out.)

The fact that it is easy to program a computer to solve problems of this kind might be considered surprising, since such solutions are often considered to require intelligence. Nevertheless a systematic approach is not hard to find. The key idea is that of state. Specifically, as one moves through the steps of any solution to this kind of problem, the objects being manipulated (in this case, the buckets) will at any moment be in some particular condition. In the case we consider, this condition or state is determined by the amount of water in each of the buckets. We can represent this state as a tuple, of as many components as there are buckets. Initially, when both buckets are empty, the state is [0,0]. The target state for the example considered is that in which exactly 4 quarts has been measured into the 5-quart bucket; this is represented by the tuple [0, 4]. The state in which both buckets are completely full is [3, 5], that in which the 3-quart bucket is full and the 5-quart bucket is empty is [3, 0], etc. In this representation, the problem solution given by (i-vii) would be represented as the following sequence of states:

[0, 0], [0, 5], [3, 2], [0, 2], [2, 0], [2, 5], [3, 4], [0, 4]

This way of looking at the problem makes it plain that what we need to consider is the set of all possible states, and the manner in which new states can be reached from old. Suppose that the tuple state represents the amount of water currently in the buckets, so that state(i) is the amount of water in the i-th bucket, and that the tuple 'size' represents the sizes of all the given buckets, so that size (i) is the capacity of the i-th bucket. In the buckets and well problem, only the three manipulations (a), (b), and (c) are allowed. If bucket i is poured into bucket j until either i becomes empty or j becomes full, then the amount poured will be

state(i) min (size(j) - state(j)).

Hence the following procedure returns the collection of all states than can be reached in a single step from an initially given state:

	procedure new_states_from(state);

		return {empty(state,j): j in [1..#state]}
			+ {fill(state,j): j in [1..#state]}
			+ {pour(state,i,j): i in [1..#state], j in [1..#state] | (i /= j)};
	end new_states_from;

	procedure empty(state,j);		-- empties bucket j
		state(j) := 0;
		return state;
	end empty;

	procedure fill(state,j);			-- fills bucket j
		state(j) := size(j);	-- the "size" tuple is assumed to be global
		return state;
	end fill;
	
	procedure pour(state,i,j);               -- pour bucket i into bucket j

		amount := state(i) min (size(j) - state(j)); -- amount that can be poured
		state(i) -:= amount;			-- out of i and into j
		state(j) +:= amount;
		return state;

	end pour;

We can now solve our problem by a systematic process of state exploration. We start in the initial, all buckets empty, state to generate all the states that can be reached in one step from this starting state. Then we generate all states that can be reached in one step from these second-level states, etc. States that have been encountered previously are ignored; the ones that remain are precisely those which can be reached from the start in two steps but no fewer. From these, we generate all states which can be generated in three steps but no fewer, and so forth. As we go along, we check to see whether the target state has yet been reached. Eventually, we either reach the target state, thereby solving our problem, or find that no new states can be generated, even though the target state has not been reached. In this latter case, the problem clearly has no solution.

Figure 5.3 illustrates the notion of state search and shows some of the states that come up during search for a solution of our two-bucket example:

Figure 5.3 States of a Two-Bucket Problem; Bucket Sizes are [3,5]

Note that in this figure we only show transitions which lead to states that have not been seen before. Other transitions are redundant, since the shortest path from start state to the target state will never pass through the same state twice.

To be sure that we can reconstruct the path from start to target once the target has been reached, we proceed as follows. Whenever a new state ns is seen for the first time it will have been generated from some immediately preceding old state os. As states are generated, we keep a map reached_from which maps each new state ns into the old state os from which ns has been reached. Once the target state has been reached, we can use this map to chain back from the target to the start state. Then the desired solution is simply the reverse of the sequence thereby generated.

The following code implements this state-generation and backchaining procedure. It is deliberately written in a manner that hides all information concerning the structure of states, as well as all details concerning the way in which new states arise from old. This makes it possible to use the same routine to solve many different kinds of state-exploration problems.

procedure find_path(start,target);	-- general state-exploration procedure.

	reached_from := {[start,start]};	-- the start state is considered
						-- to have been reached from itself

	just_seen := {start};
		-- initially, only the start state has been seen
	
	got_it := false; 	-- we don't have the solution yet
	
	while just_seen /= { } loop	-- while there exist newly seen states
				
		brand_new := { };	-- look for states that have not been seen before

		for old_state in just_seen, 
			new_state in new_states_from(old_state)
    				| reached_from(new_state) = OM loop

			brand_new with:= new_state;	-- record a brand_new state
			reached_from(new_state) := old_state;	-- and record its origin

			if new_state = target then 
				got_it := true; 
				exit;	-- since problem has been solved
			end if;
		
		end loop;

		if got_it then exit; end if;	-- since problem has been solved
		just_seen := brand_new;		-- now the brand-new states
						-- define those which have just been seen
	end loop;

	if not got_it then return OM; end if;
					-- since all states have been explored, and the target
					-- has not been found, we know that no solution exists.

	-- at this point the target has been found, so we chain back from the target
	-- to reconstruct the path from start to target

	rev_path := [target];            -- initialize the path to be built

	while (last_state := rev_path(#rev_path)) /= start loop
		rev_path with:= reached_from(last_state);	-- chain backwards to the start
	end loop;

	return [rev_path(j): j in [#rev_path, #rev_path - 1..1]];	-- reverse the path

end find_path;

The following main program can be used to acquire a problem specification interactively and to invoke the find_path routine to solve it. Again we hide all problem-specific information in appropriate procedures.

program buckets;     -- Hann Xin divides wine

	var size;        -- global variable for storing problem specification
	
	prob_specs := get_prob_specs( );
	[start, target, size] := prob_specs;

	if (path := find_path(start,target)) = OM then
		print("This problem is definitely unsolvable");
	else
		print("The following sequence of states constitutes a solution:");
		for x in path loop print(x); end loop;
	end if;

	procedure new_states_from(state);

		return {empty(state,j): j in [1.. #state]}
			+ {fill(state,j): j in [1.. #state]}
			+ {pour(state,i,j): i in [1..#state], j in [1..#state] | (i /= j)};
	end new_states_from;

	procedure empty (state,j);		-- empties bucket j
		state(j) := 0;
		return state;
	end empty;

	procedure fill(state,j);			-- fills bucket j
		state(j) := size(j);	-- the "size" tuple is assumed to be global
		return state;
	end fill;
	
	procedure pour(state,i,j);               -- pour bucket i into bucket j

		amount := state(i) min (size(j) - state(j)); -- amount that can be poured
		state(i) - := amount;			-- out of i and into j
		state(j) + := amount;
		return state;

	end pour;

	procedure find_path(start, target);	-- general state-exploration
					-- procedure. text is on previous page
	...
	end find_path;

	procedure get_prob_specs;	-- acquires and returns specifications of problem
		-- this can be replaced  by a procedure that acquires problem specifications interactively,
		-- as explained in Chapter 10
		start := [0,0,0]; target := [1,1,0]; size := [2,4,7]; 
		return [start, target, size];

	end get_prob_specs;

end buckets;

Since the notion of problem state used in the foregoing is general and since we have written the find_path procedure and the main program block shown in a manner which insulates them from the details of the problems that they solve, we can use these procedures to handle any path-finding problem of the same general class as the buckets and well problem. Another amusing problem of this kind is the goat, wolf, and cabbage puzzle. In this puzzle, a man, who brings with him a goat, a wolf, and a cabbage, comes to a river which he must cross in a boat just large enough for himself and one but not two of the objects 'goat', 'wolf', and 'cabbage'. He can never leave the goat and wolf, or the cabbage and goat, alone together, since in the first case the wolf would eat the goat and in the second the goat would eat the cabbage. How is he to cross the river?

To develop a program to solve this puzzle, we have only to rewrite the new_states_from procedure and the parameterless get_prob_specs procedure. First, we need to decide on a representation of the states of the puzzle. We can designate the four objects appearing in the puzzle by their initials as "G", "W", "C", and "M" (man), respectively, and represent each state of the puzzle by a pair [l, r], where l designates the set of all objects remaining to the left of the river, and r designates the set of all objects that have been moved across the river. For example,

[{G, M},{W, C}]

represents the state in which the wolf and the cabbage have been moved across, and the man has returned to the left side of the river to get the goat. The start state is then

[{G, W, C, M},{ }]
and the target state is

[{ },{G, W, C, M}]

The new_states_from procedure appropriate for this problem can be represented as follows:

procedure new_states_from(state);

	[l, r] := state; -- "unpack" state into its "left" and "right" portions

	return if "M" in l then		-- the man is on the left
		{[l - {"M",x}, r + {"M",x}]: x in l | x /= "M" and is_legal(l - {"M",x})}
			+ if is_legal(l - {"M"}) then {[l - {"M"}, r + {"M"}]} else { } end if
						-- and can go right alone, or with one object
	else						-- the man is on the right
		{[l + {"M", x}, r - {"M",x}]: x in r | x /= "M" and is_legal(r - {"M",x})}
		+ if is_legal(r - {"M"}) then {[l + {"M"}, r - {"M"}]} else { } end if	-- and can go left alone, or with one object
	end if;
	
end new_states_from;

procedure is_legal(s);

	-- verify that goat and cabbage or goat and wolf, are not alone on the same side
	return "M" in s or not ({"G","C"} subset s or {"G","W"} subset s);

end is_legal;
Run this program and you will see how the puzzle can be solved.

5.3.2 More about Artificial Intelligence

Path-finding programs like those described in the preceding paragraphs have always been of interest to artificial intelligence researchers. Artificial Intelligence can be defined as the attempt to imbue computers with human-like capabilities. The workings of the human mind, although still profoundly mysterious, can be described as follows. Various extremely sophisticated perceptual systems, which operate far beloww the level accessible to consciousness, capture and decode events in the external world and pass their conclusions to consciousness. These conclusions appear as a never-ending stream of perceptions which tell us what we think is in the world, but reveal little about the way in which they arise. This is true both for visual and for sound perception, including perception of speech: even at a noisy party we are able to pick words and sentences out of the incoming flow of sound, without knowing how we filter out distractions or locate word or syllable boundaries. These perceptual mechanisms, which use muchh of the brain's active surface, constantly maintain a model of our environment, of patterns of motion in this environment, of our position in it, and even of such fine details as the faces of other persons present, with clues to their emotional reactions and likely actions. Failure of any one of the many perceptual mechanisms involved can lead directly to devastating diagnosis: inability to recognize words, or faces, or our own bodies, or objects at all; to know that we see; to know that we are blind.

Another major group of mental mechanisms, equally unconscious, monitor and coordinate the smooth motion of our bodies through space and in gravity. Failures here can have equally devastating effects, e.g. uncontrollable trembling or stuttering, uncorrectable by any conscious attempt to make one's hand or tongue pick up the desired cup or speak the few desired words. So small a cause as bad signals from the inner ear's tiny balance sensor can in a moment leave one barely able to crawl nauseated along the floor, as the world seems to whirl violently around one.

Attempts to give computers perceptual capabilities that can compete with those of humans belowng to specialized branches of artificial intelligence: Computer Vision, Computer Analysis of Speech, Written Language Analysis, Robotics. Alongside of these, another branch of the subject concerns itself with the duplication of more abstract mental capabilities: the ability to plan, reason, solve puzzles, prove mathematical theorems, program. This is also quite difficult, since reason itself is doubtless guided by its own essential stream of unconscious perceptions, which give human reasoning a sense of fitness, analogy, and direction that computers lack. The overall consequence is that humans can learn by assembling related fragments of information into useful wholes. Computers, still lacking this ability, must still be programmed. For a person, the Encyclopedia Britannica, or the Library of Congress, is a treasure-trove of usable information. For a computer, as for a squid, it is simply a mystery, even though the computer is far better than the squid at storing, alphabetizing, and to some extent categorizing this information, all without being able to use it. But the squid's visual abilities are far more advanced.

The central importance of the human ability to integrate interests artificial intelligence researchers in all means of generating structured wholes from initially unordered heaps of information. The path-finding routines described in the preceding section do this, and so have been muchh studied. How far can path-finding approaches be pushed? Though encouragingly general, they are ultimately overwhelmed by the size of the state spaces that they may need to search.

A small buckets-and-well problem like that considered above has a small state space: since the allowed operations always leave an integer number of quarts in each bucket, and since the two buckets have sizes 3 and 5, the set of possible states has at most 4 * 6 = 24 elements. But what if we allowed 40 buckets, with sizes up to 9 quarts? Then the number of states could be as muchh as 10 ** 40, i.e. 10,000,000,000,000,000,000,000,000,000,000,000,000,000 states: surely too many to search blindly. What then to do?

A decomposition strategy adapted to their special structure will often work for buckets-and-well problems. As an example, consider any bucket problem in which buckets of sizes s1 and s2 without any common factor appear. Examples are [3,5], [5,7], [6,7], etc. These two buckets can be used to measure out 1 quart. To do so, we just use the fact that the greatest common factor F of s1 and s2 can always be expressed either as A * s1 - B * s2 or as B * s2 - A * s1. (Euclid's algorithm for calculating F also calculates these A and B.) Since s1 and s2 have no common factor, F must be 1; so we have either A * s1 - B * s2 = 1 or B * s2 - A * s1 = 1. If the first formula holds we can fill the bucket of size s1 A times and pour it into the other bucket, which we just empty whenever it gets full. Clearly 1 quart must be left at the end. The case in which the other formula holds is just the same. Once measured out, this 1 quart can be poured repeatedly into any desired one of the other buckets, so we can clearly bring all the other buckets to any state we like.

Every pouring operation available to us always leaves at least one of the buckets completely full or completely empty. Hence a target state is only reachable if it includes at least one completely full or one completely empty bucket. Since this bucket can be emptied in one step if it is full, and vice-versa, we might as well suppose that it is empty in the target state we seek. Call this the 'spare' bucket. Using the '1 quart' technique just explained, bring all the buckets except the s1 and s2 quart buckets to their desired state, and put the target amount for the s1 quart bucket into the 'spare' bucket (which we assume is no smaller). Express the target amount W of water for the s2-quart bucket as W = A * s1 - B * s2 or W = B * s2 - A * s1. Using the 'repeated pouring' technique explained in the preceding paragraph, bring W quarts into the s2-quart bucket, using only the s1 and s2-quart buckets. Finally, pour the 'spare' bucket into the s1-quart bucket. The problem is now solved.

These arguments tell us that, in the cases considered, a state is reachable from the starting state if and only if it includes at least one completely full or completely empty bucket. So the total number of these accessible states is

2 * s2 * s3 * s4 * ... * sn + (s1 - 2) * 2 * s3 * s4 * ... * sn + (s1 - 2) * (s2 - 2) * 2 * s4 * ... * sn + ***

For example, if the vector of bucket sizes is [3,5,7,11,13,17,19,23], then 103,594,260 out of a total of 111,546,435 possible states are reachable from the start [0,0,0,0,0,0,0,0]. This formula is useful for the discussion of modified path-search procedures given beloww.

Ordinary software practice uses an army of programmers to generate a river of programs tailored to an endless variety of special situations. Research in artificial intelligences seeks to replace all this by finding a single magic key (or, perhaps, small box of magic keys): the one program, or complex of programs, smart enough to write all the programs we want, given only loose indications of what is wanted. Search programs like that considered above have often been regarded as first steps to this magic key. They must clearly be made capable of dealing with potentially enormous search spaces if they are to play this role. Toward this end, a wide variety of strategies have been attempted.

  1. Identification of intermediate points. If we can identify, a priori, some point P that must lie along the start-to-target path to be constructed, then our problem decomposes into two easier ones: find a path from the start S to P, and from P to the target T. If this a priori identification step can be repeated, direct construction of the desired path may become possible. This was the key to our solution of the 'Towers of Hanoi' problem considered in Section XXX. To move all the disks one must move the bottom-most one, and this can only be moved from a to b if all the smaller disks have first been moved to the third peg c. Hence all solution paths must certainly involve the intermediate state: large disk in original position - target peg clear - all other disks on third peg. This reasoning, iterated, leads directly to the problem solution given in Section XXX, and even tells us that this solution is of the optimal length: 2n - 1 for an n-ring Hanoi puzzle. The full state space for this puzzle has 3n points, since any ring could in principle appear on any of the 3 pegs. Our reasoning therefore allows us to ignore all but a fraction (2/3)n of the state space, which for an n = 20 Hanoi puzzle is .03%.

  2. Search pruning. If a priori reasoning can show that, if a path from S to T exists, then the shortest such path, or some other identifiable path, will or can omit certain pre-identified points, then we can check each new point to see if it is one of these points, and if so can drop it immediately.

  3. Search guided by heuristics. If the true distance of each state from the target state were known, search would always be easy: in each state S we could simply search for an adjacent state S' one step closer to the target, and move from S to S'. Even if no exact way of calculating this distance is available, we can hope to use some easy formula approximating this true distance as a guide to searching: this will at tell us when we are 'getting warmer'. To use such a formula, we prefer points which seem to lie close to the target to other points, always searching forward from points seemingly closest to the target, and only considering points at a greater estimated distance when all such points have been exhausted. A variant find_path procedure using this strategy is given beloww. It can be seen to work well for buckets-and-well problems.

  4. Relaxed search. We can begin by trying to form, not a real path between the specified start and target states, but an initial rough plan for a subsequent final final search. This 'plan' can consist of a series of states which we hope will be reachable from each other by paths that are easier to find than the full path from start to finish. To generate the plan we then take steps which are not themselves legal, but which we nevertheless have reason to believe can be 'fleshed out' by fuller sequences of legal steps. To realize the plan once it has generated, we simply search for links between its successive nodes. Nodes which cannot be linked to their successors, either after exhaustive search or after a search effort deemed sufficient, can be dropped from the plan, after which the 'fleshing out' process can resume, now trying to complete the rougher plan which results. When its planning component fails, this approach will eventually reduce to a simple, planless effort to connect the start and target.

  5. Decomposition. It may be possible to map the state-space of the problem we are trying to solve onto some more easily searched space, in such a way that any two nodes in the original space which map into adjacent nodes N_i, N_j in the reduced space can be connected by a path P_i_j in the original space that is relatively easy to find. We can then try to find a path N_1, N_2,...,N_k in the reduced space, from the mapped initial state to the mapped final state, and then try to connect a series of states S_1, S_2,...,S_k mapping to the steps N_1, N_2,...,N_k in this plan by paths in the original state space.

  6. Two-way search. As children interested in paper-and-pencil mazes quickly learn, it is often easiest to construct a path from a specified start to a desired target by working simultaneously forward from the start and backward from the target. This will be the case if the state space 'branches' rapidly. Suppose, for example, that the path to each point reached in state space can be continued in one of 10 ways. Then comprehensive search n levels deep from any given point will generate a set of 10**n points. Plainly blind search by this method for a path between a source and a target 14 steps away will fail in the situation considered, since (1/2) * 10**14 states would probably need to be examined before a path was found. But if we search simultaneously forward from the source and backward from the target until some common point is reached, we may only need to go 7 steps forward from the source and 7 steps backward from the target, and so might find a solution after examining only 10**7 states, a far more feasible collection.

All of these strategies are highly fallible, and need not all work well even for buckets-and-well problems, which at first glance seem so transparent. Nevertheless buckets-and-well problems provide an interesting laboratory for study of the search strategies we have listed. Not all the strategies listed are easy to apply. The combinatorial detail critical to buckets-and-well problems leaves us without any obvious way of identifying either intermediate states which a solution must traverse, or large numbers of states which can be omitted without disrupting all solutions. The easiest strategies to apply are the 'decomposition','relaxed search', and the 'search guided by heuristics' approaches. One way of trying to decompose a buckets-and-well problem is by bring one of the buckets to the desired target condition, and then work on the other buckets without using this 'finished' bucket. A closely related 'search guided by heuristics' approach is to use the number of buckets which have not yet reached their target content as a heuristic measure of distance from the desired target. A surprisingly elementary 'relaxed search' approach, explored beloww and seen to work well,is simply to form'rough' paths by allowing the contents of any bucket to be changed arbitrarily.

The following variant of our first find_path procedure implements a general form of 'search guided by heuristics' strategy. It always searches forward from points seemingly closest to the target, putting all other points in a set called 'set_aside'. When all points along what seems to be the most direct path to the target have been exhausted, search backtracks to the points in 'set_aside' which seem to be closest to the target.

This find_path procedure is instrumented by insertion of statements which print out the distance-to-target estimate being used whenever it changes. The two lines of code inserted for this purpose are shown in italics.

A estimated distance-to-target function which reflects our expectation that we can find a solution which brings more and more buckets into their target condition is given following the 'find_path' procedure. Experiments with this function show that it works rather well. For example, with buckets of sizes [3,5,7,11,13,17,19,23] the target [3,4,5,3,5,2,2,9] is found at the end of a 38-step path after searching just 6,543 of the 103,594,260 reachable states. With the distance-to-target feature turned off, we would expect about half the states to be searched.

With buckets of sizes [3,5,7,11,13,17,19,23,29] the target [3,4,5,3,5,2,2,9,22] is found at the end of a 34-step path after searching just 10,007 of the 3,020,137,890 reachable states. With buckets of size [3,5,7,11,13,17,19,23,29,31] the target [3,4,5,3,5,2,2,9,22,30] is found at the end of a 94-step path after searching just 41,473 of the 94,053,692,040 reachable states.

procedure find_path(start,target);
      -- general state-exploration procedure, incorporating use of estimated distance to solution

        reached_from := {[start,start]};        -- the start state is considered
                                                -- to have been reached from itself
		best_dist := dist_to_target(start,target);
		
        just_seen := {start};
                -- initially, only the start state has been seen
  		set_aside := { };
  		-- collection of states temporarily set aside, because state believed closer to solution is known      
        got_it := false;        -- we don't have the solution yet
        
        while just_seen /= { } loop     -- while there exist newly seen states

                brand_new := { };       -- look for states that have not been seen before

                for old_state in just_seen loop 
                					
			if dist_to_target(old_state,target) > best_dist then
				set_aside with:= old_state;		-- set this node aside, perhaps temporarily
				continue;		-- and do not generate new states from this node
			end if;

                     for new_state in new_states_from(old_state)
                                | reached_from(str(new_state)) = OM loop

                       reached_from(str(new_state)) := old_state;   -- and record its origin

			if (dtt := dist_to_target(new_state,target)) <= best_dist then
                      		brand_new with:= new_state;     -- record a brand_new state
				if dtt < best_dist then print("best_dist: ",dtt); end if;
                     		 best_dist min:= dtt;	-- we may now have a better distance
			else
				set_aside with:= new_state;	-- set this node aside, perhaps temporarily
			end if;
						
                        if new_state = target then 
                                got_it := true; 
                                exit;   -- since problem has been solved
                        end if;
                
                    end loop;		-- end for new_state

                just_seen := brand_new;         -- now the brand-new states
                                                -- define those which have just been seen

		if got_it then exit; end if;   -- since problem has been solved
		if #set_aside = 0 then exit; end if;	-- all states have been tried
		if #just_seen > 0 then continue; end if;
					  -- distance to target may still diminish

		-- otherwise we must backtrack, restarting with all the best of the states that have been set aside
		best_dist := min/[dist_to_target(state,target): state in set_aside];
		just_seen := {state in set_aside | dist_to_target(state,target) = best_dist};
		set_aside -:= just_seen;  -- the states now to be processed are no longer 'set aside'
				
		print("best_dist: ",best_dist);	-- instrumentation

       	     end loop;		-- end for old_state

        end loop;	-- end while

        if not got_it then return OM; end if;
                                        -- since all states have been explored, and the target
                                        -- has not been found, we know that no solution exists.

        -- at this point the target has been found, so we chain back from the target
        -- to reconstruct the path from start to target

        rev_path := [target];            -- initialize the path to be built

        while (last_state := rev_path(#rev_path)) /= start loop
                rev_path with:= reached_from(str(last_state));      -- chain backwards to the start
        end loop;

        return [rev_path(j): j in [#rev_path, #rev_path - 1..1]];       -- reverse the path

end find_path; 
The distance-to-target used for buckets-and-well problems should reflect our expectation that we can find a solution which brings more and more buckets into their target condition. Hence we simply estimate distance-to-target as the number of buckets which have not yet reached this condition. This function is:
	procedure dist_to_target(state,targ);			-- estimated distance to target
	--	return 1;		-- disable the estimate
		return #[t: t = targ(j) | t /= state(j)];
	end dist_to_target;

Plan-guided path construction. For buckets-and-well problems we can combine heuristically guided search with path planning by using this same pathfinding and distance-to-target function under the control of a top-level routine which first uses a relaxed new_states_from function to generate the plan, and then ties to flesh out the plan by filling in real paths between its steps. The relaxed new_states_from function can simply allow the state of any buckets to be changed arbitrarily. This is:

	procedure relaxed_new_states_from(state); 	-- variant new states function, for rough planning
		return {new_state: j in [1..#state], k in [0..size(j)] | (new_state(j) := k) /= OM}; 
	end relaxed_new_states_from;
The top-level routine is as follows:
	procedure find_path_by_planning(start,target); 
	var new_states_from;	-- allowed-step function used by find_path routine
		
			-- first allow 'roughly correct' steps, to generate a plan 
		new_states_from := relaxed_new_states_from;
		if (plan := find_path(start,target)) = OM then return OM; end if;
			  -- no plan can be found

			-- now use 'exact' steps, to fill in the plan 
		new_states_from := exact_new_states_from;
		details := [];		-- will show how j-th step of plan was filled in
		plan_index := 1;	-- next step of plan to be filled in
		
		while plan_index < #plan loop
		-- try to fill in the plan with exact steps, abandoning plan steps which don't work

			if (steps := find_path(plan(plan_index),plan(plan_index + 1))) = OM then
				-- cannot take this step of plan; drop a future plan step if possible
				if plan_index + 1 < #plan then
					plan(plan_index + 1) := [ ];
				elseif plan_index > 1 then
						-- drop the prior step of the plan, and back up
					plan(plan_index) := [ ]; plan_index -:= 1;
				else	-- impossible to connect start with target
					return OM;
				end if;
			
			else	-- record sequence of steps to the next plan point
				details(plan_index) := steps; plan_index +:= 1;
			end if;
			
		end loop;
			-- at this point we are done, and simply need to assemble all subsequences
			--  of steps into an overall solution, dropping repeated nodes
		return +/[steps(if j = 1 then 1 else 2 end if..): steps in details];
		
end find_path_by_planning; 
Experiment shows that the planning strategy shown above improves the efficiency of path-finding somewhat, though at the cost of an increase in the length of the paths found. For buckets of sizes [3,5,7,11,13,17,19,23] the target [3,4,5,3,5,2,2,9] is found after searching 743 instead of 22,628 of the 103,594,260 reachable states, but the length of the path found increases from 24 to 82. For the larger cases reported above results are as follows:

SizesTargetNodes SearchedPath LengthPrior SearchedPrior Length
[3,5,7,11,13,17,19,23,29][3,4,5,3,5,2,2,9,22]1,1269410,00734
[3,5,7,11,13,17,19,23,29,31][3,4,5,3,5,2,2,9,22,30]1,36825041,47394

5.4. Recursive Procedures

The value f(x) that a mathematical function of an integer, tuple, or set variable takes on for a particular x can often be expressed in terms of the value of the same function for "smaller" argument values x. Several examples of this general principle are

  1. The 'factorial' function n!, given by */ [i : i in [1.. n]], satisfies the identity

    n! = if n = 1 then 1 else n * ((n-1)!) end if;

  2. The sum sigma(t) = +/t of all the components of a tuple t satisfies the identity

    sigma(t) = if #t = 0 then OM elseif #t = 1 then t(1)
    else t(1) + sigma(t(2..)) end if;

  3. The tuple sort(s) representing the elements of a set s in sorted order satisfies the identity

    sort(s) = if #s = 0 then [ ] else [min/s] + sort(s less min/ s) end if;

This same function sort(s) also satisfies many other interesting identities. Suppose, for example, that we pick an arbitrary element x from the set s and then divide the remaining elements of s into two parts, the first, L, containing all elements less than x, the second, G, containing all elements greater than x. Then if we sort the elements of L and G and concatenate the resulting sorted tuples, sandwiching x between them, we clearly get a tuple t which contains all the elements of s in sorted order. This shows that the function sort(s) also satisfies the identity

	sort(s) = if (x := arb(s)) = OM then [ ] else
		sort({y in s: y < x}) + [x] + sort({y in s: y > x}) end if;

Identities of the kind appearing in the preceding examples are called recursive definitions, and the functions appearing in them are called recursively defined functions. Such recursive definitions all have the following features:

  1. For certain particular simple or minimal values (like n = 1 in (i) or t = [ ] in (ii)) of the parameter variable x of a recursively defined function f(x), the value of f(x) is defined explicitly.

  2. For all other parameter values x, the value of f(x) is expressed in terms of the value that f produces for one or more smaller argument values x1,x2, . . xn. That is, there exists a relationship of the general form

    f(x) = some_combination(f(x1),f(x2),..,f(xn))

  3. Repeated use of the relationship (b) will eventually express any value f(x) in terms of various values f(y) each of which has a parameter y which is minimal in the sense of (a), so that all values f(y) in terms of which f(x) is ultimately expressed are known explicitly.

Any recursive relationship satisfying (a, b, c) gives a method for calculating f(x) for each allowed argument x. Like many other programming languages, SETL allows one to express such recursive calculations very simply and directly, by writing recursive procedures, i.e., procedures which invoke themselves. This can be done for each of the three examples given, which then take on the following forms:

	procedure factorial(n); -- calculates the factorial n!
		return if n = 1 then 1 else n * factorial(n - 1) end if;
	end factorial;
	
	procedure sigma(t); -- calculates the sum of the components of t.
		return if #t = 0 then 0 elseif # t = 1 then t(1)
	   	 else t(1) + sigma(t(2..)) end if;
	end sigma;
	
	procedure sort(s);    -- recursive sorting procedure
		return if s = { } then [ ]
			else [min/s] + sort(s less min/ s) end if;
	end sort;
	
	procedure sort(s);    -- second variant of recursive sorting procedure
		return if (x := arb s) = OM then [ ] else
			sort({y in s | y < x}) + [x] + sort({y in s |  y > x}) end if;
	end sort;

These examples illustrate the following general remarks concerning recursive procedures:

  1. Syntactically, recursive procedures have the same form as other procedures. Their only distinguishing trait is that recursive procedures invoke themselves.

  2. The same name-scoping rules apply to recursive as to other procedures.

Note that a recursive procedure f(s) uses itself but always applies itself to arguments smaller than s; this is why the calculation of f eventually terminates.

A recursive procedure f need not invoke itself directly: It can invoke another procedure g which invokes f, or g can invoke some h which then invokes f, etc. A group of procedures which invoke each other is sometimes called a mutually recursive family of procedures, and any procedure belownging to such a mutually recursive family is itself called recursive.

For an example of such a mutually recursive family, consider the problem of defining an overall order for SETL objects, which will allow any two SETL objects to be compared to each other. (Such an order could, for example, serve as the basis for an output routine which alway arranged the elements of sets in increasing order, thereby making it easier to locate elements in large sets when they were printed.) To define such an order, we can agree on the following conventions:

  1. OM always comes first, integers before reals, reals before strings, strings before atoms, atoms before tuples, and tuples before sets.

  2. Among themselves, integers and reals are arranged in their standard order, strings in their standard alphabetical order, and atoms in the order of their external printed representations; i.e., if x and y are two atoms then x comes before y if and only if str(x) < str(y). (Recall that str(x) operator produces a string identical with the external printed form of the object x; see Section 2.5.3)

  3. Tuples are arranged in lexicographic order, i.e., t1 comes before t2 if, in the first component in which t1 and t2 differ, t1 has a smaller component than t2.

  4. To compare two sets, first arrange their elements in order. This allows them to be regarded as tuples; then apply rule (c).

The following mutually recursive group of procedures implements the ordering strategy we have just described.

procedure is_bigger(x,y);	-- return true if x >= y in the
			 	-- order just described

	return if x = y or y = OM then true
		elseif x = OM then false
		elseif type(x) /= type(y) then type_number(type(x)) > type_number(type (y))
		elseif is_integer(x) then x >= y
		elseif is_real(x) then x >= y
		elseif is_string(x) then x >= y
		elseif is_atom(x) then str(x) > str(y)
		elseif is_tuple(x) then lex_compare(x,y)
		else lex_compare(sort(x), sort(y)) end if; -- x and y are sets
	
end is_bigger;

procedure biggest(S);	-- find largest element in S,
			-- in the ordering defined by is_bigger.
		
	big := arb(S);

	for x in S loop
		if is_bigger(x, big) then		-- x may be biggest
			big := x;
		end if;
	end loop;
	
	return big;

end biggest;

procedure sort(S); if S = { } then return [ ]; else b := biggest(S); return sort(S less b) with b; end if; end sort; procedure lex_compare(t1,t2); -- compare two different tuples, -- in their lexicographic order, components being compared by is_bigger return exists c1 = t1(i) | is_bigger(c1,t2(i)); end lex_compare; procedure type_number(typ); -- converts typ, which is the -- name of a valid SETL type, into an integer tno := {["INTEGER", 1], ["REAL",2], ["STRING",3], ["ATOM",4], ["TUPLE",5],["SET",6]}; return tno(typ); end type_number;

Until now we have regarded recursive SETL procedures simply as SETL representations of recursive mathematical relationships and have ignored the question of how they are implemented, i.e., how the calculations which they define are actually performed. Our abstract view is really the best way to look at the matter, since the sequence of steps used to evaluate a recursive procedure can be complex and tricky to follow even when the mathematical relationship on which it is based is simple and easy to understand. Nevertheless one needs to understand how recursive calculations are performed. For example, when an incorrectly programmed recursive procedure malfunctions, one needs to know what is happening in order to diagnose the problem and correct it.

Implementation of recursive procedures, like that of mutually recursive groups of functions, is based upon the following rule. Whenever a procedure f invokes itself, a new logical copy of the procedure is created, initial parameter values are passed to this new logical copy, and execution of this new logical copy begins with its first statement. While the new copy of f is executing, the old copy of the function f, from which the new copy was created, remains in existence, but execution of it is suspended. The new copy can in turn invoke f, thereby creating a third copy of f, which can even go on in the same way to create yet a fourth copy, etc. However, if the recursion has been written correctly, the arguments x passed to the successive copies of f will be getting smaller and smaller. Eventually one of them will get small enough for the corresponding value f(x) to be evaluated directly. Once this happens, the currently active copy of the procedure f will execute a statement

return e;

for some directly evaluable expression e. This will pass the value of e back to the place from which the current copy of f (call it CCF) was invoked. CCF will then become superfluous and will disappear. The immediately prior copy of f will then become active, and when it finishes its execution it will in turn pass a value back to the copy of f from which it has been invoked and disappears, etc. Eventually a value, and control, will be returned to the very first copy of f, and the whole recursive evaluation will be completed as soon as this first copy executes a return statement.

As an example of this process of recursive evaluation, suppose that the recursive sort routine shown earlier in this section is invoked, and that initially the argument value {30,0,60,40} is transmitted to it. This will trigger the following steps of recursive evaluation.

  1. Copy 1 of sort begins to evaluate sort({30, 0, 60, 40})

  2. The minimum element 0 is removed from the set s, and sort is invoked recursively to evaluate sort({30,60,40})

  3. Copy 2 of sort begins to evaluate sort({30, 60, 40})

  4. The minimum element 30 is removed from the set s, and sort is invoked recursively to evaluate sort( {60, 40} )

  5. Copy 3 of sort begins to evaluate sort({60,40})

  6. The minimum element 40 is removed from the set s, and sort is invoked recursively to evaluate sort({60})

  7. Copy 4 of sort begins to evaluate sort({60})

  8. The minimum (and only) element 60 is removed from the set s, and sort is invoked recursively to evaluate sort({ }).

  9. Copy 5 of sort immediately returns [ ] as the value of sort({ }) to copy 4 and disappears.

  10. Copy 4 of sort appends the returned value [ ] to [60], returns the result [60] to copy 3, and disappears.

  11. Copy 3 appends the returned value [60] to [40], returns the result [40, 60] to copy 2, and disappears.

  12. Copy 2 appends the returned value [40, 60] to [30], returns the result [30, 40, 60] to copy 1, and disappears.

  13. Copy 1 appends the returned value [30,40,60] to [0], and returns [0, 30, 40, 60], as the final result of the whole recursive evaluation, to the place from which sort was first invoked.

The complexity of this sequence of steps underscores the fact that whenever possible a recursive SETL function like sort should be looked at as the transcription of a recursive mathematical relationship, in this case, the very obvious relationship

sort(s) = if s = { } then [ ] else [min/s] + sort(s less min/ s) end;

rather than in terms of the sequence of steps required for its evaluation. However, the way in which recursive procedures are evaluated becomes relevant if they are miswritten and consequently malfunction. Certain common pathologies are associated with malfunctioning recursive routines, and one needs to be able to recognize them when they appear. A common error is to write a recursion which does not handle its easy, directly evaluable cases correctly, or which for some reason never reaches a directly evaluable case. If this happens, a recursive procedure will create more and more copies of itself without limit, until the entire memory of the computer on which it is running is exhausted, and a final, "MEMORY OVERFLOW" error message is emitted.

In somewhat more complex cases, a malfunctioning recursive procedure will loop indefinitely, first creating additional copies of itself, then returning from and erasing these, then again creating new copies of itself, again returning from and erasing these, etc., without any overall progress to termination. Such a nonterminating recursive loop is likely to produce muchh the same symptoms as a nonterminating while loop; namely, the program will run on, either with no output or with a flood of repetitive output, until somebody notices that it has outrun its time limit and terminates it forcibly. This situation is most easily diagnosed at an interactive terminal, simply by printing out the parameters transmitted to the recursive function each time it is invoked; this pattern of parameters will fail to show the logical pattern upon which your hopes for eventual termination of the recursion rest.

Having said all this, we now go on to describe another interesting recursive procedure, appropriately called quicksort.

5.4.1 The quicksort procedure

This quicksort sorting method works as follows: If the tuple t of elements to be sorted has no elements or just one element, we have nothing to do, since an empty tuple or a tuple with just one element is always sorted. Otherwise,

Figure 5.4 Quicksort Procedure

we remove the first element x from t and divide what remains into two parts, the first ("small_pile") consisting of all those components smaller than x, the second ("large_pile") consisting of all those components at least as large as x. We then sort these two piles separately. This can most readily be done just by using quicksort itself recursively. Finally, we recombine to get all the original components in their sorted order. This is done by putting the sorted small_pile first, followed by the element x, and then followed by the sorted large_pile.

See Figure 5.4 for further explanation of the way in which quicksort works. Code for this procedure can be written as follows:

procedure quick_sort(t);		-- quicksort procedure, first form

if #t < 2 then return t; end if;

	x := t(1);		-- take the first component
	small_pile := [y: y = t(i) | y < x];
	large_pile := [y: y = t(i) | y >= x and i > 1];
	
	return quick_sort(small_pile) + [x] + quick_sort(large_pile);

end quick_sort;

By using SETL expression features more strenuously, we can write this whole procedure in just one statement, namely as

procedure quick_sort(t); -- quicksort procedure, second form

	return if #t < 2 then t
	else
		quick_sort([y: y = t(i) | y < t(1)]) + [t(1)]
		+ quick_sort([y: y = t(i) | y >= t(1) and i > 1])
	end if;

end quick_sort;

5.4.2 Another recursive procedure: mergesort

The quicksort procedure that has just been presented sorts by separating an array to be sorted into two piles which can be sorted separately and then combined. This recursive approach, sometimes called divide and conquer, forms the basis for many efficient data-manipulation algorithms. It is often most effective to divide the problem given originally into two halves of nearly equal size. Quicksort does not always lead to this equal division, since random selection of a component x of a tuple t may cause it to be divided into parts [y: y in t | y < x] and [y: y in t | y > x] which are very different in size. For this reason, we will now describe another recursive sorting technique, called mergesort, which does begin by dividing the tuple t that is to be sorted into two parts of equal size. This procedure works as follows:

  1. Divide t (at its middle) into two equal parts t1 and t2, and sort them separately.

  2. Then merge the two sorted parts t1, t2 of t, by removing either the first component of t1 or the first component of t2, whichever is smaller, and putting it first in the sorted version of the full tuple t, after which we can continue recursively, merging the remaining components of t1 and t2. Code for this procedure is as follows:

procedure sort(t); -- recursive merge_sort procedure
	return if #t < 2 then t
			-- since a tuple of length 0 or 1 is ipso facto sorted
	else merge(sort(t(1..#t/2)), sort(t(#t/2 + 1..))) end if;

end sort;

procedure merge(t1,t2); -- auxiliary recursive procedure for merging

	return if t1 = [ ] then t2
	elseif t2 = [ ] then t1
	elseif t1(1) < t2(1) then [t1(1)] + merge(t1(2..),t2)
	else [t2(1)] + merge(t1, t2(2..)) end if;
	
end merge;

Instead of programming the merge procedure recursively, we can write it iteratively. For this, we have only to work sequentially through the two tuples t1 and t2 to be merged, maintaining pointers i1, i2 to the first component of each which has not yet been moved to the final sorted tuple t being built up. Then we repeatedly compare t1(i1) to t2(i2), move the smaller of the two to t, and increment the index of the component that has just been moved to t. This revised merge procedure is as follows:

procedure merge(t1,t2);		-- iterative variant of merge procedure

	t := [];		-- merged tuple to be built up
	i1 := i2 := 1;		-- indices of first components not yet moved

	while i1 <= #t1 and i2 <= #t2 loop
 	
 		if t1(i1) < t2(i2) then -- move t1(i1) to t
			t with := t1(i1);
			i1 +:=1;
 		else                -- move t2(i2) to t
			t with:= t2(i2);
			i2 +:= 1;
		end if;

	end loop;

	return t + t1(i1..) + t2(i2..);
		-- note that at most one of t1(i1..) and t2(i2..) is non-null

end merge;

5.4.3 Binary searching: a fast recursive searching technique

If the components of a tuple t are arranged in random order, then to find the component or components having a given value we must search serially through every one of the components of t. Clearly no component of t can go unexamined, since this may be precisely the component we are looking for. On the other hand, if the components of t are numbers or character strings, and if they are arranged in sorted order, then, as everyone who has ever looked up a word in a dictionary or a name in a telephone book should realize, a muchh faster searching procedure is available. The most elegant expression of this searching procedure is recursive and is as follows:

  1. Compare the item x being sought to the middle item t(#t / 2) of the sorted tuple t. If x is greater than (resp. not greater than) this middle item, search recursively in the left half of t, otherwise in the right half of t.

  2. The search ends when the vector in which we are searching has length equal to 1.

In coding this procedure, we maintain two quantities lo, hi, which are respectively the low and the high limits of the zone of t in which we must still search. When the search procedure is first called, lo should be 1 and hi should be #t. When lo and hi become equal, we return their common value. If this locates a component of t equal to x, we have found what we want; otherwise we can be sure that x is not present in t, i.e., that no component of t is precisely equal to x.

Recursive code for this searching procedure is as follows:

	procedure search(x, t, lo, hi);
			-- binary search for x in t between lo and hi
	
	return if lo = hi then lo
		elseif x <= t(mid := (lo + hi)/2) then search (x,t,lo,mid)
		else search (x,t,mid + 1,hi) end if;
		
	end search;
It is easy to express this search iteratively rather than recursively: we can simply write
	procedure search(x, t); -- iterative form of binary search procedure
		lo := 1; hi := #t; -- initialize search limits

		while lo < hi loop

			if x <= t(mid := (lo + hi)/2) then
				hi := mid;
			else
				lo:= mid + 1;
			end if;

		end loop;

		return lo;

	end search;

Binary searching can be enormously more efficient than simple serial searching. Suppose, for example, that the sorted tuple t to be searched is of length 1,000,000. Then to search t serially several million elementary operations will be required. On the other hand, since 1,000,000 is roughly 2**20, only 20 probes will be required to locate a component of t by binary searching. So binary searching is roughly 50,000 times as fast as serial searching for sorted tuples of this length. This illustrates the vast efficiency advantage that can be gained by proper choice of the algorithm that you will use.

5.4.4 The Towers of Hanoi problem

Among the many different kinds of puzzles that can be bought in toyshops, the Towers of Hanoi puzzle is a classic. This puzzle involves a board with three identical pegs and a set of rings of decreasing size that fit snugly around any of the pegs. As initially set up, the puzzle is as shown in Figure 5.5.

Figure 5.5 The Towers of Hanoi Problem

To solve the puzzle one must move all the disks from the particular peg (peg 1) on which they are originally placed to one of the other pegs (say, to peg 3). However, only one disk can be moved at a time, and it is forbidden ever to place a larger disk on top of a smaller disk.

Recursion gives us an amazingly effective way of writing a solution to this problem. The key idea is this: since a large disk can never be placed atop a smaller, all the disks except the bottom one must be moved to peg 2 before we can move the bottom disk from peg 1 to peg 3. Hence, to move a pile of n disks from peg 1 to peg 3, we must

  1. move a pile of (n-1) disks from peg 1 to peg 2

  2. move the n-th disk from peg 1 to peg 3

  3. move a pile of (n-1) disks from peg 2 to peg 3

The following elegant recursive procedure generates the sequence of moves required; each move is represented as a pair [f,t] showing the pegs from which and to which a peg is moved.

	procedure moves(ndisks,fr,to,via); -- moves n disks from peg fr to peg to
	
		return if ndisks = 1 then [[fr, to]]
		else moves(ndisks - 1, fr, via, to) + [[fr, to]]
			+ moves(ndisks - 1, via, to, fr) end if;
	
	end moves;

5.5 Procedures that Modify Their Parameters

The procedures we have seen so far are given some collection of parameter values and calculate a single result value, which it returns, from them. Occasionally, however, one wants to use procedures in a somewhat different way; namely, one wants to invoke a procedure expressly in order to modify some object that already exists. In this case, such a procedure is invoked for its effect, rather than for the value it delivers. This use of procedures moves us away from the notions of "value" and "expression" and focuses more on the somewhat different notion of program state, i.e., the collection of all values that local and global variables have at each moment during a computation. What we will be describing in this section is the way in which procedures are used to modify this program state. There are two ways in which procedures can have this effect: one of them is to modify one or more of their calling parameters; the second is to modify one or more global variables.

This use of procedures is perfectly legal in SETL and is accomplished as follows. A procedure's header line lists its parameters, as for example in

procedure my_proc(x,y,z);

Parameters listed in this way can be modified within the body of the procedure (i.e., within my_proc), but parameter values are ordinarily local to the procedure, so that these modifications are not be transmitted back to the point from which the procedure was invoked. For example, if we define the procedure

				procedure change_parameter(x);            (1a)
					x := 0;
					return x;
				end change_parameter;

and invoke it by

				y := 1;
				z := change_parameter(y);		(2)
				print("z is: ", z, " y is:", y);

then the print statement will produce the output

z is: 0 y is: 1

This reflects the fact that the return statement in the procedure returns the final value of the variable x (which is local to the procedure), but that modifications to the procedure parameter x are not transmitted back to the point of invocation and therefore do not affect the value of the actual argument y appearing in (2). Thus the argument y remains unchanged.

This is the rule which ordinarily applies to procedures, and which is most appropriate for procedures used as functions. However, it is possible to bypass this rule, and to create procedures which do modify one or more of the actual arguments with which they are invoked. To do this, one simply prefixes the qualifier rw (meaning read/write parameter) to each parameter corresponding to one of these modifiable arguments. Suppose, for example, that we modify the procedure (1a), making it

			procedure change_parameter(rw x);
				x := 0;
				return x;
			end change_parameter;

Then the output of the print statement in (2) will change to

z is: 0 y is: 0

reflecting the fact that now changes in the value of the parameter x of the procedure (1b) will be transmitted back to the point from which the procedure was invoked.

Procedures whose parameters are qualified in this way will generally not be used as functions that return values (though technically it is legal to use them as functions). Instead, they will ordinarily be invoked simply by writing their names followed by their actual argument lists, as is illustrated by

		y:= 1;
		change_parameter(y); print("y is:", y);		(3)

which produces the output

y is: 0

Any procedure my_proc(x1,..,xn) can be invoked in this way, simply by writing a statement of the form

				my_proc(a1,..., an);              (4a)

where a1,...,an is any list of expressions (called, as usual, the actual arguments of the invocation (4a)). An invocation like (4a) is logically equivalent to an invocation

		junk_variable := my_proc(a1,...,an);		(4b)

where junk_variable can be the name of any variable whose value is never used for anything else.

Of course, if the procedure my_proc invoked by (4a) does not modify any of its arguments, an invocation like (4a) will generally not be very useful, since none of the arguments a1,...,an will change, and since the value returned by my_proc is simply thrown away. On the other hand, if the procedure my_proc does modify its arguments, then the invocation (4a) will trigger corresponding modifications of any arguments which correspond to parameters carrying the qualification rw.

Procedures which modify some of their arguments and which are normally invoked in this way are often called simple-procedures, as distinct from functions, i.e. from procedures which do not modify their arguments and are normally invoked in the manner illustrated by

x := my_proc(a1,...,an);

Since the value returned by a simple-procedure will just be thrown away, the expression e appearing in a statement

return e;

within such a procedure is usually without significance and may as well be OM. SETL allows

return OM;

to be abbreviated simply as

return;

and this is the form of the return statement which is appropriate to use in simple-procedures. Note also that a return statement immediately preceding the trailer line of a simple-procedure can be omitted.

Simple procedures with no parameters and which do not return any value can be invoked just by writing their names followed by a semicolon, as in

	my_simple_proc_without_parameters; -- invokes procedure with this name.

As an example, here is a simple-procedure which "compresses" a tuple by dropping out all of its OM components:

			procedure compress (rw t);
				t := [x in t | x /= OM];              (5a)
			end compress;

(Here we have made use of one of the rules stated previously to save writing a return statement just before the trailer line of this proc.)

Note that if x initially has the value [1,OM,OM,OM,2,OM,3], then the invocation

			compress(x);                (6a)

will give x the value [1,2,3].

As a matter of style, note also that instead of writing (5a) we could have written a closely related function, namely,

			procedure compress (t);
				return [x in t | x /= OM];             (5b)
			end compress;

in which case would have had to write

			x := compress(x);               (6b)

to get the effect of (6a). The form (6a) is sometimes slightly more convenient to write, and it is this convenience that can induce us to write a simple-procedure rather than a function for some purpose we have in mind.

In addition to the parameter qualifier rw, two additional qualifiers rd and wr are provided. These parameter qualifiers have the following significance:

rd read parameter: can be read and written within its procedure, but modifications to it will not be transmitted back to the corresponding actual argument.
rw read/write parameter: can be read and written within its procedure, and modifications to it will be transmitted back to the corresponding actual argument.
wr write-only parameter: can be written and will be transmitted back to the corresponding actual argument, but will not be read.

If none of these qualifiers is attached to a particular procedure parameter, the parameter will be treated as if it were qualified with rd. Thus rd is the default qualifier for otherwise unqualified parameters of procedures.

Next suppose that a procedure called my_proc has one parameter x which is qualified with rw or wr. In this case an invocation

					my_proc(e);                (7a)

of my_proc is translated by introducing an otherwise unused temporary variable (call it var), and treating (7a) exactly as if it were

					var := e;
					my_proc(var);           (7b)
					e := var;

The last line indicates that the only expressions which can appear as actual arguments in place of parameters qualified by rw or wr are those which can legally appear to the left of an assignment operator. (See Section 3.12 for a comprehensive discussion of these assignment targets). This means that the invocations

my_proc(3);

and

my proc(x + y);

are illegal, but

my_proc(tuple(x));   and   my_proc([x,y]);

are legal and translate as

					var := tuple(x);
					my_proc(var);
					tuple(x) := var;
and
					var := [x,y];
					my_proc(var);
					[x,y] := var;

respectively.

One final, rather esoteric, point deserves mention. Actual argument values are transmitted to a procedure and become the values of its formal parameters immediately upon invocation of the procedure. These values are transmitted by copying; i.e., each parameter receives a logically independent copy of the appropriate actual argument value upon procedure invocation. If the procedure modifies its parameters, it is these copied values that are modified while the procedure runs; the original argument values remain unchanged. Moreover, even if the procedure transmits changes in its parameter values back to the point of invocation, these changes are only transmitted when the procedure executes a return, at which time an assignment like that appearing in (7b) takes place. These rules are natural enough and normally require little thought. However, examples which show their effects can be contrived. For example, consider the following code, in which the variable y is global:

program esoteric;

	var x, y;		-- This declaration makes x and y global
	x := "initial_val_of_x"; y := "initial_val_of_y";

	manipulate(x,x,y);		-- invoke procedure shown beloww
	print("y is: ", y);

	procedure manipulate(rw u,rw v,rw w);

	print("u is ", u, " v is ", v);
	-- this will print: u is initial_val_of_x v is initial_val_of_x

	u := "changed,";

	print("u is ", u, " v is ", v);
	-- this will print: u is changed, v is initial_val_of_x

	-- Note that u and v remain different even though the
	-- corresponding actual arguments are the same

	w := "mangled";


	print("w is ", w, " y is ", y); -- note that y is global
	-- this will print: w is mangled, y is initial value_of_y

	-- note that y is still unchanged, even though the change in
	-- w will be transmitted back to y when we return from this procedure
	
	end manipulate;
	
end esoteric;
Note finally that the last line of output produced by this program, which will be produced by the print statement (in line 5 of the program) which immediately follows the invocation of manipulate, will be

y is mangled,

since after return from 'manipulate' y gets the value assigned to w by 'manipulate'.

EXERCISES

1. Write a procedure whose inputs are a tuple t of integers and a tuple s of integers in increasing order, and which returns a tuple t1 of length #s + 1 defined as follows: the first component of t1 is the number of components of t which are not greater than s(1); for j between 2 and #s, the j-th component of t1 is the number of components of t which are greater than s(j-1) but not greater than s(j); and the last component of t1 is the number of components of t which are greater than the last component of s. Try to make your procedure efficient.

2. "Bags," used in some programming languages, are like sets, but each element of a bag can occur several times (i.e., any specified number of times). In SETL, a bag b can be represented in two obvious ways.

(a) by a tuple: i.e., the elements of B can be arranged in some arbitrary order and made the components of a tuple; or

(b) by a map, which sends each element of B into the number of times that it occurs in B.

Write a pair of procedures that convert between these two different representations of a bag B. Also, write a collection of procedures which extend the following set operations to bags in the most useful way:

(i) b1 + b2, b1*b2, and b2-b2 (where b1 and b2 are bags)

(ii) x in b (where b is a bag and x is arbitrary)

3. The following table describes the tax due on D dollars of taxable income. Write a procedure which, given D, will return the amount of tax due.

Income OverBut Not OverTax
2,3003,40014%
3,4004,000154 + 16% of Amount Over --3,400
4,0006,500314 + 18% of Amount Over 4,400
6,5008,500692 + 19% of Amount Over 6,500
8,50010,8001,072 + 21% of Amount Over 8,500
10,80012,9001,555 + 24% of Amount Over 10,800
12,90015,0002,059 + 26% of Amount Over 12,900
15,00018,2002,605 + 30% of Amount Over 15,000
18,20023,5003,565 + 34% of Amount Over 18,200
23,50028,800 5,367 + 39% of Amount Over 23,500
28,80034,1007,434 + 44% of Amount Over 28,800
34,10141,5009,766 + 49% of Amount Over 34,100
41,50055,30013,392 + 55% of Amount Over 41,500
55,30081,80020,982 + 63% of Amount Over 55,300
81,800108,30037,677 + 68% of Amount Over 81,800
108,300----------55,697 + 70% of Amount Over 108,800

4. Write a program which will read in a sequence of lines, each containing someone's name, first name first, and print out an alphabetized list of these names, in alphabetic order of last names. Repeat this exercise, but this time print the alphabetized list with last names first.

Three Exercises on Permutations

A permutation is a one-to-one mapping of a set s of n items into itself. If the set s consists of the integers from 1 to n, then such a permutation can be represented as a vector v of length n such that every integer from 1 to n appears as a component of v. The following exercises concern various properties of permutations.

5. The product prod(v1,v2) of two permutations v1 and v2 is the vector v such that v(i) = v1(v2(i)) for each i in {1.. #v}. The identity permutation e of n integers is the permutation represented by the vector [1,2,..,n]. The inverse inv(v) of a permutation is the permutation such that prod(v,inv(v)) = e. Write two SETL procedures prod and inv which realize these operations. Test them with the help of a procedure rand_perm(n) that generates a different random permutation of the integers from 1 to n each time it is called.

6. Check the following facts concerning permutations by generating a few random permutations and verifying that each fact asserted holds for these permutations. (The routines described in Ex. 5 should be used for this purpose.)

  1. The product of two permutations is a permutation, and the product of permutations is associative.

  2. prod(inv(v), v) = e for each permutation v.

  3. prod(inv(u), inv(v)) = inv(prod(v, u)) for any two permutations u, v of n elements.

  4. Define power(u, k) to be the product of k copies of the permutation v. Check that power(v, j + k) = prod(power(v,j), power(v, k)). Check that for each permutation v there exists a positive integer k such that power (v, k) = e.

  5. Is prod(u, v) = prod(v, u) true for every pair u, v of permutations of n items?

7. A simple recursive procedure to generate all the permutations of the elements of a set s is the following:

	procedure permutations(s);

		if s = { } then
			return {[ ]};
		else
			return {[x] + P: x in s, P in permutations(s less x)};
		end if;

	end permutations;

It is often more convenient to generate permutations one by one, by successive calls to a generating procedure. For example, a program to generate all permutations (rearrangements) of the integers 1 thru n can be built up as follows. Start with the numbers in the sequence s = [1. . n]. Then repeatedly find the last element s(j) in the sequence s such that s(j + 1) > s(j). Let s(i) be the last element following s(j) such that s(i) > s(j). Interchange s(i) with s(j), and then reverse the sequence of elements following the j-th position. This gives the next permutation s.

Write this permutation-generation procedure in SETL, and use it to write out the list of all permutations of the integers 1 thru 5. Use this same procedure to create a program which reads in a string of length 5 and prints it out in all possible permutations, but without any repetitions.

8. If a second-order polynomial P(x) = A*(x**2) + B*x + C with integer coefficients A, B, C has a first-order polynomial M* x + N with integer coefficients as a factor, then M is a factor of A and N is a factor of C. Write a procedure which uses this fact to test polynomials like P(x) to see whether they can be factored and that produces the two factors of P if P can be factored. How efficient can you make this factorization procedure? Can you devise a similar procedure for factoring third-order polynomials with integer coefficients?

9. Many years ago, tokens on the New York City subway system cost 60 cents. Tokens are sold at change booths. Purchasers normally pay for tokens without saying anything, simply by passing a sum of money to the token booth attendant. Certain sums of money (e.g., $1, which will purchase only one token) are unambiguous. Others, like a $5 bill, are ambiguous, since they will purchase anywhere from one to eight tokens. On the other hand, $5.50 is unambiguous, since the likely reason for adding the last 50 cents is to pay for nine rather than just eight tokens. Write a program which will read a tuple designating a collection of bills and coins, decide whether this is ambiguous or unambiguous, and print out an appropriate response (which might be either 'How many tokens do you want?' or 'Here are n tokens').

10. Before Britain began to use decimal coinage, its money consisted of pence, shillings worth 20 pence each, and pounds worth 12 shillings each. Write a procedure to add sums of money represented in this way, reducing the sum to pounds, shillings, and pence. (Sums of money can conveniently be represented as triples.) Write a procedure that will subtract sums of money represented as pounds, shillings, and pence, and which could have been used to make change in predecimal British shops.

11. Write a function whose argument is a tuple t with integer or real coefficients, and which returns the positions of all the local maxima in t, i.e., all the components of t which are larger than either of their neighboring components.

Exercises on Recursion

12. The greatest common divisor gcd(x,y) of two positive integers is the largest positive integer z such that (x mod z) = 0 and (y mod z) = 0. (If x and y are equal, then gcd(x, y) = x). Write procedures each of which calculates gcd(x, y) efficiently by exploiting one of the following mathematical relationships:

(a) gcd(x,y) = gcd(x - y,y) if x > y

(b) gcd(x,0) = x and gcd(x,y) = gcd(x mod y,y) if x > y.

(c) gcd(x,y) = 2 * gcd(x/2, y/2) if x and y are both even.

(d) gcd(x,y) = gcd(x/2, y) if x is even and y is odd

(e) gcd(x,y) = gcd(x - y, y) if x and y are both odd and x > y.

13. Suppose that we make the gcd procedure of Ex. 12 into an infix operator gcd and then evaluate gcd/ s for a set s. What result does this produce? Assuming that s1 and s2 are non-null sets, is the identity

gcd/ (s1 + s2) = (gcd/s1) gcd/ s2

always true? What will happen if on,e of s1 or s2 is null?

14. A rational number m/n (with integer numerator and denominator) can be represented in SETL as an ordered pair [m,n]. Using this representation, write definitions for procedures called rs, rd, rp, and rq, which respectively form the sum, difference, product, and quotient of two fractions. These procedures should reduce fractions to lowest terms, for which purpose one of the gcd procedures developed in Ex. 12 will be found useful.

15. Supposing that fractions have the representation described in Ex. 14, write a procedure which takes a set of fractions and sorts them into increasing numerical order.

16. The following mathematical relationships can be used as the basis for recursive procedures for calculating various mathematical functions. Write out appropriate recursive procedures for each of these functions.

(a) The value x occurs as a component of a tuple t if and only if it occurs either as a component of the left half of t or as a component of the right half of t.

(b) The sum of all the components of a tuple t of integers is the sum of the left half of t plus the sum of the right half of t.

(c) The reverse of a tuple t is the reverse of its right half, followed by the reverse of its left half.

Think of at least four other relationships of this kind, and write out recursive procedures based on these relationships.

17. The Fibonnacci numbers</