An *object* is defined by an internal collection of values representing its *state* and an associated collection of operations which can be used to sense and manipulate that state, and possibly to combine the object's internal state values with similar values drawn from other objects. Objects can be used to represent anything not directly or easily represented by one of SETL's built-in set-theoretic types. In particular, all of SETL's advanced facilities for user interaction, as presented in Chapter 10, are built using the object facilities explained in this section. These object facilities also serve to ease use of other external-language facilities availble to SETL by enclosing them in helpful 'syntactic wrappers'.

The operations defined on an object are called its *methods*, and are conceptually similar to procedures. But an important difference between a method and a procedure is that a method's name specifies an operation to be performed, but leaves the specific procedure to be used in performing the operation to be chosen dynamically, in a manner dependent on the type of the object to which the operation is to be applied. To illustrate the difference between these ideas, suppose we have two objects, one built around a bitmap and the other around an encapsulated Postscript graphic. Each of these objects might naturally have a method called *draw*, which draws it onto the screen (or elsewhere). We might then find ourselve working with a variable x that could be either a bitmap or Postscript graphic. Then the expression

will call either the draw method defined on bitmaps or the method defined on Postscript graphics, depending on the value of x at the time the expression is evaluated. A method is therefore something like an overloaded procedure.

A method is invoked by passing it an object of a kind for which it was defined, along with any other arguments required by the method. This is much like a procedure call, except that the object both determines the procedure called and becomes an implicit argument to the call. The standard syntax for this is

However, SETL also allows method calls to be written in infix notation, and, indeed, allows very aggressive overloading of all its built-in operations and much of its built-in syntax. The importance of this capablity is explained below.

A *class* or *object class* describes the data elements and methods common to a family of similar objects. Each of the (dynamically created, and possibly very numerous) objects described by a class is called an *instance* of the class. All these instances share a common set of methods (coded procedures), but each of these instances ordinarily contains its own private copy of some (typically of almost all) of the variables manipulated by these procedures. These variables, which exist in *separate* copies for each object of a class, are called the *instance variables* of the class; those shared by all the objects of the class are called the *class variables* of the class. Any of the instance variables, class variables, or methods defined for a class CL can be either be made *public* (visible to other programs, packages, or classes which use CL) or kept *private* (invisible to these other programs, packages, or classes.)

Classes can be related to underlying classes which aid in their definition by declaring a relationship called *inheritance* between them. This relationship integrates classes into an inheritance hierarchy. The inheriting class CL in such a relationship is called a *subclass* of any class from which it inherits, which conversely is called a *superclass* of CL. Subclasses are able to use data elements or methods in their superclasses as if they were declared internally. However, it is also possible for a subclass to override method and variable definitions made in one of its superclasses, simply by defining a method or variable with the same name again. Details of the conventions which apply are explained below.

The rules governing the syntax of a class definition closely resemble those which apply to package definitions. A class is treated as a compilation unit, at the same level as a package or program. It is not possible to embed a class within a package, program or other class. As in the case of packages, the definition of a class consists of two parts, a *class specification* and a *class body*. (The specification and body need not occur in the same source file, but the specification must always be compiled before the body.) Each class specification declares the list of superclasses (if any) from which the class inherits, along with names of class-specific methods and data elements visible to units which use the class. The complete syntax of a class specification is:

Each of these components of a class specification is described in a following section.classclass_name;inheritnames_of_superclasses; -- there can be several of these data_declarations; -- const andvardeclarations, as in packages and programs method_declarations; -- like the procedure declarations in a package headerendclass_name; -- here, 'class_name' is optional

A class body defines some of the methods declared in the class specification, along with other methods and data elements visible only within the class. This can include a list of (class-private) instance and class variables. The syntax of a class body is:

classbody class_name;usenames_of_other_classes_and_packages; -- there can be several of these data_declarations; -- const andvardeclarations, as in packages and programs method_definitions; -- like the procedure declarations in a package bodyendclass_name; -- here, 'class_name' is optional

Note that ( for consistency) this syntax similar to that used for package specifications / package bodies; but there are a few differences:

- any procedure in a class specification is treated and must be called as as a method .
**inherit**clauses appear within class specifications, not class bodies. Inheritance differs very significantly from use of another class or package, as explained in more detail in later.

Data elements declared in connection with classes fall into one of several categories, based upon where they are stored and where they are visible. A data element declared in an object class or in one of the object classes from which it inherits can have a separate copy stored in each instance of the class, or it can be global to all instances of the class. This is the distinction between instance variables and class variables.

An instance variable is declared with the same syntax as is used for global variable declarations within packages or programs:

varname_1,name_2 := initializing_expression,...;

Note that all instance variables must be explicitly declared. Instance variables in the current instance (see 8.4) of any class C may be referenced by name only; and instance variables in other instances of the same or ancestor classes (these are the classes from which C inherits directly or indirectly) can also referenced using the more elaborate syntax illustrated by instance.name.

If initialization clauses are attached to the declarations of instance variables they are executed when an instance is created, but before any create method (see 8.5) is called.

Class variables are declared in a manner similar to that used for instance variables, but with the extra keyword **class**, as shown in

class varname_1,name_2 := initializing_expression,...;

If initialization clauses are attached to the declarations of **class** variables they are executed when a **class** is loaded by the SETL interpreter. This can happen at one of two times:

- If a
**class**is explicitly used by a program, the**class**is loaded at the start of execution. - the
**class**can be loaded explicitly by applying the**unbinstr**procedure (see 6.2) or the**getb**procedure to one of its instances.

Methods are similar to procedures, except that many different classes can contain identically named methods. The standard syntax of a method definition is identical to that of a procedure definition, except that read-write and write-only parameters are not allowed.

procedurename(parameters); data_declarations; -- const andvardeclarations, as in packages and programs ... statements ... subprocedure_definitions; -- like other imbedded procedure definitionsendname; -- here, the name is optional

Methods declared using this the standard syntax are invoked using the syntax

x.method_name(parameter_1,..,parameter_k).

Here x must be an instance of a class for which method_name is a method applicable to objects of that class. The method is invoked and the object x passed to it as an implicit parameter, referenced within the method using the reserved keyword **self**. Within any of an object's methods, any 'direct' reference to instance variables (i.e. any reference that is not preceded by an explicit object reference using the syntax instance.instance-variable) refers to the corresponding instance variable in the object x. Thus a direct reference

The more general syntax

can be used to refer to the instance variables of other objects of the same class (or any of its superlasses), provided that these are visible at the point of call.

Similarly, the methods of a class may be called within a class body without the instance prefix., i.e. method_name(parameters) is synonymous with **self**.method_name(parameters) within such a class.

The B>self object passed to a method call as an implicit parameter is analogous to a read-write parameter of a procedure, in that the method can modify the object and pass the modified result back to the point of call; all other parameters of a method are always read-only. Internally, the SETL interpreter stores objects as tuples, where the first element of the tuple is a key indicating the class of the object and the remainder of the tuple stores the values of instance variables. When a method is called, the values of the instance variable are copied into the instance variables. When the method returns, those values are copied back into the tuple. Note that this protocol passes parameters by copying, not by reference.

An object is of a given class is created with a call to a *create method* that must be declared and defined within the class and its class body if any instances of the class are ever to be created. Such a method must appear in both the class specification and class body, since it must be visible outside the class body. (But if the class is used only as a superclass of other classes, no create method need be declared or defined for it.) Since create calls must specify the class to which the newly created object is to belong, they have a special syntax, which uses the name of the class as a surrogate creation routine name for that class. For example, if we have an object class rational_number (introduced to make rational numbers, which are not built into SETL, available), an object of that class might be created by writing

three_fourths := rational_number(3,4); -- create the fraction 3/4

As this example shows, a create method can accept parameters and use them to initialize the created instance. If this is done, the specified number of actual parameters must be attached to the creation call and will then be passed to create.

The number of actual parameters of the creation call must agree with the formal parameters in create. As part of the creation process, all of the initialization clauses attached to its instance variable declarations are executed.

Note that 'create' procedures need not return anything. This is because when create receives control, the object (**self**) to which it implitly refershas been created by the interpreter and the instance variables of this bject will already have been initialized (at least partially) by using to the initialization clauses on the instance variable declarations. The create procedure is invoked after this preliminary initialization, and has the responsibility of completing initialization (in complex cases this may require creation of various ancillary objects). When create terminates, the **self** is automatically returned; any other value returned by create will be ignored.

'Create' is not a reserved word. It is only special in that if a method named create is present in a class specification it will be called implicitly at the time an object is created by using the class name of the desired object.

SETL's built-in operators can be overlaoded, allowing them to invoke methods defined on classes. For example,we can define '+' and '*' operators for our hypothetical rational_number class, allowing us to write code like

three_fourths := rational_number(3,4); two_thirds := rational_number(2,3);Clearly the SETL interpreter will not know how to add or multiply two objects of some newly defined class unless an addition or multiplication method is explicitly defined for the class. To define a multiplication method for the class rational_number, something like the following the following could be placed in the class body:

procedureself* x; -- multiplication operator for rational numbers return rational_number(numerator * x.numerator,denominator * x.denominator);end;

We now detail the conventions that apply to such operator overloading.

To overload one of SETL's standard binary operators, for example '+' or **max**, we use the operator itself in a **procedure** header. Two overload methods can be defined for each of SETL's built-in binary operators. These have headers of the following forms:

The reson for allowing both of these two forms is that an object can appear either as the left or the right hand operand of a binary operator. SETL operations often expect both operands to be of the same type, but there are exceptions. For example, the built-in * operator can operate on integers and strings or integers and tuples, in which case the operands can appear in either order. We allow the two forms of binary operators to enable the same sort of thing for general objects. The first form above is used if the left operand determines the method used, in which case the left operand will become the current instance. Otherwise the second method will be used and the right operand will become the current instance.procedure selfbinary-op second_arg; orproceduresecond_arg binary-opself;

In deciding how to process a binary operation the SETL interpreter gives precedence to the left operand. That is, it goes through the following steps before invoking a method of this kind:

- If the left operand is an object and that object has an appropriate method, then that method is used and the left operand will become the current instance.
- If the left operand is not an object or it doesn't have a left operand method but the right operand is an object with an appropriate method (defined in the second of the two forms shown above), then that method is used and the right operand will become the current instance.
- Otherwise there is no appropriate method, so the program is aborted.

Each of these methods should return a value, although that is not enforced by the system. If no other value is returned, then **OM** will be returned. If an object of some class is to be returned the method will need to create, initialize, and then return a new instance.

Here is a complete list of SETL's standard binary operators.

- * / **modminmaxwithlesslessfnpow

All of SETL's standard unary operators also allow associated methods to be defined. The headers for these methods have the following form

In this case, the method used will always be determined by the class of the value of the operand, which will become the current instance. Each of these methods should return a value, otherwiseprocedureunary-opself;

Here is a complete list of SETL's standard unary operators.

- #arbdomainrangepow

Many, but not all of SETL's relational operators can be overloaded. We do not allow overloading of the equality and inequality operators, since these primitive operations play a fundamental role in the definition of set and map membership. We also restrict the overloading of comparison operators, since the SETL code generator assumes that a < b if b > a. Therefore, we allow a < method but do not allow a > method. For each operation we invoke the < method, but in the expression a < b, a will become the current instance, while in a > b, b will become the current instance.

Because of these restrictions, only < and 'in' can have associated methods. The < method allows two forms (left and right) just like other binary operators and follows the same rules. The 'in' operator also has two forms but we give precedence to the right operator when determining the method to be invoke for an in expression.

Any method associated with < or 'in' in this way must return either true or false.

The < method will be called when any of the expressions: a < b, a <= b,
a > b, or a >= b is encountered. The expression a <= b is interpreted as
a < b **or** a = b.

SETL provides four expressions for refering to portions of maps, tuples or strings, namely f(x), f{x}, f(i..j) and f(i..). Each of these expressions can appear in both left and right hand side contexts. The ability to overload these syntactic constructs is particularly valuable, since it enables us to define new own aggregates organized any way we like, while retaining the ability to access components of those aggregates elegantly.

Each of these expression forms allows two associated methods, one for appearances of the form in left hand contexts and the other for right hand appearances. The syntax of the headers for these methods is as follows:

**procedure self**(id1) := id2;

**procedure self**{id1};

**procedure self**{id1} := id2;

**procedure self**(id1..id2);

**procedure self**(id1..id2 ) := id3;

**procedure self**(id1..);

**procedure self**(id1..) := id2;

Each of the methods used in right hand side contexts (those whose headers do not include an := symbol) should return a value. Otherwise **OM** will be returned. The methods for left hand contexts should not return anything, but should use their rightmost argument to modify the current instance.

SETL provides three extraction/deletion operations:

and object methods can be defined for any of these. The syntax of the corresponding method headers is:from,fromb, andfrome,

**procedure fromb self**;

**procedure frome self**;

Notice that there is no left operand in these headers, even though each is a binary operator. The left operand in a deletion operation is written but not read. Whatever value these methods return will be assigned to the left operand as well as the target operand. Each must return a value, or **OM** will be used.

SETL provides several syntactic constructs which call for iteration over an aggregate. For example, on encountering the expression

the interpreter will iterate over S, screening each element by applying the condition x < 5 and collecting all values which satisfy that condition. Iterators are used in set and tuple forming expressions, 'for' loops, and quantified expressions. Three general forms of iterators are provided:

expression1 = expression2 (expression3)

expression1 = expression2 { expression3}

Note that the expression y = f(x) is equivalent to [x,y] in f, and so does not enter into the following discussion.

SETL allows two pairs of built-in methods, corresponding to the first and third of these syntactic constructs. These are

**procedure** set_iterator_start; **procedure** set_iterator_next;

When the interpreter encounters code requiring an iteration over an object, it calls the iterator_start or set_iterator_start method, depending on whether the iterator was of the first or third form above. Then it repeatedly calls iterator_next or set_iterator_next to get successive elements of the object.

The iterator_start and set_iterator_start methods need not return a value. They are only used to let an object initialize an iteration. The iterator_next and set_iterator_next methods should return the next element x in the object, including it in a tuple [x] of one component if the iteration is to continue, or should return **OM** if iteration is to terminate.

If the iterator form is y = f(x), then the first pair of iterator methods will be used, but each value retured must be a pair, so each return return statement within the method will look something like this:

notice the double brackets. The outer tuple indicates that a value was found, and the inner tuple is the pair of values required in this form of iteration.

If the iterator expression is y = f{x} then the second pair of iterator methods will be used. The return values must obey the same rules as for y = f(x) iterators.

None of the method names appearing in this subsection are reserved words. If not used as iterator methods, they can have any number of parameters and return anything you like. But if they are to be used for iteration, they must conform to the rules above, or the program will abort.

Objects are printed by first calling the built-in procedure **str**, then printing the string that results. A default string is alwaysproduced when **str** is applied to an object of a user-defined class, but this is mainly useful for debugging. (It lists all the instance variables, but in an ugly format.) This default string conversion can be overridden with something more elegant by furnishing a class with a method having the name 'selfstr', declared with the following header

If this method is provided for a class it will be called when **str** is applied to objects of that class. 'selfstr' can return any value, but ideally should return a printable string version of the object.

The type of an object can be determined using SETL's built-in **type** procedure. The value returned will be the name of the object's class as an upper case character string. SETL is not case sensitive, but always keeps names as upper case.

Suppose that we introcuce the following simple object class and then run the test program shown:

classsimple; -- simple demonstration objectclassprocedurecreate(n); -- creates object with given internal valueproceduresetval(n); -- sets internal valueprocedureselfstr(); -- string conversion routineendsimple;class bodysimple; -- simple demonstration objectclassvarval; -- internal valueprocedurecreate(n); val := n;endcreate; -- creates object with given internal valueproceduresetval(n); val := n;endsetval; -- sets internal valueprocedureselfstr();return str(val);endselfstr; -- string conversion routineendsimple;programtest; --usesimple; x := y := simple(1); x.setval(2);endtest;

The value printed by the test code is

This is because SETL objects introduced by class definitions, like built-in SETL objects, have strict value semantics: the operation x.setval(2), which changes the object x, must by definition have no effect on the object y, any more than it would have if we write

For this reason the SETL system creates a fresh copy of x before executing x.setval(2), preventing this operation or others like it from affecting y.

The same effect is seen in the test code

programtest; --usesimple; x := {y := simple(1)}; y.setval(2);endtest;

which for the same reason produces the output

This is not always the effect wanted (and one wants this effect less often in object settings than in dealing with orinary SETL values.) Suppose, for example, that our objects represnet the employees of a multi-department firm, whose departments are then represented by sets of employees, and that an operation like x.setval(n) is used to change an employee's salary when they get a raise. SETL's value-emeantics rule would then imply that this operation would have no effect on the summed value of salaries over the set of all employees in a department, sice the set of such employees would always contain objects with unchanged salary values. It is as if the operation x.setval(n) created an entirely new employee, with the new salary, while leaving the former employee unchanged in all contexts which the name 'x' does not directly identify. This strict value-semantics convention makes'salary' part of an employee's very identity, clearly not what is wanted. These are situations in which pointer semantics, not value semantics, should be employed.

To create SETL objects which display pointer rather than value semantics in regard to certain modifyable attributes, one can set the attribute items stored internally in the objects to unchanging atoms p, created by the creation routines ofthe objects themselves, and then attach the attribute values to these attribute atoms (which accordingly function as 'pointers') using the special global SETL map '^'. To modify our example class in this way we would write

classsimple; -- simple demonstation objectclassprocedurecreate(n); -- creates objectwithgiven internal valueproceduresetval(n); -- sets internal valueprocedureselfstr(); -- string conversion routineendsimple;class bodysimple; -- simple demonstation objectclassvarval; -- internal valueprocedurecreate(n); val :=newat(); ^val := n;endcreate; -- creates objectwithgiven internal valueproceduresetval(n); ^val := n;endsetval; -- sets internal valueprocedureselfstr();return str(^val);endselfstr; -- string conversion routineendsimple;

Now the output produced is

showing that our modified objects have pointer semantics.

A method can be used as a first class object just as a procedure can, but only when an implicit instance variable is first attached to it. The procedure-like value of a method which would be called as

can be captured by writing

Such a value can then be called in the same way as any other procedure, e.g. by witing

This expression will invoke 'method' with the value of x as the current instance.

This semantic rule is consistent with that applied when SETL forms procedure closures. Recall from Section XXX that whenever a procedure value is returned that otherwise would reference some variable defined outside its own body, SETL saves the environment of that procedure, which is to say saves as much as necessary of the current activation of all enclosing procedures. Since, inside a class body. the current instance is part of a method's environment, we bind the current instance to the method and save the combination as a closure-like procedure value. It follows that object methods cannot be used as first-class objects until an associated object instance is bound with them.

Suppose that we need to do an extensive calculation using rational numbers, which SETL does not provide as built-in objects. SETL's object **class** facility allows us not only to introduce objects of this kind, but to specify that the standard infix operators +, -, *, /, and also comparison operators like < and <= will be used in standard fashion to combine and compare rational numbers. Once this is done we can write arithmetic expressions for rational numbers in just the same way as we do for integers.

The following **class** definition accomplishes this. Since fractions are represented mathematically simply as pairs of integers, the way to set up this **class** is rather obvious. Each fraction (ie. object of 'rational' type) is defined by two internal instance variables num,den (its numerator and denominator, from which we always eliminate any common factor, so that they are always in 'lowest terms'.) The codes for the various arithmetic operators and comparisons seen in the **class body** simply reflect the standard mathematical rules for operating with fractions. All these operators are written so that they can combine either fractions with fractions, or fractions with integers, in either order. Whenever a new fraction is formed, we use the internal 'simplify' routine to reduce it to lowest terms. The 'str' procedure is redefined so as to give rational numbers their standard printed form 'num/den'. Since division of rationals by integers is available, the create routine simply takes a single integer numerator as parameter, and we build fractions like 2/3 by writing 'rational(2)/3'.

The following short program tests the 'rational' class defined by the preceding code.classrational; -- rational numberclassprocedurecreate(n); -- converts the integer n to the corresponding fraction n/1procedure ceil(); -- ceiling of rationalprocedure floor(); -- floor of rationalendrational;class bodyrational;varnum,den := 1; -- numerator and denominator, always in lowest termsprocedurecreate(n); -- converts the integer n to the corresponding fraction n/1 num := n; -- set the numeratorendcreate;procedureset_denom(n); den := n;endset_denom;procedureselfstr; -- string conversion for fractionsreturn str(num) + "/" +str(den);endselfstr;procedureself+ r2; -- addition routine for fractions, and for fraction + integerif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := num * den2 + num2 * den; n2 := den * den2; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedurer2 +self; -- addition routine covering the case integer + fractionif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := num * den2 + num2 * den; n2 := den * den2; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedureself- r2; -- subtraction routine for fractions, and for fraction - integerif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := num * den2 - num2 * den; n2 := den * den2; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedurer2 -self; -- subtraction routine covering the case integer - fractionif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := num2 * den - num * den2; n2 := den * den2; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedureself* r2; -- multiplication routine for fractions, and for fraction * integerif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := num * num2; -- we simply multiply the numerators and the denominators separately n2 := den * den2; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedurer2 *self; -- multiplication routine covering the case integer * fractionif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := num * num2; -- we simply multiply the numerators and the denominators separately n2 := den * den2; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedureself/ r2; -- division routineif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := den2 * num; -- we simply invert and then multiply the numerators and the denominators separately n2 := num2 * den; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedurer2 /self; -- division routine covering the case integer / fractionif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if; n1 := den * num2; -- we simply invert and then multiply the numerators and the denominators separately n2 := num * den2; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedureself** n; -- integer power routine n1 := num ** n; -- we simply take corrresponding powers of the numerator and the denominator n2 := den ** n; new_rational := rational(n1); new_rational.set_denom(n2);returnnew_rational.simplify();end;procedureself< r2; -- rational comparisonif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if;return(num * den2 - num2 * den) < 0;end;procedurer2 <self; -- rational comparison covering the case integer <= fractionif is_integer(r2)thennum2 := r2; den2 := 1;elsenum2 := r2.num; den2 := r2.den;end if;return(num2 * den - num * den2) < 0;end;procedure ceil(); -- ceiling of rational quot := num/den;return if(rem := num - quot * den) <= 0thenquotelsequot + 1end if;end ceil;procedure floor(); -- floor of rational quot := num/den;return if(rem := num - quot * den) >= 0thenquotelsequot - 1end if;end floor;proceduresimplify(); -- reduces a fraction to lowest termsifden = 0thenabort("RATIONAL ARITHMETIC ERROR: division by zero " +str(num) + "/" +str(den));end if; comden := gcd(num,den); -- find common denominatorif(den := den/comden) < 0thennum := -num/comden; den := -den;elsenum := num/comden;endif;returnself;endsimplify;proceduregcd(n1,n2); -- creates common denominator by Euclid's algorithm m1 :=abs(n1); m2 :=abs(n2);ifm1 < m2then[m1,m2] := [m2,m1];end if;whilem2 /= 0loopm1 := m1modm2; [m1,m2] := [m2,m1];end loop;returnm1;endgcd;endrational;

programtest; -- test of rational classuserational; two_thirds := rational(2)/3;endtest;

The fraction two_thirds ** 1000 turns out to be

10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468
25187145285692314043598457757469857480393456777482423098542107460506237114187795418215304647498358194126739876755916554394
6077062914571196477686542167660429831652624386837205668069376
/

13220708194808066368904552597521443659654220327521481676649203682268285973467048995407783138506080619639097776968725823559
50954582100618911865342725257953674027620225198320803878014774228964841274390400117588618041128947815623094438061566173054
08667449050617812548034440554705439703889581746536825491613622083026856377858229022841639830788789691855640408489893760937
3242171846359938695516765018940588109060426089671438864102814350385648747165832010614366132173102768902855220001

a fact you may wish to remember.

Object-oriented programming facilities like those of SETL embody a deep complex of ideas, which like all deep ideas present us with many possibilities, all deserving careful consideration. In the first place, these facilities invite us to view the first stages of any programming project, particularly large and challenging projects, as a matter of designing one or more families of objects, and of operations which allow these objects to interact with and act upon each other, often in ways which generate new objects of the same type. This approach is seen at its most successful when we use mathematical objects, like vectors, matrices, polynomials, trees, graphs etc., whose definitions and associated operations have been polished by years of careful study and use.

Multimedia application developments can also make good use of objects which encapsulate fragments of behavior useful for the construction of larger objects and of full interactive projects. (This is the way in which the SETL interactive environment described in Chapter 10 has been organized.) Objects useful in such settings include graphics which change their appearance or execute associated code fragments when approached or clicked, sliders and knobs which serve as sources of numbers, dials and bars which can display these numbers in various useful forms, collections of small dots which can be dragged to adjust the positions of larger and more complex objects, etc. This idea then extends itself naturally to the construction of second-level objects designed to ease management of large collections of first-level objects: folders in which other objects can be stored, windows in which they can be viewed in various ways, and so forth. In an object-oriented one first specifies the nature and behavior of the classses of objects (often called 'controls', 'widgets', or even 'mega-widgets') that will be used to build an interactive appication, then realizes them using the facilities of the language in which one is working, and finally uses the resulting library ofobjects to build the full application desired. This level-by-level approach decomposes complex design problems in useful ways and can often help one to elaborate nicely modular, logically compelling designs. Several examples illustrating this idea appear in Chapter 10.

Large programs often develop by progressive transformation and elaboration of simpler initial versions, and here also SETL's object facilities provide us with very useful tools. Suppose, for example, that we aim to develop some kind of large database, which eventually will have to serve as repository for many millions of records. In a tiny, very first prototype, this may simply consist of a family of SETL mappings which collectively store whatever information the database is to hold, for some test collection of a few hundred records. These initial mappings can be saved to disk using the standard binary read/write pair **putb** and **getb**, giving us an immediately functional mini-database in which all database capabilities not involving size or efficiency considerations can be designed and implemented. Then, at such time in development at which it becomes appropriate, we can use SETL's object facilities in the following way to scale up the database:

- Introduce one or more classes of objects which have the same external interfaces as the SETL mappings originally used to create the database. That is, objects of these classes must support all the operations f(x), f(x) := y; f with x, f + g, etc. applied to the corresponding map objects in the original database prototype, and these operations must behave in the same way. That is, after an operation f(x) := y the expression f(x) must yield the value y, etc. This is to say that our new objects must behave like maps in every way visible to our original database prototype. However, their internal structure and implememtation may be entirely different, and they may support additional operations allowing important extensions of the prototype database functionality.
- The design of the new objects should ensure that the reqired level of efficiency will be attained when the database is scaled up. For example, the map-like objects introduced may be implemented in two parts, a main part held on disk (to allow scaleup to very large size) and a smaller dynamically varying part, managed automatically in response to the operations applied to the map objects comprising the database. The part of the database held on disk can consist partly of various auxiliary indices which serve to locate objects rapidly when these are called for. The structures used may also be designed to maintain key parts of their history of change (for debugging or for post-auditing), to support efficient reversion (making it possible to back out of partially completed transactions), to guarantee reconstructabliity after system crash, etc. Parts of the database can be left resident on some 'home' computer, while other parts are moved to remote servers accessed via Internet sockets. All of this internal reworking can leave the original map-like external syntax and semantics of the evolving database objects unchanged, except that new operations, supporting new capabilities, can be added. This will ensure that
- The original database prototype always remains functional as it is scaled up.
- new versions of the database can be checked by comparing their contents with the contents of prior versions, after equivalent manipulation sequences.

- To attain still higher efficiency, particularly critical operations can be translated into native C code, implemented by one or more
**native package**s of the kind described extensively in Chapter 9, which the object classes used encapsulate in interfaces identical to those of the less efficent SETL operations which they replace.

The way in which objects can use overlaoded syntax serves to reveal and emphasize *analogies* among them which can ease both their design and their use. A section of code will act in the intended way whenever the objects appearing in it support the operations applied in the code, provided that these operations satisfy the logical assumptions concerning them implicit in the code. The object-oriented programming approach exploits this fact by eliminating unnecessary syntactic variation, thereby allowing direct code re-use in situations where extensive rewriting might otherwise be necessary. At an even more primitive level the appearance of an operatior sign like '+' hints that the operator being applied is some sort of constructive 'union' which might be associative (i.e. satisfy (a + b) + c = a + (b +c), and so allow extended 'sums' to be formed recursively), the appearance of the sign '*' hints that some notion of commonality or repetition may be involved, the appearance of y := f(x) hints that an operation f(x):=y may also exist, etc.

Operator overloading also helps us avoid unnecessary syntactic proliferation, which if unchecked can rapidly generate large, hard-to-remember programming vocabularies. Suppose, for example, that we will be working with some class of graphic objects whose appearance is determined by a long list of graphical attributes, e.g. color, size, position, orientation, size and orientation of various subfeatures, etc. It will sometimes be convenient think of these objects as a kind of extended map from attribute names to attribute values, with values which can be set by assignments having forms like

but which change the graphical appearance of the objects to which they are applied. Writing things in this way cuts our programming vocabulary in half and makes it easier to remember, since we can write

instead of something like

which is distinctly less predictable, since it might for example be

or even

A final advantage of the object-oriented prorammming approach is that it safely packages groups of related updates whose seperation would allow inadvertent, bug-generating omisssions or failures of correspondence. Suppose, for example,that we are working with a collection of tuple-like objects but using them as serial files, from which successive objects are read by an operation that might be written as

Instead of packaging these as objects one might choose to work directly with their underlying representation, which might, e.g, consist of a tuple t and a tuple index ix, thus simply converting

which seems safe enough. But if our code is working with several such objects f_1, f_2,... and occasionally needs to exchange them, the very direct

of a piously object-orienbteed version becomes

which is distictly harder to get straight since

- it requires us to remember and correctly type four names instead of two.
- the human psychological tendency to 'take the part for the whole', a common source of bugs, makes it easy to omit the second assignment [ix_1,ix_2] := [ix_2,ix_1];

If this ever happens it may introduce an elusive bug ,whose initial symptom may be the improperly repeated appearance or the disapperance of 'file' elements, but only after the 'files' have been exchanged. If this bug appears as part of a complex code, it can easily take several hours to track it down. In the object-oriented form of the same code this bug could never have occurred at all.

Object classes OC built using the SETL facilities can either **use** other classes or **inherit** from them. **use** is most appropriate when objects of type OC are built from other kinds of objects employed as parts, as, for example, a table might be built from four legs and a top. Inheritance is most appropriate when objects of one type constitute a kind of objects of some other type, possibly with some added or revised features. For example, a folding table is a kind of table, while a house-trailer is a kind of house and a kind of truck (in cases like this, inheritance from multiple ancestral classes may be appropriate.)

The inheritance relationship among classes resembles the **use** relationship among packages, but is stronger. When a package P **use**s a package P', only those procedures and variables of P' which have been made public by being declared in the package specifier of P' (rather than in its body) are visible within P. But when a class C' is **inherit**ed, by a class C, *all* its instance variables and methods become visible in C, and beyond this become a *part of* C, rather as if the C' were textually inserted into C. (The difference between inheritance and textual inclusion is that multiply defined inherited methods are hidden rather than generating errors. Moreover such hidden methods can be accessed by prefixing the name of class from which they originate to the method name.)

The syntax of an **inherit** clause is:

(diagram here)

Inherit clauses referencing a class C' are placed in the class specification of the classes C which are to **inherit** from C', after the header line of C and before any other declarations. Each of the identifiers appearing in an **inherit** clause must identify some class already available in the SETL library. All variables and methods of each superclass C' from which C **inherit**s will be brought into C, unless this is prevented by thee redefinition of someone of these methods within the body of C, or by conflicts among names of variables and/or methods belonging to the classes from which C **inherit**s. The rules for resolving conflicts are as folllows:

- No variable names may be redefined.
- Method names follow similar rules to packages: A local definition overrides an inherited one. If there are conflicts among inherited names, all are hidden. Hidden names are accessible with the syntax: superclass.method-name. Hidden names deriving from names in superclasses are only accessible in this way within the bodies of classes inheriting from them.

Inheritance also makes names available transitively, whic **use** does not. Suppose for example that a, b, and c are classes, and that a **use**s b and b **use**s c. Then nothing from c is visible in a (unless of course a also uses c). Things are different with inheritance, which is similar to textual insertion and in particular is recursive. If a **inherit**s b and b **inherit**s c then everything in c not blocked by a redefinition in a or b is visible in a. Even c's otherwise hidden methods in c are accessible in a, albeit by use of the expression c.method-name.

**An example**. The following small example illustrates the use of inheritance. It reflects that fact that for many kinds of algebraic objects for which addition and subtraction operations are both defined. and for which there is a 'zero' object (fractions, matrices, and polynomials are all examples), subtraction can be expressed in terms of addition and the *monadic* minus operation by use of the familiar identity

Also addition is commutative and thus satisfies the identity

The code shown below introduces the (relatively abstract) class of all objects for which these identities are avalable, under the name 'algebraic'. For objects of this class, it defines a - b as a + (-b) (which assumes that a summation operator and a mondic minus operator will be made available), and also defines addition with an algebraic object as second rather than first argument by using the identity a + b = b + a to revese the argument order. Then by declaring the **class** int_obj to **inherit** algebraic, these same extensions become available for int_objs. Likeise by declaring the **class** matrix to **inherit** algebraic, we make these extensions available for matrices

The first test class of objects to which this trick is applied are simply the integers, wrapped as objects. This is of course a useless family of objects, which we introduce simply to illustrate the techniques involved.

The following mini-program can be used to test the preceding code.classalgebraic; -- 'algebraic object' - a relatively 'abstract'class-- no objects of thisclasswill ever be created directly, so no 'create' routine is neededendalgebraic;class bodyalgebraic;procedureself- b;returnself+ (-b);end; -- this assumes that an addition routine is avalableprocedureb +self;returnself+ b;end; -- this assumes that an addition routine is availableendalgebraic;classint_obj; -- integers, as a sampleclassof algebraic objectinheritalgebraic; -- we inherit from the more general 'algebraic'classprocedurecreate(n);endint_obj;class bodyint_obj; -- integers, as a sampleclassof algebraic objectvarval;procedurecreate(n); val := n;endcreate;procedureselfstr;return str(val);endselfstr;procedureself+ b;returnint_obj(val +if is_integer(b)thenbelseb.valend if);end;procedure-self;returnint_obj(-val);end;endint_obj;classmatrix; -- matrices, as a second sampleclassof algebraic objectinheritalgebraic; -- we inherit from the more general 'algebraic'classprocedurecreate(m);endmatrix;class bodymatrix; -- matrices, as a second sampleclassof algebraic objectvarval;procedurecreate(m); val := m;endcreate;procedureselfstr;return"" +/ ["|" +/ [pad_8(c): cinrow] + "|\n": rowinval];endselfstr;procedurepad_8(c); stg := 8 * " " +str(c);returnstg(#stg - 7..);endpad_8; -- pad value to 8 charactersprocedureself+ b;returnmatrix([[c + b.val(i)(j): c = row(j)]: row = val(i)]);end; -- matrix additionprocedure-self;returnmatrix([[-c: c = row(j)]: row = val(i)]);end; -- negative matrixendmatrix;

It produces the outputprogramtest;useint_obj,matrix;endtest;

363 -303 | 1 2| | 3 4| | -1 -2| | -3 -4| | 2 4| | 6 8| | 1 2| | 3 4|

**'Virtual' Methods**. Another, almost equivalent but often preferred way of writing the 'algebraic' class that appears in the preceding example is

classalgebraic; -- 'algebraic object' - a relatively 'abstract'class-- no objects of thisclasswill ever be created directlyendalgebraic;class bodyalgebraic;procedureself- b;returnself+ (-b);end; -- this assumes that an addition routine is availableprocedureb +self;returnself+ b;end; -- this assumes that an addition routine is availableprocedureself+ b;end; -- this empty method is 'virtual'procedure-self;end; -- this empty method is 'virtual'endalgebraic;

Here we have simply made explicit that fact that classes which **inherit** from 'algebraic' are expcted to provide the methods **self** + b and -**self**. This is done by including the (otherwise useless) empty **procedure**s **self** + b and -**self** in the body of the **class** algebraic. Empty (or simple default) procedures included in this way merely to announce that they will subsequently be over-written are often called *virtual* methods. They give the notion of inheritance what might be called a 'Japanese' as contrasted with the more familiar 'American' flavor: in the U.S., you can inherit your parent's wealth, but are not required to assume your parent's debts; in Japan you can traditionally inherit parental debts which you are required to pay. Virtual methods are like debts which every class inheriting from a given superclass is required to pay.

Next we give a third version of this same example, which is contrived to maximize separation of the several strands of thought and technique which enter into into it:

- The 'algebraic'
**class**is expanded, to provide the variable 'val' storing an object's value, and it also assumes that each of its subobjects will initialize four additional procedure-valued variables, called 'sum', 'invert', 'self_str', and 'maker'. It uses these to define all the operations a + b, -a, and a - b that it provides and to convert objects to strings. This class assumes that 'sum', 'invert', and 'maker' will be made available to all the operations on values that it requires, but makes no assumption converning the actual form of these objects. 'algebraic' is therefore a*pure syntactic wrapper class*. - Versions of the 'sum', 'invert', and 'self_str' routines appropriate for integer and for matrix objects are supplied by two
**package**s called 'intops' and 'matrix_ops'. These routines, which do the innermost algebraic work for integer and matrix objects respectively, use the internal representations of these objects (which they define), and so do not reflect the fact that they will be called in an object-oriented setting. The are*pure semantic packages*, solely responsible for defining the form of the objects being manipulated and the way in which operations on them are implemented. - The two small classes 'int_obj' and 'matrix', each of which is just a few lines long, have sterotyped forms and so could be generated automatically. They serve to link their class names to the packages which do their semantic work and, for the rest,
**inherit**all their operations from the 'algebraic'**class**.

This third version of our code therefore breaks the programming needed to introduce the desired operations into three cleanly separated parts: (a) a semantic wrapper class, (b) packages of procedures to do the real work, and (c) a few auxiliary mini-classes to link (a) to (b).

classalgebraic; -- 'algebraic object' - a relatively 'abstract'class, encapsulating syntax -- no objects of thisclasswill ever be created directlyendalgebraic;class bodyalgebraic;varval,sum,invert,maker,self_str; -- must be avaialableprocedureself- b;returnmaker(sum(val,invert(if is_integer(b)thenbelseb.valend if)));end; -- differenceprocedureb +self;returnmaker(sum(if is_integer(b)thenbelseb.valend if,val));end; -- suminalternate orderprocedureself+ b;returnmaker(sum(val,if is_integer(b)thenbelseb.valend if));end; -- sumprocedure-self;returnmaker(invert(val));end; -- mondic minusprocedureselfstr;returnself_str(val);endselfstr; -- print formendalgebraic;packageintops; -- integer operations semanticpackage; does the innermost algebraic workproceduresum_op(val,b); -- additionprocedureinvert_op(val); -- unary minusprocedureselfstr_op(val); -- string conversionendintops;package bodyintops; -- integer operations semanticpackage; does the innermost algebraic workproceduresum_op(val,b);returnval + b;endsum_op;procedureinvert_op(val);return-val;endinvert_op;procedureselfstr_op(val);returnstr(val);endselfstr_op;endintops;packagematrix_ops; -- matrix operations semanticpackage; does the innermost algebraic workproceduresum_op(val,b); -- additionprocedureinvert_op(val); -- unary minusprocedureselfstr_op(val); -- string conversionendmatrix_ops;package bodymatrix_ops; -- matrix operations semanticpackage; does the innermost algebraic workproceduresum_op(val,b);return[[c + b(i)(j): c = row(j)]: row = val(i)];endsum_op;procedureinvert_op(val);return[[-c: c = row(j)]: row = val(i)];end invert_op;procedureselfstr_op(val);return"" +/ ["|" +/ [pad_8(c): cinrow] + "|\n": rowinval];endselfstr_op;procedurepad_8(c); stg := 8 * " " +str(c);returnstg(#stg - 7..);endpad_8; -- pad value to 8 charactersendmatrix_ops;classint_obj; -- integers, as a sampleclassof algebraic objectinheritalgebraic; -- we inherit from the more general 'algebraic'classprocedurecreate(n);endint_obj;class bodyint_obj; -- wrapperclass, combining syntax with semanticsuseintops; --usepackage of integer operationsprocedurecreate(n); val := n; sum := sum_op; invert := invert_op; maker := make_op; self_str := selfstr_op;endcreate;proceduremake_op(n);returnint_obj(n);endmake_op;endint_obj;classmatrix; -- matricess, as a second sampleclassof algebraic objectinheritalgebraic; -- we inherit from the more general 'algebraic'classprocedurecreate(m);endmatrix;class bodymatrix;usematrix_ops; --usepackage of matrix operationsprocedurecreate(m); val := m; sum := sum_op; invert := invert_op; maker := make_op; self_str := selfstr_op;endcreate;proceduremake_op(m);returnmatrix(m);endmake_op;endmatrix;

Inheritance is of course most interesting when wealth rather than debt can be inherited. In the SETL context, this corresponds to situations in which a superclass can provide methods which some **inherit**ing subclass can use without change. Unfortunately this happy situation is relatively rare. Ordinarily when we change an object's internal representation, the code for the operations applied to it will need to undergo at least some small modifications, and so cannot simply be inerited. Even in the 'house-trailer'-like case in which it is a natural to think of a class as a composite of two pre-existing classes ('house' and 'truck'), the operations applying to either aspect of the composite will generally need to refect the presence of its other portion in some way, making **use**, which allows this, more appropriate than inheritance, which does not. This is all the more true when an object multiple copies of a pre-existing object must enter into a composite object, as in a table with four legs.

The kind of situtation which the preceding example illustrates typifies cases in which SETL finds inheritance most valuable. These are cases in which higher level operations can be defined in some unchanging functional manner from constants and procedures which a subclass must supply, just as each of the subclasses 'matrix' and 'int_obj' supplies the sum, invert, maker, and self_str operations used by 'algebraic' to define its infix and prefix algebraic operations. Where even this level of invariance does not hold, method codes need to be rewitten in any case, so inheritance is less useful. However, it is possible (though a bit unusual) for a class C to both **use** and **inherit** the same class C'; this can be done to give C direct access to the interrnals of C', rather than requiring C to access these by using the methods of C'.

It is worth noting, however, that the mechanism of inheritance is considerably more important in certain languages other than SETL, for example C++ and Java. This is because these are *statically typed*languages, rather than being *untyped* or *dynamically typed*, as SETL is. This allows code written in these languages to be compiled more agressively into machine-like forms than is possible for SETL, which needs to check the type of each object with which it deals to determine what the object is, how to apply a specified operation to the object, and whether the operation is applicable at all. Statically typed languages aim to make it syntactically impossible for operations to be applied to objects which do not support them. Thus, in these languages, if we want to allow some of the operations defined for one kind of object to be applied to objects of some loosely related kind, it may be easiest to have them both **inherit** from some abstract class which supports all the methods of interest for either kind of object, at least nominally. This encourages the introduction of abstract classes C, full of empty virtual methods meant simply to be redefined in every subclass of C which really needs to use these methods. SETL, which does without the syntactic restrictions which this kind of artifice evades, has less use for inheritance. (But such use of inheritance may have documentation value even for SETL.)

The processing pattern seen in the 'merge' subroutine of 'merge_sort', namely coordinated iteration over two or more sequences of objects sorted into corresponding order, is common enough for an object class which encapsulates it to be of some interest. The code below realizes such an object class, which the code names 'coord_iter'. Objects of this class are created from a tuple of other objects, each of which must at least support a notion of iteration over its elements by presenting two methods, essentially the 'iterator_start' and 'iterator_next' methods described in Section 8.6.6, for external use. However, since the names 'iterator_start' and 'iterator_next' are treated in a special way by the SETL system, the following code assumes that these methods are made available underthe names 'istart' and 'inext'. To adapt a pre-existing class to this requirement, one merely needs to add the following two lines to its body, and the two corresponding procedure declarations to its class header.

procedureistart; iterator_start;endistart;procedureinext;returniterator_next();endinext;

Objects of type 'coord_iter' are created by calls

And then used simply by writing iterators over them in the ordinary SETL form, e.g.

However, such an iteration returns, not merely the values x selected (in increasing order) from the various obj_tup(j) involved in the coordinated iteration, but pairs [j,x] containing both the value x and the index j of the source object from which x was obtained. If only the values x are wanted, the first components j of these pairs must be dropped.

The first input 'obj_tup' of a 'coord_iter' creation call is the tuple of objects to be involved in the coordinated iteration, as explained above. The second input 'key_tup' is a tuple of mappings, where for each j key_tup(j) is assumed to send the successive elements of obj_tup(j) into values directly comparable by the SETL comparison operator '<='. Any component of key_tup can be replaced by **OM**, in which case the identiy map will be used in its place. We also allow any of the components obj_tup(j) to be a tuple rather than an object, in which case the iterator for it is assumed to be standard tuple iteration.

The code seen below manages coordinated iteration by maintaining a tuple 'current_vals' whose j-th component current_vals(j) is the most recent value obtained by applying an iteration step to the object obj_tup(j). In each cycle of coordinated iteration, the minimum component of current_vals is selected for return, and replaced by the next item drawn from obj_tup(j). The iteration-step operators neeed for this are held in an auxiliary tuple 'next_fcns'; when iteration over obj_tup(j) reaches its end, next_fcns(j) is set to **OM**, causing current_vals(j) to drop out of contention for the minimum to be returned. If obj_tup(j) is a tuple rather than some other kind of object, next_fcns(j) stores an iteration index, rather than an iteration-step function.

classuseiters; -- iteration testprocedureistart; -- wrapper for iterator_startprocedureinext; -- wrapper for iterator_nextenduseiters;class bodyuseiters; -- iteration testprocedureistart;endistart; -- wrapper for iterator_startprocedureinext;endinext; -- wrapper for iterator_nextenduseiters;classcoord_iter; -- coordinated iteration classprocedurecreate(obj_tup,key_tup); -- creator. parameters are list of objects and list of comparable keysendcoord_iter;class bodycoord_iter; -- coordinated iteration classuseuseiters;varobjs,keymaps,current_vals,next_fcns; -- next_fcns(j) is set to OM when objs(j) is exhauastedprocedurecreate(obj_tup,key_tup); -- creator. parameters are list of objects and list of comparable keys objs := obj_tup; keymaps := key_tup?[];forjin[1..lot := #obj_tup]loopkeymaps(j) ?:= ident;end loop; current_vals := [newat(): jin[1..lot]]; next_fcns := [newat(): jin[1..lot]];endcreate;procedureident(x);returnx;endident; -- identity mappingprocedureiterator_start; -- begin iterationforobj = objs(j)loopif is_tuple(obj)then-- object is a tuple; start with first component. ^next_fcns(j) is iteration index ^(next_fcns(j)) :=if#obj = 0thenOMelse1end if; ^(current_vals(j)) := obj(1);else-- object is not a tuple; start with first obj.istart(); ^(next_fcns(j)) :=if(val := (oin := obj.inext)()) = OMthenOMelseoinend if; ^(current_vals(j)) :=ifval = OMthenOMelseval(1)end if;end if;end loop;enditerator_start;procedureiterator_next; -- iteration cycle. Returns smallest item among non_exhausted, in pair ix_smallest := smallest_val := OM; was :=false;forobj = objs(j) | ^(next_fcns(j)) /= OMloop-- search for element to return was :=true;ifix_smallest = OMthen-- first element smallest_key := keymaps(j)(smallest_val := ^(current_vals(ix_smallest := j)));elseif(possib_newkey := keymaps(j)(maybetter := ^(current_vals(j)))) < smallest_keythensmallest_key := possib_newkey; smallest_val := maybetter; ix_smallest := j;end if;end loop;if notwasthen returnOM;end if; -- iteration is exhaustedif is_integer(nf := ^(next_fcns(ix_smallest)))then-- we are dealing with a tupleif(newv := objs(ix_smallest)(nf +:= 1)) = OMthen-- object is exhausted ^(next_fcns(ix_smallest)) := OM; -- flag object as exhaustedelse^(current_vals(ix_smallest)) := newv; -- post the new current value ^(next_fcns(ix_smallest)) := nf;end if;else-- we are dealing with an object other than a tupleif(newv := nf()) = OMthen-- object is exhausted ^(next_fcns(ix_smallest)) := OM; -- flag object as exhaustedelse^(current_vals(ix_smallest)) := newv(1); -- post the new current valueend if;end if;return[[ix_smallest,smallest_val]];enditerator_next;endcoord_iter;

The following small test program can be used to examine the use of the preceding class.

programtest; -- test of coordinated iterationusecoord_iter; cit := coord_iter([[3..12],[2..5],[1..10]],[]);incit]); -- drop first componentsendtest;

It produces the output

Where applicable and not too baroque, expressions are often more succinct and less bug-prone than the more primitively explicit bits of procedural ccode which could be used to produce their values. For example,

is better than

and

SETL provides severaal constructs which exploit these advanteges, notably the set- and tuple-formers

and compound operators like '+/t' and '**max**/t'. Note that the 'e(x)' clause in a set- or tuple-former gives succinct and general expression to the notion of element-by-element transformation f a composite, while the 'P(x)' clause handles the notion of 'filtering' equally effectively. Nevertheless, there are no end of other loop forms, code fragments, and situations where the same kind of transformation to expression form will occasionally be desirable. The code which follows shows how easily object classes can be used to realize thiskindoftransformation.

We present two object classes, called 'all' and 'each'. 'all' simply duplicates SETL's built-in compound operator by allowing expressions like

Similarly 'each' allows expressions like

For efficiency, both these classes inherit from an abstract class 'oploop' which assumes that a virtual method 'do_the_loop' will be supplied and passes operators like '+', "*', '**' to 'do_the_loop' for use in whatever loop form it realizes. Then 'all' simply supples the loop form

whereas 'each' supples the loop form

classoploop; -- uses operation loop to apply infix operator to composite -- this class has no methodsendoploop;class bodyoploop; -- uses operation loop to apply infix operator to compositevarval_1;vardo_the_loop;procedureself + tup; -- addition op :=lambda(x,y);returnx + y;end lambda;returndo_the_loop(op,tup);end;procedureself * tup; -- product op :=lambda(x,y);returnx * y;end lambda;returndo_the_loop(op,tup);end;procedureself - tup; -- diff - y;end lambda;returndo_the_loop(op,tup);end;procedureself / tup; -- division op :=lambda(x,y);returnx / y;end lambda;returndo_the_loop(op,tup);end;procedureself ** tup; -- exponentiation op :=lambda(x,y);returnx ** y;end lambda;returndo_the_loop(op,tup);end;procedureselfmaxtup; -- max op :=lambda(x,y);returnxmaxy;end lambda;returndo_the_loop(op,tup);end;procedureselfmintup; -- min op :=lambda(x,y);returnxminy;end lambda;returndo_the_loop(op,tup);end;endoploop;classall; -- functional transformation: compound operator inherit oploop;procedurecreate(init_val); -- creator.endall;class bodyall; -- functional transformation: compound operatorprocedurecreate(init_val); -- creator. val_1 := init_val; do_the_loop := doit; -- save initial valueendcreate;proceduredoit(op,tupset); -- operation loop x := val_1;foryintupsetloopx := op(x,y);end loop;returnx;enddoit;endall;classeach; -- functional transformation: compound operator inherit oploop;procedurecreate(init_val); -- creator.endeach;class bodyeach; -- functional transformation: compound operatorprocedurecreate(init_val); -- creator. val_1 :=if not is_tuple(init_val)then[init_val]elseinit_valend if; -- save initial value do_the_loop := doit;endcreate;proceduredoit(op,tup); -- operation loop last := val_1(#val_1);return[op(val_1(j)?last,x): x = tup(j)];enddoit;endeach;

The preceding class is exercised by the test program

programtest; -- test of 'all' and 'each'useall,each;endtest;

which produces the output

6 [4, 5, 6] [2, 4, 5] 24 [3, 6, 9] [1, 4, 6, 8, 10, 12, 14, 16, 18, 20]

Undebugged recursions in a program, like faulty while loops in it, may fail to terminate. An example is the following defective reccursive version of *factorial*, which fails to handle the base case for the recursion and so will recurse until memory is exhausted.

programtest; -- test of rec_limit class factorial_bad(10);procedurefactorial_bad(n); -- erroneous version of factorial procedurereturnn * factorial_bad(n - 1) ; -- no mention of base caseendfactorial_bad;endtest;

In this section we present an object class, analogous tho the 'bugoff' package described 7.1.9, which can be used to detect and control faulty recursions of this type. The class is most easily used by assigning a newly assinged instance ofit to a global variable before the first call to the recursive procedure to be controlled, and by wrapping this procedure (under a modified name) into a small front-end alias rotine which has the same name and the form illustrated by

procedurefactorial_bad(n); -- alias for erroneous version of factorial procedure -- 'rec_limiter' is an instance of the 'rec_limit' class junk := rec_limiter{OM};returnrec_limiter(factorial_bad_original(n)); -- factorial_bad_original is the procedure being debugged; is name must be changed to factorial_bad_original, -- but noting else in it should be changed at allendfactorial_bad;

Accordingly the test program seen above takes on the slightly modified form

programtest; -- test of rec_limit classuserec_limit; var rec_limiter; rec_limiter := rec_limit("factorial",200); -- set up recursion-level control factorial_bad(10);procedurefactorial_bad(n); -- alias for erroneous version of factorial procedure junk := rec_limiter{OM};returnrec_limiter(factorial_bad_original(n)); -- note procedure entryendfactorial_bad;procedurefactorial_bad_orig(n); -- erroneous version of factorial procedurereturnn * factorial_bad(n - 1) ; -- note procedure returnendfactorial_bad_orig;endtest;

The creation routine for a rec_limit object has the form rec_limit(msg,n), where 'msg' designates a string message that should identify the recursion uniquely, and 'n' should be an integer defining the maximum alloweddepth of recursion. If this is exceeded a **stop** will be forced, and 'msg' will be incorporated in the resulting error string.

rec_limit objects support just two methods, obj{x} and obj(x), which should be used as in the above example, obj{x} to signal recursive entry and obj(x) to capture the corresponding returns. The class code is simply as follows.

classrec_limit; -- class for imposing limit on depth of recursionprocedurecreate(msg,n); -- error caption and allowed recursion depthendrec_limit;class bodyrec_limit; -- class for imposing limit on depth of recursion var the_msg,count,count_ptr;procedurecreate(msg,n); -- error caption and allowed recursion depth the_msg := msg; count := n; ^count_ptr := n; -- retain message and counter value (which is forced to pointer semantics)endcreate;procedureself{x}; -- error caption and allowed recursion depthif(^count_ptr := ^count_ptr - 1) < 1thenstr(count) + " exceeded: " +str(the_msg));stop;end if;end;procedureself(x); -- error caption and allowed recursion depth ^count_ptr := ^count_ptr + 1;returnx;end;endrec_limit;

Several important technical issues arise in connection wih codes which need to work intensively with tuples or strings into which insertions and deletions need to be made in random positions. How are these abstract tuples or strings best represented? If reads and simple assignments predominate, the most efficient way of storing the string or tuple will be its 'flat' form, since this allows characters or components to be accessed by high-speed machine level indexing operations. However, sometimes insertions at random positions in such a string or tuple must be handled, as for example in the case of a tuple which represents a constantly changing database kept in alphabetical order of particular key-fields within it. In such cases the 'flat' tuple form suffers from the fact that repeated insertions may force large sections of the tuple or string (half the total, on average) to be moved repeatedly.

In such situations other, more insertion-friendly tuple representations suggest themselves. On can, for example, divide the original abstract tuple or string into roughly equal sections, for example 100 such sections, and then use the list of such sections (each of which is itself a subtuple) to represent the original tuple. This slows component acess only slightly, by turning one indexing operation into two: we must first access the section to which a desired component belongs, and then access the component within this section. However,the 'tuple of tuples' representation we contemplate greatly speeds up insertions, since for example, to insert a new component into the middle of the original tuple we may only need to move the following components of the subsection in which the insertion occurs, and this may be as little as 1% of the length of the original tuple.

Such a tuple-of-tuple representations has other advantages in connection with the copy operations which need to be applied when strings or tuples are being maintained in a SETL like value-semantics environment. Suppose, for example, that we need to manipulate a very long string, in which we want to change just one character while at the same time keeping a copy of the unchanged string. Keeping the string in a perfectly 'flat' representation may force us to copy the entire string before changing it. But if a sectional, tuple of strings representation is used instead, we will only need to copy the string section actually being changed, along with the top-level tuple which holds the succession of sections together. All the unchanged string sections can simply be shared between the original and the changed string. Here again we get a roughly 100-fold speedup, since the data elements which need to be copied should be only 1% of the size of the full string representation.

But if a two-level, tuple-of-subsections representation has these advantages, we can expect a three-level, tuple-of-tuple-of-subsections scheme to be even more effective. Pushing this idea to its limit suggests that the ultimately desired representation is a recursive tree-like structure of tuples-of-tuples-of-..., whose number of levels grows (logarithmically) with the size of the abstract tuple or string which it represents. This idea, which is that of the *B-tree*, one of the most famous and important of all non-elementary data structures, is in fact very sucessful. The code which follows in this section shows how this data structure can be used to represent all tuple operations, including component and slice retrieval, assignment, and insertion, tuple concatenation, and iteration over tuples. We give a further set of codes which uses a similar B-tree structure to represent large strings. Note also that the same kind of data structure is fundamental to the general SETL database design explored in Chapter 12.

B-trees can be thought of either as trees or as recursively tuples of objects which are themselves B-trees. The code which follows emphasises the second point of view. It defines a B-tree in the follwing way:

- Each btup BT stores a tuple t in an internal 'instance variable' of the same name. BT also stores an integer height value h. If h = 1, our btup BT is pretty much a simple tuple, and its components can be arbitrary SETL objects. If h > 1 every component of the tuple t must itself be a btup object, of height h - 1. In what followsthe component btups of the tuple t will be calledthe
*children*of the btup BT. - The length of the tuples found at every level of a B-tree (except possibly at its topmost level) lies between a specified lower limit 'minw' and upper limit 'maxw'. As a technical convenience we take maxw = 2 * minw - 1. (In the code which follows 'minw' is arbitrarily set to 5, so that the branching factor seen at every node of our trees lies beween 5 and 9. This means that a tree representing an abstract tuple of length 1,000,000 will have height about 8. When B-trees are used in a more realistic way to mangage databases, the minimum branching factor would have a value (proportioned to the size of the physical records of the disk system storing the database) more like 100, so that a tree representing an abstract tuple of length 1,000,000 would be of height no more than 3.)
- Each B-tree object also stores an auxiliary 'cumulant' tuple 'cum' of integers. 'cum' has the same length as the tuple t of children stored in the btup BT, and the j-th component cum(j) of 'cum' gives the number of tree 'leaves' that have t(j) and nodes to its left as ancestor. That is, cum(j) is the length of the tuple 'tup' represented by the B-tree section t(1..j). The availabilty of these cumulants speeds up search for a given component tup(k) of 'tup'.

The conventions listed above suggest the main outlines of the B-tree code which follows. The 'btup' objects representing our B-trees store the instance variables h, tup, and cum described above. Two additional instance variables, 'iter_ptr' and 'compno', serve to speed up iterations over B-trees; these variables are given pointer semantics. The parameterless btup 'create' routine forms an empty B-tree, and the auxilary set(t) method changes this to a B-tree representing the parameter tuple t. (The set(t) method is recursive; it constructs the btups corresponding to small tuples diirectly, but builds the btup's representing larger abstarct tuples by recursively concatentating btup's representing half-size tuples.) An auxiliary **range** operator, which converts any btup object into the standard SETL tuple it represents, is provided to aid in debugging the btup code itself. A second routine 'check_consistency', which applies various consistency checks to btup objects, is also provided for debugging.

The component access and modification routines **self**(i) and **self**(i) := x; use the 'cum' data available in each btup object to search recursively for the component addressed. The **self**(i) := x; operator must also update all cumulants along the sequence of btup objects which it traverses. The length operator #**self** is elementary, since it merely needs to return the final cumulant value cum(#cum).

The btup concatenation operation x + y is fundamental to all the remaining btup routines. This begins by converting both its arguments to btups if they are not already btups. It the two arguments x andy are then of equal height h, their internal tuples and cumulant vectors are simply concatenated. This gives the desired result unless the resulting concatentated tuple exceeeds the allowed maximum length 'maxw', in which case the auxiliary routine 'ifsplit' divides the resulting btup into two roughly equal parts (each of which is guranteeed to be of length at least 'minw' since maxw = 2 * minw - 1). These two btups then become the two children of a new btup of height h + 1, which 'ifsplit' forms. But if the two arguments of x + y are of differnet heights (assume for example that y is shorter), then the concatenation procedure recursively concatenates y with the last of x's children. Application of 'ifsplit' to this subconcatenation may then generate a pair of btups rather than a single larger btup, and the presence of this extra btup, which must at the top level become a child of x + y, may force a final, top-level split, in which case the height of x + y grows to one more than the height h of x.

The **with** operation 'x **with** y' for a btup x clearly is expressible in terms of concatenation.

The next operator implemented is the end-slice extraction operator **self**(i..). After checking for trivial cases and out-of-bounds conditions, this locates the rightmost child t(j) of the original btup which contains any of the leaves of the desired slice. If this is the last child of **self**, the end-slice is recursively extracted from this child. Otherwise an end-slice of t(j) is recursively extracted and concatenated with the tuple of children of **self** which follow t(j).

Even though the initial-slice extraction operator **self**(1..i) could be programmed in much the same way, the code for this operator is incorporated into the general slice extraction operator **self**(i..j). Like **self**(i..), this first checks for trivial cases and out-of-bounds conditions. The special **self**(1..i) case is then handled in a manner largely symmetrical to the treatment of **self**(i..). The general case is handled by locating the rightmost child t(k) of the original btup BT which contains any of the leaves of the desired slice, and also the leftmost child t(m) of BT which contains any of the leaves of the desired slice. Appropriate end and start-slices of t(k) and t(m) are then extracted recursively, and concatenated with those children of BT which lie between t(m) and t(k); this constructs the btup slice desired.

The slice assignment operator **self**(i..j) := x; is implemented (after checking for out-of-bounds conditions and easy special cases) by catenating the retained parts of **self** (there can be at most two of these) with the btup x.

The extraction operations **from**, **fromb**, and **frome** are elementary speical cases of the slice extraction operators.

The btup 'iterator_start' routine prepares for simple iteration over a btup object by setting up iterator stack, which becomes the value referenced by the iter_ptr instance variable held in the btup. This is a stack of pairs [tup,posn] containing the sequence of btup nodes and their positions leading down to the tuple leaf current at any moment during simple iteration over a btup. The associated iteration method, 'iterator_next', advances this stack by incrementing the topmost 'posn' pointer which has not reached the end of its associated 'tup', and then re-initializing the part of thestack which lies above this incremented item. The related routines 'set_iterator_start' and 'set_iterator_next' are alomost identical to 'iterator_start' and 'iterator_next', but returns pairs [x,index] of elements rather than simply the range elements x.

classbtup; -- btree class representing tuplesclassvardebug :=false;procedurecreate(); -- create blank btupprocedureset(t); -- set from tupleprocedurecheck_consistency(); -- check consistency of treeendbtup;class bodybtup; -- btree class representing tuplesconstminw := 5, maxw := 2 * minw - 1; -- minimum number of descendants, except at topvarh := 1,cum := [],tup := []; -- height, cumulant tuple, top-level tuple;variter_ptr,compno; -- iteration pointer and countprocedurecreate(); -- create empty B-tree top from tuple iter_ptr :=newat(); compno :=newat(); -- set up iteration stack pointerandcompno pointerreturnself;endcreate;procedureset(t); -- set a B-tree from a tuple;if(nt := #t) <= maxwthenh := 1; cum := [1..nt]; tup := t;returnself;else-- make two trees of half length, and concatenate them bt1 := btup(); bt1 := bt1.set(t(1..nt/2)); -- first half bt2 := btup(); bt2 := bt2.set(t(nt/2 + 1..)); -- second halfreturnbt1 + bt2; -- return concatenated versionend if;endset;procedureself(i); -- component extractionreturn ifh = 1thentup(i)elseifi <= cum(1)thentup(1)(i)elseifexists c = cum(j) | i <= cthentup(j)(i - cum(j - 1))elseOMend if;end;procedureself(i) := x; -- component changeifh = 1thentup(i) := x;elseifi <= cum(1)thentup(1)(i) := x;elseifexists c = cum(j) | i <= cthentup(c)(i - cum(j - 1)) := x;else abort("index " +str(i) + " is out ofrange");end if;end;procedurerange self; -- convert to tuplereturn ifh <= 1thentupelse[] +/ [-t: tintup]end if;end;procedure#self; -- total leaf lengthreturncum(#cum);end;procedureself+ x; -- btup concatenationif is_tuple(xx := x)thenx := btup(); x := x.set(xx);end if; -- force second argument to btupleif#cum = 0then returnx;end if;if#x.cum = 0then returnself;end if; --nullcases new := btup();ifx.h = hthen-- concatenated btup has same height new.tup := tup + x.tup; new.h := h; c1l := cum(#cum); new.cum := cum + [c + c1l: cinx.cum];returnnew.ifsplit();elseifx.h < hthen-- concatenated btup is shorter new.tup := tup; new.cum := cum; new.h := h; -- copy the element end_elt := tup(nt := #tup); -- the final element end_elt := end_elt + x; -- catenate 1 level downifend_elt.h < hthen-- the subconcatenation hasnotsplit (new.tup)(nt) := end_elt; -- just install new.cum(nt) := new.cum(nt) + x.cum(#x.cum); -- adjust the cumulantreturnnew;end if; new.tup(nt..nt) := end_elt.tup; new.h := h; --otherwiseadd tuple at top level c1ml :=ifnt = 1then0elsecum(nt - 1)end if; new.cum(nt..nt) := [c + c1ml: cinend_elt.cum];returnnew.ifsplit();else--otherwiseconcatenated element is taller new.tup := x.tup; new.cum := xc := x.cum; new.h := xh := x.h; -- copy x first_x_elt := x.tup(1); -- the first element of x tot_cum := cum(#cum); -- total cumulant of this tree first_x_elt :=self+ first_x_elt; -- catenate 1 level downiffirst_x_elt.h < xhthen-- the subconcatenation hasnotsplit (new.tup)(1) := first_x_elt;forjin[1..#xc]loopnew.cum(j) +:= tot_cum;end loop; -- adjust the later cumulantsreturnnew; -- just installend if; new.tup(1..1) := first_x_elt.tup; --otherwiseadd tuple at top level new.cum(1..1) := first_x_elt.cum; -- likewise cumulantforjin[3..#xc + 1]loopnew.cum(j) +:= tot_cum;end loop; -- adjust the later cumulantsreturnnew.ifsplit();end if;end;procedureifsplit(); -- split into 2 nodesifoverpopulatedif(nt := #tup) <= maxwthenself.check_consistency();returnself;end if; -- needn't split t1 := tup(1..nto2 := nt/2); t2 := tup(nto2 + 1..); c1 := cum(1..nto2); c2 := cum(nto2 + 1..); cum1 := cum(nto2); cum2 := cum(nt); new1 := btup(); new1.h := h; new1.tup := t1; new1.cum := c1; new2 := btup(); new2.h := h; new2.tup := t2; new2.cum := [c - cum1: cinc2]; newtop := btup(); -- new node of population 2 newtop.tup := [new1,new2]; newtop.cum := [cum1,cum2]; newtop.h := h + 1;returnnewtop;endifsplit;procedurex +self; -- concatenation with btup argument on rightif is_tuple(xx := x)thenx := btup(); x := x.set(xx);else abort("bad concatenation argument: " +str(x));end if; -- force first argument to btuplereturnx +self;end;procedureselfwithx; -- item concatenationreturnself+ [x];end;procedureself(i..); -- end slice extractionifi > (cncp1 := cum(nc := #cum) + 1)ori < 1then abort("endslice index out ofrange: " +str(i));end if;ifi = cncp1then returnbtup();end if; -- empty resultifi = 1then returnself;end if; -- the whole shebang must := exists c = cum(j) | c >= i;ifh = 1then-- minimal heght case; slice the tuple new := btup(); new.h := 1; new.tup := tup(i..); new.cum := [1..nc - i + 1];returnnew;end if; cumbef :=ifj = 1then0elsecum(j - 1)end if; -- prior cumulant tj := tup(j);ifj = ncthenreturntj(i - cumbef..);end if; -- extract from last component tail := btup(); tail.h := h; tail.tup := tup(j + 1..); cj := cum(j); tail.cum := [c - cj: cincum(j + 1..)];returntj(i - cumbef..) + tail; -- catenate tail to front piece, and returnend;procedureself(i..j); -- general slice extractionifi > (cncp1 := (cnc := cum(nc := #cum)) + 1)ori < 1thenabort("first slice index out ofrange: " +str(i));end if;ifj < i - 1orj > cncthen abort("second slice index out ofrange: " +str(i));end if;ifj = i - 1then returnbtup();end if; --returnan empty btupleifh = 1then-- minimal heght case; slice the tuple new := btup(); new.h := 1; new.tup := tup(i..j); new.cum := [1..j - i + 1];returnnew;end if; must := exists c = cum(jloc) | c >= j; -- find the top-level tup location of the index jifi = 1then-- prefix slice extraction tj := tup(jloc);ifjloc = 1then returntj(1..j);end if; -- extract from first component pref := btup(); -- will generate new prefix cumbef := cum(jloc - 1); subslice := tj(1..j - cumbef); -- 'subslice' is part of final node to takeifjloc = 2then-- would produce treewith#tup = 1; so descend 1 level pref.h := h - 1; pref.tup := (t1 := tup(1)).tup; pref.cum := t1.cum; -- prefix is identicalwithfirst nodeelse-- taking prefix part won't produce treewith#tup = 1 pref.h := h; pref.tup := tup(1..jloc - 1); pref.cum := cum(1..jloc - 1);end if;returnpref + subslice; -- catenate tail to front piece, and returnend if; must := exists c = cum(iloc) | c >= i; --otherwise findthe top-level tup location of the index i ind := tup(iloc); -- the subtup contining i prior_cum :=ifiloc = 1then0elsecum(iloc - 1)end if; -- cumulant prior to this nodeifiloc = jlocthen-- if the two tup locations are the same, then do a subextractionreturnind(i - prior_cum..j - prior_cum);end if; jnd := tup(jloc); -- the subtup contining i jprior_cum := cum(jloc - 1); -- cumulant prior to jnd ipart := ind(i - prior_cum..); jpart := jnd(1..j - jprior_cum); -- headandtail extractionifiloc + 1 = jlocthen returnipart + jpart;end if; --ifthe two tup locations differ by 1, just catenate newmid := btup(); --otherwisethe middle tree has just 1 node; generate new middleifiloc + 2 < jlocthen-- middle has at least 2 nodes newmid.h := h; newmid.tup := tup(iloc + 1..jloc - 1); cumbef := cum(iloc); newmid.cum := [c - cumbef: cincum(iloc + 1..jloc - 1)];else-- middle has just 1 node; reduce height newmid := tup(iloc + 1);end if;returnipart + newmid + jpart; --returnconcatenation of the front, middle,and endend;procedureself(i..j) := x; -- slice assignmentif is_tuple(xx := x)then-- convert to btup x := btup(); x := x.set(xx);elseif type(x) /= "BTUP"thenabort("illegal slice-assignment righ-hand side: " +str(x));end if;ifi > (cncp1 := (cnc :=if(nc := #cum) = 0then0elsecum(nc)end if) + 1)ori < 1thenabort("first slice-assignment index out ofrange: " +str(i));end if;ifj < i - 1orj > cncthen abort("second slice-assignment index out ofrange: " +str(i));end if;ifi = 1then-- over-written part is prefixifj = cncthen-- all over-written; copy x to self h := x.h; tup := x.tup; cum := x.cum;returnx; -- modify this btup; return right-hand sideend if; tail :=self(j + 1..); new := btup(); new.h := x.h; new.tup := x.tup; new.cum := x.cum; new := new + tail; -- catenate the assigned part plus the retained part h := new.h; tup := new.tup; cum := new.cum; -- modify this btupreturnx; --returnright-hand sideend if; pref :=self(1..i - 1); -- get the retained prefix new := btup(); new.h := x.h; new.tup := x.tup; new.cum := x.cum; -- the assigned partifj = cncthen-- over-written part is suffix new := pref + new; -- catenate the retained part plus the assigned partelse-- over-written part is middle tail :=self(j + 1..); new2 := btup(); new2.h := x.h; new2.tup := x.tup; new2.cum := x.cum; new := pref + new + tail; -- catenate the retained part plus the two assigned partsend if; h := new.h; tup := new.tup; cum := new.cum; -- modify this btupreturnx; --returnright-hand sideend;procedureself(i..) := x; -- end slice assignmentself(i..cum(#cum)) := x;returnx; --returnright-hand sideend;procedure fromself; -- front extractioniftup = []then returnOM;end if; --null casex :=self(1); y :=self(2..); h := y.h; cum := y.cum; tup := y.tup;returnx;end;procedure fromeself; -- end extractioniftup = []then returnOM;end if; --null casex :=self(ln := cum(#cum)); y :=self(1..ln - 1); h := y.h; cum := y.cum; tup := y.tup;returnx;end;procedure frombself; -- end extractioniftup = []then returnOM;end if; --null casex :=self(1); y :=self(2..); h := y.h; cum := y.cum; tup := y.tup;returnx;end;procedureiterator_start; -- initialize simple iteration over btup -- sets up iterator stack as value referenced by iter_ptr. This is a stack of pairs [tup,posn] stack := []; -- to be built node :=self;forjin[1..h]loopstackwith:= [notup := node.tup,ifj = hthen0else1end if]; node := notup(1);end loop; ^iter_ptr := stack; -- attach stack to iteration pointerenditerator_start;procedureiterator_next(); -- step simple iteration over btup -- returns value as singleton,orOMifterminating -- advances iterator stack referenced by iter_ptr stack := ^iter_ptr; -- retrieve the iterator stack height := 1;forjin[ns := #stack,ns - 1..1]loopif(sj := stack(j))(2) = #sj(1)thenheight +:= 1; removedfromestack; -- removeanyexhausted elementelseexit;end if;end loop;ifheight = 1thenremovedfromestack; -- pop the top element;thenadvance it [tup,loc] := removed; result := tup(loc +:= 1); stack(ns) := [tup,loc]; ^iter_ptr := stack;return[result]; --returnsingleton tuple builtfromleaf elementend if;ifstack = []then returnOM;end if; -- iteration is exhausted removedfromestack; -- pop the top element,thenadvance itand useto rebuild rest of stack [tup,loc] := removed; node := tup(loc +:= 1); stackwith:= [tup,loc];forjin[1..hm1 := height - 1]loop-- rebuild the stack, startingwiththe node that was advanced stackwith:= [notup := node.tup,ifj = hm1then0else1end if]; node := notup(1);end loop; removedfromestack; -- pop the top element;thenadvance it [tup,loc] := removed; result := tup(loc +:= 1); stack(ns) := [tup,loc]; ^iter_ptr := stack;return[result]; --returnsingleton tuple builtfromleaf elementenditerator_next;procedureset_iterator_start; -- initialize second-form iteration over btup (similar to iterator_start) -- sets up iterator stack as value referenced by iter_ptr stack := []; -- to be built ^compno := 0; node :=self;forjin[1..h]loopstackwith:= [notup := node.tup,ifj = hthen0else1end if]; node := notup(1);end loop; ^iter_ptr := stack; -- attach stack to iteration pointerendset_iterator_start;procedureset_iterator_next(); -- step second-form iteration over btup -- returns value as singleton,orOMifterminating -- advances iterator stack referenced by iter_ptr stack := ^iter_ptr; -- retrieve the iterator stack ^compno := cno := ^compno + 1; -- advance the component number height := 1;forjin[ns := #stack,ns - 1..1]loopif(sj := stack(j))(2) = #sj(1)thenheight +:= 1; removedfromestack; -- removeanyexhausted elementelseexit;end if;end loop;ifheight = 1thenremovedfromestack; -- pop the top element;thenadvance it [tup,loc] := removed; result := tup(loc +:= 1); stack(ns) := [tup,loc]; ^iter_ptr := stack;return[[cno,result]]; --returnsingleton tuple builtfromleaf elementend if;ifstack = []then returnOM;end if; -- iteration is exhausted removedfromestack; -- pop the top element,thenadvance itand useto rebuild rest of stack [tup,loc] := removed; node := tup(loc +:= 1); stackwith:= [tup,loc];forjin[1..hm1 := height - 1]loop-- rebuild the stack, startingwiththe node that was advanced stackwith:= [notup := node.tup,ifj = hm1then0else1end if]; node := notup(1);end loop; removedfromestack; -- pop the top element;thenadvance it [tup,loc] := removed; result := tup(loc +:= 1); stack(ns) := [tup,loc]; ^iter_ptr := stack;return[[cno,result]]; --returnsingleton tuple builtfromleaf elementendset_iterator_next;procedurecheck_consistency(); -- check consistency of treeifh > 1then-- the number of nodes must beinthe correctrangeifexists nd = tup(j) | (nc := #(nd.tup)) < minwornc > maxwthenabort("node count out ofrange: " +str(nc) + " " +str(nd));end if;ifexists nd = tup(j) | nd.h /= h - 1thenabort("node height inconsistency: " +str(j) + " " +str(nd.h) + " " +str(h));end if;end if; -- the cumulant differences at this level must be the sum of those at the next lower levelifh > 1thenifexists nd = tup(j) | cum(j) /=ifj = 1then0elsecum(j - 1)end if+ nd.cum(#nd.cum)thenabort("bad cumulant: loc = " +str(j) + " cum at loc = " +str(cum(j)) + " priorcum = " +ifj = 1then""else str(cum(j - 1))end if+ " node cum = " +str(nd.cum(#nd.cum))); -- + " " +str(nd)end if;elseifcum /= [1..#tup]thenabort("bad bottom-level cumulant: " +str(cum) + " " +str(nd));end if;ifh > 1and(exists ndintup |notnd.check_consistency())then abort("badabort");end if;return true; -- passed the consistency checkendcheck_consistency;endbtup;

As detailed in the preceding sections of this in Chapter, extended meanings for SETL's infix operators and other built-in syntactic constructs are specified by including function definitions with header lines of forms like

in an object-class definition. This makes it possible give SETL operators convenient new meanings for types of objects which are not part of SETL's standard object repertoire. However, it is sometimes appropriate to extend the meaning of SETL operators to cases involving only standard objects, but in wasy that SEL does not allow (or, better said, does not use.) For example, if f is a function or one argument and s is a set or tuple, it will sometimes be convenient to use f * s as an abbreviation for {f(x): x in s} or [f(x): x in s].procedure self+ y; ...procedure self(x); ...procedure self{x}; ...procedure self(i..j); ...procedure self(x) := y; ...procedure self{x} := y; ...procedure self(i..j) := y; ...procedureiterator_start; ...procedureiterator_next; ...procedureset_iterator_start; ...procedureset_iterator_next; ...

More specifically, SETL allows operation-error cases, that is, object-operator combinations that are not supported in SETL's built-in library, which would ordinarily trigger run-time aborts, to be redefined as function invocations. This mimics the SETL mechanism for extending built-in operations to user-defined object classes.

SETL allows all the redefinitions listed above (excepting the iterator redefinitions) to be applied, in what currently are operation error cases, to built-in SETL objects. This is done by **use**ing special packages whose name starts with 'error_extension_' in the syntactic environment (a package or a class) within which such an extension or group of extensions is to apply. (Several such package can be **use**d in such an environment, but if two of these redefine the same operation, they cancel each other and neither redefinition applies; this is the same rule which applies to conflicting function definitions of other kinds which appear in multiple packages used within a single syntactic context.)

For example, SETL does not define the mutiplication operation f * s if s is a set. But by writing

we remedy this omission and can use this otherwise forbidden diction.packageerror_extension_f_usage;enderror_extension_f_usage;package bodyerror_extension_f_usage;procedure self* s;return if is_set(s)then{self(x): xins}else[self(x): xins]end if; end;enderror_extension_f_usage;programtest;float* [1..10],cos* t);endtest;

Note the simple syntactic rules which apply here:

- 'error_extension_' packages require package headers,but these are always empty.
- The syntactic form of SETL operator redefinitions within 'error_extension_' packages is the same as that within object classes. In particular, the first argument of any such redefinition must always be called
**self**. - In addition to its object redfinitions, an 'error_extension_' package can, like other packages, contain auxiiary function definitions, but no of these can be visible externally.

Here is a second example. SETL does not define the division operation s / t if s and t are both strings, so in the 'default' SETL environment this case of the division operator will be treated as a run-time error. However, if a package_named 'error_extension_xxx' is **use**d in this environment, and if that package contains an operator definition **procedure** of the form

then thisprocedure self/ y;..

The final 'case type(self)when"STRING" => (code for evaluatingself/ y when both are strings)when"TUPLE" => (etc.)otherwise=>return self/ y;endif;

found in 'error_extension_zzz', and so on iteratively until either the original operation s / t finds its interpretation, or an actual error-abort results.procedure self/ y;..

These conventions make otherwise illegal cases like s / t, where s and t are both strings, usable within SETL programs, packages, and classes. The scope within which any such error extension applies is a single program, package, and class PPorC, the applicable error redefinition being specified, as explained above, by the 'error_extension_...' package **use**d in PPorC.

Extensions of this kind are potentially handy, especially for exploring contemplated SETL extensions. Suppose, for example, that we let s / t be the operation which cuts any string s at all the occurrences of t in s, thereby forming a tuple; and (ii) applies recursively to sets and tuples of strings. Then

"a,b;c,d;ee,ff" / ";" / "," is [["a","b"],["c","d"],["ee","ff"]].But "a,b c,d ee,ff"/ ";" / "," is clearly a more convenient form of input than [["a","b"],["c","d"],["ee","ff"]]. Suppose next that we define t * s, where t is a tuple and s is a string, to be the concatenation t(1) + s + t(2) + s + ... + t(#t). Then if s, a, and b are all strings, s / a * b replaces every occurence of a in s by an occurence of b.

Many useful extensions of this kind suggest themselves. If s, t, and u are strings and n is an integer, s(t) can designate the index of the first occurence of t in s (or **OM** if there is none);
s(t) := u can replace the first occurence of t by u; s(n..t) can designate the index of the first occurence after position n of t in s, etc. If t is a set or tuple and f is a one-parameter procedure, f * t can designate the result of applying f to each of the components or elements of t. Many other such cases suggest themselves, as ways of lending futher flexibility to SETL.

The rules governing the syntactic scopes within which error redefinitions apply facilitate their combination. Suppose, for example, that a useful definition for the quotient s / t has already been found and made available within a package 'error_extension_strings', and that we now want to extend this to the case in which s is a string but t is a set (for example, t might be a set of strings, and s / t might break s at each occurence of any string x in t, but leave these x in the resulting sequence.) To do this, one could simply write a package 'error_extension_strings2' in which

had a form something likeprocedure self/ y;..

If the 'error_extension_strings2' package 'procedure self/ y;if(type(y) = "SET")then...elsereturn self/ y;end if;end;

We give a few specific examples illustrating the use of error extension packages. Suppose that the two following error packages have been compiled into the SETL library.

packageerror_extension_floats;enderror_extension_floats;package bodyerror_extension_floats;procedure self* s;if is_integer(self) thenself:= float(self); end if;if is_integer(s) then s := float(s); end if;return self* s;end;procedure self+ s;if is_integer(self) thenself:= float(self); end if;if is_integer(s) then s := float(s); end if;return self+ s;end;procedure self- s;if is_integer(self) thenself:= float(self); end if;if is_integer(s) then s := float(s); end if;return self- s;end;procedure self/ s;if is_integer(self) thenself:= float(self); end if;if is_integer(s) then s := float(s); end if;return self/ s;end;procedure self** s;if is_integer(self) thenself:= float(self); end if;if is_integer(s) then s := float(s); end if;return self** s;end;enderror_extension_floats;packageerror_extension_misc;enderror_extension_misc;package bodyerror_extension_misc;usestring_utility_pak;useerror_extension_floats;procedure self+ s;return self+ s;end;procedure self** s;case type(self)when"STRING" =>return self(#self+ s + 1..);otherwise=>return self** s;end case;end;procedure self/ s; -- breakup function, represented as divisioncase type(s)when"STRING" => return breakup(self,s); -- uses string splitting routine from 'string_utility_pak'otherwise=>return self/ s;end case;end;procedure self- s; -- suppress_chars function, represented as subtractioncase type(s)when"STRING" => return suppress_chars(self,s); -- from 'string_utility_pak'otherwise=>return self- s;end case;end;procedure self* s; -- 'join' function, represented as multiplicationvarmy_self; my_self :=self;case type(self)when"TUPLE" =>case type(s)when"STRING" => stg :=if#self= 0then self else self(1)end if;forx =self(j) | j > 1loopstg +:= (s + x);end loop;returnstg;when"TUPLE" =>return[self(j) * x: x = s(j)]; -- lengths assumed equalend case;when"PROCEDURE" =>case type(s)when"TUPLE" =>return[self(x): xins];when"SET" =>return{self(x): xins};when"PROCEDURE" =>returnlambda(x);returnmy_self(s(x));end lambda;end case;when"REAL" =>case type(s)when"TUPLE" =>return[self * x: x in s];when"SET" =>return{self * x: x in s};end case;otherwise=>return self* s;end case;end;procedureself modx; -- alternative syntax:self(x)case type(x)when"STRING" =>returnword_loc(self,x);when"INTEGER" =>return self(#self+ 1 + x); -- character from end; x must be negativeend case; end;procedureselfmaxx; -- alternative syntax:self(x..)case type(x)when"INTEGER" =>return self(#self+ 1 + x..); -- character from end; x must be negativeend case;end;enderror_extension_misc;

The following test program exercises this miscellany of error extensions:

programerr_extension_test; -- test of error-extension packagesuseerror_extension_misc; -- use error_extension_floats; -- this must be omitted: would cause errorsenderr_extension_test;

The output produced by this program is

[1.0000000000, 2.5000000000, 1.5000000000, 4.0000000000, 0.25000000000] [1.0000000000, 2.0000000000, 3.0000000000] [0.50000000000, 1.0000000000, 1.5000000000] [0.25000000000, 1.0000000000, 2.2500000000] [0.87758256189, 0.54030230587, 0.070737201668] [0.88726005072, 0.66636674539, 0.54240850453] [0.88726005072, 0.66636674539, 0.54240850453] ["a", "in", "b", "an", "c", "of", "d", "in", "aaa", "of", "bbb", "an", "ccc", ""] abc a b c... 6 k ck