CHAPTER 12

The SETL Database Facility

Chapter Contents.

  1. General Introduction to the Functionality and Behavior of SETL Databases

  2. A First Example of SETL Database Use.

  3. A Second Example. Value-semantics of SETL Databases, free form of records, databases as ordinary objects.

  4. More Examples: Databases which store Free-form Text.

  5. A First Sketch of the SETL Database Implementation concept: simplest form of the basic B-tree structure.

  6. B-trees Incorporating More General Cumulants.

  7. Mapping large strings to collections of disk pages.

  8. Examples of the use of Large-string Objects.

  9. Code for large-string management with reference counting.

  10. The full SETL database design - Introduction.

  11. Code Details for SETL Database objects.

  12. 'Hibernating' transactions - Introduction.

  13. 'Hibernating' transactions - An example application.

  14. 'Hibernating' transactions - additional implementation details.

  15. Guaranteed recovery after crash for programs which manipulate databases.

  16. Recovery mechanisms - code details.

  17. Formal verification of crash recovery code correctness; a 'brute force' approach.

  18. 'Versioned' B-trees.

  19. Atomicity and recovery of distributed transactions.

12.1 General Introduction to the Functionality and Behavior of SETL Databases. SETL includes a powerful facility for database design and implementation. Associated with this is an extended facility for prototyping and implementing complex commercial applications, based on the notion of 'hibernating transactions' which can be resident for long periods of time in the database itself. These facilities are fully transactional, in the sense explained below, and also facilitate prototyping of and implementation of large distributed databases.

SETL databases provide an alternative to standard SQL databases. Each SETL database object is a potentially very large (up to hundreds of gigabytes) abstract object stored on disk. Each such database object resembles a map from a set of system generated 4- or 5-byte record identifiers to the records which the database stores, each of which is a fully general object of the SETL language, but which is itself not excessively large (e.g. no more than a few dozen megabytes.) The database records are systematically indexed for efficient access, using the abstract technique described below. Once a record of a SETL database has been accessed, all the operations of the SETL language can be applied to it. Conversely, any standard SETL object can be stored as a database record.

The operations specific to SETL databases (other than standard SETL operations applicable to individual records) are as follows:

 x := db(id);	-- find a database record given its identifier
 db(id) := x;	-- store x as a database record, associating it with the specified identifier
 db with x	-- create a new record and put x into it. 
 		-- a pair of the form [modified_db,id_of_new_record] is returned.
 db.to_string;	-- database-associated mapping of records to 
 		-- reduced string variants ('summary strings')
			-- used for indexing; see comment below

 db.contains(wd) -- creates and returns a database iterator object, which iterates
		-- over all records whose summary string contains the given word

Iterations 'x in iter_obj' over these iterator objects return the ordered sequence of keys of all records containing the specified word (in a standard key-collating order). These iteration objects support the operations:

	#iter_obj (number of keys in an iteration group)
	arb(iter_obj) (first item in the iteration group, or OM if none)
	iter_obj + iter_obj (union iteration)
	iter_obj * iter_obj (iterate over intersection),
	iter_obj - iter_obj (iterate over difference)
Internally, the database 'records' are represented by long string sections, separated by special marks representing each record's id. The detailed layout of the 'id-and-record' file used to store this information is described below.

The SQL view of database semantics has become pervasive. However, the alternative approach described in this chapter may have significant advantages. To begin to see how these to approaches compare, note that the SQL database records ordinarily depicted as

NameAgeDepart.SalaryEmployedPosn.
John Peter Doe43063210006/7/79Clerk
Mary M. Smith25063450009/1/91Mgr.

can be converted to the strings (used only for searching in the SETL version of the same database)

"John Peter Doe 43 D:063 $21000 6/7/79 Clerk" "Mary Smith 25 D:063 $45000 9/1/91 Mgr."

by a function f sophisticated enough to recognize middle names and treat them in a special way. Then the operations listed above would allow the crucial core of database searches, e.g. a search for the manager of the employee John Doe, to be written as follows:

	{name(y): x in contains("Doe"), y in contains(department(x)) | 
				posn(y) = "Mgr." & "John" in words_of(string_of(x))}.
More properly, this returns all the managers in any department containing an employee called John Doe, along with whatever 'accidental' data might occasionally result from accidental combinations of John and Doe, as in a conceivable department that manufactured Doe Skin gloves and had an employee named Wallace John, or John Johnson. But this sort of thing should be quite rare, and can in any case be detected by forming and filtering or displaying the set of pairs
	{[x,y]: x in contains("Doe"), y in contains(department(x)) | 
				posn(y) = "Mgr." & "John" in words_of(string_of(x))}.
in an appropriate format.

This example shows the felt advantage of the approach proposed: (a) it separates the optimization of database queries rather cleanly from their abstract definition; (b) no special statement of the fields to be indexed is required; (c) it allows arbitrary strings to appear in fields, and is prepared to search using concordances of the words in these fields, which is often an effective search technique. (E.g. we can just as well find the manager of "Peter Doe", or "J. Peter Doe", a task which might well vex an SQL database.)

Indexing of SETL databases. Each database has an administrator-defined 'to_string(record)' method which can be applied to each of its records and which yields a 'record summary string' defining those aspects of the record which are to be indexed to improve the efficiency of searches over the database. By cutting these summary strings into words (simply by cutting them at all included whitespaces) we generate an associated internal word_list(record). These word lists are automatically used to build a 'word index' associated with the database. Iterations over the db.contains(wd) iterator objects described above exploit this index.

Transaction handling in SETL databases - Basics. SETL databases, like all other SETL objects, have value semantics. For example, if db is a database object, then the sequence of statements

db2 := db; db := db with new_record;

changes db by adding a new record, but leaves db2 unchanged since no new assignment has been made to it. The SETL database implementation creates the 'logical copies' that this requires in an efficient way, whose cost is only weakly dependent of the size of the database objects involved.

This gives an easy way of obtaining transaction-like effects. To begin a transaction which may have to be aborted, simply write

db_copy := db;

Then edit db as desired. To 'commit' the transaction, simply erase the copy by writing db_copy := OM. To back out of the transaction instead, simply write db := db_copy; this will restore db to its original state.

More Advanced Transaction-related facilities.

Commercial 'transactional' database systems provide several important transaction-related capabilities in addition to the basic transaction-backout ability noted above. These are: (i) guaranteed recovery after system failure; (ii) distributed execution, with guaranteed coordination of multiple distributed files, and guaranteed recovery after failure of any of the components of a distributed system; (iii) parallel execution of multiple independent transactions, with guaranteed isolation of each transaction from the others (serializability.) These fundamental capabilities are also supported by the SETL database system.

Guaranteed recovery after system failure - introduction.

We model the system failures to be guarded against by assuming that:

  1. a processor's CPU can crash, e.g. by losing power or by freezing after a software fault.

  2. Such crashes can occur unpredictably, even in the middle of writing a disk record. However, when such a crash occurs, bad data can appear only in the disk area being written, but not outside it.

  3. Once written to disk, data never 'evaporates', i.e. can always be read.

If any of these assumptions are violated in an actual physical system, a lower supporting layer of physical and software redundancy will be needed. This must implement a 'virtual system' satisfying assumptions (a-c).

Recovery after system failure is guaranteed if we can always restart the system after crash by reading the disk and recovering to a process state equivalent to one reached previously to, and not too distant from, the moment of crash. We call system states to which we could recover 'checkpoints', and can think of them as being marked by a statement

 	checkpoint();

Such checkpoint operations must reliably create something logically equivalent to a complete, self-standing copy of the database, but, since they must be reasonably frequent they must not be too expensive. Recovery is then possible if (i) we can always accept a crash event which destroys RAM and leaves a half-written record on the disk, as explained above; (ii) we can write a top-level 'handler' for this event which reads the state of the disk and brings us back to a system state fully equivalent to that reached immediately prior to execution of the last previous checkpoint() call.

SETL databases support this functionality. The technique used to accomplish this is described in a later section.

A First Example of SETL Database Use. The following example illustrates the fundamental database operations described above, and can be executed to test them. In it we create a very tiny 'database' containing just a few records and apply the main database operations to it. Note that the elements of the 'database' are simply SETL pairs [name, age], but that the database-associated mapping 'pair_to_stg' sends the pairs into summary string of the form 'N:name A:age'.

 program test; -- a first test program for the SETL database class
	use SETL_database; -- use the class of SETL database objects
	
	db := SETL_database(pair_to_stg); -- create an empty database, to store test pairs

	[db,jack_id] := db with ["Jack",71]; -- add a first pair
	print("Jack record: ",db(jack_id)); 
	
	[db,diana_id] := db with ["Diana",43]; -- add a second pair 
	[db,rachel_id] := db with ["Rachel",43]; -- add a third pair 

	print("\n*** Initial Iteration test - records containing 'A:43' ***");
	for idx in db.contains("A:43") loop print(db(idx)); end loop;
	
	print("\n*** Testing empty arb - records containing 'Walter' ***");
	print(arb(db.contains("N:Walter"))); 
	
	print("\n*** Initial Union test - records containing 'Jack' or 'Diana' ***");
	for key in (jd_recs := (db.contains("N:Jack") + db.contains("N:Diana"))) loop 
		print(db(key)); 
	end loop;

	print("\n*** Iterate again ***");
	for key in jd_recs loop print(db(key)); end loop;

	print("testing arb for union: list a record containing 'Jack' or 'Diana': ",
		db(arb jd_recs)); 

	print("\n*** Intersection test - records containing '43' and 'Diana' ***");
	for key in (d43_recs := (db.contains("A:43") * db.contains("N:Diana"))) loop 
		print(db(key)); 
	end loop;
	
	print("\n*** Intersection test - records containing '43' and 'Rachel' ***");
	for key in (r43_recs := (db.contains("A:43") * db.contains("N:Rachel"))) loop 
		print(db(key)); 
	end loop;
	
	print("\n*** Difference test - records containing '43' and not 'Diana' ***");
	for key in (nd43_recs := (db.contains("A:43") - db.contains("N:Diana"))) loop 
		print(db(key)); 
	end loop;
	
	print("\n*** Difference test - records containing '43' and not 'Rachel' ***");
	for key in (nr43_recs := (db.contains("A:43") - db.contains("N:Rachel"))) loop 
		print(db(key)); 
	end loop;
	
	print("testing arb for union: list a record containing 'Jack' or 'Diana': "
		,db(arb jd_recs)); -- ****
	print("testing arb for Intersection: list a record containing '43' and 'Rachel': "
		,db(arb r43_recs));
	print("testing arb for Intersection: list a record containing '43' and 'Diana': "
		,db(arb d43_recs));
	print("testing arb for Intersection: list a record containing 'Rachel' and 'Diana': "
		,arb(db.contains("N:Rachel") * db.contains("N:Diana")));

	-- try various iterator-based retrievals
	print("testing arb: list a record containing 'Jack': "
			,db(jack_key := arb(db.contains("N:Jack"))));
	print("testing arb: list a record containing 'Diana': "
		,db(arb(db.contains("N:Diana"))));
	print("testing arb: list a record containing 'A:71': "
		,db(arb(db.contains("A:71"))));
	print("testing arb: list a record containing 'A:43': "
		,db(arb(db.contains("A:43"))));

	print("testing count: count records containing 'A:43': "
		,#db.contains("A:43")); 
	
	-- now try some elementary database edits
	print("testing change of record, set ['Jack',71] to ['Jack',72]");
	db(jack_key) := ["Jack",72]; -- another year gone by

	print("testing arb after change: list a record containing 'N:Jack': ",
		db(arb(db.contains("N:Jack"))));
	print("testing arb after change: list a record containing 'A:71': "
		,arb(db.contains("A:71")));
	print("testing arb after change: list a record containing 'A:71': "
		,if (a_rec := arb(db.contains("A:71"))) /= OM then db(a_rec) 	
	else "Non-existent ***" end if);
	print("testing arb after change: list a record containing 'A:72': "
		,if (a_rec := arb(db.contains("A:72"))) /= OM then db(a_rec) 
	else "Non-existent ***" end if);

	print("testing arb after change: list a record containing 'N:Diana': "
		,db(arb(db.contains("N:Diana"))));
	print("testing arb after change: list a record containing 'A:43': "
		,if (a_rec := arb(db.contains("A:43"))) /= OM then db(a_rec) 
	else "Non-existent ***" end if);
	
	print("\nadding ['Jack', 6]: "); [db,younger_id] := db with ["Jack", 6]; 
			-- add a younger jack
	
	print("testing the basic iterators again: list records containing '6'");
	for key in (jack_recs := db.contains("A:6")) loop 
		print(db(key)); 
	end loop;
	
	print("*** Print all records containing 'Jack' ***");
	for key in (jack_recs := db.contains("N:Jack")) loop 
		print(db(key)); 
	end loop;

	print("\n*** Deleting a record -- the 'Jack' with A:72 ***");
	db(jack_id) := OM;
	
	print("*** Print all records containing 'Jack' after deletion ***");
	for key in (jack_recs := db.contains("N:Jack")) loop 
		print(db(key)); 
	end loop;
	
	print("\n*** Restoration Test - Restoring ['Jack',71]");
	
	[db,jack_id2] := db with ["Jack",71]; -- the other Jack again
	
	print("*** Print all records containing 'Jack' after restoration ***");
	for key in (jack_recs := db.contains("N:Jack")) loop 
		print(db(key)); 
	end loop;
	
	print("*** Add large group of new records ***");
	for j in [1..20] loop -- add sixty new records
		[db,-] := db with ["Jack",j]; 
		[db,-] := db with ["Diana",j]; 
		[db,-] := db with ["Rachel",j];
	end loop;
	
	print("records with 'A:3'");
	print(str([db(x): x in (jnd_recs := db.contains("A:3"))])); 
	print("records with 'N:Diana'");
	print(str([db(x): x in (jnd_recs := db.contains("N:Diana"))])); 

	print("\n*** Union test - records containing 'Jack' or 'Diana' ***");
	print([db(key): key in (jd_recs := (db.contains("N:Jack") + db.contains("N:Diana")))]);
	
	print("\n*** Union test - records containing 'Rachel' or 'Diana' ***");
	print([db(key): key in (jd_recs := (db.contains("N:Rachel") + db.contains("N:Diana")))]);
	
	print("\n*** Intersection test - records containing '43' and 'Diana' ***");
	for key in (jd_recs := (db.contains("A:43") * db.contains("N:Diana"))) loop 
		print(db(key)); 
	end loop;
	
	print("\n*** Intersection test - records containing '43' and 'Rachel' ***");
	for key in (jd_recs := (db.contains("A:43") * db.contains("N:Rachel"))) loop 
		print(db(key)); 
	end loop;
	
	print("\n*** Difference test - records containing '43' and not 'Diana' ***");
	for key in (jd_recs := (db.contains("A:43") - db.contains("N:Diana"))) loop 
		print(db(key)); 
	end loop;
	
	print("\n*** Difference test - records containing '43' and not 'Rachel' ***");
	for key in (jd_recs := (db.contains("A:43") - db.contains("N:Rachel"))) loop 
		print(db(key)); 
	end loop;
	
	print("recs with Jack and '6': "); 
		-- this adds a second ["Jack,6] record, with a different id
	if (jack_rec6 := arb(db.contains("N:Jack") * db.contains("A:6"))) /= OM then 
		print(db(jack_rec6)); 
	end if;
	
	print("\n*** Difference test ***");
	print("recs with Jack but not '6'");
	print([db(key): key in (jn6_recs := db.contains("N:Jack") - db.contains("A:6"))]);
	
	print("\n*** Intersection test ***");
	print("recs with Jack and '6'");
	for x in (jnd_recs := db.contains("N:Jack") * db.contains("A:6")) loop 
		print(db(x)); 
	end loop;
	
	print("recs with Jack and '72': ");
	print(if (jack_rec72 := arb(db.contains("N:Jack") * db.contains("A:72"))) /= OM 
		then db(jack_rec72) 
	else "Nonexistent" end if); 
	
	print("\n*** Deleting a record -- the 'Jack' with A:10 ***");
	jack_rec10 := arb(db.contains("N:Jack") * db.contains("A:10")); 
	db(jack_rec10) := OM;
	
	print("remaining recs with Jack");
	print([db(key): key in (jack_recs := db.contains("N:Jack"))]);
	
	print("\n*** Restoration Test - Restoring ['Jack',10]");
	
	[db,jack10_id] := db with ["Jack",10]; -- the other Jack again

	print("Records with 'Jack' but not 'Diana'");
	print([db(key): key in (jnd_recs := db.contains("N:Jack") - db.contains("N:Diana"))]);
	
	print("\n*** Deleting another record -- one of the two 'Jack's with A:6 ***");
	print("Deleting ",db(lit_jack_rec := arb(db.contains("N:Jack") * db.contains("A:6"))));
	db(lit_jack_rec) := OM; -- remove this younger 'Jack'

	print("All remaining records with 'Jack'"); 
	print([db(key): key in (jack_recs := db.contains("N:Jack"))]);
	
	for j in [1..20] loop -- add twenty new records
		[db,-] := db with ["Jack",100 + j]; 
		[db,-] := db with ["Diana",100 + j]; 
		[db,-] := db with ["Rachel",100 + j];
	end loop;
	
	print("records with 'A:103'");
	print(str([db(x): x in (all3_recs := db.contains("A:103"))])); 
	
	print("records with 'N:Diana'");
	print(str([db(x): x in (jnd_recs := db.contains("N:Diana"))])); 
	print("records with 'A:3' but not 'N:Diana'");
	print(str([db(x): x in (jnd_recs := db.contains("A:3") - db.contains("N:Diana"))])); 
	print("records with 'N:Diana' but not 'A:3'");
	print(str([db(x): x in (jnd_recs := db.contains("N:Diana") - db.contains("A:3"))])); 
	print("records with 'N:Diana' and 'A:3'");
	print(str([db(x): x in (jnd_recs := db.contains("N:Diana") * db.contains("A:3"))])); 
	print("records with 'N:Diana' or 'A:3'");
	print(str([db(x): x in (jnd_recs := db.contains("N:Diana") + db.contains("A:3"))])); 
	
	print("\n ***END OF TESTS ***"); 
	
	end test;
The output produced by this program is as follows:
	Jack record: ["Jack", 71]
	test elimination of duplicates in union iteration
	[["Jack", 71]]

	*** Initial Iteration test - records containing 'A:43' ***
	["Diana", 43]
	["Rachel", 43]

	*** Testing empty arb - records containing 'Walter' ***
	<om>

	*** Initial Union test - records containing 'Jack' or 'Diana' ***
	["Jack", 71]
	["Diana", 43]

	*** Iterate again ***
	["Jack", 71]
	["Diana", 43]
	testing arb for union: list a record containing 'Jack' or 'Diana': ["Jack", 71]

	*** Initial Intersection test - records containing '43' and 'Diana' ***
	["Diana", 43]

	*** Initial Intersection test - records containing '43' and 'Rachel' ***
	["Rachel", 43]

	*** Initial Difference test - records containing '43' and not 'Diana' ***
	["Rachel", 43]

	*** Initial Difference test - records containing '43' and not 'Rachel' ***
	["Diana", 43]
	testing arb for union: list a record containing 'Jack' or 'Diana': ["Jack", 71]
	testing arb for Intersection: list a record containing '43' and 'Rachel': ["Rachel", 43]
	testing arb for Intersection: list a record containing '43' and 'Diana': ["Diana", 43]
	testing arb for Intersection: list a record containing 'Rachel' and 'Diana': <om>
	testing arb: list a record containing 'Jack': ["Jack", 71]
	testing arb: list a record containing 'Diana': ["Diana", 43]
	testing arb: list a record containing 'A:71': ["Jack", 71]
	testing arb: list a record containing 'A:43': ["Diana", 43]
	testing count: count records containing 'A:43': 2
	testing change of record, set ['Jack',71] to ['Jack',72]
	testing arb after change: list a record containing 'N:Jack': ["Jack", 72]
	testing arb after change: list a record containing 'A:71': <om>
	testing arb after change: list a record containing 'A:71': Non-existent ***
	testing arb after change: list a record containing 'A:72': ["Jack", 72]
	testing arb after change: list a record containing 'N:Diana': ["Diana", 43]
	testing arb after change: list a record containing 'A:43': ["Diana", 43]

	adding ['Jack', 6]: 
	testing the basic iterators again: list records containing '6'
	["Jack", 6]
	*** Print all records containing 'Jack' ***
	["Jack", 72]
	["Jack", 6]

	*** Deleting a record -- the 'Jack' with A:72 ***
	*** Print all records containing 'Jack' after deletion ***
	["Jack", 6]

	*** Restoration Test - Restoring ['Jack',71] ***
	*** Print all records containing 'Jack' after restoration ***
	["Jack", 6]
	["Jack", 71]
	*** Add large group of new records ***
	records with 'A:3'
	[["Jack", 3], ["Diana", 3], ["Rachel", 3]]
	records with 'N:Diana'
	[["Diana", 43], ["Diana", 1], ["Diana", 2], ["Diana", 3], 
	["Diana", 4], ["Diana", 5], ["Diana", 6], ["Diana", 7], ["Diana", 8], 
	["Diana", 9], ["Diana", 10], ["Diana", 11], ["Diana", 12], 
	["Diana", 13], ["Diana", 14], ["Diana", 15], ["Diana", 16], 
	["Diana", 17], ["Diana", 18], ["Diana", 19], ["Diana", 20]]

	*** Union test - records containing 'Jack' or 'Diana' ***
	[["Diana", 43], ["Jack", 6], ["Jack", 71], ["Jack", 1], ["Diana", 1], 
	["Jack", 2], ["Diana", 2], ["Jack", 3], ["Diana", 3], 
	["Jack", 4], ["Diana", 4], ["Jack", 5], ["Diana", 5], ["Jack", 6], 
	["Diana", 6], ["Jack", 7], ["Diana", 7], ["Jack", 8], 
	["Diana", 8], ["Jack", 9], ["Diana", 9], ["Jack", 10], ["Diana", 10], 
	["Jack", 11], ["Diana", 11], ["Jack", 12], ["Diana", 12], 
	["Jack", 13], ["Diana", 13], ["Jack", 14], ["Diana", 14], ["Jack", 15], 
	["Diana", 15], ["Jack", 16], ["Diana", 16], ["Jack", 17], 
	["Diana", 17], ["Jack", 18], ["Diana", 18], ["Jack", 19], ["Diana", 19], 
	["Jack", 20], ["Diana", 20]]

	*** Union test - records containing 'Rachel' or 'Diana' ***
	[["Diana", 43], ["Rachel", 43], ["Diana", 1], ["Rachel", 1], ["Diana", 2], 
	["Rachel", 2], ["Diana", 3], ["Rachel", 3], ["Diana", 4], 
	["Rachel", 4], ["Diana", 5], ["Rachel", 5], ["Diana", 6], ["Rachel", 6], 
	["Diana", 7], ["Rachel", 7], ["Diana", 8], ["Rachel", 8], 
	["Diana", 9], ["Rachel", 9], ["Diana", 10], ["Rachel", 10], ["Diana", 11], 
	["Rachel", 11], ["Diana", 12], ["Rachel", 12], ["Diana", 13], 
	["Rachel", 13], ["Diana", 14], ["Rachel", 14], ["Diana", 15], ["Rachel", 15], 
	["Diana", 16], ["Rachel", 16], ["Diana", 17], 
	["Rachel", 17], ["Diana", 18], ["Rachel", 18], ["Diana", 19], ["Rachel", 19], 
	["Diana", 20], ["Rachel", 20]]

	*** Intersection test - records containing '43' and 'Diana' ***
	["Diana", 43]

	*** Intersection test - records containing '43' and 'Rachel' ***
	["Rachel", 43]

	*** Difference test - records containing '43' and not 'Diana' ***
	["Rachel", 43]

	*** Difference test - records containing '43' and not 'Rachel' ***
	["Diana", 43]
	recs with Jack and '6': 
	["Jack", 6]

	*** Difference test ***
	recs with Jack but not '6'
	[["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], ["Jack", 4], 
	["Jack", 5], ["Jack", 7], ["Jack", 8], ["Jack", 9], 
	["Jack", 10], ["Jack", 11], ["Jack", 12], ["Jack", 13], ["Jack", 14], 
	["Jack", 15], ["Jack", 16], ["Jack", 17], ["Jack", 18], ["Jack", 19], 
		["Jack", 20]]

	*** Intersection test ***
	recs with Jack and '6'
	["Jack", 6]
	["Jack", 6]
	recs with Jack and '72': 
	Nonexistent

	*** Deleting a record -- the 'Jack' with A:10 ***
	remaining recs with Jack
	[["Jack", 6], ["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], 
	["Jack", 4], ["Jack", 5], ["Jack", 6], ["Jack", 7], 
	["Jack", 8], ["Jack", 9], ["Jack", 11], ["Jack", 12], ["Jack", 13], 
	["Jack", 14], ["Jack", 15], ["Jack", 16], ["Jack", 17], 
	["Jack", 18], ["Jack", 19], ["Jack", 20]]

	*** Restoration Test - Restoring ['Jack',10] ***
	[["Jack", 6], ["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], 
	["Jack", 4], ["Jack", 5], ["Jack", 6], ["Jack", 7], 
	["Jack", 8], ["Jack", 9], ["Jack", 11], ["Jack", 12], ["Jack", 13], 
	["Jack", 14], ["Jack", 15], ["Jack", 16], ["Jack", 17], 
	["Jack", 18], ["Jack", 19], ["Jack", 20], ["Jack", 10]]

	*** Deleting another record -- one of the two 'Jack's with A:6 ***
	Deleting ["Jack", 6]
	All remaining records with 'Jack'
	[["Jack", 71], ["Jack", 1], ["Jack", 2], ["Jack", 3], ["Jack", 4], 
	["Jack", 5], ["Jack", 6], ["Jack", 7], ["Jack", 8], 
	["Jack", 9], ["Jack", 11], ["Jack", 12], ["Jack", 13], ["Jack", 14], 
	["Jack", 15], ["Jack", 16], ["Jack", 17], ["Jack", 18], 
	["Jack", 19], ["Jack", 20], ["Jack", 10]]
	records with 'A:103'
	[["Jack", 103], ["Diana", 103], ["Rachel", 103]]
	records with 'N:Diana'
	[["Diana", 43], ["Diana", 1], ["Diana", 2], ["Diana", 3], ["Diana", 4], 
	["Diana", 5], ["Diana", 6], ["Diana", 7], 
	["Diana", 8], ["Diana", 9], ["Diana", 10], ["Diana", 11], ["Diana", 12], 
	["Diana", 13], ["Diana", 14], ["Diana", 15], 
	["Diana", 16], ["Diana", 17], ["Diana", 18], ["Diana", 19], ["Diana", 20], 
	["Diana", 101], ["Diana", 102], ["Diana", 103], 
	["Diana", 104], ["Diana", 105], ["Diana", 106], ["Diana", 107], 
	["Diana", 108], ["Diana", 109], ["Diana", 110], 
	["Diana", 111], ["Diana", 112], ["Diana", 113], ["Diana", 114], 
	["Diana", 115], ["Diana", 116], ["Diana", 117], 
	["Diana", 118], ["Diana", 119], ["Diana", 120]]
	records with 'A:3' but not 'N:Diana'
	[["Jack", 3], ["Rachel", 3]]
	records with 'N:Diana' but not 'A:3'
	[["Diana", 43], ["Diana", 1], ["Diana", 2], ["Diana", 4], ["Diana", 5], 
	["Diana", 6], ["Diana", 7], ["Diana", 8], 
	["Diana", 9], ["Diana", 10], ["Diana", 11], ["Diana", 12], ["Diana", 13], 
	["Diana", 14], ["Diana", 15], ["Diana", 16], 
	["Diana", 17], ["Diana", 18], ["Diana", 19], ["Diana", 20], ["Diana", 101], 
	["Diana", 102], ["Diana", 103], 
	["Diana", 104], ["Diana", 105], ["Diana", 106], ["Diana", 107], 
	["Diana", 108], ["Diana", 109], ["Diana", 110], 
	["Diana", 111], ["Diana", 112], ["Diana", 113], ["Diana", 114], 
	["Diana", 115], ["Diana", 116], ["Diana", 117], 
	["Diana", 118], ["Diana", 119], ["Diana", 120]]
	records with 'N:Diana' and 'A:3'
	[["Diana", 3]]
	records with 'N:Diana' or 'A:3'
	[["Diana", 43], ["Diana", 1], ["Diana", 2], ["Jack", 3], ["Diana", 3], 
	["Rachel", 3], ["Diana", 4], ["Diana", 5], 
	["Diana", 6], ["Diana", 7], ["Diana", 8], ["Diana", 9], ["Diana", 10], 
	["Diana", 11], ["Diana", 12], ["Diana", 13], 
	["Diana", 14], ["Diana", 15], ["Diana", 16], ["Diana", 17], ["Diana", 18], 
	["Diana", 19], ["Diana", 20], ["Diana", 101], 
	["Diana", 102], ["Diana", 103], ["Diana", 104], ["Diana", 105], 
	["Diana", 106], ["Diana", 107], ["Diana", 108], 
	["Diana", 109], ["Diana", 110], ["Diana", 111], ["Diana", 112], 
	["Diana", 113], ["Diana", 114], ["Diana", 115], 
	["Diana", 116], ["Diana", 117], ["Diana", 118], ["Diana", 119], 
	["Diana", 120]]

 	***END OF TESTS ***
The following comments will aid in understanding this output. We begin by creating a database, into which a first record '["Jack",71]' is inserted and then retrieved. (The summarization routine for the database simply maps such records into strings like "N:Jack A:71", ignoring all but the first two components of each record.) Next two additional records, '["Diana",43]' and '["Rachel",43]' are inserted, and then retrieved using the iterator 'db.contains("A:43")' (which iterates over both of these records, since both contain '43' as their second component.) The empty iterator 'db.contains("N:Walter")' is then used in combination with the 'arb' operation, which returns OM since no record contains the name 'Walter'.

Next we iterate over all records containing either of the names 'Jack' or 'Diana', The union iterator 'db.contains("N:Jack") + db.contains("N:Diana")' is saved in a variable and then used a second time to show that this is possible, and a third time in combination with the 'arb' operation.

Next the corresponding intersection and difference iterators are invoked, either to support iterations or in connection with 'arb' or the count operator '#' and seen to behave as expected. After this, we use the database record-change operation db(x) := y to change a record, which is then retrieved in changed form, using an iterator, while the previous record is seen to be absent, and other records in the database are seen to be unchanged. Then a second record '["Jack", 6]' containing the name "Jack" is added to the databases, and retrieved, first using 'db.contains("A:6")' (which retrieves only '["Jack", 6]'), and 'db.contains("N:Jack")' (which retrieves both '["Jack", 6]' and '["Jack", 72]'). The record '["Jack", 72]' is then deleted by using the record-change operation in the form db(x) := OM, its absence verified, and the original '["Jack",71]' record added back to the database. Following this, we add 60 records, containing ["Jack",j], ["Diana",j], and ["Rachel",j] for j from 1 to 20 to the database, verify via various retrievals and iterations that this has been done correctly, and delete a record, subsequently verifying the correctness of this deletion, and then re-inserting the deleted record and re-verifying. Finally we insert another 60 records, now containing ["Jack",j], ["Diana",j], and ["Rachel",j] for j from 101 to 120, and re-verify.

A Second Example. Value-semantics of SETL Databases, free form of records, databases as ordinary objects.

The following short program illustrates several other significant properties of SETL databases, specifically that

  1. they obey value-semantics rather than pointer semantics rules;

  2. records in them need not have any fixed structure;

  3. databases behave like ordinary SETL objects, i.e. can be put into tuples and/or sets, can be passed as parameters to subroutines, etc.

The program is:

	program test; -- test of SETL database class
	use SETL_database;
	
	db := SETL_database(pair_to_stg); -- create an empty database, to store test record
	[db,jack_id] := db with ["Jack",71]; -- add a first record
	[db,diana_id] := db with ["Diana",43]; -- add a second record 
	[db,rachel_id] := db with ["Rachel",43]; -- add a third record 
	
	print("Database records");
	for key in (db.contains("A:43") + db.contains("A:71")) loop 
		print(db(key)); 
	end loop;

	db_save := db;  -- save the original database
	
	db(jack_id) := db(jack_id) with "a swell guy";
	db(diana_id) := ["Diana",44]; db(diana_id) := db(diana_id) with "a swell gal";
	db(rachel_id) := db(rachel_id) with "a folksinger";
	db(rachel_id) := db(rachel_id) with "remember upcoming club date";
	
	print("\nEdited database");
	for key in (db.contains("A:43") + db.contains("A:71")) loop 
		print(db(key)); 
	end loop;

	print("\nSaved database");
	for key in (db_save.contains("A:43") + db_save.contains("A:71")) loop 
		print(db_save(key)); 
	end loop;
	
	print("\nComparison of records in the two databases");
	print("db_save(jack_id): ",db_save(jack_id)); 
	print("db(jack_id): ",db(jack_id));
	print("db_save(rachel_id): ",db_save(rachel_id)); 
	print("db(rachel_id): ",db(rachel_id));
	print("db_save(diana_id): ",db_save(diana_id)); 
	print("db(diana_id): ",db(diana_id));

	database_list := [db,db_save];

	print("\nFirst database");
	for key in (database_list(1).contains("A:43") + database_list(1).contains("A:71")) loop 
		print(database_list(1)(key)); 
	end loop;

	print("\nSecond database");
	for key in (database_list(2).contains("A:43") + database_list(2).contains("A:71")) loop 
		print(database_list(2)(key)); 
	end loop;
	
	procedure pair_to_stg(p); -- convert name, age pair to string
	
		[nm,age,-] := p; return "N:" + nm + " A:" + str(age);
	
	end pair_to_stg; 

	end test;
The preceding program produces the following output.
	Database records
	["Jack", 71]
	["Diana", 43]
	["Rachel", 43]

	Edited database
	["Jack", 71, "a swell guy"]
	["Rachel", 43, "a folksinger", "remember upcoming club date"]

	Saved database
	["Jack", 71]
	["Diana", 43]
	["Rachel", 43]

	Comparison of records in the two databases
	db_save(jack_id): ["Jack", 71]
	db(jack_id): ["Jack", 71, "a swell guy"]
	db_save(rachel_id): ["Rachel", 43]
	db(rachel_id): ["Rachel", 43, "a folksinger", "remember upcoming club date"]
	db_save(diana_id): ["Diana", 43]
	db(diana_id): ["Diana", 44, "a swell gal"]

	First database
	["Jack", 71, "a swell guy"]
	["Rachel", 43, "a folksinger", "remember upcoming club date"]

	Second database
	["Jack", 71]
	["Diana", 43]
	["Rachel", 43]

The program seen above creates a database and inserts three records, each a simple pair, into it. The database is then copied to a second variable 'db_save', and the original database is modified by editing each of its three records, to which additional string components are freely added. (The simplicity of this operation, which is a bit clumsy in a standard SQL database since it changes the number of SQL 'columns', illustrates one of the advantages of SETL databases: since records in them are entirely free-form, additional elements of any kind can be added to records in a completely dynamic way.) The edited records are retrieved from the edited database using an iterator, but then also from the saved database, which is seen to contain the original records, unchanged. This illustrates a second advantage of SETL databases, their value semantics, on which a 'transactional' capability can be built in the manner noted above. Finally we include the databases in a tuple, from which the databases are then recovered and used. This illustrates the fact that databases behave like ordinary SETL objects, and so can be put into tuples and/or sets, passed as parameters to subroutines, etc.

More Examples: Databases which store Free-form Text. Next we give a few examples having somewhat more realistic application flavors. The first of these, seen below, reads a file of text divided into paragraphs by blank lines, and inserts these paragraphs into a database. Paragraphs are summarized by collecting their words, reducing these to lowercase, suppressing any words which belong to a file of 'excluded_wds', and depluralizing them by dropping any terminal 's'. After building the database, the code seen below executes a few sample queries, some of which illustrate the use of Boolean combinations of search keys.

	program test; -- test of SETL database class
	use SETL_database;
	use string_utility_pak,get_lines_pak; 

	const punc := "';:<>,.?/\\|][}{=-_+\t ~!@#$%^&*()_-+=\"€˜€‘„`„˜"; 
		-- punctuation and whitespace characters
	var excluded_wds;  	
		-- we set up a global collection of words excluded from summaries
	
	excluded_wds := breakup("" +/ [line + " ": line in get_lines("excluded_wds")],punc);	
	excluded_wds := {case_change(wd,"ul"): wd in excluded_wds | wd /= ""};

	db := SETL_database(record_to_summary_stg); 
		-- create an empty database, to store test record
	lines := get_lines("database_text.txt");  
		-- read the input text for the database

	paragraphs := current_paragraph := [];  
		-- we will collect list of paragraphs, delimited by empty lines
	
	for line in lines loop   
		--loop over all lines,collecting paragraphs

		span(line,"\t "); 
		if line /= "" then 
			current_paragraph with:= line; continue; 
		end if;

	if current_paragraph /= [] then  
		-- collect and rebuild the current_paragraph, adding it to the database
	[db,-] := db with ("" +/ [pline + " ": pline in current_paragraph]); 
	current_paragraph := [];
	end if;

	end loop;

	if current_paragraph /= [] then  
		-- collect the final paragraph
	[db,-] := db with ("" +/ [pline + " ": pline in current_paragraph]); 
	end if;
	        -- now some database queries
	print("\nParagraphs containing 'object' *** ");
	for x in db.contains("object") loop 
		print("\n",db(x)); 
	end loop;
	
	print("\nParagraphs containing 'object' and 'value' *** ");
	for x in db.contains("object") * db.contains("value") loop 
		print("\n",db(x)); 
	end loop;
	
	print("\nParagraphs containing 'insertion' and 'fast' *** ");
	for x in db.contains("insertion") * db.contains("fast") loop 
		print("\n",db(x)); 
	end loop;
	
	procedure record_to_summary_stg(stg); -- convert paragraph to summary string
	
	wd_list := breakup(stg,punc);  -- form list of all words in paragraph
	wd_set := {drop_plural(case_change(wd,"ul")): wd in wd_list | 
		#wd > 1 and wd notin excluded_wds and 
	 (wd(1) notin 		
		"0123456789#:=,./?;'[]=-)(*&^%$#@!~`\n\t\r\"\\|{}+\x3E\x3C")};
	 
	return ("" +/ [" " + wd: wd in wd_set])(2..);  
		-- return collection of words as a blank-delimited string
	
	end record_to_summary_stg; 

	procedure drop_plural(stg); 
		x := rmatch(stg,"s"); 
		return stg; 
	end drop_plural;  -- drop plural ending from word

	end test;

The sample text file which this code (and the variants of it which follow) reads into its database is simply:

The underlying structure used to realize the large objects supported by SETL (and most other databases) is the B-tree. This famous structure, can be thought of as a way of maintaining large tuples in a form which allows fast execution of all the standard tuple operations, i.e., component value access, component change, component insertion, slicing, and slice assignment. B-trees can be thought of as arising by recursive elaboration of the simple idea of dividing very long tuples T (e.g. tuples of length 1 million) into sections of some smaller length, e.g., length approximately 1000.

If this is done, then a tuple of length 1 million will be represented by a tuple R of tuples, each of length roughly 1000, the whole list R also being of length approximately 1000. To make an insertion into the middle of such a structure (regarded as a representation of the large tuple T that would be obtained by concatenating all the components of R) is then much faster than it would be in a flat representation of T. Insertion in the middle of a flat representation might require all following components to be moved, so if T is of length one million, 500,000 of its components might have to be moved. But in the alternate representation suggested above the insertion is made in a sub-tuple of R having length roughly 1000 that is easily located, and so should require no more than 500 components to be moved. This gives a speedup that can be as high as 1000.

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.

As already stated, the B-tree structure arises by recursive elaboration of the idea just described. Rather than arbitrarily dividing long tuples like T into some fixed number of pieces in the two-level way sketched above, we divide it into a tuple of tuples of tuples ..., down to as many levels as needed, limiting each of the tuples ('nodes') that appear at each level of this nested tree structure to some convenient maximum length. It turns out to be most convenient to insist that (with the one exception of the tree root) each 'node', (i.e., tuple) occurring in the data structure have a length lying between some integer n and 2n - 1.

This ensures that insertions remain very fast while at the same time limiting the number of tree levels that must be searched to locate a given component of the large tuple T that the tree structure R represents. Specifically, this number of levels can be no more than log n L where L is the length of the tuple represented.

The process of insertion of a new component into the large tuple represented by the recursive tree structure R will force insertion of a component at the bottom tree level, and this may cause the tree node into which the insertion is made to overflow the maximum length 2n - 1 to which it is supposed to be limited. But when this happens we can simply divide that node into two nodes of length n, inserting one of these at the next level up in the tree. Even though the process of recursive insertion thereby triggered may propagate all the way up the tree to its root, it remains orderly at each recursive level, as the more detailed code shown below makes plain.

The values stored in B-trees appear in leaf nodes at their bottom level. These leaf nodes are accessed via pointers stored in nodes one level higher.

If text stored in a database like that appearing in the preceding example is divided into longer paragraphs or chapters, it may be useful to support searches by combinations of keywords constrained to lie near each other in the text being searched. The following example shows one way in which such functionality can be supported. 'Assuming that 'near' can be interpreted as meaning 'separated by no more than 4 intervening words', we divide the text into overlapping zones of length 10 words. A new zone is started every 5 words, so that nearby words are certain to belong to some common zone. For each such zone we insert a 'zone record' containing its words, and for each paragraph collect the ids of the zones in the paragraph. Each time a paragraph completes, we insert a 'paragraph record' for it, indexed by the list of the zones in the paragraph; these are sanitized by representing them as hex strings prefixed by the character 'z'. To search for paragraphs containing two words 'wd1' and 'wd2' near to each other, we can then use the compound iterator

zone in contains(wd1) * contains(wd1), parag in contains(hex_of(zone)).

This is seen in the following code.

program test; -- test of SETL database class
	use SETL_database;
	use string_utility_pak,get_lines_pak; 

	const punc := "';:<>,.?/\\|][}{=-_+\t ~!@#$%^&*()_-+=\"€˜€‘„`„˜"; 
		-- punctuation and whitespace characters
	var excluded_wds;  
		-- we set up a global collection of words excluded from summaries
	var db;   -- make global set of words excluded from summaries
	var zod_list;  -- list of zone ids, used by record summary routine
	
	excluded_wds := breakup("" +/ [line + " ": line in get_lines("excluded_wds")],punc);
	excluded_wds := {case_change(wd,"ul"): wd in excluded_wds | wd /= ""};

	db := SETL_database(record_to_summary_stg); 
		-- create an empty database, to store test record
	lines := get_lines("database_text.txt");  
		-- read the input text for the database

	paragraphs := current_paragraph := []; 
		-- we will collect list of paragraphs, delimited by empty lines
	
	for line in lines loop   
		-- loop over all lines,collecting paragraphs

	 span(line,"\t "); 
	 if line /= "" then current_paragraph with:= line; continue; end if;

	 if current_paragraph /= [] then  
	 	-- collect and restart the current_paragraph, adding it to the database
	 digest_paragraph(current_paragraph);
	 current_paragraph := []; 
	 end if;

	end loop;
	
	if current_paragraph /= [] then  -- collect the final paragraph
	 digest_paragraph(current_paragraph);
	end if;
	         -- now some database queries
	print("\nParagraphs containing 'value' and 'pointer' nearby *** ");
	id_seen := {};
	for zone in db.contains("value") * db.contains("pointer"),
		 parag in db.contains(hex_of(zone)) | parag notin id_seen loop 
	 id_seen with:= parag; print("\n",db(parag)); 
	end loop;

	print("\nParagraphs containing 'value' *** ");
	 id_seen := {};
	for zone in db.contains("value"),
		 parag in db.contains(hex_of(zone)) | parag notin id_seen loop 
		id_seen with:= parag; 
		print("\n",db(parag)); 
	end loop;
	
	print("\nParagraphs containing 'pointer' *** ");
	 id_seen := {};
	 for zone in db.contains("pointer")
	 	, parag in db.contains(hex_of(zone)) | parag notin id_seen loop 
	 	id_seen with:= parag; print("\n",db(parag)); 
	 end loop;

	procedure digest_paragraph(para);

	 zones_of_paragraph := zones_of(para_stg := "" +/ [" " + line: line in para]); 
	 zod_list := OM; zone_ids := []; 
	 for zone in zones_of_paragraph loop 
	 	[db,zone_id] := db with zone; 
	 	zone_ids with:= zone_id; 
	 end loop;
	 zod_list := join([hex_of(zod): zod in zone_ids]," ");
	 [db,-] := db with para_stg;

	end digest_paragraph;
	 
	procedure zones_of(stg); -- find list of zones in paragraph
	
	 wd_list :=[wd in breakup(stg,punc) | wd /= "" 
	 	and case_change(wd,"ul") notin excluded_wds];
	 			-- form list of all words in paragraph
	 zones_in_para := [];   -- collect list of all zones in paragraph
	 
	 for j in [1,6..nwl := #wd_list] loop -- loop over the zones
	  zone_wds := wd_list(j..(j + 9) min nwl);  -- the words in the zone
	  wd_set := {drop_plural(ccw := case_change(wd,"ul")): wd in zone_wds | #wd > 1 and 
	  not (wd(1) in " 0123456789#:=,./?;'[]=-)(*&^%$#@!~`\n\t\r\"\\|{}+\x3E\x3C")};
	  if (zone_stg := ("" +/ [" " + zw: zw in wd_set])) /= "" then 
			zones_in_para with:= zone_stg(2..); 
	  end if;
	 end loop;

	 return zones_in_para; 
	
	 end zones_of; 
	
	procedure record_to_summary_stg(stg); -- convert paragraph to summary string
	 		-- note that the database we are setting up contains records of two 
	 		-- different kinds, namely zone records representing short stretches of words,
	 		-- and paragraph records representing entire paragraphs. 
	 		-- zone records are added by the statement [db,zone_id] := db with zone
	 		-- seen above; paragraph records are added by the statement 
	 		-- [db,-] := db with para_stg;
	 		-- This summary string routine causes the summary of a zone to be the words
	 		-- within it, and the summary of a paragraph to be the zones within it. 
	 		-- This has the advantage that intersection-iterators like 
	 		
	 		-- B>for zone in db.contains("value") * db.contains("pointer")...
	 		
	 		-- can be used to iterate over nearby pairs and triples of words, 
	 		-- but the disadvantage that iterations over the paragraphs containing
	 		-- a single must also be written using a compound iterator
	 		-- that iterates over zones first.
	 		 
	 if zod_list /= OM return zod_list; end if;
	 
	 
	 wd_list := breakup(stg,punc);  -- form list of all words in paragraph
	 wd_set := {drop_plural(case_change(wd,"ul")): wd in wd_list | 
	 	#wd > 1 and wd notin excluded_wds and 
	 	not (wd(1) in 
		" 0123456789#:=,./?;'[]=-)(*&^%$#@!~`\n\t\r\"\\|{}+\x3E\x3C")};
	
	 return ("" +/ [" " + wd: wd in wd_set]);  
	 	-- return collection of words as a blank-delimited string
	 
	end record_to_summary_stg; 

	procedure drop_plural(stg);  -- drop plural ending from word
	 us := rmatch(stg,"us"); if us /= "" then return stg + us; end if;
	 us := rmatch(stg,"ss"); if us /= "" then return stg + us; end if;
	 rmatch(stg,"s"); return stg; 
	end drop_plural; 

	procedure hex_of(id); 
	 return "z" +/ [char((ac := abs(c)) / 16 + 97) 
		+ char((ac mod 16) + 97): c in id]; 
	end hex_of;
	
end test;

Given the sample text listed above as input, the program seen above produces the following output.
Paragraphs containing 'value' and 'pointer' nearby ***

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.

Paragraphs containing 'value' ***

The underlying structure used to realize the large objects supported by SETL (and most other databases) is the B-tree. This famous structure, can be thought of as a way of maintaining large tuples in a form which allows fast execution of all the standard tuple operations, i.e., component value access, component change, component insertion, slicing, and slice assignment. B-trees can be thought of as arising by recursive elaboration of the simple idea of dividing very long tuples T (e.g. tuples of length 1 million) into sections of some smaller length, e.g., length approximately 1000.

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.

The values stored in B-trees appear in leaf nodes at their bottom level. These leaf nodes are accessed via pointers stored in nodes one level higher.

Paragraphs containing 'pointer' ***

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.

The values stored in B-trees appear in leaf nodes at their bottom level. These leaf nodes are accessed via pointers stored in nodes one level higher.

Note that the keywords 'value' and 'pointer' occur both in the paragraph

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.
and in the paragraph
The values stored in B-trees appear in leaf nodes at their bottom level. These leaf nodes are accessed via pointers stored in nodes one level higher.
However, the code seen above retrieves only the first of these paragraphs, since only in this paragraph do the two words 'value' and 'pointer' occur within 5 words of each other.

In the code seen above, the 'digest_paragraph' procedure used to process each of the paragraphs occurring in the source text read first calls the 'zones_of' procedure to return the list of zones in the paragraph. The 'zones_of' procedure builds the list of zones by simply iterating over the words in the paragraph in steps of 5 and returning the group of 10 words beginning at each starting point. These short text segments are then inserted into the database, and the ids of the resulting records collected into a globally accessible variable 'zod_list'. The paragraph itself is then entered into the database, causing the normal call to 'record_to_summary_stg', which however behaves unusually since it detects 'zod_list' and returns this as the summary of the paragraph passed to it. This causes each 'zod_list' entry to reference the paragraph, allowing nested iterators like the

	zone in contains(wd1) * contains(wd1), parag in contains(hex_of(zone)).

seen in the code above to behave in the expected way.

The clauses 'parag notin id_seen' appearing in the queries seen above, and the other manipulations of th variable 'id_seen' simply serve to prevent records from appearing repeatedly in our iterations.

Two shortcomings of the code seen above can be noted:

  1. It only supports searches for pairs of words occurring nearby, but not for pairs of words occurring anywhere in a paragraph.

  2. Words occurring in the source text are uselessly stored three times, once in the paragraph records formed, then again in each of two overlapping zone records.

The variant code seen below cures these problems in the following way. Words (logically) inserted into zone records, but not words inserted into paragraph records, are prefixed with a '.' (i.e. a 'dot'). This allows proximity-based searches like those seen above to be written using iterators like

zone in db.contains(".value") * db.contains(".pointer"), parag in db.contains(hex_of(zone))

while searches ignoring proximity can be written in the normal way, e.g. as

parag in db.contains("value") * db.contains("pointer")

One can even use hybrid combinations like

zone in db.contains(".value") * db.contains(".pointer"), parag in db.contains(hex_of(zone)) * db.contains("nodes")

which finds all paragraphs containing the three words 'value', 'pointer', and 'node', constraining the first two of these words, but not the third, to lie near each other.

The code which follows is changed only slightly from its preceding version. Insertion of each successive zone into the database is preceded by a call 'munge(zone)', which extracts the words appearing in the zone, prefixes each of them by a '.', posts the words in 'zod_list' as a blank-delimited string, and returns a null string (which becomes the 'node record') associated with the zone; however the changes also made to the summarization routine 'record_to_summary_stg' ensure that each zone is properly indexed by the words it contains. The paragraph-level processing collects the ids of all these zones, and posts them to 'zod_list' as a tuple. Then, when 'record_to_summary_stg' is called at the paragraph level, the ids in 'zod_list' are prefixed to the paragraph summary, causing the paragraph to be indexed both by the ids of the zones in it and by the non-excluded words which it contains.

program test; -- test of SETL database class
	use SETL_database;
	use string_utility_pak,get_lines_pak; 

	const punc := "';:<>,.?/\\|][}{=-_+\t ~!@#$%^&*()_-+=\""; 
			-- punctuation and whitespace characters
	var excluded_wds; 
		-- we set up a global collection of words excluded from summaries
	var db;   -- make global set of words excluded from summaries
	var zod_list;  -- list of zone ids, used by record summary routine
	
	excluded_wds := breakup("" +/ [line + " ": line in get_lines("excluded_wds")],punc);
	excluded_wds := {case_change(wd,"ul"): wd in excluded_wds | wd /= ""};

	db := SETL_database(record_to_summary_stg);
		 -- create an empty database, to store test record
	lines := get_lines("database_text.txt");  
		-- read the input text for the database

	paragraphs := current_paragraph := [];  
		-- we will collect list of paragraphs, delimited by empty lines
	
	for line in lines loop   
		--loop over all lines,collecting paragraphs

	 span(line,"\t "); 
	 if line /= "" then 
	 	current_paragraph with:= line; 
	 	continue; 
	 end if;

	 if current_paragraph /= [] then  
	 	-- collect the current_paragraph, adding it to the database
	 digest_paragraph(current_paragraph);
	 current_paragraph := []; 
	 end if;

	end loop;
	
	if current_paragraph /= [] then  -- collect the final paragraph
	 digest_paragraph(current_paragraph);
	end if;
	         -- now some database queries
	print("\nParagraphs containing 'value' and 'pointer' nearby *** ");
	id_seen := {};
	for zone in db.contains(".value") * db.contains(".pointer"), 
		parag in db.contains(hex_of(zone)) | parag notin id_seen loop 
	 id_seen with:= parag; print("\n",db(parag)); 
	end loop;

	print("\nParagraphs containing 'value' and 'pointer' anywhere *** ");
	id_seen := {};
	for parag in db.contains("value") * db.contains("pointer") | parag notin id_seen loop 
	 id_seen with:= parag; print("\n",db(parag)); 
	end loop;

	print("\nParagraphs containing 'value' and 'pointer' nearby, together with 'rule' *** ");
	id_seen := {};
	for zone in db.contains(".value") * db.contains(".pointer"), 
		parag in (db.contains(hex_of(zone)) * db.contains("rule")) 
			| parag notin id_seen loop 
		id_seen with:= parag; print("\n",db(parag)); 
	end loop;

	procedure digest_paragraph(para);  
		-- finds zones in paragraph and inserts them into database

	 zones_of_paragraph := zones_of(para_stg := "" +/ [" " + line: line in para]); 
	 zone_ids := []; 
	 for zone in zones_of_paragraph loop 
	 [db,zone_id] := db with munge(zone); 
	 zone_ids with:= zone_id; end loop;
	 zod_list := [hex_of(zod): zod in zone_ids];  
	 	-- build the paragraph build the paragraph-level zod_list, 
	 	-- leaving it as a tuple
	 [db,-] := db with para_stg;

	end digest_paragraph;
	
	procedure munge(zone_stg);  
		-- finds zones in paragraph and inserts them into database
	 wds_list := breakup(zone_stg," ");  
	 	 -- break into words using blanks as delimiters
	 zod_list := join(["." + wd: wd in wds_list]," "); 
	 	-- prefix words in zone by '.' and rejoin into string, posting this globally
	 return "";      -- return empty string as nominal contents of zone
	end munge;
	 
	procedure zones_of(stg); 
		-- find list of zones in paragraph
	
	 wd_list :=[wd in breakup(stg,punc) | 
	 	wd /= "" and case_change(wd,"ul") notin excluded_wds];  
	 		 -- form list of all words in paragraph
	 zones_in_para := [];   -- collect list of all zones in paragraph
	 
	 for j in [1,6..nwl := #wd_list] loop 
	 	-- loop over the zones
	  zone_wds := wd_list(j..(j + 9) min nwl);  
	  	-- the words in the zone
	  wd_set := {drop_plural(ccw := case_change(wd,"ul")): wd in zone_wds | 
	  	#wd > 1 and (wd(1) notin 
	  	" 0123456789#:=,./?;'[]=-)(*&^%$#@!~`\n\t\r\"\\|{}+\x3E\x3C")};
 
	  if (zone_stg := ("" +/ [" " + zw: zw in wd_set])) /= "" then 
		zones_in_para with:= zone_stg(2..); 
	  end if;
	 end loop;

	 return zones_in_para; 
	
	 end zones_of; 
	
	procedure record_to_summary_stg(stg); -- convert paragraph to summary string
	 
	 if is_string(zod_list) then return zod_list; end if; 
	 	-- this clause processes zones
	 
	 wd_list := breakup(stg,punc);  
	 	-- paragraph processing continues here: form list of all words in paragraph
	 wd_set := {drop_plural(case_change(wd,"ul")): wd in wd_list | #wd > 1 
			and wd notin excluded_wds and 
	 not (wd(1) in 
			" 0123456789#:=,./?;'[]=-)(*&^%$#@!~`\n\t\r\"\\|{}+\x3E\x3C")};

	 return join(zod_list," ") +/ [" " + wd: wd in wd_set]; 
	 	-- return collection of words, prefixed by ids in zod_list, 
	 	-- as a blank-delimited string
	 
	end record_to_summary_stg; 

	procedure drop_plural(stg);  -- drop plural ending from word
	 us := rmatch(stg,"us"); if us /= "" then return stg + us; end if;
	 us := rmatch(stg,"ss"); if us /= "" then return stg + us; end if;
	 rmatch(stg,"s"); return stg; 
	end drop_plural; 

	procedure hex_of(id); 
		return "z" +/ [char((ac := abs(c)) / 16 + 97) 
		+ char((ac mod 16) + 97): c in id]; 
	end hex_of;
	
end test;

The code seen above produces the following output, which shows that each of the three sample iterations appearing in this example behaves properly.

Paragraphs containing 'value' and 'pointer' nearby ***

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.

Paragraphs containing 'value' and 'pointer' anywhere ***

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.

The values stored in B-trees appear in leaf nodes at their bottom level. These leaf nodes are accessed via pointers stored in nodes one level higher.

Paragraphs containing 'value' and 'pointer' nearby, together with 'rule' ***

Database objects have value, rather than pointer semantics. This rule is implemented in an efficient way.

The examples seen above hint at the way in which SETL databases can be used to organize proximity-based queries in geographic information applications.

12.2. A First Sketch of the SETL Database Implementation concept: simplest form of the basic B-tree structure.

The underlying structure used to realize the large objects supported by SETL (and most other databases) is the B-tree. This famous structure can be thought of as a way of maintaining large tuples in a form which allows fast execution of all the standard tuple operations, i.e., component access, component change, component insertion, slicing, and slice assignment. B-trees can be thought of as arising by recursive elaboration of the simple idea of dividing very long tuples T (e.g. tuples of length 1 million) into sections of some smaller length, e.g., length approximately 1000. If this is done, then a tuple of length 1 million will be represented by a tuple R of tuples, each of length roughly 1000, the whole list R also being of length approximately 1000. To make an insertion into the middle of such a structure (regarded as a representation of the large tuple T that would be obtained by concatenating all the components of R) is then much faster than it would be in a flat representation of T. Insertion in the middle of a flat representation might require all following components to be moved, so if T is of length one million, 500,000 of its components might have to be moved. But in the alternate representation suggested above the insertion is made in a sub-tuple of R having length roughly 1000 that is easily located, and so should require no more than 500 components to be moved. This gives a speedup that can be as high as 1000.

As already stated, the B-tree structure arises by recursive elaboration of the idea just described. Rather than arbitrarily dividing long tuples like T into some fixed number of pieces in the two-level way sketched above, we divide it into a tuple of tuples of tuples ..., down to as many levels as needed, limiting each of the tuples ('nodes') that appear at each level of this nested tree structure to some convenient maximum length. It turns out to be most convenient to insist that (with the one exception of the tree root) each 'node', (i.e., tuple) occurring in the data structure have a length lying between some integer n and 2n - 1. This ensures that insertions remain very fast while at the same time limiting the number of tree levels that must be searched to locate a given component of the large tuple T that the tree structure R represents. Specifically, this number of levels can be no more than log n L where L is the length of the tuple represented.

The process of insertion of a new component into the large tuple represented by the recursive tree structure R will force insertion of a component at the bottom tree level, and this may cause the tree node into which the insertion is made to overflow the maximum length 2n - 1 to which it is supposed to be limited. But when this happens we can simply divide that node into two nodes of length n, inserting one of these at the next level up in the tree. Even though the process of recursive insertion thereby triggered may propagate all the way up the tree to its root, it remains orderly at each recursive level, as the more detailed code shown below makes plain.

Another crucial advantage of the recursive B-tree structure is that it allows value semantics to be implemented efficiently for large tuples. If, for example, we copy a tuple t1 into a tuple t2 by writing t2 := t1, and then change a designated component of t1 by writing t1(i) := x, we only need to copy and modify those tree nodes of the changed tree t1 which lie on the tree path down to the point at which the modification occurs. All other tree nodes can simply be shared between the two trees t1 and t2. Thus the amount of node copying and modification required by the rules of value semantics is the product of log L * 2n, where L is the length of the tuple represented, and 2n - 1 is the maximum allowed node size.

The generic B-tree code which follows reflects these considerations, some explicitly and others implicitly. Tree nodes are represented as objects, of class 'Btup', containing two vectors 'cum' and 'tup'. The 'tup' instance variable held in each tree node stores the list of child nodes of that node, or, in the case of nodes of height 1, stores the components of the represented tuple that come under it. The j'th component of the 'cum' instance variable, whose value is a tuple identical in length to 'tup', stores the total number of leaves that come under any of the nodes tup(1)...tup(j). These 'cum' vectors allow search down the tree for a twig of given index i within the large tuple T that the tree represents by a fast binary-search like process. This is seen in the procedure self(i) that appears in the following code, which is written as a single line of recursive search code. The same is true for the component change operation seen in the procedure self(i) := x that follows. It is worth thinking through the action of this latter routine in detail to understand how much node copying it implies when a component is changed, and where these copies occur in the code.

The 'cum' vector illustrates the important idea of storing 'cumulant' quantities, derived by some addition-like operation from the children of a node, at each of the nodes of a B-tree. This idea generalizes readily. Multiple such 'cumulants' can be stored in each node. Each such cumulant supports a corresponding form of fast binary search in the tree. This idea is explored more fully in the next section.

Since all of the twigs in a B-tree are required to have the same height, that is, to lie at the same distance from the tree root, it follows that every tree node lies at the same distance from all of the twigs that come under it. This height is third item of value data stored in each node.

In respect to all of these node data items, particularly 'cum' and 'tup', node objects have value semantics. That is, change of any component of these two tuple-valued instance variables causes a copy of the changed object to be made in such a way as to guarantee that no other object referenced directly or indirectly by an independent variable is affected. It should be clear that this implies that the large tuples represented by our system of tree nodes also have value semantics.

The two most revealing routines in the code seen below are those for the operations 'self(i)' and 'self(i) := x'. After these two most essential routines, the next most crucial is the concatenation routine that follows them. Concatenation is most easily understood in the special case that the two trees being concatenated are of the same height. In this case, it should be clear that they can be concatenated simply by concatenating their top-most nodes. If this generates a top node whose length is less than the limit 2n on which we insist, we have only to set the cumulant vector correctly; otherwise we split the children of the over-fat top node that results into two parts, making each of them the child of a separate parent, and introducing a new top node having these nodes as its children. But if the two nodes being concatenated are not of the same hight, then either the first or the second will be taller than the other. Suppose, to begin with, that the first tree t1 is taller than the second tree t2. Then we descend to the rightmost child t'1 of the first tree, and recursively concatenate t'1 with the second tree. If the resulting tree c'1 is of the same height as t'1, then it replaces t'1 as the final child of t1, thereby definining a new B-tree r1 whose internally stored cumulant we simply need to adjust. But if c'1 is taller than t'1 it will have two children, each of the same height as t'1. In this case these two children replace the single child t'1 of t1, resulting in a new B-tree r1. If this r1 does not have too many children, its stored cumulant is simply adjusted and it becomes the concatenation result. But if r1 has too many children, we spit these children evenly among two new parent nodes, which then become the two children of a result tree one level higher than the original t1. This conditional splitting operation, like all others in the code which follows, is performed by the utility routine 'ifsplit()' seen below.

The case of concatenations in which t2 is taller than t1 is handled symmetrically to the case just described.

The concatenation procedure we have just described is used as a subroutine by various other of the important procedures in the set seen below, including end-slice extraction, general slice extraction, and slice assignment. End-slice extraction, which is written as t(i..) where t is a B-tree object and i an integer, is performed as follows. We first locate that child t' of the top node of t whose associated suB-tree contains the leaf with index i. Let c be the total number of leaves in children of t which precede t'. Then we form the B-tree t2 whose children are those children of t which follow t', form t'(i - c..) recursively, and concatenate t'(i - c..) and t2 + t to get the desired result. It should be clear that a prefix-slice extraction t(1..i) could be performed in a manner symmetrical to the procedure just described, and that a general slice extraction t(1..i) can be viewed as a combination of a prefix-slice and an end-slice operation. The detailed code for general slice extraction seen in what follows does essentially this, but in a slightly optimized way which we leave it to the reader to examine.

Slice assignment t(i..j) := t* is performed as if the result to be formed had been written as t(1..i - 1) + t* + t(j + 1..). The remaining B-tree operations in the code seen below all have easy expressions in terms of the basic operations we have just described.

The collection of routines seen below also includes the procedures used to manage the iterator objects associated with our btup objects, and to handle iterations over these and their Boolean combinations. This group of routines is discussed following the code itself.

The following code reflects all of the foregoing design considerations, which it expands in detail.

 class btup;    -- B-tree class
 class var debug := false;

 procedure create();  -- create blank btup
 procedure set(t);  -- set btup object from standard SETL tuple

 procedure check_consistency();  -- check consistency of tree

end btup;

class body btup;    -- B-tree class
 const minw := 5, maxw := 2 * minw - 1;   
 	 -- minimum number of descendants, except at top
 
 var h := 1,cum := [],tup := [];   -- height, cumulant tuple, top-level tuple;
  -- Note that B-trees have value semantics, but with sharing of sub-trees.
  -- The cumulant records the number of descendants of each node, 
  -- plus those of all siblings to its left.
  
 var iter_ptr,compno;      -- iteration pointer and count
 
 procedure create();  -- create empty B-tree top from tuple
  iter_ptr := newat(); compno := newat();   
  	-- set up iteration stack pointer and compno pointer
  return self;
 end create;

 procedure self(i);  -- component extraction; 
   -- descend iteratively to desired leaf using cumulant to find child containing it
  return if h = 1 then tup(i) elseif i <= cum(1) then tup(1)(i)
    elseif exists c = cum(j) | i <= c then tup(j)(i - cum(j - 1))
     else OM end if;
 end;

 procedure self(i) := x;  -- component change
   -- descend iteratively to desired leaf using cumulant to find child containing it
   -- then change the value at this node. Note that the cumulant need not be changed 
   -- in this case, since here it represents the number of leaves beneath each node,
   -- which this operation does not change.
  if h = 1 then tup(i) := x; elseif i <= cum(1) then tup(1)(i) := x;
    elseif exists c = cum(j) | i <= c then tup(c)(i - cum(j - 1)) := x; 
     else abort("index " + str(i) + " is out of range"); end if;
 end;
 
 procedure -self;  
 	-- convert to tuple by recursively converting and concatenating descendants
  return if h <= 1 then tup else [] +/ [-t: t in tup] end if;
 end;

 procedure #self;  -- total leaf length, given by cumulant
  return cum(#cum);
 end;

 procedure self + x;  -- concatenation

  if is_tuple(xx := x) then x := btup(); x := x.set(xx); end if;  
	-- force second argument to btuple
  if #cum = 0 then return x; end if; 
  if #x.cum = 0 then return self; end if;  -- null cases

  new := btup();   -- create an empty B-tuple

  if x.h = h then   -- concatenated btup has same height

   new.tup := tup + x.tup; new.h := h;  -- perform a top-level concatenation

   c1l := cum(#cum); new.cum := cum + [c + c1l: c in x.cum];
       -- get cumulant of concatenation, adjusting second part
   return new.ifsplit();   -- split if necessary 

  elseif x.h < h then  -- concatenated btup is shorter

   new.tup := tup; new.cum := cum; new.h := h;   
   	-- copy the top node of this tree

   end_elt := tup(nt := #tup);  -- the final descendant of this tree's root
   end_elt := end_elt + x;  -- recursively catenate x, 1 level down

   if end_elt.h < h then    -- the subconcatenation has not split
    (new.tup)(nt) := end_elt;   -- just install
    new.cum(nt) := new.cum(nt) + x.cum(#x.cum);  
    	-- adjust the cumulant to reflect the added descendants

    return new;  -- return the new tree
   end if;

     -- otherwise the subconcatenation has grown taller, i.e. has split
     -- the descendants of the new top level root 
     -- replace the prior last node at this level
     -- (which is the node to which x was concatenated)
   new.tup(nt..nt) := end_elt.tup; new.h := h;
       -- replace the last child of the original tree 
       -- with the children of the new split sub-tree
 
   c1ml := if nt = 1 then 0 else cum(nt - 1) end if;  
   	-- get cumulant prior to last node of original tree 
   new.cum(nt..nt) := [c + c1ml: c in end_elt.cum];  
   	-- add this to all subsequent cumulants

   return new.ifsplit();   -- split if necessary 
   
  else  -- otherwise concatenated element is taller

   new.tup := x.tup; new.cum := xc := x.cum; new.h := xh := x.h;  
   	-- copy the top node of x
   first_x_elt := x.tup(1);  -- the first element of x
   tot_cum := cum(#cum);   -- total cumulant of original tree
   
   first_x_elt := self + first_x_elt;  
   	-- recursively catenate this tree to first child of x
   if first_x_elt.h < xh then  -- the subconcatenation has not split
    (new.tup)(1) := first_x_elt;   -- it becomes first child of the new tree
    for j in [1..#xc] loop 
    	new.cum(j) +:= tot_cum;   -- adjust the later cumulants
    end loop;
    return new;  -- return the new tree
   end if;

     -- otherwise the subconcatenation has grown taller, i.e. has split
     -- the descendants of the new top level root replace 
     -- the prior first node at this level
     -- (which is the node of x to which our original tree was concatenated)
   new.tup(1..1) := first_x_elt.tup;  
       -- replace the first child of the original tree 
       -- with the children of the new split sub-tree

   new.cum(1..1) := first_x_elt.cum;  
   	-- likewise replace the cumulant of the first child of the original tree
   for j in [3..#xc + 1] loop   -- adjust all later cumulants
   	new.cum(j) +:= tot_cum; 
   end loop;
   
   return new.ifsplit();   -- split if necessary 
 
  end if;

 end;

 procedure ifsplit();  -- split into 2 nodes if overpopulated

  if (nt := #tup) <= maxw then return self; end if;
  	   -- needn't split if not above length limit

  t1 := tup(1..nto2 := nt/2); t2 := tup(nto2 + 1..); -- split the top node in half
  c1 := cum(1..nto2); c2 := cum(nto2 + 1..);   -- split its cumulant in half
  cum1 := cum(nto2); cum2 := cum(nt);    
  	-- get cumulants for the two new nodes that will be formed
  
    -- form a new root with just two descendants
  new1 := btup(); new1.h := h; new1.tup := t1; new1.cum := c1;  
  	-- initialize first child and its cumulant
  new2 := btup(); new2.h := h; new2.tup := t2; 
  new2.cum := [c - cum1: c in c2]; -- initialize second child and its cumulant
 
  newtop := btup();  -- create a new empty node
  newtop.tup := [new1,new2]; 
  newtop.cum := [cum1,cum2]; newtop.h := h + 1; 
  	-- attach children, cumulant, and set height

  return newtop;
  
 end ifsplit;

 procedure x + self;  -- concatenation with btup in second position
  if is_tuple(xx := x) then x := btup(); x := x.set(xx); 
  else abort("bad concatenation argument: " + str(x)); end if;  
       -- force first argument to btuple
  return x + self; -- then just concatenate
 end;

 procedure set(t);  -- set a B-tree from a tuple

  if (nt := #t) <= maxw then 
   h := 1; cum := [1..nt]; tup := t;
   return self;
  else  -- make two trees of half length, and concatenate them
   bt1 := btup(); bt1 := bt1.set(t(1..nt/2));  -- first half
   bt2 := btup(); bt2 := bt2.set(t(nt/2 + 1..));  -- second half
   return bt1 + bt2;  -- return concatenated version
  end if;
 end set;

 procedure self with x;  -- item concatenation
  return self + [x];  -- just concatenate singleton tuple
 end;

 procedure self(i..);  -- end slice extraction

  if i > (cncp1 := cum(nc := #cum) + 1) or i < 1 then 
	abort("end slice index out of range: " + str(i)); end if;
  if i = cncp1 then return btup(); end if;  -- empty result
  if i = 1 then return self; end if;    -- the whole shebang
  
  if h = 1 then    -- minimal height case; slice the tuple
   new := btup(); new.h := 1; new.tup := tup(i..); new.cum := [1..nc - i + 1]; 
   return new;
  end if;

  must := exists c = cum(j) | c >= i; 
  	-- find the child position to which the slice propagates
  cumbef := if j = 1 then 0 else cum(j - 1) end if;  
  	 -- get the cumulant prior to position of the sliced child
  tj := tup(j);    -- get the child to which the slice propagates
  
  if j = nc then return tj(i - cumbef..); end if;  
  	-- just return slice of the child if this is last
  
  tail := btup(); tail.h := h; tail.tup := tup(j + 1..);  
  	-- otherwise get siblings following the sliced child 
  cj := cum(j); tail.cum := [c - cj: c in cum(j + 1..)];  
  	-- adjust cumulant for this isolated 'tail' group of siblings
 
  return tj(i - cumbef..) + tail;  
  	-- catenate the 'tail' to the sliced child, and return

 end;

 procedure self(i..j);  -- general slice extraction

  if i > (cncp1 := (cnc := cum(nc := #cum)) + 1) or i < 1 then  
  	-- check that first index is in range
   abort("first slice index out of range: " + str(i)); 
  end if;
  if j < i - 1 or j > cnc then 
  	abort("second slice index out of range: " + str(i)); 
  end if;
  		-- check that second index is in range
  if j = i - 1 then return btup(); end if;  
  	-- in self(i..i in self(i..i - 1) case, return an empty btuple

  if h = 1 then    -- minimal height case; just slice the tuple
   new := btup(); new.h := 1; new.tup := tup(i..j); new.cum := [1..j - i + 1];   
   	-- return a height 1 btuple
   return new;
  end if;

  must := exists c = cum(jloc) | c >= j; 
  	-- find the top find the top-level child location of the index j

  if i = 1 then   -- we have a prefix slice extraction t(1..j)
 
   tj := tup(jloc); -- get child to which slice propagates
   
   if jloc = 1 then return tj(1..j); end if; 
   	-- if this is first child, just get and return slice of first component 
   
   pref := btup();   
   	-- otherwise will generate new prefix, consisting of all preceding siblings
   cumbef := cum(jloc - 1); subslice := tj(1..j - cumbef);  
   	-- this 'subslice' will be concatenated to the preceding siblings
   
   if jloc = 2 then   
   	-- the preceding sibling(s) would form a tree with #tup = 1; so descend 1 level
    pref.h := h - 1; pref.tup := (t1 := tup(1)).tup; pref.cum := t1.cum;  
    	-- in this case, prefix is identical with first node
   else     
   	-- taking prefix part won't produce tree with #tup = 1, 
   	-- so we form a new prefix tree
    pref.h := h; pref.tup := tup(1..jloc - 1); 
    pref.cum := cum(1..jloc - 1);
   end if;
   
   return pref + subslice;  
   	-- catenate tail to the appropriate slice of the child, and return

  end if;
    
    -- in the remaining case, two of the children will be sliced
  must := exists c = cum(iloc) | c >= i; 
  	-- find the top-level tup location of the first index i
  ind := tup(iloc);   -- this is the first child that will be sliced
  prior_cum := if iloc = 1 then 0 else cum(iloc - 1) end if; 
		-- get the cumulant prior to this child

  if iloc = jloc then  
  	-- if the two children being sliced are the same, 
  	-- then just do a subextraction from this child
    return ind(i - prior_cum..j - prior_cum);    -- return this subextraction
    end if;

    jnd := tup(jloc);      -- this is the second child that will be sliced
    jprior_cum := cum(jloc - 1);  -- get the cumulant prior to this second child

    ipart := ind(i - prior_cum..); jpart := jnd(1..j - jprior_cum);
        -- recursively extract the appropriate pieces of these two children
        -- one of these is a head extraction, the second is a tail extraction

    if iloc + 1 = jloc then return ipart + jpart; end if;
        -- if the two tup locations differ by 1, just catenate these two extracted pieces

    newmid := btup();   -- generate new a new node for the 'middle' nodes
  
    if iloc + 2 < jloc then     -- the 'middle' part has at least 2 nodes
      newmid.h := h; newmid.tup := tup(iloc + 1..jloc - 1);     
      	-- attach these nodes to the new empty tree formed above 
      cumbef := cum(iloc); newmid.cum := [c - cumbef: c in cum(iloc + 1..jloc - 1)];     
		-- attach a properly adjusted cumulant
    else              -- middle has just 1 node; use this node as the middle tree
      newmid := tup(iloc + 1);
    end if;  
    
    return ipart + newmid + jpart;       
    	-- return concatenation of the front, middle, and end
    
  end;

  procedure self(i..j) := x;    -- slice assignment

    if is_tuple(xx := x) then       -- force x to btup if it is a tuple
      x := btup(); x := x.set(xx);
    elseif type(x) /= "BTUP" then    -- otherwise x must already be a btup
      abort("illegal slice-assignment right-hand side: " + str(x));
    end if;
    
    		-- check that first index is in range
    if i > (cncp1 := (cnc := if (nc := #cum) = 0 then 0 else cum(nc) end if) + 1) 
		or i < 1 then
      abort("first slice-assignment index out of range: " + str(i)); 
    end if;
    
         -- check that second index is in range
    if j < i - 1 or j > cnc then 
    	abort("second slice-assignment index out of range: " + str(i)); 
	end if;

    if i = 1 then       -- the over-written part of this tree is a prefix

      if j = cnc then  -- the whole initial tree is over-written; just copy x to self
        h := x.h; tup := x.tup; cum := x.cum; return x;    
        	-- modify this btup; return right-hand side
      end if; 

            -- otherwise only a prefix of the initial tree is over-written
      tail := self(j + 1..);       
      	-- extract the trailing B-tup section that is not over-written 
      new := btup(); new.h := x.h; new.tup := x.tup; new.cum := x.cum;     
      	-- make a copy of x

      new := new + tail;      -- catenate the assigned part plus the retained part
      h := new.h; tup := new.tup; cum := new.cum;    -- modify this btup
      return x;                      -- return right-hand side
      
    end if;
            -- in this final case, a middle portion of the original tree is over-written 
    pref := self(1..i - 1);       -- get the retained prefix
    new := btup(); 
    new.h := x.h; new.tup := x.tup; new.cum := x.cum;     
    	-- the assigned part
    
    if j = cnc then    -- over-written part is suffix
      new := pref + new;      -- catenate the retained part plus the assigned part 
    else         -- over-written part is middle
      tail := self(j + 1..); new2 := btup(); 
      new2.h := x.h; new2.tup := x.tup; new2.cum := x.cum; 
      new := pref + new + tail;      
      	-- catenate the retained part plus the two assigned parts 
    end if; 

    h := new.h; tup := new.tup; cum := new.cum;      -- modify this btup
    return x;                      -- return right-hand side

  end;

  procedure self(i..) := x;    -- end slice assignment
    self(i..#self) := x; return x;
  end;

  procedure from self;    -- front extraction
    if tup = [] then return OM; end if;    -- null case
    x := self(1); y := self(2..);
    h := y.h; cum := y.cum; tup := y.tup; 
    return x;
  end;

  procedure frome self;    -- end extraction
    if tup = [] then return OM; end if;    -- null case

    x := self(ln := cum(#cum)); y := self(1..ln - 1);
    h := y.h; cum := y.cum; tup := y.tup; 
    return x;
  end;

  procedure fromb self;    -- end extraction
    if tup = [] then return OM; end if;    -- null case
    x := self(1); y := self(2..);
    h := y.h; cum := y.cum; tup := y.tup; 
    return x;
  end;
			-- **** DIAGNOSTIC ROUTINE FOR DEBUGGING ****
  procedure check_consistency();    
  	-- check consistency of tree (diagnostic routine for debugging)

    if h > 1 then -- the number of nodes must be in the correct range
 
      if exists nd = tup(j) | (nc := #(nd.tup)) < minw or nc > maxw then 
        abort("node count out of range: " + str(nc) + " " + str(nd)); 
      end if;

      if exists nd = tup(j) | nd.h /= h - 1 then 
        abort("node height inconsistency: " + str(j) + " " + str(nd.h) + " " + str(h)); 
      end if;

    end if;
    
          -- the cumulant differences at this level must be 
          -- the sum of those at the next lower level
    if h > 1 then 

      if exists nd = tup(j) | 
      	cum(j) /= if j = 1 then 0 else cum(j - 1) end if 
      		+ nd.cum(#nd.cum) then
        abort("bad cumulant: loc = " + str(j) + " cum at loc = " 
        	+ str(cum(j)) + " priorcum = " 
           + if j = 1 then "" else str(cum(j - 1)) end if 
           	+ " node cum = " + str(nd.cum(#nd.cum))); -- + " " + str(nd)
      end if; 

    elseif cum /= [1..#tup] then

      abort("bad bottom-level cumulant: " + str(cum) + " " + str(nd)); 

    end if;
    
    if h > 1 and (exists nd in tup | not nd.check_consistency()) then 
		abort("bad abort"); 
	end if;
    
    return true;      -- passed the consistency check
    
  end check_consistency;
  
			-- **** ROUTINES FOR ITERATING OVER B-TREES ****

  procedure iterator_start;     -- initialize simple iteration over btup
      -- sets up iterator stack as value referenced by iter_ptr. 
      -- This is a stack of pairs [tup,posn]
    stack := [];    -- to be built

    node := self;
    for j in [1..h] loop
      stack with:= [notup := node.tup,if j = h then 0 else 1 end if]; node := notup(1); 
    end loop;

    ^iter_ptr := stack;      -- attach stack to iteration pointer

  end iterator_start;
  
  procedure iterator_next();     -- step simple iteration over btup
      -- returns value as singleton tuple  whose value  is a pair, or OM if terminating
      -- advances iterator stack referenced by iter_ptr
    stack := ^iter_ptr;    -- retrieve the iterator stack

    height := 1;
    
    for j in [ns := #stack,ns - 1..1] loop
      if (sj := stack(j))(2) = #sj(1) then 
        height +:= 1; removed frome stack;    -- remove any exhausted element
      else
        exit;
      end if;
    end loop;
    
    if height = 1 then 
      removed frome stack;  -- pop the top element; then advance it
      [tup,loc] := removed; 
      result := tup(loc +:= 1); stack(ns) := [tup,loc];
      ^iter_ptr := stack; return [result];    
      	-- return singleton tuple built from leaf element
    end if;
    
    if stack = [] then return OM; end if;    
    	-- iteration is exhausted

    removed frome stack;  
    	-- pop the top element, then advance it and use to rebuild rest of stack
    [tup,loc] := removed; node := tup(loc +:= 1); stack with:= [tup,loc];

    for j in [1..hm1 := height - 1] loop      
    	-- rebuild the stack, starting with the node that was advanced 
      stack with:= 
      	[notup := node.tup,if j = hm1 then 0 else 1 end if]; 
	  node := notup(1); 
    end loop;

    removed frome stack;  -- pop the top element; then advance it
    [tup,loc] := removed; result := tup(loc +:= 1); stack(ns) := [tup,loc];
    ^iter_ptr := stack; return [result];    
    	-- return singleton tuple built from leaf element
    
  end iterator_next;
  
  procedure set_iterator_start;     
  	-- initialize second-form iteration over btup (similar to iterator_start)
  	-- sets up iterator stack as value referenced by iter_ptr
    stack := [];    -- to be built
    ^compno := 0;
    
    node := self;
    for j in [1..h] loop
      stack with:= [notup := node.tup,if j = h then 0 else 1 end if]; node := notup(1); 
    end loop;

    ^iter_ptr := stack;      -- attach stack to iteration pointer

  end set_iterator_start;
  
  procedure set_iterator_next();     -- step second-form iteration over btup
    -- returns value as singleton, or OM if terminating
    -- advances iterator stack referenced by iter_ptr
    stack := ^iter_ptr;    -- retrieve the iterator stack
    ^compno := cno := ^compno + 1;    -- advance the component number
    
    height := 1;
    
    for j in [ns := #stack,ns - 1..1] loop
      if (sj := stack(j))(2) = #sj(1) then 
        height +:= 1; removed frome stack;    -- remove any exhausted element
      else
        exit;
      end if;
    end loop;
    
    if height = 1 then 
      removed frome stack;  -- pop the top element; then advance it
      [tup,loc] := removed; 
      result := tup(loc +:= 1); stack(ns) := [tup,loc];
      ^iter_ptr := stack; return [[cno,result]];    
      	-- return singleton tuple built from leaf element
    end if;
    
    if stack = [] then return OM; end if;    -- iteration is exhausted

    removed frome stack;  
    	-- pop the top element, then advance it and use to rebuild rest of stack
    [tup,loc] := removed; node := tup(loc +:= 1); stack with:= [tup,loc];

    for j in [1..hm1 := height - 1] loop      
    	-- rebuild the stack, starting with the node that was advanced 
      stack with:= [notup := node.tup,if j = hm1 then 0 else 1 end if]; node := notup(1); 
    end loop;

    removed frome stack;  -- pop the top element; then advance it
    [tup,loc] := removed; result := tup(loc +:= 1); stack(ns) := [tup,loc];
    ^iter_ptr := stack; return [[cno,result]];    
    	-- return singleton tuple built from leaf element

  end set_iterator_next;
  
end btup;

The family of routines for iterating over B-trees seen at the end of the preceding code work by maintaining a stack of tree nodes representing the sequence of ancestors leading to the leaf x currently reached by an iteration. Each stack entry stores a tree node t, along with the position i of the next lower node in the chain among the children of t. If the leaf x is not the last child of the bottom-most node t' in this chain, the iteration is advanced simply by moving on from x to the next child of t'. Otherwise we must locate the bottom-most node t* in the sequence whose final child has not yet been reached by the iteration, advance to the next child t2 of t*, and then rebuild the part of the stack following this t2 by appending the chain of children leading to the first node in the suB-tree headed by t2. Details of the code just outlined are found in the 'iterator_next()' routine seen above. The closely related 'set_iterator_next()' routine adapts this same idea to realize iterations of the modified form 'for y = t(i) loop..."

The B-tree object class 'btup' implemented by the SETL class code just described is logically ancestral to most of the other codes need to realize the database facility at which we aim, in the sense that most of the actual codes used are derived rom this initial prototype by working additional functionality and feature into it, which out really changing its basic structure very much. We shall see that such 'conservative reworking' allows us to define both various useful kinds of large string and large tuple objects, and also all of the large maps used for indexing SETL databases in the manner described at the start of this Chapter. Another essential feature which can be worked into the same basic B-tree structure is the reference counting mechanism used to give B-trees the value semantics whose importance we have already explained. The fact that this latter mechanism allows us to implement database modifications using only very limited numbers of tree node changes is essential both to the efficiency of the whole database system, and to the crash-recovery capabilities which are detailed in a later section.

We begin our account of this chain of generalizations by explaining one of the simplest of them, the use of additional cumulants to realize other important data structures by generalized B-trees.

B-trees Incorporating More General Cumulants. The B-tree data structure seen in the preceding code can be adapted to realize other data objects than tuples. For example, we can represent very large strings, e.g., strings several gigabytes in length, by similar B-tree structures, simply by replacing the initial cumulant 'cum' stored in the tree nodes by one which cumulates number of characters rather than number of nodes. The idea here is to store very large strings by breaking them up into pieces, say of a few hundred characters each, which are referenced by an overlying B-tree (of a few million nodes). In this B-tree the cumulant stored represents the number of characters in all the leaves coming below some particular tree node and its left siblings. A variation on this idea would be to maintain strings divided into words by the occurrence of non-alphabetic characters, and into lines by the occurrence of end-of-line characters, in such a way as to allow very rapid access either to a particular character position, or to a particular word by its numerical position in the sequence of all words, or to a particular line by line number. For this, we could simply maintain three auxiliary cumulants, one for total number of characters, the second for total number of word breaks, the third for total number of line-end characters. It should be clear that the code seen above can be adapted to handle all such cases, and that it can retain all the advantages seen above. (Another approach is sketched later.) That is, all supported accesses and edit operations can be executed in time proportional to the logarithm of data object size, and value semantics, with the advantages that we have noted above, can be maintained in the presence of active edit operations.

The cumulants described in the preceding section all are count-like, in that they are calculated for parent nodes by summing the corresponding values for their children. However, the same cumulant technique will work even if operation used to calculate the cumulant of a parent node from those of its children is an associative operation different from simple summation. Only associativity, not even commutativity, is required. This observation opens many possibilities. For example, the maximum and minimum operations in any ordered set are associative. Hence, if the leaves of a B-tree store elements (such as strings or integers) belonging to such a set, we can keep a cumulant derived from the maximum or minimum of children at each of the B-tree's nodes, and keep these cumulants updated as the tree is edited, in much the way seen in the code displayed above, without losing the logarithmic efficiency of the operations which this code implements. Suppose then that we aim to keep the tree leaves in sorted order. The availability of a cumulant representing the maximum of all the leaves coming under or to the left of any of our tree nodes makes it possible to find the largest leaf smaller than a new leaf to be inserted in logarithmic time. Hence we can insert this new leaf at its proper position in log time, thus always keeping the sequence of leaves in their desired order.

Here is another, more sophisticated but less generally useful variant of this same idea. Suppose that we wish to store very long parenthesized strings in a manner allowing the closing parenthesis matching a given open parenthesis to be located rapidly, even though these to parentheses may be separated by millions of characters and many nested pairs of matching parentheses. This can be done as follows. In each node we keep two cumulants, one 'free_opens_in' representing the total number of open-parentheses coming under the node which are not matched by any close-parenthesis coming to their right, the second 'free_closes_in' representing the total number of close-parentheses coming under the node which are not matched by any open-parenthesis lying to their left. To combine these quantities as calculated for two strings when the two strings are concatenated we can use the following formula.

	if (foi1 := free_opens_in(stg1)) >= (fci2 := free_closes_in(stg2)) then 
		free_opens_in(stg1 + stg2) := foi1 - fci2 + free_opens_in(stg2); 
		free_closes_in(stg1 + stg2) := free_closes_in(stg1);
	else
		free_opens_in(stg1 + stg2) := free_opens_in(stg2); 
		free_closes_in(stg1 + stg2) := free_closes_in(stg2) + (fci2 - foi1);
	end if;
This can be thought of as an operation on pairs [num_free_opens_in,num_free_closes_in] which, because of its relationship to the obviously associative string-concatenation operation, is also associative. Hence these quantities can be kept together as cumulants in the nodes of a B-tree representing strings. It is not hard to see that, with these quantities available, the parenthesis which matches a give open- or close-parenthesis can be found by a logarithmically efficient recursive search.

Our first example of the use of additional cumulants to realize modified data structures by B-trees is the 'large string' object represented by the fragments of code seen below, which are excerpted from the full class code for this object type. We give excerpts rather than the full code, since this code derives so closely from that for the B-tuple class seen above. Basically it just: (i) keeps strings in the B-tree leaves of the B-tuple; (ii) adds one more cumulant, the number of characters; (ii) renames all the original Btup routines so as not to conflict with the new string-oriented top-level function syntax; (iv) implements all the string operations using the underlying tuple operations, in a straightforward way. The code excerpts seen below flesh out this sketch with additional details.

Pure cumulant vectors. We can regard a tuple represented by a B-tree containing additional cumulants as a vector supporting logarithmically fast operations giving the partial sums extended over arbitrary tuple slices, in a structure that can also edited rapidly while always retaining fast seachability. For example, storing a cumulant for the square of the components of a tuple of numbers would allow us to find the standard deviation of an arbitrary tuple slice rapidly. This remark extends to associative operations other than sums. For example, storing a cumulant for the maximum of a tuple of numbers or strings, even if this is not sorted, allows us to find the maximum of an arbitrary tuple slice rapidly, and also to locate the maximum by a fast binary style of search.

Various other cumulants allow representation of other kinds of disk-storable objects. One such example is the 'large tuple of long strings'. This can have billions of components, and individual components can be billions of bytes long. The crucial operations linking such objects to the SETL environment can be represented as

	stg := obj(n..[j,k]);     and      obj(n..[j,k]) := stg; 

The first of these operations extracts characters j thru k from the n'th component of the vector of strings and returns it as a standard SETL string (which should be no more than a few megabytes in length). The second assigns a standard SETL string to this same range of characters. We also have

	obj(n..m) := tup;     and     obj(n..m) := OM; 

where tup is a standard SETL tuple of strings. The first of these inserts a vector of strings into the stated range of the 'large tuple of long strings'; the second deletes a range of components.

To represent these objects we need only introduce a family of B-trees with two cumulants: total number of characters in a node's descendant leaves and total number of breaks between vector components in a node's descendant leaves.

Another important case is that of string-to-string mappings, which in abstract terms are sets of pairs of strings [key,val], the key being of no more than standard SETL size, but the string value being possibly very large. The abstract mapping from 'key' to 'val' can be many-to-many, but for simplicity in the present remarks we shall suppose that it is single-valued. Here the main operations can be represented as

	stg := obj(key..[j,k]);         and          obj(key..[j,k]) := stg; 

The first of these operations extracts characters i thru j from the val string located by the key 'key'. The second assigns a standard SETL string to this same range of characters. We also have

	obj(key) := stg;          and         obj(key) := OM; 

where stg is a standard SETL string. The first of these inserts a new [key,val] pair; the second deletes the pair with the stated key.

To represent these objects we can introduce a family of B-trees with two cumulants: total number of characters in a node's descendant leaves and largest key-hash in a node's descendant leaves. Here each record is assumed to consist of four abutting fields

	hash_of_key,length_of_key,key,val

hash_of_key is a 4- (or 5-)byte hash of a record's 'key' value. The key itself can be of an arbitrary length, given by the 4-byte length_of_key field. Records are stored in order of their hash_of_key field. Keys with identical hash_of_key values should occur only very rarely, but when they do will be stored successively, in random order. Search for the record with a given key then proceeds by binary search on the key's hash, followed by key verification at the bottom level. In the rare case in which several keys have the same hash, a slightly more elaborate serial search at the bottom level is required.

These are the objects and operations central to the Berkeley database management library downloadable from http://www.sleepycat.com.

Separated versus integrated cumulants. Cumulants like those discussed above can either be integrated into a single B-tree structure carrying multiple cumulants, or organized into separate B-trees. For example, we can realise a vector of strings as a single tree carrying the two cumulants (i) total number of characters in a node's descendant leaves, and (ii) total number of breaks between vector components in a node's descendant leaves, as indicated above. But we can also represent it as a pair of objects, the first a simple string B-tuple, the second a cumulating vector of integers giving the end position of each of the sections into which this string falls if we regard it as being broken into a vector of strings. Though less efficient than an integrated representation, this second kind of representation is easier to construct since it involves addition of new, generally easy data structures rather than revision of existing structures. This second representation is also useful as a mental tool for thinking thru the design of novel B-tree based data structures.

The reader is invited to work out the code details needed to implement one of the B-tree based data structures sketched in the preceding paragraphs. Code for one of the simplest of these, a family of B-trees which can be used to represent very large strings, is given in a later section. But we postpone examination of this code until something is said about an additional layer of functionality which must be worked into it: the system of reference counts used to realize value semantics for B-trees too large to be held in RAM, whose nodes must therefore be stored as collections of pages held on disk (and perhaps occupying most of a disk or disks many gigabytes in size.)

12.3.Mapping large strings to collections of disk pages.

Reference-count management. For the reasons outlined in the first section on this chapter, we want to manage large objects represented by B-trees and stored on disk, be these simple strings or full databases, and we want them to have value semantics. The initial in_RAM prototypes of these objects, realized by the codes described earlier in this chapter, are implemented entirely by SETL objects, and so inherit their value semantics from that of their underlying SETL objects. In this setting the SETL interpreter supplies the implicit reference count management needed to support value semantics. For objects stored on disk this is no longer the case, and so the reference count management needed to give these objects their desired value semantics must be programmed explicitly. We will now explain what must be done, first in the relatively simple case of large disk-stored strings, and then, in a later section, for full databases.

We separate reference-count management into two parts, a first or 'bottom' part having to do with the counting references to disk records as these are set up and removed by the codes internal to the packages of routines (those described in the preceding section) which implement operations on disk-stored strings (and other like objects) represented by B-trees, and a second or 'top' part relating to the handling of reference counts in procedures, more at the application level, which use these lower-level routines. Note that routines of this second class are unaware of the very existence of internal B-tree nodes, and so never reference them directly. However, they do reference the root nodes of such objects, and so may modify the reference counts of such root nodes.

Reference counting can be implemented by making each disk record used for an object in which we are interested store a field which keeps track of the total current number of references to the record, either from variables in a program manipulating such records, or from other records. First consider reference-count management for (the limited family of) 'bottom' level routines, i.e. those which manipulate internal tree nodes. Whenever such a record, referenced by a variable obj, is about to be modified (e.g. by an operation like obj.set_cum(i,x) or obj.set_tup(i,x) which modifies a field within it), we check the refcount of the record. If this is 1, we can be sure that 'obj' is the only reference to the object, and so we can simply proceed with the change. However, if the refcount is > 1, then there are other references to the same record, and the rules of value semantics forbid the edit about to be performed from affecting the data value seen by these other references. To avoid this we first copy the record which we are about to edit (thereby replacing it with a brand new record whose reference count is certainly 1), and then modify the new record. When this is done the number of references to the old copy of the record diminishes by 1. The check-and-replace operation needed for this can be thought of as an assignment

	obj := obj.maycopy();

inserted at the start of edit operations like obj.set_cum(i,x). 'maycopy' just returns 'obj' if its count is 1, but otherwise returns a fresh copy and decrements the refcount of the old copy by 1.

Creation of the new record copy increments the number of references to all of the records to which the copied record refers. 'maycopy' can handle this supplementary action also.

This is all that we need do in the case of edit operations like obj.set_cum(i,x) which simply modify a numerical value. An additional step is required for operations like 'obj.set_tup(i,x)' which modifies a value which points to another record. This must decrement the number of references to the object 'obj.tup(i)' which the field being modified originally referenced (unless this field was initially unused), and must increment the number of references to the object x (unless x is undefined.) This action can be thought of as an operation

	note_replace_ref(obj.tup(i),x);

inserted at the start of the 'obj.set_tup(i,x)' procedure but not at the start of operations like 'obj.set_cum(i,x)'. Similarly any statement x := obj.tup(i); decrements the number of references to the record value of x (if any) and increments the number of references to obj.tup(i). Hence a call note_replace_ref(x,obj.tup(i)) should be attached to each such operation.

To create new the records required for some of the operations just described, we need a way of allocating the disk pages which will hold them. If disk space is not to be consumed progressively until no more remains, pages allocated to records that are no longer in use must be reclaimed. A record falls out of use whenever its reference count falls to zero, as may, for example, happen to the record referenced by 'obj.tup(i)' when we execute the operation 'obj.set_tup(i,x)'. The simplest (but not the most efficient) way of managing page reclamation is to maintain a free pages list, to the front of which records are linked when they are freed, and from which recycled pages can be recovered by delinking them as needed. This is a simple one-way list. Records put on the free list can retain any references to other records which they initially contain. However, when a record is delinked from the free-list in preparation for re-use,it implicitly loses all these references, so each of the other records that it references must have their refcount decremented by 1, perhaps causing them to be added to the free-list themselves. Alternatively, all the children of a record being freed can be processed immediately to decrement their reference counts and free them recursively if these counts fall to zero.

Whenever a simple assignment x := y; is used to replace the value of one variable by that of another, the record referenced by x (if any) loses one reference, and that referenced by y (if any) gains one. These actions can be implemented by a call adjust_counts(x,y);

To complete our account of the bottom-level reference-count manipulations needed to implement value semantics for disk-stored structures, it only remains to consider the treatment of procedure calls and returns. A call f(expn1,...,expnn) can be viewed as a group of assignments

param1 := expn1; ... paramn := expnn

of the call arguments to the parameter values of f, which are originally OM. Hence calls simply increment the reference-count values of those of their arguments which have references as values. This set of assignments is followed by a jump which has no effect on reference counts. All variables other than the parameter variables of called procedures should be given the value OM; recursive calls should be considered to introduce wholly new sets of parameter and internal variables.

The operation

return expn;

can be viewed as an assignment

retvar := expn;

of 'expn' to the return variable of the function call, followed by a jump which has no effect on reference counts. The return also deallocates all the internal and parameter variables of the procedure being returned from. This implicitly sets all these values to OM, and so should decrement the reference count of every variable whose immediate pre-return value is a record. This can be done by inserting a call

cleanup([v1,...,vm]);

immediately before each return statement, which should however first be rewritten in the form

retval := expn; return retval;

The 'retval := expn;' statement will have its normal effect on reference counts; the 'return retval;' statement which follows it will have none. Function calls which ignore their returned valves should be treated as if they read

retval := f(expn1,...,expnn); retval := OM;

The 'cleanup' routine seen above simply examines each component of its argument tuple and decrements the number of references of each such component which is a record.

Top-level considerations. Next we consider the 'top' part of the reference count problem, i.e the handling of counts in (the open-ended family of) application level procedures, which only modify the reference counts of the root nodes of disk-stored structures represented by B-trees. Here our problem is to integrate the treatment of these objects with the ordinary actions of the SETL space-management and reference counting system. This can be done is various ways, depending on what space-recovery mechanisms are available. In the code which realizes large string objects given later, these objects appear as instances of an object class which keeps track of all the large strings which visible at the application level. This class can then provide a 'hold(object_list)' method which application-level code can use to erase all string objects which do not appear in the 'object_list' parameter passed to 'hold', thereby recovering the storage space that these string objects would otherwise occupy.

Optimization of reference-count management.

Each time a B-tree node N becomes the child of some newly created node (as, e.g., when a node must be copied in consequence of a modification of some leaf of a tree to whose root there exists two independent references the number of references to N must be incremented by 1. Each time N ceases to be the child of such a node (e.g., when a node referencing it is destroyed, or when some external variable referencing a tree root is cleared, the count of references to N must be decremented. These rules imply a large amount of reference-count maintainance activity, and so tell us that refcount-map implementation is the most efficiency critical point in the SETL database design. The following numbers will give a sense of what is involved. Consider an implementation that will manage a 100GB disk, divided into 100M physical records, each 1000 bytes in size. Then child-node identifiers and 'cum' values can both be 4 or 5 bytes in size, allowing each node to have roughly 100 children. This implies a tree-branching factor of approximately 100, so if the whole disk is devoted to a single database, the depth of its representing B-tree will be roughly 4. Each B-tree edit forcing a logical tree copy to be made will therefore require roughly 400 refcount-map changes.

To accommodate this large amount of refcount-update activity it is import to keep the refcount entries in RAM and to accelerate access to them. The following technique is used. The refcount map is represented by two flat arrays, called refcount_disk and refcount_RAM. As their name suggests, the larger of these is held on disk, and the smaller in RAM. The refcount_disk array needs 4 byte fields, but the refcount_RAM array, which is an approximate representation of refcount_disk, can have one byte fields. Thus, even for a database containing up to 100 million physical pages, refcount_RAM can be stored in a 100MB flat array.

The refcount-RAM and refcount_disk arrays are related in the following way. The in-RAM refcount array of single bytes and the refcount_disk array are equally long. The length of these arrays is equal to the total number of physical disk pages to be managed, and together they store a refcount for each page. The refcount for page j is always the sum refcount_RAM(j) + refcount_disk(j). Integer values in 'refcount_RAM' are arranged in two numerical ranges. Values from 0 to 139 are used when refcount_disk(j) is 0, and so represent 'full' refcount values. Values from 140 to 255 are stored as integers offset by 140 (and so represent integers in the range 0 to 115). These are used when refcount_disk(j) is not 0, and so represent 'partial' refcount values. refcount_disk values no larger than 140 (ordinarily, this will be all refcount_disk values) are simply copied to refcount_RAM when this latter array is initialized (and the corresponding refcount_disk entry is zeroed.)

The refcount incrementation and decrementation operations 'incref' and 'decref' normally increment and decrement refcount_RAM(j) only. If refcount_RAM(j) sinks to zero, the corresponding refcount really is 0, and the corresponding record is freed. If refcount_RAM(j) rises to 140, 70 references are unloaded from it to refcount_disk(j), but then refcount_RAM(j) is kept in 'offset' form, i.e as an integer in the range 140 to 255. If a refcount_RAM(j) value held in this offset form sinks to 139, refcount_disk(j) is accessed and up to 60 reference units are subtracted from it, and added to refcount_RAM(j), either in offset or non-offset form, depending on whether anything remains. Similarly, if refcount_RAM(j) rises to 256, 70 references are unloaded from it to refcount_disk(j).

For purposes of crash recovery, as detailed in a later section, refcount_RAM is supplemented with an auxiliary log file refcount_log to which in_RAM refcount values are written (in a fail-safe manner) during backup operations (as described later) to insure that they are recoverable from disk.

Persistence. Large strings of the form we have been discussing must be persistent if they are to be useful, since it is not feasible to reconstruct them whenever they are accessed. (Think, e,g,, of a 50GB string.) For these objects to be persistent, they must have persistent external names (like file names) under which they can be stored, and using which they can be accessed. In the code seen below, the string objects shown are thought of as residing within files (whose names then give part of the persistent string object name needed.) However, it is often appropriate for several such string objects to be resident in a single file (for example, when these are variants of a single such object stg_obj, obtained by editing stg_obj in various ways.) To give each of these string objects its own persistent name, the prototype system implemented by the code seen below provides two operations

open_string_set(file_name);           and           close_string_set(file_name,names_to_strings_map);

These two operations are used to store arbitrary maps from string-names to bigstrings (i.e. 'stgs_btup' objects) in a persistent manner on disk, and more specifically in a file on disk, whose file name is the first parameter of the open_string_set and close_string_set operations. The 'open_string_set(file_name)' call opens the specified file, and returns a previously stored mapping from names to the string objects stored under these names by a preceding 'close_string_set' operation. These string objects can then be accessed and manipulated by operations of the normal SETL string form, e.g. stg_obj(i..j), stg_obj(i..j) := x; (where x is another such string object) and stg_obj_1 + stg_obj_2 (denoting concatenation.) Once any desired objects of this kind have been built, they can be stored persistently, just by wrapping them into a map 'names_to_strings_map' from names to the string objects which are to be saved, and then making a call

close_string_set(file_name,names_to_strings_map);

This writes a logical copy of the string objects included in the names_to_strings_map to the designated file (which can be different from the file named in the original 'open_string_set(file_name)' call). (In the prototype code shown below, an actual physical copy is written, but in a production-strength implementation, designed to manage string objects which might be gigabytes in length, only a short file of header information, containing a reference to the actual location of the very large strings represented, would be written.) Since our string objects have value semantics in any case, the difference between these two implementations is not significant, except for the efficiency issues involved.

To convert a string object which is not too large into the corresponding SETL string, the operation str(stg_obj) is used. To convert a SETL string into a string object of the type we have been discussing, one uses the creator routine 'stgs_btup' supplied by the corresponding object class in a 'creation call' of the form 'stgs_btup(stg)', where 'stg' is the SETL string to be converted. To initiate a new, initially empty collection of string objects (which can then be edited, and subsequently saved using the 'close_string_set' call) one writes

open_string_set(OM);
This returns an empty mapping.

Examples of the use of Large-string Objects. The prototype code seen in a subsequent subsection implements large-string objects of the kind discussed above. That code is organized around one principal object class, 'stgs_btup', and several auxiliary packages, as detailed below. Of these, the most salient is 'string_sets', which provides the 'open_string_set' and 'close_string_set' procedures discussed above. As seen in our first example, 'stgs_btup' provides an alternative creation call of the form 'stgs_btup(0)' which can be used to create a void string object if one is wanted to invoke one of the utility methods of the 'stgs_btup' class, e.g. 'cleanup_space()', which provides a report on then number of items lost from the expected free space pool, if any. (After all big-string objects have been erased, this can be used to check for 'memory leaks' in an application using these objects.) The global variable 'dump_count', made public by the 'string_sets' package, indicates the number of records that have been swapped to disk during a run.

Our first example creates an initially empty set of strings, and an empty string object, and then reads a sample text file, by 10K sections, inserting them into the string object. Once this has been done, a section of the string object is accessed and printed, the string is saved in a string set under the name '"Chapter 10.1"', and the string set is closed, being written into a file 'Chapter_10_file' which keeps it persistent. This file is then re-opened immediately, and the saved string object is accessed. After this, a few additional text sections are extracted and printed, and the string object is again saved, in the same file, under the same name. (For your own tests you can substitute any available text file.)

program test; -- test of stgs_btup operations
 use stgs_btup,string_sets; 
 	-- use the bigstring object class and its supporting package

 setup_chapter_test; -- test of half meg chapter string setup

 stgs_btup(0).cleanup_space();