CHAPTER 2

Simple Data Types, Expressions, and Operations

2.1 The Main Classes of Data Objects

SETL allows one to manipulate several kinds of data items or values, including simple data items, composite data items, user-defined objects, and procedures. We begin by describing simple and composite data items, which are basic to the SETL language. User-defined objects and procedure values, which are very powerful but somewhat more advanced, are described in a later chapter. The intuitive distinction between simple and composite data items is that composite data items have elements, or components, which are themselves other (simple or composite) data items; simple data items are less 'decomposable'.

Four of the simple kinds of data items, namely

integers
floating-point numbers
character strings
Boolean values

are very much like those provided in most other programming languages. A fifth kind of data item, called atoms, will be a bit less familiar. One special SETL quantity, namely the undefined value, called OM (for omega, the last letter of the Greek alphabet), is used frequently in SETL programs, and its somewhat unusual properties, akin to those of atoms, will become fully familiar as we go along. In addition to simple data items, SETL provides exactly two kinds of built-in composite objects, namely

sets
and
tuples

The fact that it allows sets to be used freely gives SETL its name "SET-L."

Sets of one particular kind, namely sets of ordered pairs, play particularly important roles and therefore are often referred to by a special name:

maps

These are the types of data supported by SETL. In the rest of the chapter we will examine how values of the simple types are created and manipulated. We will discuss the operations that are meaningful, and the operators with which we construct expressions of each of these types. In the next chapter will do the same for the composite objects.

2.2 Simple Types and Their Constants

To use objects of any of these kinds in a program we often need to be write them out directly. For example, to give a variable x the value we may want to write

x := 3.14159;

A value written into a program in this way is called a constant, a constant denotation, or (by some authors) a literal. The rules for the various constants allowed in SETL are described in this section.

2.2.1 Integer constants

Integer constants are written in the standard way, as sequences of digits possibly preceded by a + or - sign. Examples are

0
1066
- 50
+ 35
001616232358

The proper way to write an integer constant can be summarized by a diagram, or graph, as in Figure 2.1:

Figure 2.1

This kind of diagram will be used frequently in what follows, to describe precisely the rules of syntax of various constructs of SETL, and we pause briefly to explain the rules of construction of such diagrams, and their meaning.

Such a syntax diagram consists of rounded boxes, square boxes, and paths with arrows connecting these boxes. Each diagram also has an edge that leads into it, and an edge that exits from it. Any path through a diagram that follows the edges in the indicated directions is a valid instance of a language construct. The two kinds of boxes have the following meaning:

A square box denotes a symbol of SETL, which must appear as is when used. For example, the + and - signs, the parenthesis, and keywords such as if, loop, exists, and so on.
Rounded boxes correspond to other language constructs for which as is, separate diagram is provided. For example, the construct 'digit' is described fully by a diagram that lists the 10 digits as valid instances of this construct. A full list of diagrams for SETL is provided in Appendix B. To test your understanding of these, verify that the diagram in Figure 2.1 allows you to write -12345678 as a SETL integer but forbids 1234'5678.

2.2.2 Floating-point constants

Floating-point numbers are written in one of the standard notations, namely either in decimal form or in exponent form. A (base 10) floating-point number in decimal form is a sequence of decimal digits, followed by a decimal point, followed by a second sequence of decimal digits, and possibly preceded by a '+' or '-' sign. The initial but not the final sequence of digits can be omitted. Examples are

	0.0
	.3156 (but note that 3. is illegal)
	1066.6
	-50.50
	+35.50
	3.1415928

A floating-point number in exponent form is a floating-point number in decimal form, immediately followed by the letter E, and then by an integer (the exponent). Examples are

	1.0E100
	31415.9E-4
	6.0E+23

This last form for real constants corresponds to the ordinary "scientific" notation for decimals; e.g., these three examples would be written in ordinary scientific notation as

10*10¹⁰⁰, 31425.9*10^-4, and 6.0*10²³

The previous description of floating-point constants is summarized by diagram in Figure 2.2. This diagram makes it clear that any valid floating point constant must have one digit or more after the decimal point but may have none before it.

Figure 2.2

2.2.3 String constants

A string is an ordered sequence of zero or more characters. To write a string as a constant we enclose it within quotation marks, as in the following examples:

"Brother, can you spare a dime?"
"*!!-;*!!"
""

This last example shows the null string, i.e., the (unique) string consisting of no characters at all. Note that blanks appearing within a string are significant, are treated in the same way as any other character. Thus, although the number of characters in "Hello" is 5, the number of characters in " Hello" or "Hello " is 6, and the number of characters in " Hello " is 7.

If the quotation mark itself is to appear within a string it must be written with a prefixed backslash, to indicate that it is part of S and not the end of S. Thus, to write the string The character -"- is called a 'quote mark'. as a constant, we would write

"The character -\"- is called a 'quote mark'."

The backslash character is sometimes called the 'escape' character. To write it within a character string, it must be doubled. For example, to print the sentence displayed above in a somewhat more symbolic form we might write the string

"The \\ character is sometimes called the 'escape' character."

When printed by SETL, doubled backslashes within quote marks are reduced to single backslash characters. For example, the length of the SETL string "\\\\" is 2, and the length of the SETL string "\\\\\"" is 3 (two backslashes followed by a quote character). A few other characters are written by using special multi-character combinations of this kind beginning with the backslash. (In this regard, SETL's conventions are much like those of C.) The special characters written in this way are as follows:

	"\n" designates the 'linebreak' or "newline' character
	"\t" designates the tab character
	"\r" designates the carriage-return character
	"\0" designates a zero byte
	"\f" designates the form-feed (new page) character
	"\xdd" is used within quote marks to write characters in hexadecimal. 
		Here each d must be a valid hex character, i.e. 0..9 or A..F,
		in either upper or lower case.

Any of the characters available on the machine which you are using can used in a string constant, although SETL programs which are to be run variety of different computers should restrict themselves to the characters available on all computers to avoid character-set translation problems.

Sometimes one will need to write a long string constant, so long that it must cross a line boundary. This can be done by ending the first part of string with a quotation mark, following this with a Plus sing (i.e. '+') and then continuing immediately on the next line, with a second quotation mark character to continue the string. This means, for example, that we can write the string assignment statement

x :=	"This sentence may be too long to fit into the " +
	"rest of the line";

in order to assign to x the full 14-word sentence given.

2.2.4 Boolean constants

There are two Boolean values, truth and falsity, in SETL. These are written as true and false, respectively.

2.2.5 Atoms

Atoms are generated values that can be used to label objects in a SETL program. Atoms are different enough from other data types in their functions and use that we will postpone their discussion until Section 2.5.5. Let us just note that there are no constant denotations for atoms.

2.2.6 OM: the undefined value

OM is a special SETL constant that denotes the undefined value. The notion of an undefined value may appear paradoxical at first but is in fact exceedingly useful. There are many computations in SETL which under certain circumstances cannot yield a value of any type. These correspond typically to some "nonsensical" calculation such as asking for an element of an empty set, or for the first integer between 2 and 5 which is divisible by 7, and so on. The result of these and similar computations (we shall see later on how to express them in SETL) is the value OM. OM can be used in an assignment statement and in tests (to determine whether a given value is actually undefined) and often serves to indicate that a calculation has reached some end. We will see many examples of its use as we go along.

2.3. Variable Identifiers

Most programming languages make it possible to perform calculations and then save their results for reuse later. This is done by assigning the results of calculations to a variable identifier (sometimes abbreviated simply as variable, or as identifier). An example is

x := 1 + 2 + 3 + 4 + 5;

which saves the result of the expression 1 + 2 + 3 + 4 + 5 appearing to the right of the assignment operator :=, making the result the value of the variable identifier x appearing to the left of this assignment operator. Since the value in question is 15, the command

print(x);

would then print the current value of the variable x, namely 15. Identifiers are composed of the letters, digits, and the underscore character "_." The first character of an identifier must be a letter. The following are examples of valid identifiers:

x
x23
big1
End_of_Input_flag
set_garbage
z123456789
eta_

On the other hand, the following are not valid identifiers:

big 1
x.23
23x

because the first two contain characters other than letters, digits, and underscores (blank in the first case, period in the second), and the third begins with a digit rather than a letter.

Identifiers can be of any length, but they cannot be split between two lines.

Except within quoted string constants, capitalization is ignored by SETL, so that

Big_set
big_set
BIG_SET
big_SET
BIG_sEt

are considered to be identical.

The diagram in Figure 2.3 describes the structure of valid identifiers:

Figure 2.3

The proper choice of identifiers can make an important contribution to the clarity and professionalism of your programs. If you choose identifiers thoughtfully, your program will be easier for others to read and understand, and, equally important, will be easier for you to understand. Careless errors are also less likely to occur, since the inner "rhythm" of a well-chosen set of identifiers will make errors easier to detect when your program is written, typed, and proofread. Here are some useful guidelines for the choice of identifiers:

Choose mnemonic identifiers, i.e., identifiers which explain the meaning of the quantities which they represent; e.g., an identifier which represents some sort of upper limit value in a program should be called upper_limit or uplim rather than simply u or L.
Avoid ambiguity in the choice of identifiers and use standard spellings. It is certainly bad practice to have two different identifiers called, e.g., STACK and STAK. It is also bad practice to use variant spellings like STAK, since without noticing it you may slip back to the standard spelling. Use the standard spelling STACK instead.

2.4 Expressions and Statements

The use of expressions like those of algebra is one of the main features of most programming languages, including SETL. Expressions denote values, which can be printed, saved as the values of variables, etc. The following are typical expressions:

3 + 5 * (7 - 11)

17.0 / 31.3131 + 19.9

x + Y

x1 + x2 + x3 + yl * y2 * y3

As these examples show, an expression can involve both constants and variables (also called identifiers). Values are given to variables by assignments. For example, the following assigns the value 3 to the variable zz1:

zz1 := 3;

An assignment is written by using the := (colon-equal) sign, sometimes called the assignment operator. The assignment is the first type of statement that we will use. Statements are the basic building blocks out of which programs are constructed. In this chapter we will only use two types of statements: the assignment statement and the print statement, whose purpose is to display (on the screen or on an output listing) the result of a computation. The print statement has the format

print(expression1, expression2,... );

that is to say, it consists of the keyword print, followed by a list of expressions, enclosed between parentheses and separated by commas. Any number of expressions can appear in a print statement. A print statement that does not include a list of expressions will simply produce a blank line.

Variables can have different values in the course of program execution. To understand the meaning of an expression, the following rule must be kept in mind:

A variable appearing in an expression always stands for its current value. Thus if we write the statements

	zz1 := 3;
	zz2 := 17;
	print(zz1);
	print;
	print(zz1 + zz2);

the current value of the variables zz1 and zz2 at the moment that the print instruction is executed will be 3 and 17 respectively, so that the output of this program fragment will be:

3
20

(Note the blank line separating the two printed values produced by the print statement without argument). Suppose next that we write:

	zz1 := 3;
	zz2 := 17;
	print(zz1);
	zz1 := 4;
	print(zz1 + zz2);
	print(zz1);

This will produce the output

3
21
4

because the value of the variable zz1 has been changed by the assignment statement "zz1 := 4" after the first print statement but before the second print statement, and because (we say it again) a variable appearing in an expression always stands for its current value, i.e., the last previous value given to the variable by any assignment statement. To test yourself, see whether you can tell what output the following sequence of statements will produce:


	x:= 1;
	print(x);

	x := 2;
	print(x);

	y:= 3;
	print(x + y);

	x := 0;
	print(x + y);

	y := 0;
	print(x + y);

	x := 1;

	print(x + x);
	y:= 1;

	print(x + x);
	print(x + y);

Expressions can be compounded; that is, an expression e1 can be substituted for any variable appearing in another expression e2, thereby generating a more complicated but still legal expression. For example, by substituting x + y for z in 2 * z, one generates the expression 2 * (x + y). Then, by substituting 3 * a * b for y in the result, one generates the expression 2 * (x + 3 * a * b).

As in algebra, the order in which a compound expression containing many operators is evaluated is determined by the "precedences" of the operators involved. For example, multiplication and division are given higher precedence than addition and subtraction and are therefore performed before the latter. The operator precedence rules can be bypassed by using parentheses: subexpressions enclosed within parentheses are always evaluated before any operation is applied to them. For example, 1 + 2 * 3 has the value 7 because the multiplication 2 * 3 is performed before the addition, but (1 + 2) * 3 has the value 9 since the parentheses force the addition to be performed first.

Both binary operators like the " +" in x + y, and unary operators like the "-" in x + (-y) can appear in expressions. Some operator signs like "-" can designate both binary and unary operators: unary if they are preceded by a left parenthesis or by another operator, binary otherwise. On the other hand, some operator signs are only used to designate binary operators, and others are used only to designate unary operators. All the (binary and unary) SETL operators will be described in this chapter and in Chapter 3. Section 3.1 contains a table giving the precedences of all operators.

2.5 Operations with Simple Data Types

2.5.1 Integer operators

We begin our systematic description of the SETL operators by discussing those operators that take arguments of integer type. Some of these operators yield a value of the same type, for example, the familiar arithmetic operators of addition, subtraction, multiplication, and division. Another group of integer operators yields a Boolean value: true or false. This is the case for comparison operators (greater than, equal to, etc.) These operators are often called predicates. Finally, a conversion operator, float, allows us to convert an integer into a floating-point quantity.

2.5.1.1 Binary operations on integers

i + j computes the sum of i and j.

i - j computes the difference of i and j

i * j computes the product of i and j.

i ** j computes i to the j-th power. An error results if j is negative or if i and j are both zero.

i / j computes the integer (whole number) part of the quotient of i by j. The fractional part of the quotient is discarded. An error results if j = 0. See the following examples for the way in which i / j works if one of i or j is negative.

i mod j computes the remainder left over when i is divided by j. An error results if j = 0. The result is always positive.

i max j yields the larger of i and j.

i min j yields the smaller of i and j.

2.5.1.2 Predicates on integers

i = j yields true if i and j are the same, false otherwise.

i /= j yields true if i and j are different, false otherwise.

i > j yields true if i is bigger than j, false otherwise.

i < j same as j > i.

i >= j yields true if i is no smaller than j, false other wise.

i <= j same as j >= i.

Examples of use of these operators are as follows:

print(1 + 1); yields 2

print(1 - 1, 1 - 10); yields 0 -9

print(1 * 2, 1 * (-2), (-1) * 2, (-1) * (-2)); yields 2 -2 -2 2

print(2 ** 3, (-2) ** 3, 2 ** 0, (-2) ** 0); yields 8 -8 1 1

print(1 / 3, 2 / 3, 3 / 3, 4 / 3); yields 0 0 1 1

print(1 mod 3, 2 mod 3, 3 mod 3, 4 mod 3); yields 1 2 0 1

print(7 / 3, (-7) / 3, 7 / (-3), (-7) / (-3)); yields 2 -2 -2 2

print(7 mod 3, (-7) mod 3); yields 1 2

print(1 max 2, (-1 ) max (-2)); yields 2 -1

print(1 min 2, (-1)min(-2)); yields 1 -2

print(1 = 1, 1 = 2); yields true false

print(1 /= 1, 1/= 2); yields false true

print(1 > 1, 1 > 2, 2 < 1); yields false false false

print(1 > 1, 1 < 2, 2 < 1); yields false true false

print(1 >= 1, 1 >= 2, 2 >= 1); yields true false true

print(1 <= 1, 1 <= 2, 2 <= 1); yields true true false

Concerning i / j and i mod j, it is useful to note that for i (and j) positive we always have i = (i /j)*j + (i mod j), but for i negative this is false, e.g.,

(-7)/ 3 is-2,

but

(-7) / 3 is 2.

2.5.1.3 Unary integer operators and functions

Unary integer operators compute a result value from a single input i.

-i computes the negative of i.

abs(i) computes the absolute value of i.

even(i) yields true if i is even, false if i is odd.

odd(i) yields false if i is even, true if i is odd.

float(i) converts the integer i to the corresponding floating-point value. If the conversion causes overflow, which is possible if i has a very large value, then an error results.

Examples of these unary operators are as follows:

print(0 + 1,0 +(-100)); yields 1 -100

print(-1, -(-100)); yields -1 100

print(abs(1), abs(-2)); yields 1 2

print(even(1), even(2), even (-1)); yields false true false

print(odd(1), odd(2), odd(-1)); yields true false true

print(float(1), float(-1), float(2)); yields 1.0 - 1.0 2.0

EXERCISES
1. Which of the following are valid identifiers?
	(a) number_1 (b) number 1 (c) number.1


2. What output will be produced by the following code?


				Program one;
					x:= 1; y := 2;
					print(x + y);
					x := 3;
					print(x + y);
					y := x + y;
					print(x + y);
				end;


3. What is the output produced by the following program?


				program multiply_x_by_y;
					x:= 11; y:=11;
					print(x * y);
				end;


4. What output will the following code produce?


				program thr3;
					number := 1;
					Number := 2;
					NUMBER := 3;
					print(number + Number + NUMBER);
					number := number * NUMBER;
					print(number + Number + NUMBER);
				end;


5. What output will the following code produce? 


			program five;
				number1 := 1;
				Number1 := 2;
				Number_1 := 3;
				print(number1 + Number_1 + Number1);
				number1 := Number1 * Number1;
				print(number_1 + Number1 + Number1);
			end;


6. What output will the following code produce? 

			program xs;
				x := 1; y := 2; z := 3; w := 4;
				print((x + y), z * (x + y), z * x + y, w + z * (x + y)); w := 2;
				print(w + z * x + y,z * y / w, y ** (x + y) * z);
			end xs;
					
7. Which of the following are valid expressions?

	(a) x (b) x + y (c) (x + y) ** w
	(d) (x + y) ** w * *w (e) a_1 / (x + y) ** w ** w

8. Evaluate the following constant expressions:

	(a) 2 ** 2 		(b) 2 ** 2 ** 3 		(c) (2 ** 2) ** 3
	(d) 2 ** (2 ** 3) 	(e) 3 / 2 			(f) 1 / 2
	(g) (1 + 2) / 4 	(h) (-11) mod 5 		(i) -1 mod 5
	(j) 2 ** 2 ** 3 /= 64	(k) 3 - 0 / 3 			(l) 3 - O < 3
	(m) (-35) min 1

9. Simplify the following expressions:

	(a) + - +--x
	(b) --x
	(c) x max y min y
	(d) x max (y min y)
	(e) x max x

10. Evaluate the following constant expressions: 

	(a) abs(-1) + abs(-2) (b) abs(-1 + abs(-2)) 
	(c) abs (1 min -1) (d) abs(1 max -1)
	(e) 1 max 2 min 3 (f) 1 max 2 max 3
	(g) 2 + 2 max 3 + 3 (h) -2 - 2 max -3 - 3


11. Re-express the following expressions in as simple a way as you can, using the max, min, and abs operators:

	(a) x max -x
	(b) x min -x
	(c) (x max 0) + (x min 0)
	(d) (x max 0) + (-x max 0)

2.5.2 Floating-point operators

2.5.2.1 Binary floating-point operators

Binary floating point operators compute a result value from two floating point values, x and y. The binary floating-point operators provided by SETL are as follows:

x + y computes the sum of x and y.

x - y computes the difference of x and y.

x * y computes the product of x and y.

x / y computes x divided by y. An error results if y is zero, or if the division causes floating-point overflow.

x ** y this variant of the exponentiation operator yields x raised to the y power. An error results if exponentiation causes floating-point overflow, or if x and y are both zero.

x max y yields the larger of x and y.

x min y yields the smaller of x and y.

atan2(x, y) yields the arc tangent of the quotient x/y. The result is given in radians.

2.5.2.2 Predicates on floating-point values

x = y	yields true if x and y are equal, false otherwise.
x /= y	yields true if x and y are unequal, false otherwise.
x > y	yields true if x is greater than y, false otherwise.
x < y	same as y > x.
x >= y	yields true if x is at least as large as y, false otherwise.
x <= y	same as y >= x.

2.5.2.3 Unary floating-point operators

+x	yields x.
-x	yields the negative of x.
abs(x)	yields the absolute value of x, i.e., yields x if x is positive, -x if x is negative.
fix(x)	yields the integer part of x, dropping its fractional part.
float(i)	yields a floating-point quantity numerically equal to i, where i is an integer.
floor(x)	yields the largest integer which is not larger than x. (See the following examples for the rule which applies if x is negative).
ceil(x)	yields the smallest integer which is at least as large as x. (See the following examples for the rule which applies if x is negative).
exp(x)	yields e ** x, where e is the base of natural logarithms.
log(x)	yields the natural ("base e") logarithm of x. An error results if x is zero or negative.
cos(x)	yields the cosine of x, which is assumed to be given in radians.
sin(x)	yields the sine of x, which is assumed to be given in radians.
tan(x)	yields the tangent of x, which is assumed to be given in radians.
acos(x)	yields the arc cosine of x; the result is given in radians. An error results if x does not lie between-1.0 and + 1.0.
asin(x)	yields the arc sine of x; the result is given in radians. An error results if x does not lie between-1.0 and + 1.0.
atan(x)	yields the arc tangent of x; the result is given in radians.
sqrt(x)	yields the square root of x. An error results if x is negative.
sign(x)	yields one of the integer results -1, 0, or +1 depending on whether x is negative, zero, or positive.

Examples of some of these operators are as follows:

1.1 + -1.1 yields 0.0

1.1 * 1.1 yields 1.21

1.1 ** 2.0 yields 1.21

1.1 = 1.11 yields false

1.1 = 1.10 yields true

1.1 max 1.1001 yields 1.1001

1.1 min 1.101 yields 1.1

+ 1.1 yields 1.1 Note that the two '-' signs must be separated by a space, since '--' starts a comment

- - 1.1 yields 1.1

abs(-1.1) yields 1.1

print(fix(1.1), fix(-1.1)) yields 1, -1

print(floor(1.1), floor(-1.1),
floor(-1.0)) yields 1, -2, -1

print(ceil(1.1), ceil(-1.1),
ceil(1.0)) yields 2, -1, 1

print(float(1), float(-1),
float(0)) yields 1.0, -1.0, 0.0

The forms in which floating-point constants can be written are described in Section 2.1.1.

Note that for floating-point numbers x and y, the use of the predicates x = y and x /= y can be a bit tricky since rounding effects might cause (0.5 + 0.5) = 1.0 to yield false and 1.0 = 1.0000000000000000001 to yield true. Keep in mind the fact that floating-point values can always turn out to have values slightly different from the exact values that you may expect. To learn their proper use in all but obvious cases, you may need a course in numerical analysis.

2.5.3 String operators

Binary string operators compute a result value from two inputs, at least one of which is a string. Some of these operators take two strings as their arguments, others take a string and a positive integer as their arguments. Some of these operators are predicates and perform string comparisons analogous to the integer comparisons discussed previously.

In what follows, s and ss are always strings; i and j are integers.

The string operators are the following:

s(i) computes the i-th character of the string s; the result is a one character string. If i is negative, an error results; if i is greater than the length of s, then the value OM is returned.

s(i..j) this string slice operator computes and returns the substring of s which extends from its i-th through its j-th characters, inclusive. If i = j + 1, a null string is returned. See Table 2.1 for a description of the treatment of other marginal and exceptional cases for this operator. (Note that this operator actually has three, rather than two, arguments.)

s(i..) this computes and returns the substring of s which extends from its i-th character through the end of s, inclusive. See Table 2.1 for a description of the treatment of marginal cases of this operator.

s + ss concatenates the two strings s and ss. i *s concatenates i successive copies of the string s. If i = O, then i*s is the null string. If i < O then an error results.

s = ss yields true if s and ss are identical, false otherwise.

s /= ss yields true if s and ss are distinct, false otherwise.

s > ss yields true if s comes later than ss in standard alphabetical order, false otherwise. (Note that this operation, as well as the other string comparisons s < ss, s >= ss, s <= ss, are implementation-dependent, as they depend on an assumed alphabetical order of characters (collating order). Of course, alphabetic characters will always have their standard order, but the relative order of punctuation marks, and also the way in which alphabetics compare to numerics, may differ from implementation to implementation.)

s < ss yields true if s comes earlier than ss in standard alphabetic order, false otherwise.

s >= ss yields true if s either is identical with ss or comes later in standard alphabetic order, false otherwise.

s <= ss yields true if s either is identical with ss or comes earlier in standard alphabetic order, false otherwise.

s in ss yields true if s occurs as a substring of ss, false if not.

s notin ss yields false if s occurs as a substring of ss, true if not.

To give examples of these operators, we shall suppose that the value of s is the string "ABRA," and that the value of ss is the string "CADABRA". Then

print(ss(1), ss(4)); yields C A

print(s(1..2), s(2..4), s(2..2)); yields AB BRA B

print(s(1..0)); yields the null string

print(s(1..), s(2..), s(3..), s(4..)); yields ABRA BRA RA A

print(s(5..)); yields the null string

print(s(5)); yields OM

print(s + ss); yields ABRACADABRA

print(3*s); yields ABRAABRAABRA

print(s > ss, ss > s); yields false true

print("AA" > "A","A" > ""); yields true true

print("AA" < "A","A" < ""); yields false false

print(s in ss, ss in s); yields true false

The unary string operators compute a value from a single input. These operators are

#s yields the number of characters in the string s.

abs s here s must be a one-character string, or an error results. If s is a single character, then abs s returns the internal integer code for this character. Abs and char are inverse operators.

char i here i must be an integer which can be the internal code of some character c. If this is so, then char i yields the single character c (i.e., a one-character string). Otherwise, an error results. (The range of integer values used as character codes is implementation-dependent.)

str x x is any SETL object. str x is a string that is the printable form of the value of x.

Table 2.1 shows the way that the string extraction operators s(i), s(i..), and s(i..j) behave in various marginal cases.

To each string extraction operator there corresponds a string assignment operator that modifies the string section which the corresponding extraction operator would retrieve. These string assignments are indicated by writing either s(i), s(i..), or s(i..j) to the left of the assignment operator ":=." For example, if s is a string, we can modify the section of it extending from its

Table 2.1 Behavior of String Operators in Marginal Cases

Operator Condition Effect

s(i) i negative or zero causes error

s(i) i > #s yields OM

s(i..) i negative or zero causes error

s(i..) i = #s + 1 returns null string

s(i..) i > #s + 1 causes error

s(i..j) i negative or zero causes error

s(i..j) i >j + 1 causes error

s(i..j) j negative causes error

s(i..j) j > #s causes error

s(i..j) i = j + 1 returns null string

second to its fourth character (inclusive) by writing

		s(2..4) := x;      						           (1)

where x is any string. The expression x need not be a string of length 3, so that the assignment operation (1) can lengthen s (if x has length greater than 3) or shorten it (if x has length less than 3). Similar remarks apply to the string assignment operation

s(i..) := x;

which is treated exactly as if it read

s(i..#s) := x;

However, the right-hand side of the simple string assignment

s(i) := x;

must be a single character, because s(i) denotes the i-th character of string s, not a substring of s.

For examples of all this, suppose that s1, s2,..., s7 are seven variables, all having the string value "ABRACADABRA" initially. To achieve this, you need only execute

s1 := s2 := s3 := s4 := s5 := s6 := s7 := "ABRACADABRA";

Then the following assignments produce the indicated results.

s1(3..5) := "XXX"; -- now s1 = ABXXXADABRA

s2(3..4) := "XXXXXX"; -- now s2 = ABXXXXXXCADABRA

s3(3..4) := "X"; -- now s3 = ABXCADABRA

s4(3..4) := ""; -- now s4 = ABCADABRA

s5(7..) := "XXX"; -- now s5 = ABRACAXXX

s6(7..) := ""; -- now s6 = ABRACA

s7(1) := "Y"; -- now s7 = YBRACADABRA

To summarize, the three string assignment operators are

s(i) := x; x must be a single character, and i must be an integer and lie between 1 and #s; otherwise an error results. This modifies the i-th character of s.

s(i..j) := x; i must be an integer at least equal to 1 and at most equal to j + 1 or an error results. j must also also be an integer and cannot exceed #s. The section of s between i and j is made equal to x, which may expand or contract s. Note that if i = j + 1, x will be inserted into s immediately before its i-th position. The case i = #s + 1, j = #s is legal and adds x to the end of s.

s(i..) := x; this is treated exactly as if it read s(i..#s) := x. Thus i must be an integer which is at least 1 and at most #s + 1.

As an example of the case i = #s + 1, note that if s1 and s2 are both initially equal to "ABC", then both the assignment

s1 (4..3) := "XXX";

and the assignment

s2(4..) := "XXX";

give "ABCXXX" as the value of s1 and s2, respectively.

2.5.4 Boolean operators

Boolean operators compute a Boolean result from one or two input Boolean quantities c, cc. That is, both the inputs of these operations and their results must be one of the two Boolean values true and false. These operations are generally used to combine results produced by prior comparisons or other tests; i.e., they typically appear in contexts such as

if (i > j and j > k) or (k > j and j > i)...

The binary Boolean operators supported by SETL are as follows:

c and cc yields true if both c and cc are true, false otherwise.

c or cc yields true if at least one of c and cc is true, false otherwise.

c impl cc This is the logical implication operator and yields true except when c is true and cc is false. That is, if either c is false, or cc is true, then c impl cc yields true. But if c is true and cc false, then c impl cc yields false.

The only unary Boolean operator provided is

not c yields the logical opposite of c, i.e., false if c is true, true if c is false.

In using these operations one will often make use of various well-known rules of logic like those called De Morgan's rules. For example since

(not c) or (not cc)

is true if either c or cc is false, but is false if both c and cc are true. It is equivalent to

not (c and cc).

Various other equivalences between Boolean expressions are

not (c or cc) is equivalent to (not c) and (not cc)

not (c impl cc) is equivalent to c and (not cc)

c impl cc is equivalent to (not c) or cc

not (not c) is equivalent to c

These and other related logical equivalencies can often be used to simplify Boolean expressions that occur in programs. For example, since

c or ((not c) and cc)

is true if and only if at least one of c and cc is true, it simplifies to

c or cc

Thus, instead of writing

if i >j or ((not i >j) and k >j)...

in a program we can simplify this to

if i >j or k >j...

Other useful relationships of this sort appear in Exercises 1 through 8.

2.5.4.1 Boolean equivalences

A tautology is a Boolean expression E which evaluates to true no matter what Boolean values are given to the variables appearing in E. An equivalence is a statement of the form E1 = E2 which evaluates to true no matter what values are given to the variables appearing in it. For example, the expression

A or (not A)

can be seen to be true regardless of the value of the Boolean value of A, because (A or B) is true if either A or B is true, by the definition of or, and either A or not A must be true, for any value of A. Thus A or (not A) is a tautology, or a universally valid Boolean equivalence. To prove that a given expression is a tautology, we build a truth table that lists, for each possible set of values of the variables appearing in the expression, the value of each of the subexpressions and that of the whole expression. For the expression A or (not A) we have the following table:

A	not A	A or (not A)
true	false	true
false	true	true

EXERCISES

The following exercises list various tautologies and Boolean equivalences, which you are asked to prove either by an exhaustive list of cases or by appropriate mathematical reasoning. In a later section we will see how to write a simple program that builds the truth table of a given expression.

Prove the equivalence (A or B) = (B or A).
Prove the equivalence ((A or B) or C) = (A or (B or C)); also prove ((A and B) and C) = (A and (B and C)).
Prove the equivalence (A and A) = A, also (A or A) = A.
Prove the equivalence (A and (B or C)) = ((A and B) or (A and C)), also (A or (B and C)) = (A or B) and (A or C).
Prove the equivalence (A or ((not A) and B)) = (A or B).
(De Morgan's rules). Prove that (not (A and B)) = ((not A) or (not B)), also (not (A or B)) = ((not A) and (not B)).
Prove that not (not A) = A. Using this fact and the results proved in Ex. 6, show that
(A and B) = (not ((not A) or (not B))),
(A or B) = (not ((not A) and (not B))).
Prove the following equivalences:
(A and true) = A, (A and false) = false,
(A or true) = true, (A or false) = A.

2.5.5 Operations with atoms

Mathematical constructions occasionally make use of abstract points which have no particular properties other than their identity. For example, in dealing with graphs we generally regard them as abstract collections of points (or nodes) connected by edges (see Figure 2.4).

Figure 2.4 A Graph: Six Nodes Connected by Edges.

In this case, to make a new copy of a graph we need a supply of new points. What these points are is of no significance as long as they can be generated in a way which guarantees that all newly generated "points" are definitely distinct from all such "points" previously encountered.

To handle situations of this sort, SETL provides a special kind of object called an atom, or for emphasis a blank atom. These objects can be members of sets or components of tuples, but very few other operations act on these atoms. In particular, there is only one way of producing objects of this kind: namely, by calling a special, built-in, zero-argument (i.e., nullary) function written as

newat()

Each time a program invokes this construct, it yields a new atom, distinct from all previously generated atoms. The only operations involving a pair of atoms, a and aa, are

a = aa	yields true if a and aa are the same, false otherwise.
a /= aa	yields true if a and aa are different, false otherwise.

EXERCISES


1. Write a program that will read a floating-point number x and print the
number of decimal positions of x which will lie to the left of the decimal
point when the number is printed in standard decimal formal. For example, if
1.23E4 is read, 0 should be printed; when 25.6E + 5 is read, 7 should be
printed.


2. Which of the following operations will cause an error?
   
(a)	2.2/(1.1 + 1.10 - 2.200)
(b)	-2.2 * - 2.2 ** 2.2
(c)	(- 2.2) ** 2.2
(d)	float(2) * 2
(e)	(-2.2 max 2.2) ** -2.2
(f)	(-2.2 min 2.2) ** 2.2
(g)	sqrt(2 max 2)



3. Test the following Boolean expressions to see if they yield true or false:

(a)	1.0 = 2.0 - 1.0
(b)	2.0 = sqrt(4.0)
(c)	sin(asin(0.5)) = 0.5
(d)	sin(0.5) * sin(0.5) + cos(0.5) * cos(0.5) = 1.0



Determine the size of the difference between the left- and the right-hand
sides of each of the preceding equations which yields the value false.


4. Which of the following statements are true for all values of the variable x?



(a)	abs(float(x)) = float(abs(x))
(b)	fix(float(x)) = float(fix(x))
(c)	floor(x) <= fix(x)
(d)	ceil(x) >= fix(x)
(e)	exp(log(x)) = x
(f)	log(exp(x)) = x


5. For what positive values of x is cos(x) closest to 0.0? What is the value of asin(1.0)?
Check your answers by computer evaluation.


6. How small is the sum sin(x) + sin(x + 3.1415928)? (Evaluate it at the points x =
-3.1415928, 0.0, 3.1415928, etc.) Can you find a constant c such that sin(x) +
sin(x + c) is smaller than sin(x) + sin(x + 3.1415928) for several values of x?


7. Square the quantity x := 2.0 / sqrt(4.0) repeatedly to see how its higher powers
behave. How many squarings are required to calculate x**1024?


6. Run the following programs and see what results they produce.


(a)       x := 2.0; for n in [1..100] loop x := x * x; print(n," ",x); end loop;


(b)       x := 0.5; for n in [1..100] loop x := x * x; print(n," ",x); end loop;


9. Write a short program which would work perfectly if perfectly accurate 
floating-point arithmetic were performed but which fails catastrophically 
because of small inaccuracies in the computer representation of floating-point 
quantities.

i + j	computes the sum of i and j.
i - j	computes the difference of i and j
i * j	computes the product of i and j.
i ** j	computes i to the j-th power. An error results if j is negative or if i and j are both zero.
i / j	computes the integer (whole number) part of the quotient of i by j. The fractional part of the quotient is discarded. An error results if j = 0. See the following examples for the way in which i / j works if one of i or j is negative.
i mod j	computes the remainder left over when i is divided by j. An error results if j = 0. The result is always positive.
i max j	yields the larger of i and j.
i min j	yields the smaller of i and j.

i = j	yields true if i and j are the same, false otherwise.
i /= j	yields true if i and j are different, false otherwise.
i > j	yields true if i is bigger than j, false otherwise.
i < j	same as j > i.
i >= j	yields true if i is no smaller than j, false other wise.
i <= j	same as j >= i.

print(1 + 1);	yields	2
print(1 - 1, 1 - 10);	yields	0	-9
print(1 * 2, 1 * (-2), (-1) * 2, (-1) * (-2));	yields	2	-2	-2	2
print(2 3, (-2) 3, 2 0, (-2) 0);	yields	8	-8	1	1
print(1 / 3, 2 / 3, 3 / 3, 4 / 3);	yields	0	0	1	1
print(1 mod 3, 2 mod 3, 3 mod 3, 4 mod 3);	yields	1	2	0	1
print(7 / 3, (-7) / 3, 7 / (-3), (-7) / (-3));	yields	2	-2	-2	2
print(7 mod 3, (-7) mod 3);	yields	1	2
print(1 max 2, (-1 ) max (-2));	yields	2	-1
print(1 min 2, (-1)min(-2));	yields	1	-2
print(1 = 1, 1 = 2);	yields	true	false
print(1 /= 1, 1/= 2);	yields	false	true
print(1 > 1, 1 > 2, 2 < 1);	yields	false	false	false
print(1 > 1, 1 < 2, 2 < 1);	yields	false	true	false
print(1 >= 1, 1 >= 2, 2 >= 1);	yields	true	false	true
print(1 <= 1, 1 <= 2, 2 <= 1);	yields	true	true	false

-i	computes the negative of i.
abs(i)	computes the absolute value of i.
even(i)	yields true if i is even, false if i is odd.
odd(i)	yields false if i is even, true if i is odd.
float(i)	converts the integer i to the corresponding floating-point value. If the conversion causes overflow, which is possible if i has a very large value, then an error results.

print(0 + 1,0 +(-100));	yields	1	-100
print(-1, -(-100));	yields	-1	100

print(abs(1), abs(-2));	yields	1	2
print(even(1), even(2), even (-1));	yields	false	true false
print(odd(1), odd(2), odd(-1));	yields	true	false	true
print(float(1), float(-1), float(2));	yields	1.0	- 1.0	2.0

x + y	computes the sum of x and y.
x - y	computes the difference of x and y.
x * y	computes the product of x and y.
x / y	computes x divided by y. An error results if y is zero, or if the division causes floating-point overflow.
x ** y	this variant of the exponentiation operator yields x raised to the y power. An error results if exponentiation causes floating-point overflow, or if x and y are both zero.
x max y	yields the larger of x and y.
x min y	yields the smaller of x and y.
atan2(x, y)	yields the arc tangent of the quotient x/y. The result is given in radians.

1.1 + -1.1	yields	0.0
1.1 * 1.1	yields	1.21
1.1 ** 2.0	yields	1.21
1.1 = 1.11	yields	false
1.1 = 1.10	yields	true
1.1 max 1.1001	yields	1.1001
1.1 min 1.101	yields	1.1
+ 1.1	yields	1.1 Note that the two '-' signs must be separated by a space, since '--' starts a comment
- - 1.1	yields	1.1
abs(-1.1)	yields	1.1
print(fix(1.1), fix(-1.1))	yields	1, -1
print(floor(1.1), floor(-1.1), floor(-1.0))	yields	1, -2, -1
print(ceil(1.1), ceil(-1.1), ceil(1.0))	yields	2, -1, 1
print(float(1), float(-1), float(0))	yields	1.0, -1.0, 0.0

s(i)	computes the i-th character of the string s; the result is a one character string. If i is negative, an error results; if i is greater than the length of s, then the value OM is returned.
s(i..j)	this string slice operator computes and returns the substring of s which extends from its i-th through its j-th characters, inclusive. If i = j + 1, a null string is returned. See Table 2.1 for a description of the treatment of other marginal and exceptional cases for this operator. (Note that this operator actually has three, rather than two, arguments.)
s(i..)	this computes and returns the substring of s which extends from its i-th character through the end of s, inclusive. See Table 2.1 for a description of the treatment of marginal cases of this operator.
s + ss	concatenates the two strings s and ss. i s concatenates i successive copies of the string s. If i = O, then is is the null string. If i < O then an error results.
s = ss	yields true if s and ss are identical, false otherwise.
s /= ss	yields true if s and ss are distinct, false otherwise.
s > ss	yields true if s comes later than ss in standard alphabetical order, false otherwise. (Note that this operation, as well as the other string comparisons s < ss, s >= ss, s <= ss, are implementation-dependent, as they depend on an assumed alphabetical order of characters (collating order). Of course, alphabetic characters will always have their standard order, but the relative order of punctuation marks, and also the way in which alphabetics compare to numerics, may differ from implementation to implementation.)
s < ss	yields true if s comes earlier than ss in standard alphabetic order, false otherwise.
s >= ss	yields true if s either is identical with ss or comes later in standard alphabetic order, false otherwise.
s <= ss	yields true if s either is identical with ss or comes earlier in standard alphabetic order, false otherwise.
s in ss	yields true if s occurs as a substring of ss, false if not.
s notin ss	yields false if s occurs as a substring of ss, true if not.

print(ss(1), ss(4));	yields	C	A
print(s(1..2), s(2..4), s(2..2));	yields	AB	BRA	B
print(s(1..0));	yields	the null string
print(s(1..), s(2..), s(3..), s(4..));	yields	ABRA	BRA	RA	A
print(s(5..));	yields	the null string
print(s(5));	yields	OM
print(s + ss);	yields	ABRACADABRA
print(3*s);	yields	ABRAABRAABRA
print(s > ss, ss > s);	yields	false	true
print("AA" > "A","A" > "");	yields	true	true
print("AA" < "A","A" < "");	yields	false	false
print(s in ss, ss in s);	yields	true	false

#s	yields the number of characters in the string s.
abs s	here s must be a one-character string, or an error results. If s is a single character, then abs s returns the internal integer code for this character. Abs and char are inverse operators.
char i	here i must be an integer which can be the internal code of some character c. If this is so, then char i yields the single character c (i.e., a one-character string). Otherwise, an error results. (The range of integer values used as character codes is implementation-dependent.)
str x	x is any SETL object. str x is a string that is the printable form of the value of x.

Operator	Condition	Effect

s(i)	i negative or zero	causes error
s(i)	i > #s	yields OM
s(i..)	i negative or zero	causes error
s(i..)	i = #s + 1	returns null string
s(i..)	i > #s + 1	causes error
s(i..j)	i negative or zero	causes error
s(i..j)	i >j + 1	causes error
s(i..j)	j negative	causes error
s(i..j)	j > #s	causes error
s(i..j)	i = j + 1	returns null string

s1(3..5) := "XXX";	-- now s1 =	ABXXXADABRA
s2(3..4) := "XXXXXX";	-- now s2 =	ABXXXXXXCADABRA
s3(3..4) := "X";	-- now s3 =	ABXCADABRA
s4(3..4) := "";	-- now s4 =	ABCADABRA
s5(7..) := "XXX";	-- now s5 =	ABRACAXXX
s6(7..) := "";	-- now s6 =	ABRACA
s7(1) := "Y";	-- now s7 =	YBRACADABRA

s(i) := x;	x must be a single character, and i must be an integer and lie between 1 and #s; otherwise an error results. This modifies the i-th character of s.
s(i..j) := x;	i must be an integer at least equal to 1 and at most equal to j + 1 or an error results. j must also also be an integer and cannot exceed #s. The section of s between i and j is made equal to x, which may expand or contract s. Note that if i = j + 1, x will be inserted into s immediately before its i-th position. The case i = #s + 1, j = #s is legal and adds x to the end of s.
s(i..) := x;	this is treated exactly as if it read s(i..#s) := x. Thus i must be an integer which is at least 1 and at most #s + 1.

c and cc	yields true if both c and cc are true, false otherwise.
c or cc	yields true if at least one of c and cc is true, false otherwise.
c impl cc	This is the logical implication operator and yields true except when c is true and cc is false. That is, if either c is false, or cc is true, then c impl cc yields true. But if c is true and cc false, then c impl cc yields false.

not (c or cc)	is equivalent to	(not c) and (not cc)
not (c impl cc)	is equivalent to	c and (not cc)
c impl cc	is equivalent to	(not c) or cc
not (not c)	is equivalent to	c