﻿Introduction to the use of the Ref proof verifier

# A user's manual for the Ref verifier

### Section I. An overall view of the Ref proof verifier.

The Ref verifier is fed script files, called scenarios, consisting of successive definitions, theorems, and auxiliary commands, which Ref either certifies as constituting a valid sequence or rejects as defective. In the case of rejection, the verifier attempts to pinpoint the troublesome locations within a scenario, so that errors can be located and repaired. Step timings are produced for all correct proofs, to help the user in spotting places where appropriate modifications could speed up proof processing.

The bulk of the text normally submitted to the verifier consists of theorems and proofs. Some theorems (and their proofs) are enclosed within so-called theories, whose external conclusions are justified by these internal theorems. This lets scenarios be subdivided into modules, which increases the readability and supports proof reuse.

The verifier allows input and checking of the text to be verified to be divided into multiple sessions.

Each theorem is labeled in the manner illustrated below. As seen in the following example, each theorem label is followed by a syntactically valid logical formula called the assertion of the theorem.

`Theorem 19: ((enum(X,S) = S) & (Y incs X)) •imp (enum(Y,S) = S).`

The statement of each theorem should be terminated by a final period (i.e. '.') and be followed by its proof, which must be introduced by the keyword 'Proof:', and terminated by the reserved symbol 'QED'. A theorem's proof consists of a sequence of statements (also called inference steps), each of which consists of a hint portion separated by the sign ' ==> ' from the assertion of the statement. An example of such a statement is

`ELEM ==> (car([x,y]) in {x}) & (cdr([x,y]) = y) & (car([z,w]) notin {x}) ,`

where the 'hint' is ELEM and the assertion is

`(car([x,y]) in {x}) & (cdr([x,y]) = y) & (car([z,w]) notin {x}) .`

As this example illustrates, the 'hint' portion of a statement names the inference rule used to derive its assertion (from prior statements of the same proof, or from theorems, definitions, or assumptions). This 'assertion' must be a syntactically well-formed statement in Ref's set-theoretic language.

A simple example of a full proof is

```    Theorem 1a: arb({{X},X}) = X. Proof:
Suppose_not(c) ==> arb({{c},c}) /= c
{{c},c} --> T0 ==> ((({{c},c} = 0) & arb({{c},c}) = 0) or
((arb({{c},c}) in {{c},c}) & (arb({{c},c}) * {{c},c} = 0)))
ELEM ==> false; Discharge ==> QED
```

Each 'hint' must name one of the basic inference mechanisms that Ref provides, and may also supply this inference mechanism with auxiliary parameters, including the context of preceding statements in which it should operate. The following table lists the inference mechanisms provided.

1. ELEM ==> ... Proof by extended elementary set-theoretic reasoning.

TELEM ==> ... variant of ELEM (see below).

2. Suppose ==> ... Introduces hypothesis, available in a local range of the proof, to be 'discharged' subsequently.

3. Discharge ==> ... Closes the the proof range opened by the last previous 'Suppose' statement, and makes negation of prior supposition available.

4. Suppose_not ==> ... Specialized form of 'Suppose', used to open proof-by-contradiction arguments.

5. (e1,..,en)-->Stat_label ==> ... Substitutes given expressions or newly generated constants into a prior labeled statement.

6. (e1,..,en)-->Theorem_name ==> ... Substitutes given expressions into a prior universally quantified theorem.

7. Def definition_name: symbol_name(params) := ... Defines a new function symbol, constant, or predicate symbol.

8. Use_def(symbol) ==> ... Expands a defined symbol into its definition.

9. Loc_def symbol_name(params) = ... Defines a new function symbol, constant, or predicate symbol for use within a single proof.

10. EQUAL ==> ... Makes deduction by substitution of equals for equals, possibly in a universally quantified formula.

11. SIMPLF ==> ... Makes deduction by removal of set-former expressions nested within other setformers or quantifiers.

12. ALGEBRA ==> ... Deduces an algebraic consequence using statements proved or assumed previously.

13. Set_monot ==>... Handles setformers and exploits set-theoretic monotonicity relationships.

Pred_monot ==>... Handles quantifiers and the finiteness predicate and exploits set-theoretic monotonicity relationships.

14. THEORY theory_name ... ==> ... Introduces a new named THEORY, listing the function symbols and constants which it assumes, and stating its assumptions.

15. ENTER_THEORY theory_name Enters the special logical context defined by a theory, to begin making definitions and giving proofs within the theory.

16. Assump ==> Cites an available theory assumption during a proof being conducted within a theory.

17. APPLY.. ==> ... Draws conclusions from theorems previously proved in a theory. Note that 'Skolemization' is handled as a special case of APPLY.

18. QED Terminates the proof of a theorem and makes the theorem available for citation.

19. AUTO Introduces an implicit internal assertion resulting directly from the hint.

20. DECLARE ==> ... Declares algebraic or monotonicity properties of individual operators or families of operators. Examples are

```
DECLARE_RING Si,•S_PLUS,+,•S_TIMES,*,•S_MINUS,-,R_Rev,--,S_0,0,S_1,1 ,
DECLARE_MONOTONE Un,+;Pow,++;#,+;•PLUS,+,+.
```

Statements and clause labels. Statement assertions and parts of conjuncts connected by the conjunction sign '&' can be labeled for explicit subsequent reference within a proof by appending a label of the form 'Statnnn:' to them, where 'nnn' designates any integer. (An example of such a label is 'Stat3:'). These labels can then be used in hints of the form

`    (e1,..,en)-->Stat_label ==> ...`

shown in the table above. Labels prefixed to entire assertions can also appear in the specification of restricted contexts that some statements carry as an attachment to their hint portion, as will be explained later. Examples of statement labels and context use appear in the following proof fragment (which you are not expected to understand right now):
```    Discharge ==> Stat0: (m in Cm) & (m /= (m •C_TIMES C_1))
m-->T804(Stat0*) ==> Stat1: (car(m) in Re) & (cdr(m) in Re) & (m = [car(m),cdr(m)])
Use_def(•C_TIMES)(Stat0*) ==> m /=
[(car(m) •R_TIMES car(C_1)) •R_MINUS (cdr(m) •R_TIMES cdr(C_1)),
(car(m) •R_TIMES cdr(C_1)) •R_PLUS (cdr(m) •R_TIMES car(C_1))]
Use_def(C_1)(Stat0) ==> (car(C_1) = R_1) & (cdr(C_1) = R_0)
(car(m))-->T606(Stat1,Stat1*) ==> (car(m) •R_TIMES R_1) = car(m)
```

### Section II. Proof scenario development and debugging

Development of a Ref proof begins with the careful statement, in ordinary mathematical language, of some proposition of interest and of its proof. These must then be translated into Ref's formal language, submitted to the verifier, and debugged. The proof should be laid out in small segments (5 to 10 lines of formal Ref are a reasonable segment size). These segments should correspond roughly to the sentences or phrases of the initial mathematical proof, which should be attached systematically to the Ref text as natural-language comments. This lets the details of the formal proof to be correlated with its guiding mathematical plan. Such correspondence will be found invaluable as Ref steps are repeatedly reviewed and adjusted in the course of debugging them.

Ordinary mathematical proofs which can be arranged to use a limited family of elementary steps repeatedly will translate most easily into Ref. Ref's library of pre-proved theorems and theories, which deserve careful study before you start to develop your own proofs, may already provide many of the basic steps you need. If this is not the case, you may want to begin by designing and developing your own library of small lemmas and theories capturing the proof ideas of which you will make repeated use. This can shorten the final proof of your target proposition considerably.

Syntax errors in theorem statements or proofs must be eliminated immediately, since the Ref verifier begins by checking the syntax of the whole text and halts at the first syntax error. Syntax errors therefore make real proof verification impossible, and you must get rid of them before any real 'proof scrubbing' can start. A typical syntax error is to omit the hint part of a proof step; you can avoid this by supplying steps about whose hint you are uncertain with dummy hints consisting of any word beginning with 'T' other than 'TELEM', e.g. 'Therefore' or 'Tsomehow', or 'TALGEBRA'. Ref will issue diagnostics for such illegal hints, treating them as if they were citations of unavailable theorems, but will accept them as syntactically valid and continue processing your proof script.

A proof which you are beginning to debug may have small or large gaps of which you are aware, but which you are not ready to fill in. It is recommended that the the proof nevertheless be maintained as a 'balanced' structure in the sense that its initial 'Suppose_not' statement and QED terminator are present, and that the 'Discharge' statements matching all 'Suppose' (if any) are also present. Parts of the proof which are still very shapeless can be carried as comments.

### Section III. Ref formula input format and pretty-printed Ref output: symbols and precedence rules.

Ref deals with logical formulae in two related forms, its keyboard-oriented input format and a pretty-printed output format oriented toward readability rather than easy typing. The proof given above illustrates Ref's keyboard-oriented input format. The same theorem looks as follows in the pretty-printed form:

```    Theorem 1a: ∋{{X},X} = X. Proof:
Suppose_not(c) ➨ ∋{{c},c} ≠ c
{{c},c} --> T0 ➨ {{c},c} = 0 & ∋{{c},c} = 0 ∨ ∋{{c},c} ∈ {{c},c} & ∋{{c},c}∩{{c},c} = 0
ELEM ➨ false; Discharge ➨ QED
```
As this example shows, the input format and pretty-printed format differ significantly. The input format uses standard ASCII keyboard characters only, with the exception of the 'Option-8' character (ASCII character 149), which appears in most Macintosh applications as a largish round dot '•', but which many browsers translate differently. This character is most often used by the Ref system as a tag prefixed to infix operators; a typical example is •PLUS, representing a binary addition operator written as infix.

While the Ref input format holds cautiously to its repertoire of typeable characters, the pretty-printed format uses Unicode characters freely, sometimes as subscripts or superscripts, to abbreviate the longish operator names which appear in the input format. For example, the function name SAME_FRAC(x,y) (for equivalence of fractions) appears as x ≈ y in pretty-printed format. The map of rational Cauchy sequences to the real numbers they represent, which is CAUCHY_TO_RE(s) in the input format, is s➮ℝ in pretty-printed form. A full table of special characters used in the pretty-printed form of Ref output is given below.

Another difference between Ref's keyboard input format and its pretty-printed output format lies in the different operator precedence rules that they reflect. The Ref input format inherits its relatively limited set of precedence rules from SETL, in which operators bind more tightly than built-in comparators such as '=', but in which all operators, especially including extended infix constructions like '•imp' or '•MINUS', have much the same priority. So in entering formulae in Ref keyboard input format, it is necessary to parenthesize fully, as shown by the example

```Theorem 121: [Cardinality lemma] Ord(#S) & (EXISTS f in OM | (one_1_map(f) &
(range(f) = S) & (domain(f) = #S))) &
(not((EXISTS o in #S | (EXISTS g in OM | one_1_map(g) &
(range(g) = S) & (domain(g) = o))))).
```
The pretty-printed format uses a much more extensive set of precedence rules, in which

1. monadics (excepting 'not', written as ¬) bind most tightly;

2. set binaries such as ∪ and ∩ bind next most tightly;

3. arithmetic binaries such as +; come next;

4. comparators such as 'in', 'notin', ⊆ and ≥ bind more weakly than arithmetic binaries;

5. quantifiers come next;

6. propositional connectives such as → (implication) and ∨ (disjunction) come next, with conjunction ('&') binding a bit more tightly than the other operatorsin this group.
This more ample family of precedence rules turns the theorem seen above to pretty-print as
```	Theorem 121: [Cardinality lemma] IsO(#S) & (∃F ∈ Ω | 11(F) &
ЯF = S & ДF = #S) & ¬(∃O ∈ #S | (∃G ∈ Ω | 11(G) & ЯG = S & ДG = O)) .
```

### A glossary of symbols used in the pretty-printed Ref output

Note: the following glossary shows the definitions of all non-primitive symbols. By clicking on any symbol, you can see all the theorem statements in Ref's shared common scenario in which the symbol appears. Symbols are given color-modeled background showing to the manner of their definition according to the following scheme: primitive sign -- cyan; ordinary algebraic definition -- white; recursive definition -- pink; definition via an APPLY -- magenta; definition via and external APPLY -- orange; in/out formal THEORY parameter -- green; variable -- yellow; unused -- grey; meta-level sign -- blue.

Clicking on any symbol (other than the most commonly used) in the left column of following table shows the list of theorems which mention the symbol

### Section IV. The Ref proof-verification environment.

IV.1 - The Ref working environment. Ref is free software, so you can download the entire Ref system and its shared common scenario of over 1000 theorems and proofs for local use on your own computer. But to work with the server-based version of the Ref verifier you must first register as a system user by sending an E-mail to the Poohbah. This should give your name, E-mail address, and institution, and say something about your background in computational logic and intended use of the system. A password and user alias will then be returned to you by E-mail. Using this password, you can access a worksheet window having the following appearance: Using this worksheet (and an external text editor of your choice) you should assemble a carefully formatted input scenario and submit it to the verifier as explained below. To verify your scenarios you must enter your user alias into the Worksheet window's 'Me' field and then click on the Worksheet's 'Go' button. The verifier system (running on its New York server) will then examine your proofs, accepting some and possibly rejecting others, either because of syntactic or logical defects, and will respond by opening a result window like that seen here.

IV.2 - The Verification result windows. Before reviewing the features of Ref's worksheet window, let us examine its typical result windows. These consist of three main sections, accessible by clicking on the radio buttons seen at the top of the first. One of these (labeled 'Bad') gives a table of the verification failures encountered during your verifier run. A second (labeled 'Good Pfs.') lists all the theorems whose proofs have been verified successfully. A third (labeled 'Misc. Init. Output') gives overall timing information, plus various items of diagnostic information. Examine these information panels closely to become familiar with their layout. The 'Good' Panel The 'Misc Initial Output' Panel If any of your proofs have failed to verify, the proof steps which have failed will be listed either in the 'Misc. Init. Input' panel or in the 'Bad' panel of the verifier report. At the left of each of the failure items in the 'Bad' panel there is a radio button for the item. By clicking on this, you can bring up additional, more detailed information concerning each of the proof steps that have failed to verify. This information appears in a panel just below the list of verification failures. Below we show a sample result Window illustrating what this information looks like. As seen, the offending statement is identified, and an account is given of some of the key items examined by the verifier during the failed verification attempt. These include the stack of statements used as the supporting context of the attempted verification, the conjunction formed from this stack, the 'blobbed' form of this conjuction (see below), this blobbed form after removal of irrelevant clauses, and the map from unblobbed formula elements to their blobbed forms. Your efforts to correct a failed verification step should generally begin by rethinking the reason why you expected the step to succeed, and by comparing this implicit argument with the verifier's actual actions, as reflected in the diagnostic information just described.

IV.3 - Features of the Verifier's main worksheet window. Now let's examine the features of Ref's main worksheet window, which is used to set up and launch verification runs. This window serves to

1. assemble the proof text which is to be verified;
2. run the verifier, to verify designated theorems, definitions, and theories in this text;
3. facilitate examination of the output of this run.

The proof text which the verifier examines is the concatenation of up to five text sources, any one of which can either be included or omitted. These are

1. A common shared scenario file, containing Ref's principal library of more than a thousand definitions and verified theorems of basic set theory and analysis leading up to the Cauchy integral theorem. This library is included by default at the start of the other text submitted to the verifier.

This shared collection of definitions and proofs can be examined by clicking on 'Full Common Scenario' in the verifier worksheet window. Clicking on 'Scenario Summary' produces a summary of this collection, giving definitions, theories, and theorem statements only. (The .pdf versions of this scenario and of its summary are provided via the 'full common scenario' and the 'scenario summary' anchors appear in 'pretty-printed' rather than 'keyboard' form. To get the full common scenario in keyboard form, click here.)

2. Up to three auxiliary proof scenario files, which are uploaded from files you designate, are appended, one after the other, to the shared common scenario before verification. You can either paste the full path to these files into any of the Worksheet's 'More files' fields, or insert this path using one of the 'Browse' buttons next to them. The scenario files you give in this way must be local files on your computer.
3. A final proof scenario fragment, entered directly into the text area seen at the bottom of the worksheet window is appended to the end of all the above files. Try pasting the following fragment into this area, and then clicking on the 'Go' (after first dropping the 'Share' check-button) to see what is meant.
```--BEGIN HERE
Theorem 3: arb({X}) = X. Proof:
Suppose_not(c) ==> arb({c}) /= c
ELEM ==> Stat1: false
Discharge(Stat1) ==> QED
```

To omit any of these last four text sources, simply leave the corresponding field empty. To suppress use of the common scenario, simply set the Worksheet's 'Share' checkbox to 'off'.

At least one 'BEGIN HERE' must appear in the text you submit for the verification. Since the verifier does not examine any text before the first line having exactly the form

```		--BEGIN HERE
```
you must always include such a line in your scenario or the verifier will ignore it all. This convention lets you include explanatory 'preambles' in your scenarios by placing them before the mandatory 'BEGIN HERE' line.

The verifier also recognizes control directives of the form

```		--PAUSE HERE
```

and

```		--END HERE
```
The first of these special controls stops statement examination temporarily, until the next following 'BEGIN HERE' directive. (This lets you insert extended comments into your scenarios.) 'END HERE' terminates scenario processing, and allows insertion of an explanatory appendix.

Shorter comments can be inserted at the end of any line simply by prefixing them with the mark '--'. The verifier ignores any text on a line following such a mark.

IV.4 - Designating the range of theorems to be verified, and invoking the verifier. Once (proceeding in the manner just explained) you have defined the body of text which the verifier is to examine, you can designate the range of theorems to be verified and invoke the verifier just by setting the 'Verify' field of the worksheet window and clicking on the 'Go' button. The 'Verify' field should contain a comma-separated combination of integers n and integer ranges n..m. An example is

```		7..15,17,21,95..105,111
```
which indicates that the proofs of theorems numbers 7 thru 15, 17, 21, 95 thru 105, and 111 are to be verified. Note that these integers refer to the ordinal positions of the theorems within the sequence of all theorems present in your scenario, not to the name given to the theorem (which can also be an integer, e.g. 'Theorem 999'). To find these theorem serial numbers, you can use the 'Thms.' panel described below, which lists all the theorems in your scenario, with their serial numbers.

IV.5 - Using the verifier for scenario examination short of full verification. The verifier's report can be produced in a variant form, oriented more toward review than to verification of your input scenario, which appears as follows: As you see, this modified results window includes four extra radio buttons. To produce the results window in this variant form, simply set the Worksheet's 'Verify' field to '1'. When this is done the verifier will only process your first theorem fully. But it always parses the whole proof scenario passed to it, for which it generates comprehensive information under the four headings 'Thms.', 'Defs.', 'Theories', and 'Cit_Indx'. Clicking on these respective radio buttons will bring up texts of all the theorems, definitions, and theories in your scenario. The fourth button, marked 'Cit_Indx', brings up a comprehensive 'A is cited by B' cross-reference index of your scenario, which includes a listing of all theorems never cited. Try these buttons to become familiar with the layout of these helpful auxiliary texts. A fragment of each of the four views these buttons disclose on the scenario is shown below:    IV.7 - Syntax errors in proof scenarios. If any of the proofs, theorem statements, or definitions in your scenario contains a syntax error the verifier will issue a diagnostic at the first such error and stop immediately, displaying an error report like the following: which was produced by running the simple but slightly erroneous test scenario

```--BEGIN HERE
Theorem 3: arb({X}) = X. Proof:
Suppose_not(c) => arb({c}) /= c
ELEM ==> Stat1: false
Discharge(Stat1) ==> QED
```

As you can see, the window which appears repeats the line containing the syntax error (in this case, erroneous use of '=>' for the required '==>'), and then shows the remainder of the proof in which this error has appeared. Given this information, you should be able to search rapidly for the error and fix it. Note that because the verifier stops at the first syntax error it encounters, only its 'Miscellaneous Initial Output' panel is accessible in such cases.

### Section V. More details concerning the syntax and semantics of the Ref inference primitives.

#### V.1 - Quantified variables and constants

Within the body of a proof all free variables x of formulae are treated as logical constants, i.e. none are understood to be universally quantified, and so inferences which substitute any expression or variable for such an x are illegal, unless an equality or equivalence allowing such replacement as an instance of equals-for-equals substitution is available in the relevant context.

A somewhat different convention is used to abbreviate the statements of theorems. Within a theorem assertion, every variable that is not otherwise quantified (or previously defined) is understood to be universally quantified. For example, the theorem

`    Theorem 1: arb({X}) = X`

is understood as meaning

`    (FORALL x | arb({x}) = x).`

Variables used in this way should be capitalized for emphasis.

#### V.2 - 'Suppose' and 'Discharge'

Ref's 'Suppose' and 'Discharge' capabilities make a convenient form of 'natural deduction' available. Any syntactically well-formed formula can be the assertion of a 'Suppose' statement, i.e. you can suppose what you like. For example,

`    Suppose ==> 2 *PLUS 2 = 4`

and

`    Suppose ==> 2 *PLUS 2 = 5`

are both perfectly legal. However, all the suppositions made in the course of a theorem's proof must be Discharged before the end of the proof. This is accomplished by matching each 'Suppose' statement with a following 'Discharge' statement. The matching rule used is the same as that for opening and closing parentheses. A Discharge statement of the form

`    Discharge ==> some_conclusion`

is only valid when the context preceding it is contradictory, in which case it allows deduction of the negation of the last preceding 'Suppose' statement, or any ELEMentary consequence of that negated statement, as a conclusion.

#### V.3 - The context of an inference step

The context of an inference is the collection of preceding statements, within the proof in which the inference appears, which the inference mechanism invoked by a hint can use in deducing the assertion to which the hint is attached. Since in some cases the efficiency of an inference mechanism degrades very rapidly (e.g. exponentially or worse) with the size of the context with which it is working, appropriate restriction of context will sometimes be crucial to successful completion of an inference. Inferences which the verifier cannot complete within a reasonable amount of time are abandoned with a diagnostic message 'Abandoned...', or with the more specific message 'Failure...' if the inference method is able to certify that the inference attempted is impossible. Hint keywords like ELEM, EQUAL, SIMPLF, and ALGEBRA can be supplied with indications limiting the context they use by prefixing them (in the cases of ELEM and Discharge) or appending them (in all other cases) with a statement label or comma-separated list of such labels, as in the examples

`  (Stat3)ELEM ==> s notin {x *incin o | Ord(x) & P(x)}`

and

`  (Stat3,Stat4,Stat9)ELEM ==> s notin {x *incin o | Ord(x) & P(x)}.`

The first form of prefix defines the context of an inference to be the collection of all statements in the proof, back to the point of last previous occurrence of the statement label in the proof (but not within ranges of the proof that have already closed because they are included between a preceding 'Discharge' statement and its matching 'Suppose' statement; see above). The second form of prefix defines the context of an inference to be the collection of statements explicitly named in the prefix. If no context is specified for an inference, then its context is understood to be the collection of all preceding statements in the same proof (not including statements enclosed within previously closed 'Suppose/Discharge' ranges). This unrestricted default context is workable for simple enough inferences in short enough proofs.

#### V.4 - The ELEM primitive

Among inference primitives, ELEM is the most central, and it is often used implicitly by other inference primitives, e.g. 'Suppose_not', 'Discharge', 'Use_def', 'Assump', EQUAL, ALGEBRA, and others.

Direct invocations of ELEM can have any of the forms

```    ELEM ==> ...
(σ1,..,σn)ELEM ==> ...
TELEM ==> ... ,
```

of which the second specifies a restricted context (σ1,..,σn) and the third is an invocation of ELEM whose negated assertion is taken as the entire context of the inference. (I.e. the assertion of TELEM must be universally valid in isolation from the rest of the proof).

ELEM implements an extended form of multi-level syllogistic, a decision algorithm which determines whether a given unquantified set-theoretic formula involving individual variables (which designate sets) and a restricted collection of set operators is satisfiable. By using this algorithm, the Ref verifier can identify many cases in which a conjunction constructed by negating one statement S of a proof and conjoining a selection of earlier lines is unsatisfiable, which implies that S follows from the preceding context.

When not all the constructs appearing in a context (e.g. quantifiers and setformers) are accessible to multilevel syllogistic, a preprocessing step must precede its application. This replaces all parts of the current context whose lead operators are not recognized by the decision algorithm by 'blobs', i.e. by new variables designating either sets (when they occur as terms) or propositions (when they occur as subformulae). Blobbing, as we shall call this operation, replaces syntactically identical (or recognizably equal) parts of a formula by the same variable. It is also able to treat as equal well-formed parts which only differ by the renaming of bound variables in quantifiers or setformers. Blobbing also treats existential quantifiers as negated universal quantifiers. (In the diagnostics issued by Ref, set-blobs are written in the form BLAnnn and propositional blobs in the form BLBnnn.)

The primary function of blobbing is to reduce all the constructs that may appear in a proof statement to ones which multilevel syllogistic can handle. However, the blobbing phase is also used to introduce other simplifications which extend the power of ELEM beyond that of simple multilevel syllogistic and improve system performance. For example, when the context available for a deduction is large, it is often the case that only a narrow part of it is needed to derive a contradiction. Discarding irrelevant context clauses greatly improves the efficiency of the decision algorithm.

Blobbing consists of three sub-phases: (1) pre-blobbing, which makes reductions such as the reduction of any part of the form Finite(#X) to Finite(X) [see Table ?? for a full list of the simplifications carried out in this phase]; (2) blobbing proper, during which subterms whose lead constructs are not known to multilevel syllogistic are replaced by set names and quantified subformulae are replaced by propositional variables; (3) post-blobbing, which drops parts of a purported contradiction when it is clear that they can play no role in establishing its contradictory nature.

#### V.5 - The Suppose_not and QED primitives

'Suppose_not' statements occur, exclusively and always, as the first inference step in a Ref proof. They can have either of the forms

```    Suppose_not(c1,..,cn) ==> ...
Suppose_not ==> ... ,
```

where c1,..,cn are distinct constants local to a proof, which correspond in number and in positions to the distinct unquantified variables appearing in the statement T of the corresponding theorem. Such theorem variables (whose first letters are always capitalized for emphasis), are understood to be universally quantified; and in a proof-by-contradiction the constants ci replace them during deduction of a contradiction. Accordingly, the statement which follows ' ==> ' in the 'Suppose_not' step must be logically equivalent to the negation of an instantiated version of T.

At the end of the proof there must appear a 'Discharge' statement which matches the 'Suppose_not'. This must have the form

```    Discharge ==> QED
```

and indicates that a contradiction was derived by assuming the existence a counterexample to T.

#### V.6 - The Suppose primitive

A 'Suppose' statement has the form

```    Suppose ==> ... ,
```

where the formula B that follows ' ==> ' can involve no constants save those already available in the proof preceding it. (These can be globally defined constants, constants of the form c_thryvar local to the current theory, constants generated within the proof as replacements for existentially bound variables, and constants generated by application of a THEORY).

Every inference step S coming after a 'Suppose B' can exploit the temporary assumption B until the following 'Discharge' statement which matches this 'Suppose' statement. The 'Discharge' eliminates the assumption of its matching 'Suppose' along with all the intermediate steps C which were derived from it.

#### V.7 - The Discharge primitive

As already said, 'Discharge' statements always match 'Suppose' and 'Suppose_not' statements within a proof, in the same balanced way in which closed parentheses match open parentheses in arithmetic expressions. A 'Discharge' statement can have either of the forms

```    Discharge ==> ...
(σ1,..,σn)Discharge ==> ... ,
```

where the latter form specifies a restricted context (σ1,..,σn).

To see how this inference primitive works, consider the proof fragment:

```    C
Suppose ==> B
D
Discharge ==> A
```

in which the 'Suppose' and 'Discharge' match each other, while C represents the overall context available before the 'Suppose'. D represents the part of context following the temporary assumption B , and A is the assertion which the 'Discharge' is to yield.

Ref will only regard this derivation as legitimate if it can establish (using ELEM) that 'C & B & D & (not A)' is contradictory and that 'C & (not B) & (not A)' is also contradictory. Most commonly 'C & B & D' is contradictory by itself, and sthe above proof pattern is

```    C
Suppose ==> B
...
false; Discharge ==> not B .
```

Note that all the assertions in D are regarded by Ref as relying on the the temporary assumption B and so are all forgotten after B is discharged.

The treatment of

```    Suppose_not(...) ==> B
D
Discharge ==> QED
```

is similar but simpler. To verify this final proof step, Ref simply tries to derive a contradiction from B&D (by resorting to ELEM as before).

#### V.8 - The citation primitives and their instantiation mechanisms

Statements of the forms

```    (e1,..,en)-->Stat_label ==> ...
(e1,..,em)-->Theorem_name ==> ...
```

(with n≥0 and m>0) are called statement citations and theorem citations, respectively. (The simpler form 'Theorem_name ==> ...' of theorem citation is also available). These inference primitives involve instantiation mechanisms, which replace universally or existentially bound variables by terms within the formulae cited, while (in the case of statement citations) dropping quantifiers from these formulae.

In a statement citation, the formula which 'Stat_label' refers to is either the whole of an assertion found on a line occurring earlier within a proof, or an ending conjunctive part of that assertion (connected by the conjunction sign '&' with its preceding part). In a theorem citation, the formula which 'Theorem_name' refers to is the assertion of an earlier theorem belonging to the same THEORY as the current proof or to the background THEORY Set_theory. This conclusion usually involves variables (written in uppercase) which are considered to be universally bound. In theorem citations, only these tacitly bound variables get instantiated. On the other hand, in a statement citation, instantiation can affect universally as well as existentially bound variables, and the two kinds of instantiation can be intermixed in a single step.

Instantiation of an existentially bound variable and instantiation of a universally bound variable are quite different logical operations. Only previously unused constants can replace existentially bound variable occurrences, whereas any term not containing any free variables can replace a universally bound variable occurrence. While bound variables in quantifiers can have the same name as bound variables used earlier, each constant used to instantiate an existentially bound variable must be "new", i.e. must be distinct from constants introduced elsewhere within the same proof.

#### V.8a - Logical equivalences exploited in instantiating statements

A negated existentially quantified conjunct is a universally quantified formula in disguise. Accordingly the variables bound by its lead quantifier can be instantiated with arbitrary terms. For example, the negated formula '(not(EXISTS x | P(x)))' is treated as if it were '(FORALL x | not P(x))'. Likewise, the negation '(not(FORALL x | P(x)))' of a universally quantified statement is treated as if it were '(EXISTS x | not P(x))'. Negated inclusions and negated equalities are treated like existential statements, taking into account the logical equivalence between

```    not(x *incin y), not(y incs x), (x *nincin y), not(y *nincs x),
```

and

```    (EXISTS z | (z in x) & (z notin y)),
```

and also the logical equivalence between

```    x /= y
```

and

```    (EXISTS z | (z in x) *eq (z notin y)).
```

A membership statement of the form

```    y in {s: x1 in t1,...,xn in tk | P(x1,...,xk)}
```

is treated exactly like the equivalent existential statement

```    (EXISTS x1,...,xk | (y=s) & (x1 in t1) &...& (xk in tk) & P(x1,...,xk)) .
```

Similarly, a formula

```    y notin {s: x1 in t1,...,xn in tk | P(x1,...,xk)}
```

is treated as if it were

```    (FORALL x1,...,xk | (y/=s) or (x1 notin t1) or...or (xn notin tk) or (not P(x1,...,xk))) .
```

#### V.8b - Correspondence between substituends and substitutes used in instantiation

Suppose that we call the terms

```    e1,..,em
```

appearing in the hint of a theorem citation

```    (e1,..,em)-->Theorem_name ==> ... ,
```

substitutes and call the implicitly bound variables X1,..,Xm,...,Xh which occur in the assertion A of the cited theorem 'Theorem_name' substituends. It is obvious that the number m of substitutes should not exceed the number of distinct substituends. Substitution of the e's for the X's is made in the left-to-right order of the X's within the theorem assertion, and if there are more than m variables in the theorem statement the extra X's go unreplaced.

Next consider a statement citation

```    (e1,..,en)-->Stat_label ==> ... ,
```

and again call the terms

```    e1,..,en ,
```

substitutes. In this case identification of the occurrences of the i-th substituend (namely the one whose occurrences will be replaced by ei) is sometimes a bit trickier, since in quantified formulae occurrences of logically distinct bound variables can have the same name. To avoid this issue, imagine first that all bound variables (in quantifiers as well as in setformers) are renamed in a way that eliminates this possibility. Then the same left-to-right rule applies as for theorem citation.

A minor problem relates to the existence of two (syntactic but) synonymous kinds of set-former, namely

```    StatN: {x: x in t | P(x)}
```

and

```    StatN: {x in t | P(x)} .
```

The ways one instantiates affirmed membership for these differ slightly, as the following basic example shows. One does

```    ... StatN: a in {x: x in t | P(x)}
c-->StatN ==> (a = c) & (c in t) & P(c)
```

with the first form of setformer, and

```    ... StatN: a in {x: x in t | P(x)}
()-->StatN ==> (a in t) & P(a)
```

with the second, where '( )' clearly designates the "void" substitution.

#### V.9 - The Def and Use_def primitives

New predicate, function, infix operator or relator, or constant symbols can be introduced using definitions of one of the allowed forms

```    Def definition_name: predicate_symbol(X1,...,Xn) := formula
Def definition_name: function_symbol(X1,...,Xn) := term
Def definition_name: Def(X1 •infix_operator X2) := term
Def definition_name: Def(X1 •infix_relator X2) := formula
Def definition_name: constant_symbol := term ,
```

where the variables X1,..,Xn (which are separated by commas in the first two forms of definition) must all be different. These X's are the formal parameters of the defined symbol; no other free variables can appear on the right-hand side of the definition, which can however use previously defined symbols. The new symbol which a definition introduces can appear in its right side also, provided certain syntactic restrictions (see below) are obeyed, to ensure that such a recursive definition does not result in a vicious circle.

It is illegal to define the same symbol twice.

A definition of the above kind has the same logical consequences as an assertion of the corresponding form

```    predicate_symbol(X1,...,Xn) •eq formula
function_symbol(X1,...,Xn) = term .
```

Once their definitions have been given, every substituted instance of these formulae can be used as a step in a proof. This is done using hints of the forms

```    Use_def(defined_symbol) ==> ...
Use_def(defined_symbol)(σ1,..,σk) ==> ...
```

(As usual, the second kind of hint restricts the context of the 'Use_def' inference).

For non-recursive definitions, the 'Use_def' mechanism searches the context in which it is applied for occurrences O of the 'defined_symbol', identifies the arguments of each such occurrence, substitutes these arguments into the right-hand side of the definition, and replaces O by this susbtituted term/formula. The transformed context which results is then used as the basis for an ELEM deduction.

In the case of a recursive definition, no ELEM step is taken, so the conclusion of the 'Use_def' must have the form

```    Use_def(defined_predicate_symbol) ==> defined_predicate_symbol(e1,...,en) •eq A
```

or the form

```    Use_def(defined_function_symbol) ==> defined_function_symbol(e1,...,en) = t ,
```

where e1,...,en are the actual parameters of the Use_def, the i-th of which replaces the i-th formal parameter in the definition of 'defined_symbol' to give the corresponding formula A or term t.

#### V.9a - "Algebraic" and recursive definitions

In ordinary 'Def' definitions, the symbol being defined will only appear on the left-hand side of the definition, not on its right. Definitions of this form, which can be called "algebraic", typically involve setformers, e.g.

`Def 25: Un(S) := {y: x in S, y in x}`

(which defines Un(S) as 'the set of all elements of elements of S', i.e. 'the union of all elements of S'). A second example is

`Def nnn: Less_usual(S) := {y: x in S, y in x} - s`

(which defines 'the set of all elements of elements of S which are not directly elements of S').

A second allowed form of definition, called recursive definition, generalizes the kind of set-theoretic definition just seen in a less commonly used but very powerful way, by relaxing the rule prohibiting "circular" definitions. Here the symbol being defined, which must designate a function of one or more variables, can also appear on the right of the definition, but only in a way subject to appropriate restrictions. Specifically, we allow function definitions like

Def nnn: f(s,t) := d({g(f(x,h1(t)),s,t): x in s | P(x,f(x,h2(t)),s,t)}) ,

where it is assumed that d, g, h1, h2, and P are previously defined symbols and that f is the symbol being defined by the formula displayed. Here circularity is avoided by the fact that the value of f(s,t) can be calculated from values f(x,t') for which we can be sure that x is a member of s, so that x must come before s in the implicit (possibly infinite) sequence of steps which build sets up from their members, starting with the empty set, which is the only necessary foundation object for the so-called 'pure' theory of sets.

Definitions like that displayed above let Ref scenarios use the sledgehammer technique called transfinite induction, which like other sledgehammers is sometimes needed to break through key obstacles.

An example of a recursive defnition is

```    Def 9: [The enumeration of a set] enum(X,S) :=
if S •incin {enum(y,S): y in X} then S else arb(S - {enum(y,S): y in X}) end if .
```

The most general form of recursion allowed in Ref's definition is

f(s,x2,...,xn) := d({g(f(x,h2(s,x,x2,...,xn),h3(s,x,x2,...,xn),...,hn(s,x,x2,...,xn)),s,x,x2,...,xn): x in s | P(f(x,h2(s,x,x2,...,xn),h3(s,x,x2,...,xn),...,,hn(s,x,x2,...,xn)),s,x,x2,...,xn)},s,x2,...,xn).

Here g, d, and h2,...,hn must be previously defined functions of the indicated number of arguments, and P must be a previously defined predicate of the indicated number of arguments.

In the predicate case, recursive definitions of the form

P(s,x2,...,xn) := d({g(s,x,x2,...,xn): x in s | P(x,x2,...,xn)},s,x2,...,xn) = 0,

are allowed. Here again g and d must be previously defined functions of the indicated number of arguments. In the special case in which the function d has the form

d(t,s,x2,...,xn) = {x: x in t | (not Q(s,x,x2,...,xn))},

where Q is some previously defined predicate, the recursive predicate definition seen above is equivalent to

P(s,x2,...,xn) := (FORALL x in s | P(x,x2,...,xn) •imp Q(s,x,x2,...,xn)).

Accordingly, recursive predicate definitions of the latter form are also allowed.

#### V.10 - The EQUAL and ALGEBRA primitives

Inferences of two implicitly 'equational' kinds, EQUAL and ALGEBRA, are supported by Ref. Inferences invoked by the EQUAL primitive can handle arbitrary constant and function symbols but know only whether these are the same or not. In contrast, the ALGEBRA primitive is aware of constructs which designate operations obeying the associative and commutative laws of algebra.

Inferences by EQUAL or by ALGEBRA have the forms:

```    EQUAL(σ1,..,σn) ==> ...
EQUAL ==> ... ,
ALGEBRA(σ1,..,σn) ==> ...
ALGEBRA ==> ...
```

where (σ1,..,σn), when present, specifies a restricted context as usual.

Inference by EQUAL operates in a very simple way: all of the equalities which have been derived in the context K of a statement of this kind are propagated transitively within all terms and formulae occurring in K. For example, if the equality 's = t' between terms s, t is known (either directly from the context K or because it has been derived at an earlier stage of the propagation), and if a term f(...s...) containing an occurrence of s appears in K, then the equality 'f(...s...) = f(...t...)' (whose right side has t in one or more places where the left side has s) is derived. The same is done for propositional level equivalences.

EQUAL can handle some derivations that lie beyond reach for ELEM, since ELEM cannot penetrate the structure of a term whose lead operator lies outside the realm of multilevel syllogistic. For example, if f is a function symbol not known to ELEM, ELEM will not be able to derive 'f(x) = f(y)' from 'x = y', but EQUAL can. Also, since ELEM is implicitly invoked at the end of all EQUAL steps of a proof, EQUAL will sometimes allow derivations such as that of R_is_nonneg(r) from a context K in which 's •R_GE R_0', 'R_is_nonneg(s) •eq (s •R_GE R_0)', and 'r = s' are known.

#### ALGEBRA.

Once the ALGEBRA primitive has been informed that some family of operations like

```    Si, •S_PLUS, •S_TIMES, •S_MINUS, S_Rev, S_0, S_1
```

or

```    Re, •R_PLUS, •R_TIMES, •R_MINUS, R_Rev,  R_0, R_1 ,
```

are those of a ring, it can deduce many algebraic relationships involving them. Operator families known to ALGEBRA must involve a constant designating the support domain, infix operators designating addition, multiplication, and subtraction, a unary function symbol designating the additive inverse, and constants representing the additive unit and the multiplicative unit. Some of these constructs can be omitted, e.g. ALGEBRA can handle some deductions for "quasi-ring" structures such as the family

```    Za, •PLUS, •TIMES, •MINUS, 0, 1
```

of unsigned integers (which lacks an additive inverse).

ALGEBRA inference works by generating algebraic equalities and then triggering an EQUAL inference. It also handles with statements like 'S_0 in Si', 'a in Re', 'b •R_PLUS c in Re', '#(n\k)in Za', etc., which assert membership of constants, variables and compound quantities in the rings of which it is aware. Assertions of this auxiliary kind are often needed before useful equalities are generated and equality propagation can start.

Caveats concerning the use of ALGEBRA in its present implementation:

1. ALGEBRA can derive '(n •PLUS m) •PLUS m in Za' when 'n in Za' and 'm in Za' are known from the context K; but if only '(n •PLUS m) in Za' and 'm in Za' are known, ALGEBRA fails to derive '(n •PLUS m) •PLUS m in Za' unless 'n •PLUS m' is explicitly named as a variable.

2. Valid equations involving subtraction are not derived in quasi-rings: for instance, although ALGEBRA can derive '((n •MINUS m) •PLUS m in Za) & ((n •PLUS m) •MINUS m in Za)' from 'n in Za' and 'm in Za', it is unable to derive '(n •PLUS m) •MINUS m = n' from the same hypotheses.

Interplay between ALGEBRA and EQUAL: Since ALGEBRA concludes with a call to EQUAL, it can often handle "hybrid" equalities which intermix rings operations with others which are not algebraic. For example, ALGEBRA can derive
`f(a,b •R_TIMES R_0,R_1 •R_TIMES c) = f(a,R_0,c)`

from 'b in Re' and 'c in Re'.

#### V.11 - The Declare primitive

Ref's 'Declare' primitive is used to supply other Ref operations like 'Set_monot' and ALGEBRA with the information that they need to handle collections of function and predicate symbols being progressively enlarged by definitions. 'Declare' esists in several forms, namely

DECLARE_MONOT func_or_pred_name,tagged_argument_list

DECLARE_RING the_ring,opp_and_identity_list

DECLARE_QUASI_RING the_quasi_ring,opp_and_identity_list

Details concerning the DECLARE_MONOT directive are given below. In a DECLARE_RING statement, 'the_ring' must identify a set, for example the set of signed integers, rationals, reals, or real functions, which carries the algebraic structure of a ring, and 'opp_and_identity_list' must be a comma separated list which identifies the algebraic operations and unit elements of this ring. An example is

DECLARE_RING Re,•R_PLUS,+,•R_TIMES,*,•R_MINUS,-,•R_Rev,--,R_0,0,R_1,1.

As seen in this example, the opp_and_identity_list consists of operation, e.g •R_PLUS (or constant, e.g. 'R_0') names, alternating with standard signs (e.g '+', '0') which identify the algebraic roles of these operators and constants. So, for example, the declaration just seen states that Re is a ring with operations and constants, namely •R_PLUS, •R_TIMES, •R_MINUS, •R_Rev, R_0, and R_1 which function as its addition, multiplication, subtraction, monadic minus, additive zero, and multiplicative identity respectively. A DECLARE_RING statement must identify operators and constants playing each of these six roles.

DECLARE_QUASI_RING statements are similar, but do not involve either the subtraction or monadic minus operations.

To accept a DECLARE_RING directive, Ref must be able to locate prior theorems which establish the closure and other properties which the operations on a ring must have, e.g. commutativity and associativity of addition and multiplication, distributivity of multiplication over addition, etc. In the absence of the required prior theorems a DECLARE_RING or DECLARE_QUASI_RING statement will be diagnosed as incorrect.

#### V.12 - The Loc_def primitive

[...TO BE FILLED IN...]

#### V.13 - The SIMPLF primitive

Suppose that a theorem or statement ToS containing a construct like {e(x): x in s | P(x)} is cited, that the variable s is free in ToS, and that the citation substitutes some second set-former expression {a(y): y in b | C(x)} for s. Then a compound expression {e(x): x in {a(y): y in b | C(x)} | P(x)} will appear: Very commonly in such situations, the argument continues by simplifying this last expression in the obvious way, i.e. to {e(a(y)): y in b | C(y)} & P(a(y))}. Simplifications of much the same form apply to quantified predicates, e.g. (FORALL x in {a(y): y in b | C(x)} | P(x)) simplifies to (FORALL y in b | C(y) •imp P(a(y))). Ref's SIMPLF primitive handles simplifications of this kind, along with their multivariable set-former and predicate analogs, e. g. simplification of

{e(x,y): x in {a(u,v): u in b, v in c(u) | d(u,v)}, y in {A(U,V,x): U IN B(x), V IN C(U,x) | D(U,V,x)}}

to

{e(a(u,v),A(U,V,x)): u in b, v in c(u), U IN B(a(u,v)), V IN C(U,a(u,v)) | D(U,V,a(u,v)) & d(u,v)}}.

#### V.14 - The Set_monot and Pred_monot primitives

Set-former expressions involving free variable s,t,... etc. often have evident monotonicity properties in these variables, e.g. the value of the set-former {e(x): x in s | x incs t} always increases when s is made larger and decreases when t is made larger. Monotonicity properties of this kind can be propagated in obvious syntactic fashion for compound operators, e.g. f(s,f(t,u)) obviously increases when either s or u is made larger and decreases if t is made larger (the other sets being held constant in each case). These simple rules, often exploited in proofs, have evident predicate analogs.

Ref's 'Set_monot' inference mechanism works with relationships of this kind, which it propagates systematically through the syntax trees of expressions presented to it, or rather through as much of these trees as involve only functions having monotonicity properties known to it.

Monotonicity properties of previously defined function and predicate symbols f are made known to 'Set_monot' by declarations like

DECLARE_MONOT f(+x1,+x2,-x3,...),

where each argument variable xj in which the function f increases (resp. decreases) should be prefixed by a '+' (rep. '-') sign, while argument variables in which f is not monotone in either direction should carry no prefix.

To accept such a directive, Ref must be able to locate prior theorems which establish the claimed monotonicity properties, e.g. having forms like

Theorem name: x incs y •imp (f(x,s2,...) incs f(y,s2,...)), etc.

In the absence of such prior theorems the DECLARE_MONOT statement will be diagnosed as incorrect.

DECLARE_MONOT is used in much the same way to declare predicates monotone, but in this case the meaning of DECLARE_MONOT P(+x..) is x incs y •imp (P(y,..) •imp P(x,..)) and the meaning of DECLARE_MONOT P(-x..) is x incs y •imp (P(x,..) •imp P(y,..)).

The 'Set_monot' inference mechanism expects to be presented with a conclusion having one of the forms a incs b, a •incin b, a = b. p •imp q, p •eq q, where a and b are set-valued expressions while p and q are predicate expressions. Given such a conclusion, 'Set_monot' walks the syntax trees of a and b (or p and q) in parallel, descending through all nodes at which identical function symbols f are found, and to arguments of these f on which the f has a known monotone dependency. Whenever a discrepancy between the two trees is found, with a subexpression ea in a and eb at the corresponding point in b, 'Set_monot' generates and collects a hypothesis of the form ea incs eb, ea •incin eb, ea = eb, as appropriate to the known monotone dependency (or lack of monotone dependency) of the overall expressions a and b on the subexpression e occurring at the point of discrepancy, and adds this to an conjunction D of clauses. Then the desired conclusion is an evident consequence of the conjunction D, so 'Set_monot' is able to accept this conclusion if it can show, using ELEM, that 'not D' is incompatible with the context C available at the point of inference.

#### V.15 - The THEORY modularization construct. The ENTER_THEORY and Assump primitives

The Ref system provides a theory mechanism designed to ease large-scale proof development. Ref's theories, like the procedures of a programming language, have lists of formal parameters. Each theory requires its parameters to meet a set of assumptions. When "applied" to a list of actual parameters that have been shown to meet these assumptions, a theory can instantiate several additional "output" set, predicate, and function symbols, and supply a list of theorems generated automatically as substituted instances of theorems initially proved by the user inside the theory itself. Accordingly, the output parameters of a theory comprise of a symbol-definition mechanism supplementary to the use of 'Def'.

Theories allow modularization of a scenario, which

• make it unnecessary to repeat steps when the proofs of two theorems are closely analogous;

• conceal the details of a proof once a template for it has been introduced into Ref and successfully certified.

Ref organizes theories into nested trees of named theories whose top level is always 'Set_theory'. Every Ref definition and theorem is tagged as belonging to some theory. New theories are created by declarations of the form

```    THEORY theory_name(list_of_assumed_symbols_and_arguments)
assumptions...
END theory_name
```

which define the input symbols and assumptions of the theory. The list_of_assumed_symbols_and_arguments seen here should be a comma-separated list of all the constants and functions which the theory assumes, each followed by its list of parameters if any. The function symbols appearing in the 'list_of_assumed_symbols_and_arguments' are allowed to be infix operators written in the syntax appropriate for such operators.

Examples of theory declarations are

```    THEORY equivalence_classes(R(x,y),s)
(FORALL x in s,y in s | (R(x,y) •eq R(y,x)) & R(x,x))
(FORALL x in s,y in s,z in s | ((R(x,y) & R(y,z)) •imp R(x,z)))
END equivalence_classes ,```

```    THEORY well_founded_set(s,arg1_bef_arg2(x,y))
(FORALL t | ((t •incin s) & (t /= 0)) •imp (EXISTS w in t |
(FORALL v in t | (not arg1_bef_arg2(v,w)))))
END well_founded_set ,```

and

```    THEORY sigma_theory(s,x •PLUZ y,e)
e in s
(FORALL x in s | x •PLUZ e = x)
(FORALL x in s , y in s | x •PLUZ y = y •PLUZ x))
(FORALL x in s , y in s , z in s |
(x •PLUZ y) •PLUZ z = x •PLUZ (y •PLUZ z))))
END sigma_theory .
```

Once a theory T has been declared in this way it can be entered by executing the verifier directive

`    ENTER_THEORY T`

which should appear outside the body of the theory.

Theory t1 is said to be nested within t2 if the declaration of t1 appears when t2 is active, i.e. t2 is the most recent theory to have been ENTERed. Moreover, theory t0 is said to be an ancestor of t1 if t1 is nested within t0 or one of its descendants, or t0 and t1 are the same theory. If t1 has t0 as an ancestor, it can use the assumptions and definitions, and the theorems of t0, otherwise not. To force every theory to be a descendant of Set_theory, the declaration and directive

```    THEORY Set_theory(arb(x),s_inf)
(FORALL s | (s = 0 & (arb(s) = 0)) or ((arb(s) in s) & (arb(s) * s = 0))
s_inf /= 0 and (FORALL x in s_inf | {x} in s_inf)
END theory_name;

ENTER Set_theory```

are automatically and implicitly prefixed to every scenario.

The 'Assump' primitive appears in statements of the forms

```    Assump ==> ...
Assump(σ1,..,σn) ==> ...
```

(As usual, the second kind of hint restricts the context of the 'Assump' inference). When Ref encounters a statement with an 'Assump' hint, it conjoins together all assumptions appearing in the ancestor theories of the one where the statement belongs, and checks whether the assertion of the statement follows immediately from the conjoined assumptions. The proof in the following toy theory, which is correct, illustrates what just said.

```    THEORY beyond_infinity(u_inf)
u_inf = {s_inf}
END beyond_infinity

ENTER_THEORY beyond_infinity

Theorem beyond_infinity.1: arb(s_inf) in arb(u_inf). Proof:
Suppose_not ==> arb(s_inf) notin arb(u_inf)
Assump ==> (u_inf = {s_inf}) & (s_inf /= 0) & Stat1:
(FORALL s in OM | (((s = 0) & (arb(s) = 0)) or ((arb(s) * s = 0) & (arb(s) in s))))
s_inf --> Stat1 ==> false; Discharge ==> QED```

#### V.16 - The APPLY primitive. Skolemization

???But when t1 is nested within t2 APPLYing t1 requires that functions and predicates to be substituted for all the input functions and predicates of t1 be identified, and that after this substitution all the assumptions of both?? t2 and t1 follow from statements available in the context in which the APPLY statement appears. Note that???

APPLY statements can appear either within proofs, in which case the symbols defined by the APPLY statement are local to that proof, or outside every proof, i.e. at the same syntactic level as proofs themselves, in which case the context of the APPLY is the assertion made by ??the very last?? theorem preceding it (which should therefore collect everything required by the APPLY). In this case the symbols defined by the APPLY statement are global.

When an APPLY is executed, replacements must be given for all the input symbols of the THEORY being applied (and, as explained above, all?? its ancestor THEORYs, namely all THEORYs within which it is nested), and new names can be declared for any of the function and predicate symbols defined globally within the THEORY. Accordingly the header of the APPLY statement must have the form

```    APPLY(sym_1_thryvar:extrn_name_for_sym_1,sym_2_thryvar:extrn_name_for_sym_2,...)
th_name(inp_replacement_list) ,```

where 'inp_replacement_list' is a comma-separated list of elements of the form

`    (*)   input_symb_of_theory(param_1,param_2,..)->replacement_expn .`

An example is

`    APPLY(x_thryvar:x,y_thryvar:y) Svm_test(a(x)->car(x),b(x)->cdr(x),s->s) .`

To understand the syntactic and semantic rules which must apply to such 'THEORY calls', we can interpret them as follows: Aside from functions and constants defined in some one of its ancestor theories, or defined within itself, a theory, once ENTERed, can make use only of its input symbols and of the information concerning such symbols given in the assumptions of the THEORY. (In particular, it uses the fact that there exist functions and predicates having the properties listed in these assumptions). We can view the set of replacement_expn's appearing in the inp_replacement_list of an APPLY statement as integrating the THEORY into the logical environment of the APPLY statement by giving definitions of all of these input symbols. These definitions must convert all the assumptions of the THEORY into theorems available in the logical context of the APPLY. This requirement allows us to view all the proofs given in the THEORY as proofs in the context of the APPLY which make use of these defined symbols. The output symbols of the THEORY, namely (some of) those defined within the theory itself, could have been given other names if we liked, and in particular those other names which appear as 'extrn_name_for_sym's in the APPLY statement. From this point of view, our THEORY/APPLY mechanism simply repackages hypothetical 'inline' proof sequences of the form

1. Define a group of new symbols (corresponding to the input symbols of a THEORY) to be used in an immediately following sequence of proofs.
2. Prove the properties (corresponding to the assumptions of the THEORY) additional to those already available in the THEORY's parent environment, on which these proofs rely.
3. Insert the definitions and proofs of the theory, which now become proofs valid in the context of the APPLY. Make sure that all symbols defined have the external names 'extrn_name_for_sym'.
The repackaged 'offline' form of this line of proof is then
1. Assume certain properties of a group of symbols not otherwise used.
2. Give a sequence of definitions and proofs
3. Supply a set of replacement expressions which, if used to define the assumed symbols of a theory, would allow proof of all the assumptions of the theory.
4. Create versions of the functions and constants defined in the theory, giving them any desired external names.
5. Capture versions of the theorems proved in the theory by substituting the appropriate replacement expression for each of the assumed symbols of the theory.
The relationship between the hypothetical 'inline' version of a theory and its actual invocation after 'offline' construction is comparable to the way that procedures are used in programming languages. Were it actually constructed, steps (ii) and (iii) of the 'inline' version would be handled by supplemental inferences of Use_def type.

These considerations make clear the rules that must be imposed on THEORY declarations and APPLY inferences for them to remain sound:

1. The assumed symbols of a THEORY must all be distinct and distinct from any symbol previously defined in any of the THEORY's ancestors. The parameters declared for any of a THEORY's input symbols must all be distinct. These symbols count within the THEORY and any of its descendants as having been defined.
2. The THEORY can cite any definition or theorem already proved in it or in any of its ancestor theories, and can also cite any assumption of it or any of its ancestor theories.
3. If a THEORY T1 is APPLYed within another THEORY T2, replacement_expns must be given for all the input symbols of T1 and of all its ancestor THEORYs, all the way back to the last common ancestor of T1 and T2, and, after replacement of the assumed symbols of all of these THEORY by the replacement expressions supplied for them, all the assumptions of all?? these THEORYs must have proofs in T2.

The verifier then substitutes the given expns for each appearance of the given input symbols in the assumptions of the theory, [...]

#### V.17 - 'AUTO' formulae: Statements with implicit conclusions.

Some of the Ref's deduction primitives generate an internal statement which they use to derive the assertion actually written by the Ref user. The most obvious of these are:

1. Discharge: generates the negation of its matching Suppose.

2. Theorem citation: generates a substituted form of the theorem cited.

3. Statement citation: generates a substituted form of the statement cited.

4. Suppose_not: generates the negation of its theorem.

In these cases the assertion U written by the user will often be identical with the statement S generated by the system. Ref lets the keyword 'AUTO' be used as an abbreviation for S, making it unnecessary for the Ref user to write U = S. The Ref syntax which lets this be done for the 4 cases listed above is:

1. Discharge ==> AUTO

2. Theorem citation: (v1,...,vk)-->Tnnn ==> AUTO, where Tnnn designates the theorem cited, and v1,...,vk is the list of variables or expressions which are to be substituted for the free variables of that Theorem.

3. Statement citation: (v1,...,vk)-->Statnnn ==> AUTO, where Statnnn designates the conjunct C cited, and v1,...,vk is the list of variables or expressions which are to be substituted for the free variables of C.

4. Suppose_not(v1,...,vk) ==> AUTO, where v1,...,vk is the list of hitherto unused variables to be substituted for the free variables of the Theorem.

For example, use of AUTO reduces Theorem 42 of the standard proof library, which in fully detailed form would be

```    Theorem 37: [Enumeration membership lemma] enum(X,S) = S or enum(X,S) in S. Proof:
Suppose_not(x,s) ==> (enum(x,s) /= s) & (enum(x,s) notin s)
Use_def(enum) ==> enum(x,s) = if s •incin {enum(y,s): y in x} then s else
arb(s - {enum(y,s): y in x}) end if
ELEM ==> s - {enum(y,s): y in x} /= 0
ELEM ==> enum(x,s) = arb(s - {enum(y,s): y in x})
(s - {enum(y,s): y in x}) --> T0 ==> arb(s - {enum(y,s): y in x}) in s - {enum(y,s): y in x}
Discharge ==> QED
```
to
```    Theorem 37: [Enumeration membership lemma] enum(X,S) = S or enum(X,S) in S. Proof:
Suppose_not(x,s) ==> AUTO
Use_def(enum) ==> enum(x,s) = if s •incin {enum(y,s): y in x} then s else
arb(s - {enum(y,s): y in x}) end if
ELEM ==> s - {enum(y,s): y in x} /= 0
ELEM ==> enum(x,s) = arb(s - {enum(y,s): y in x})
(s - {enum(y,s): y in x}) --> T0 ==> AUTO
Discharge ==> QED
```

This has two benefits: (a) AUTO statements need no inference and so are verified immediately; (b) the user has less to write. A drawback is that heavy use of 'AUTO' makes proofs less readable. But this can be cured by re-introducing the suppressed statements as comments.

AUTO is also available in Use_def cases, including (i) those in which the conclusion drawn is a substituted literal copy of a a function's definition, and (ii) those in which the conclusion drawn is a substituted version of some conjunct appearing earlier in a proof. To use AUTO in case (i) one expands the hint Use_def(fcn) into Use_def(fcn(v1,..,vn)) where v1,..,vn are the variables or expressions to be substituted for the free variables of the definition. Using this extension the above proof can be further simplified to

```    Theorem 37: [Enumeration membership lemma] enum(X,S) = S or enum(X,S) in S. Proof:
Suppose_not(x,s) ==> AUTO
Use_def(enum(x,s)) ==> AUTO
ELEM ==> s - {enum(y,s): y in x} /= 0
ELEM ==> enum(x,s) = arb(s - {enum(y,s): y in x})
(s - {enum(y,s): y in x}) --> T0 ==> AUTO
Discharge ==> QED
```
and even to
```    Theorem 37: [Enumeration membership lemma] enum(X,S) = S or enum(X,S) in S. Proof:
Suppose_not(x,s) ==> AUTO
Use_def(enum(x,s)) ==> AUTO
(s - {enum(y,s): y in x}) --> T0 ==> AUTO
Discharge ==> QED
```

The Use_def case (ii) noted above (substitution into a conjunct) can in the same spirit be written as

```   Use_def(fcn(v1,..,vn)-->Statn) ==> AUTO
```

where Statn labels the conjunct in question.

#### V.18 - The behind-the-scenes activity of proof-by-structure

Ref's typelessness contrasts with the more elaborate type systems common in other automated proof systems. Setting efficiency considerations aside (Ref typically verifies roughly 200 proofs per minute, so these are not of central interest), the advantage of a type system from the point of view of a proof system's user is that numerous small statements concerning the membership of variables in sets central to mathematical discourse (such as the reals, the integers, the real sequences, etc.) are handled implicitly. Once an appropriate type system has been set up, this implicit treatment makes such statements available without further user effort. But a familiar disadvantage of static typing in proof verifiers is that, once introduced into a system's foundations, it tends to become over-rigid. This inspires efforts to generalize type systems, and tends to result in baroque intellectual structures unfamiliar to the working mathematician.

For this reason Ref prefers the 'lean and mean' typeless approach of Zermelo-von Neumann set theory. But then, to recapture the main advantages of a typed approach, it provides an internal, dynamic, type-like mechanism called 'Proof by structure' described in the following paragraphs.

Proof by structure uses a simple internal language of structure descriptors ("types", in a weak sense) to keep track of the top structural levels of sets appearing in scenario proofs. Any special set defined in a scenario, for example ℕ, the set of all integers, or ℝ, the set of all reals, can be used as a primary structure symbol in this language. This descriptor attaches to all members of the set, e.g. any integer has the descriptor ℕ. A significant but less basic example is ℤ+, the set of all non-negative signed integers, which presently does not occur in our main scenario but could easily be defined. Structure descriptors need not be confined to sets, but can also designate classes, like the classes IsO and IsC of all ordinals and of all cardinals, the classes IsΦ and Is¬Φ of all finite and of all infinite sets, the class Is¬0 of all non-void sets, and the class Ω of all sets.

Given any such symbols S,S1,S2 representing structures, we can then form new structure descriptors:

1. {S} describes a set all of whose elements have the descriptor S. For example, the set ℕ has the descriptor {ℕ}; the set (ℕ) of all sets of integers has the descriptor {{ℕ}}.

2. [S1,S2] describes a pair whose components have respectively the descriptors S1 and S2.
These constructions can be compounded by letting S,S1,S2 be any basic or non-basic structure descriptors. For example
1. {[ℕ,ℕ]} describes a set of integer pairs (and so applies to ℤ, the set {[i,j]: i ∈ ℕ, j ∈ ℕ | i = 0 ∨ j = 0} of signed integers);

2. {[ℕ,Ω]} describes a map from integers to elements of any kind, e.g. it describes any finite or infinite(ly denumerable) sequence.

A given set can have multiple descriptors. For example, a finite sequence of signed integers has the descriptors {[ℕ,ℤ]} and IsΦ. Since ℤ itself has the descriptor {[ℕ,ℕ]}, a sequence of signed integers also has the descriptor {[ℕ,[ℕ,ℕ]]}, which in any given situation one may wish either to use or to ignore. Infinite sequences of rationals have the descriptors {[ℕ,ℚ]} and Is¬Φ. Real numbers in Cantor's representation are equivalence classes of such sequences, and therefore have the descriptors {{[ℕ,ℚ]}} and {Is¬Φ}.

The verifier's internal proof by structure mechanism tracks the descriptors of variables and expressions appearing in proofs as precisely as it can. For example, a variable x known to satisfy a clause "x ∈ ℕ" is given the descriptor ℕ, while if x is known to satisfy "x ∈ ℤ" it gets the descriptors ℤ and [ℕ,ℕ].

Setformers and other basic constructors operate in a known way on structure descriptors. Suppose, for example, that s is a set known to have some descriptor {D}, and that e(x) is an expression involving the free variable x. Suppose that e(x) can be seen to map elements having the descriptor D into elements having the descriptor D'. Then

{e(x): x ∈ s | P}

has the descriptor {D'}, while

{[x,e(y)]: x ∈ s, y ∈ s | P}

has the descriptor {[D,D']}.

When a set s is known to have a descriptor {D}, any element x for which "x ∈ s" has been proved is known to have the descriptor D. If D is a primitive descriptor representing a special set, this gives us the assertion "x ∈ D", for example "x ∈ ℕ", which may be needed as an auxiliary hypothesis for the application of some theorem. Similarly any set s having the descriptor {[ℕ,ℕ]} is known to satisfy Is_map(s), and also

(∀ x ∈ s | x ∈ ℕ & x ∈ ℕ).

Deriving conclusions of this kind automatically is the principal function of the system of structure descriptors.

Many other basic set-theoretic operations have known effects on descriptors. These often follow from the definitions of the operators in question. For example:

1. if s has the descriptor {D}, then so does every one of its subsets, and (s) has the descriptor {{D}}.

2. if s has the descriptor {{D}}, then ∪s has the descriptor {D}. Note that this follows automatically from the definition

{x: y ∈ s, x ∈ y}

of ∪s, since the bound variable y in the iterator has the descriptor {D}, so each of the x has the descriptor D, and the set as a whole has the descriptor {D}.

3. if s1 and s2 both have a descriptor {D}, then so does s1 ∪ s2.

4. if s1 and s2 both have the descriptor IsΦ, then so does s1 ∪ s2.

5. if s1 and s2 have descriptors {D1} and {D2} respectively, then s1 ∩ s2 has both descriptors {D1} and {D2}. Even if s2 has no descriptor, s1 ∩ s2 and s1 - s2 has the descriptor {D1}, as does any set s for which an assertion s ⊆ s1 has been proved.

6. if s1 has the descriptor IsΦ, so do s1 ∩ s2 and s1 - s2, as does any set s for which an assertion s ⊆ s1 has been proved.

7. if s1 and s2 have the descriptor IsΦ, so does any set-former
{e: x ∈ s1, y ∈ s2 | P},

or any set-former {e: x ∈ s1 | P}.

8. #s always has the descriptor IsC. Since the class of cardinals has the descriptor {IsO}, #s also has the descriptor IsO, as does any x known to be a cardinal. If s has the descriptor IsΦ, then #s has the descriptor . Since itself has the descriptor {IsΦ}, #s also has the descriptor IsΦ.

9. if s has the descriptor {D}, then any set-former like

{x: x ⊆ s | p}

has the descriptor {{D}}; this result obviously generalizes.

10. if sets s and t have the descriptors {D} and {D'} respectively, then their Cartesian product s × t has the descriptor {[D,D']}, If s and t both have the descriptor IsΦ, so does s × t.
11. if s and t have descriptors {[D,D']} and {[D',D'']} respectively, then s◊t has the descriptor {[D,D'']}. If s and t both have the descriptor IsΦ, so does s◊t.
12. if s and t have descriptors D, D' respectively, then [s,t] has the descriptor [D,D']. If u has the descriptor [D,D'], then u has the descriptor D and u has the descriptor D'.
13. if F has the descriptor {[D,D']}, then its inverse F has the descriptor {[D',D]}, and F ɭ s has the descriptor {[D',D]}. If F has the descriptor IsΦ, then F and F ɭ s both have the descriptor IsΦ also.

These ideas also apply to single-valued and one-one maps and can also be applied to topological situations, spaces of continuous functions, etc.

Some important conclusions result immediately by use of structure descriptors. For example, the cardinal sum is defined as

#({[x,0]: x ∈ s1} ∪ {[x,1]: x ∈ s2}),

making it obvious that the sum of two integers is an integer. Similarly, the definition of cardinal product, namely

#{[x,y]: x ∈ s1,y ∈ s2},

makes it obvious that the product of two integers is an integer. Since the difference of integers is defined by #(n - m), it also follows immediately that the difference of integers is an integer.

In many cases a definition or theorem appearing in a scenario characterizes the action on structure descriptors of one or more of the function symbols appearing in it. The examples given just above illustrate this. Such facts, combined with the other rules given above, extend the verifier's ability to track the structures of objects appearing in proofs. For example, if s, t and u are sets known to have the descriptor {ℕ}, then

{(x • y) + z: x ∈ s, y ∈ t, z ∈ u | P}

is also known to have the descriptor {ℕ}.

The theory of summation yields the fact that ∑f has the descriptor D if f has the descriptors {[d,D]} and IsΦ, and if the ⊕ operator appearing in the summation can be shown to map pairs of objects having the descriptor D into objects having this same descriptor. Thus, for example, the sum or product of any set-former like

{[[x,y,z],(x • y) + z]: x ∈ s, y ∈ t, z ∈ u | P}

is also known to be an integer if s, t and u are sets known to have the descriptor {ℕ}.

The structure definition mechanism explained above carries over in a useful way to recursively defined functions (in our set-theoretic context, these can be functions defined by transfinite induction.) To show why such extension is possible, we first note that the system of descriptors extends readily to function symbols, since these are very close semantically to sets of pairs. For example, the descriptor {[D1,D2]} can be ascribed to any one-parameter function symbol which maps each object having the descriptor D1 into an object having the descriptor D2. Similarly, the descriptor {[[D1,D2],D3]} can be ascribed to any two-parameter function symbol which yields an object having the descriptor D3 whenever its two parameters have the respective descriptors D1 and D2. (For example, the integer addition operator + has the descriptor {[[ℕ,ℕ],ℕ]}, but also the descriptors {[Ω,Ω],IsO]} since it always produces an ordinal, and the descriptor {[[IsΦ,IsΦ],ℕ]}, since it produces an integer for any two finite inputs.) In the three-parameter case, {[[[D1,D2],D3],D4]} can be ascribed to any three-parameter function symbol which yields an object having the descriptor D4 whenever its three parameters have the respective descriptors D1, D2, and D3.

Using these descriptors, we can state the rule for function application as follows: If a one-parameter function symbol f has the descriptor {[D1,D2]}, and x has the descriptor D1, then f(x) has the descriptor D2. Similarly, if a two-parameter function symbol f has the descriptor {[[D1,D2],D3]}, and its two arguments x1,x2 has the descriptor D1,D2, then f(x1,x2) has the descriptor D3, and similarly for more than two arguments.

Function compounding acts in an obvious way on descriptors, for example if f has the descriptor {[D1,D2]} and g has the descriptor {[D2,D3]}, then g◊f has the descriptor {[D1,D3]}.

Next consider a transfinite recursive definition of one of the general types Ref allows, namely

f(s,t) := d({g(f(x,h(s,t)),s,t): x ∈ s | P(x,f(x,h(s,t)),s,t)},s,t),

where as before we assume that the functions d, g, and h have been defined prior to the occurrence of the recursive definition shown. In working with this definition one needs to establish that f has some descriptor {[[D1,D2],D3]}, i.e. that it yields an element having descriptor D3 for any input arguments with descriptors D1,D2 respectively. This conclusion is valid under the following circumstances: we need to know that there exists a descriptor D' such that

• h has the descriptor {[[D1,D2],D2]};
• g has the descriptor {[[[D3,D1],D2],D']};
• d has the descriptor {[[[{D'},D1],D2],D3]}.
Then in the ground case of the transfinite recursive definition f(0,t) has the value d(0,s,t), and so must produce an element with the descriptor D3. In the remaining case it follows inductively (given that s and t have the respective descriptors D1,D2) that f(x,h(s,t)) has the descriptor D3 for every x ∈ s, so that g(f(x,h(s,t)),s,t) has the descriptor D', and so

(*)     {g(f(x,h(s,t)),s,t): x ∈ s | P(x,f(x,h(s,t)),s,t)}

has the descriptor {D'}. Therefore the right side of the recursive definition seen above has the descriptor D3, and it follows inductively that f has the descriptor {[[D1,D2],D3]}. If s has the descriptor IsΦ, then the set (*) will have this descriptor also, and so if d has the descriptor {[[[IsΦ,IsΦ],D2],IsΦ]}, f will have the descriptor {[[IsΦ,D2],IsΦ]}. On the other hand, if d is an operator like the selector "", and so has the descriptor {[{D1},D2]} (where the null set must have the descriptor D2), then g must have the descriptor {[[[IsΦ,IsΦ],D2],IsΦ]}, and s the descriptor {IsΦ}, for f(s,t) to have the descriptor IsΦ. In this case f has once again the descriptor {[[IsΦ,D2],IsΦ]}.

#### V.19 - Use of external provers

The Ref proof verifier can accept proofs generated by various external provers (such as Otter). This is done by a syntactic extension of the normal Ref APPLY directive, i.e. external provers are regarded as sources of variant Ref THEORYs. When such provers are being used, the normal keyword APPLY used to invoke a THEORY is changed to 'APPLY_provername', where 'provername' names the external prover in question. In this case, the normal Ref THEORY declaration is expanded to give Ref-syntax translations of all the theorems being drawn from the external prover, and of all the external symbol definitions on which these depend. An external file, also named in the modified Ref APPLY directive, must be provided as certification of each such THEORY. Ref examines this file to establish that it is a valid proof, by the external prover named, of all the theorems which the THEORY claims.

#### V.20 - Large individual and group verification projects: the 'FREEZE' and 'FROZEN' directives.

In large, possibly group, verification projects, which can extend over months, repeated verification of already verified proofs wastes computer time and is unnecessary in principle. To allow omission of proofs once verified, while still guaranteeing that unverified material is not presented and falsely claimed to have been verified, Ref provides two auxiliary directives, named 'FREEZE' and 'FROZEN'. FREEZE extracts a specified list of theorems and theories from text containing verified proofs of them, attaches all the symbol definitions on which they depend, signs this extract digitally, and presents the signed extract to the Ref user. If the Ref user subsequently inserts this signed extract into his/her scenario using the FROZEN directive, the theorems, definitions, and theories in it will be accepted as valid without further proof.

As a Ref user community develops, signed extracts of the kind just described can be used to exchange certified sets of theorems between its members.

The FREEZE directive has the form

FREEZE list_of_theorems_and_theories ,

for example

FREEZE 37,41..45,49,ordinal_induction,finite_induction .

As the example indicates, the list_of_theorems_and_theories in a FREEZE directive must be a comma-separated list of theorem numbers, theorem ranges n..m, and theory names.

On encountering such a directive, the Ref verifier checks that the theorems and theories in question are present in the text containing the FREEZE directive along with their full and correct proofs. Once this has been checked the verifier prepares and signs an extract of the theorems and theories, along with all the definitions on which they depend. If any of these definitions are made by theory application, the theory APPLYed, and any definitions and further theories on which they depend, are added to the extract. This process continues iteratively until the extract is complete.

In such an extract, theorems appear as statements only, without their proofs. THEORYs appear with all their theorems and required input symbols, but without any of their internal proofs. To this material, a signature consisting of 15 printable characters is appended, and then the whole extract is printed to the Ref output window.

Suppose for example that you have requested a simple extract consisting of the two theorems

```	Theorem 16: [Key property of ultimate members set] ((X in S) & (Y in X)) •imp
(Y in Ult_membs(S)).
Theorem 17: [An ordinal is its own set of ultimate members] (Ord(S)) •imp (Ult_membs(S) = S).
```

and the two theories ordinal_induction and finite_induction. The statements of these theories are as follows:

```	THEORY ordinal_induction(o,P(x))
Ord(o) & P(o)
==>(t_thryvar)
Ord(t_thryvar) & P(t_thryvar) & (t_thryvar •incin o) &
(FORALL x in t_thryvar | not P(x))
END ordinal_induction;
```

```	THEORY finite_induction(n,P(x))
Finite(n) & P(n)
==>(m_thryvar)
(m_thryvar •incin n) & P(m_thryvar) &
(FORALL k •incin m_thryvar | ((k /= m_thryvar) •imp (not P(k))))
END finite_induction;
```
Since the requested theorems and theories use the defined symbols Ord, Ult_membs, and Finite, the following definitions must be added to complete the extract.

```	Def 1: [Ordered pair] Def([X,Y]) := {{X},{{X},{{Y},Y}}}
Def 2: [First component of ordered pair] car(X) := arb(arb(X))
Def 3: [Second component of ordered pair] cdr(X) := arb(arb(arb(X - {arb(X)}) - {arb(X)}))
Def 5: [Map domain] domain(F) := {car(x): x in F}
Def 6: [Map range] range(F) := {cdr(x): x in F}
Def 18: [Finiteness] Finite(s) := not(EXISTS f in OM | one_1_map(f) &
(domain(f) = s) & (range(f) •incin s) & (s /= range(f)))
Def 35a: [Transitive membership closure of S] Ult_membs(S) :=
S + {y: u in {Ult_membs(x): x in S}, y in u}
```

Ref's therefore responds to the FREEZE request in the following way:

```	Successful FREEZE, 2 theories, 2 theorems, 7 required definitions added

FROZEN Signature: D#4^*ef0f9c6c4%

Def 1: [Ordered pair] Def([X,Y]) := {{X},{{X},{{Y},Y}}}
Def 2: [First component of ordered pair] car(X) := arb(arb(X))
Def 3: [Second component of ordered pair] cdr(X) :=
arb(arb(arb(X - {arb(X)}) - {arb(X)}))
Def 5: [Map domain] domain(F) := {car(x): x in F}
Def 6: [Map range] range(F) := {cdr(x): x in F}
Def 18: [Finiteness] Finite(s) := not(EXISTS f in OM | one_1_map(f) &
(domain(f) = s) & (range(f) •incin s) & (s /= range(f)))
Def 35a: [Transitive membership closure of S] Ult_membs(S) :=
S + {y: u in {Ult_membs(x): x in S}, y in u}

Theorem 16: [Key property of ultimate members set] ((X in S) & (Y in X)) •imp
(Y in Ult_membs(S)).
Theorem 17: [An ordinal is its own set of ultimate members] (Ord(S)) •imp
(Ult_membs(S) = S).

THEORY ordinal_induction(o,P(x))
Ord(o) & P(o)
==>(t_thryvar)
Ord(t_thryvar) & P(t_thryvar) & (t_thryvar •incin o) &
(FORALL x in t_thryvar | not P(x))
END ordinal_induction;

THEORY finite_induction(n,P(x))
Finite(n) & P(n)
==>(m_thryvar)
(m_thryvar •incin n) & P(m_thryvar) &
(FORALL k •incin m_thryvar | ((k /= m_thryvar) •imp (not P(k))))
END finite_induction;

ENDFROZEN
```

The block E of text running from 'FROZEN' to 'ENDFROZEN' can then be included in a scenario, making the theorems and theories in it available without further verification, provided that the symbols defined in the FROZEN extract are not differently defined in the text into which E is inserted. But if as much as a single character of this block is changed, Ref will see that it conflicts with the attached signature and refuse to accept E.

Any number of disjoint extracts E can be inserted into a scenario in this way. Key elements of scenarios can therefore be certified extracts drawn from growing libraries of such extracts.

Theorems and theories made available by using extracts in this way can serve as the source of further extracts. For example, one may wish to include Theorem 16 but not 17 in some other extract. This can be done by referencing the desired theorems in the normal way inside the list_of_theorems_and_theories of a subsequent FREEZE. So Ref's FROZEN extracts can be constructed, dissassembled, and reassembled at will, in whole or in part. And of course extracts can always be reconstructed simply by reverifying the original proofs of the theorems they contain.

#### V.21 - Verification efficiency

Until a proof is complete and acceptable to the Ref verifier, you ordinarily do not want efficiency concerns to interfere with your focus on the logic of the proof. But once an initial version of the proof has been certified by Ref, and while its details are still fresh in your mind, you may want to speed up processing by modifying your initial proof appropriately. It is good practice to optimize a proof's performance immediately after it has been verified successfully. Doing this will force you to read through the proof once more and spot its redundancies. The first thing to do is to review it for unnecessary steps and eliminate these, and also to look for places in which the proof wanders unnecessarily. Such sections should be made more direct and shortened where possible.

Once your proof has been cleaned up by eliminating unnecessary steps, you can speed up its processing by supplying contexts for the most time-consuming proof steps. (This will often require introduction of extra statement labels). Ref's ' Good Proofs' table (accessible via its standard output window) is a key tool for doing this effectively. This table gives step timings for the ten most time consuming steps of each proof, even for proofs which are still incomplete and so appear in the 'Bad' list also. Each proof is represented by a line in that table, each with up to ten cells which show the longest steps of the proof. These cells carry step numbers representing the numbers of the proof lines on which the step appears ordered in the same way as their occurrences do within the scenario. These entries also carry one-character codes indicating the kind of inference steps involved, as shown in the following table.

#### 1-letter 'successful verification' field codes for inference step types.

=      EQUAL
A      ALGEBRA
D      Discharge
E      ELEM
F      SIMPLF
L      APPLY
M      Set_monot
N      Suppose_not
O      Loc_def
P      Assump
Q      theorem citation
S      statement citation
T      theorem citation
U      Suppose
Z      Use_def

Some efficiency caveats and efficiency-related Ref options. Completely automatic treatment of built-in functions like the cons-car-cdr triple (where we are using 'cons' as an alias for '[_,_]') by Ref often poses efficiency problems, since the method Ref uses for this generates multiple clauses which may force extensive branching in the search required for statement verification. For example, automatic deduction of

([x,[y,z]] = [x2,[y2,z2]]) •imp ((x = x2) & (y = y2) & (z = z2))

takes about 1 second on a 2.8Ghz Intel processor. For this reason Ref provides a few efficiency-oriented variants of the ELEM deduction primitive. These are invoked by prefixing the keyword ELEM with the kind of context indication described above, and by including various special characters in it. Including the character '*' just before the closing ')' of the prefix suppresses expansion of special functions like cons, car, cdr, i.e. causes these to be treated as unknown functions whose occurrences must be 'blobbed'. This treats statements like

([x,[y,z]] = [x2,[y2,z2]] & [x,[y,z]] = [x3,[y3,z3]] & [x,[y,z]] = [x4,[y4,z4]])

(xyz = xyz_2 & xyz = xyz_3 & xyz = xyz_4),

and so makes deduction of

[x2,[y2,z2]] = [x3,[y3,z3]]

from the conjunction shown above easy. Without suppression of detailed expansion of the pairing operations, this same deduction would require many seconds. However, this coarse treatment is plainly incapable of deducing the implication

([x,[y,z]] = [x2,[y2,z2]]) •imp ((x = x2) & (y = y2) & (z = z2))

which it sees as

(xyz = xyz_2) •imp ((x = x2) & (y = y2) & (z = z2)).

To handle such cases we must sometimes allow a longer search than Ref generally uses. Ref normally cuts off ELEM deduction searches after about 10 seconds. Including the character '+' instead of '*' in a prefix attached to ELEM raises this limit to 40 seconds. Note that an empty prefix, i.e. '( )',can be used to indicate that a statement is to be derived without additional context, i.e. that it is universally valid as it stands. Therefore the right way of obtaining the implication just displayed by ELEM deduction is to write it as

(+)ELEM ==> ([x,[y,z]] = [x2,[y2,z2]]) •imp ((x = x2) & (y = y2) & (z = z2)).

#### V.21a - Optimization of proof steps by automated context discovery

[...] To mark all the steps of a proof for the automated analysis just described, change the normal '==>' mark of its initial 'Suppose_not' to '===>'. To mark a single step of a proof for the automated analysis change its '==>' mark to '===>'. The first such mark encountered in a proof (if any) turns off the 'analyze by default' option if this has been set by marking the initial 'Suppose_not'.

### Section VI. Additional notes on the behavior (and reach) of Ref's inference procedures.

#### VI.1 - Extended multilevel syllogistic

As already said, multi-level syllogistic is an algorithm which can check for satisfiability a given formula involving individual variables (which are supposed to designate sets) along with a restricted collection of set operators and propositional connectives (no quantifiers). This algorithm can work in two modes: in one of the two, it can handle a larger collection of constructs but less efficiently. Actually, even in its weaker mode, multi-level syllogistic is confronted with an NP-complete decision problem.
The constructs which multilevel syllogistic can handle when it works in weak mode are: membership, set inclusion and equality, Boolean set-theoretic operators (intersection, union, set difference), and the propositional connectives (negation, conjunction, disjunction, implication, bi-implication). When working in strong mode, it can also handle the choice operator , the pair formation operator [-,-] and its associated projections,  and .

#### VI.1a - Pre-blobbing

Before being submitted to the blobbing phase proper, which will reduce all the constructs that may appear in it to the ones which multi-level syllogistic can handle, a formula undergoes a number of simplifications. The following table collects common simplifications which pre-blobbing recursively applies to the formulae and terms which appear within the given formula:
 Simplifiable formula/term Simplified formula/term Is_map({[anything1,anything2]: ... }) true Is1({[x,anything]: x ∈ s | P}) true 11({[anything,anything]: ... }) true 11({[[anything1,anything2],[anything2,anything1]]: ... }) true {x: x ∈ s} s {expn: iters | true} {expn: iters} {expn: iters | false} 0 IsΦ(#s) IsΦ(s) {[x,e(x)]: x ∈ s | P} ◊ {[y,ee(y)]: y ∈ ss | PP} {[y,e(ee(y))]: y ∈ ss | PP & ee(y) ∈ s & P(ee(y))} ##s #s {e_independent_of_x: x ∈ s | P} if {x: x ∈ s | P} = 0 then 0 else {e_independent_of_x} end if {e(x): iterator1,..., x ∈ 0, iteratorn... | P} 0 {e(x): x ∈ {a} | P} if P(a) then {e(a)} else 0 end if (∀ iterator1,..., x ∈ 0, iteratorn... | P) true (∃ iterator1,..., x ∈ 0, iteratorn... | P) false ∋({s}) s [expn1,expn2] expn1 [expn1,expn2] expn2 Д({[anything1,anything2]: iterator | P}) {anything1: iterator | P} Я({[anything1,anything2]: iterator | P}) {anything2: iterator | P} #({[x,e(x)]: x ∈ s}) #s IsO(s⁺) IsO(s) N×0 0 0×N 0 N•0 0 0•N 0 N+0 #N #(N×{s}) #N #({s}×N) #N 1•N #N N•1 #N N-N 0 N-0 #N

#### VI.2 - Simplification of nested setformers

[..TO BE FILLED..]

#### I.3 - Set monotonicity laws

[..TO BE FILLED..]