Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống
1
/ 17 trang
THÔNG TIN TÀI LIỆU
Thông tin cơ bản
Định dạng
Số trang
17
Dung lượng
153,23 KB
Nội dung
C H A P T E R 10 First Order Logic 10.1 Introduction In Herbrand Logic, we assume that there is a one-to-one relationship between ground terms in our language and objects in the application area we are trying to describe For example, in talking about people, we assume a unique name for each person In arithmetic, we assume a unique term for each number This makes things conceptually simple, and it is a reasonable way to go in many circumstances But not always In natural language, we often find it convenient to use more than one term to refer to the same real world object For example, we sometimes have multiple names for the same person - Michael, Mike, and so forth And, in Arithmetic, we frequently use different terms to refer to the same number, e.g 2+2, 2*2, and The reverse is also true, We sometimes find that there are objects in an application area for which we have no names For example, in Mathematics, we occasionally want to say things about uncountably many objects, such as the real numbers Unfortunately, if we have only countably many ground terms in our logical language, we cannot have names for everything The one-one correspondence between ground terms and objects is also problematic when we want to define functions and relations in a general way that applies to worlds with differing numbers of objects For example, we might want to state the fact that addition is commutative whether we're considering finite arithmetic, arithmetic over the natural numbers, or arithmetic over the real numbers Unfortunately, we cannot this in Herbrand Logic since it requires that we fix the language and, therefore, the number of objects in advance First Order Logic is an extension to Herbrand Logic that deals with these problems In many ways, it is similar to Herbrand Logic The syntax of the two languages is almost identical The proof method is almost the same The main difference is in semantics We start this chapter by introducing the idea of a conceptualization - a mathematical structure consisting of a language-independent universe of objects, together with functions and relations on that universe Then we define the syntax First Order Logic, and we define its semantics in terms of conceptualizations We present a selection of examples We present a proof system for First Order Logic and show some sample proofs We talk about soundness and completeness of our proof system And, finally, we compare Herbrand Logic and First Order Logic 10.2 Conceptualization As we shall see, the semantics of First Order Logic is based on the notion of a conceptualization of an application area in terms of objects, functions, and relations The notion of an object used here is quite broad Objects can be concrete (e.g this book, Confucius, the sun) or abstract (e.g the number 2, the set of all integers, the concept of justice) Objects can be primitive or composite (e.g a circuit that consists of many sub-circuits) Objects can even be fictional (e.g a unicorn, Sherlock Holmes, Miss Right) In short, an object can be anything about which we want to say something Not all knowledge-representation tasks require that we consider all possible objects; in some cases, only those objects in a particular set are relevant For example, number theorists usually are concerned with the properties of numbers and usually are not concerned with physical objects such as resistors and transistors Electrical engineers usually are concerned with resistors and transistors and usually are not concerned with buildings and bridges The set of objects about which knowledge is being expressed is often called a universe of discourse Although in many real-world applications, there are only finitely many elements in our universe of discourse, this need not always be the case It is common in mathematics, for example, to consider the set of all integers or the set of all real numbers or the set of all n-tuples of real numbers, as universes with infinitely many elements An interrelationship among the objects in a universe of discourse is called a relation Although we can invent many relations for a set of objects, in conceptualizing a portion of the world we usually emphasize some relations and ignore others The set of relations emphasized in a conceptualization is called the relational basis set For example, in the Blocks World described in Chapter 6, there are numerous meaningful relations As described in Chapter 6, it makes sense to think about the on relation that holds between two blocks if and only if one is immediately above the other We might also think about the above relation that holds between two blocks if and only if one is anywhere above the other The stack relation holds of three blocks that are stacked one on top of the other The clear relation holds of a block if and only if there is no block on top of it The bottom relation holds of a block if and only if that block is resting on the table One way of giving meaning to these intuitive notions in the context of a specific conceptualization of the world is to list the combinations of objects in the universe of discourse that satisfy each relation in the relational basis set For example, the on relation can be characterized as a set of all pairs of blocks that are on-related to each other If we use the symbols a, b, c, d, and e to stand for blocks, then the table below is one encoding of the on relation There is a separate row in the table for each pair of blocks in which the first element is on the second a b b c d e The generality of relations can be determined by comparing their elements Thus, the on relation is less general than the above relation, since, when viewed as a set of tuples, it is a subset of the above relation Of course, some relations are empty (e.g the unsupported relation), whereas others consist of all n-tuples over the universe of discourse (e.g the block relation) Logic allows us to capture the contents of such tables; and, more importantly, it allows us to assert relationships between these tables For example, it allows us to say that, if one block is on a second block, then that block is above the second block Also, the first block is not on the table Also, the second block in not clear Moreover, it allows us to say these things general, i.e out of the context of any particular arrangement of blocks One thing that our tabular representation for relations makes clear is that, for a finite universe of discourse, there is an upper bound on the number of possible n-ary relations In particular, for a universe of discourse of size b, there are bn distinct n-tuples Every n-ary relation is a subset of these bn tuples Therefore, an n-ary relation must be one of at most 2^(bn) possible sets A function is a special kind of relation among the objects in a universe of discourse For each combination of n objects in a universe of discourse (called the arguments, a function associates a single object (called the value) of the function applied to the arguments Note that there must be one and exactly one value for each combination of arguments Functions that are not defined for all combinations of arguments are often called partial functions; and functions for which there can be more than value for a given combination of arguments are often called multi-valued functions By and large, we will ignore partial and multi-valued functions in what follows Mathematically, a function can be viewed as a set of tuples of objects, one for each combination of possible arguments paired with the corresponding value In other words, an n-ary function is a special kind of (n+1)-ary relation For example, consider the unary function shown on the left below We can view this function as a binary relation shown on the right a b c d e → → → → → b a d c e a b c d e b a d c e In an analogous way, a binary function can be treated as a ternary relation; a ternary function can be treated as a 4-ary relation; and so forth Although the tabular representation for functions and relations is intuitively satisfying, it consumes a lot of space Therefore, in what follows, we use a different way of encoding relational tables, viz as sets of tuples, each tuple representing a single row of the corresponding table As an example, consider the on relation shown earlier In what follows, we treat the relation as the set of three 2-tuples shown below {, , } Since functions are relations, we could use the same representation However, to emphasize the special properties of functions, we use a slight variation, viz the use of an arrow in each tuple to emphasize the fact that the last element is the value of the function applied to the element or elements before the arrow Using this notation, we can rewrite the functional table shown earlier as the set shown below {a→b, b→a, c→d, d→c, e→e} Finally, before we get on with the details of First Order Logic, it is worthwhile to make a few remarks about conceptualization No matter how we choose to conceptualize the world, it is important to realize that there are other conceptualizations as well Furthermore, there need not be any correspondence between the objects, functions, and relations in one conceptualization and the objects, functions, and relations in another In some cases, changing one's conceptualization of the world can make it impossible to express certain kinds of knowledge A famous example of this is the controversy in the field of physics between the view of light as a wave phenomenon and the view of light in terms of particles Each conceptualization allowed physicists to explain different aspects of the behavior of light, but neither alone sufficed Not until the two views were merged in modern quantum physics were the discrepancies resolved In other cases, changing one's conceptualization can make it more difficult to express knowledge, without necessarily making it impossible A good example of this, once again in the field of physics, is changing one's frame of reference Given Aristotle's geocentric view of the universe, astronomers had great difficulty explaining the motions of the moon and other planets The data was explained (with epicycles, etc.) in the Aristotelian conceptualization, although the explanation was extremely cumbersome The switch to a heliocentric view quickly led to a more perspicuous theory This raises the question of what makes one conceptualization more appropriate than another for knowledge formalization Currently, there is no comprehensive answer to this question However, there are a few issues that are especially noteworthy One such issue is the grain size of the objects associated with a conceptualization Choosing too small a grain can make knowledge formalization prohibitively tedious Choosing too large a grain can make it impossible As an example of the former problem, consider a conceptualization of our Blocks World scene in which the objects in the universe of discourse are the atoms composing the blocks in the picture Each block is composed of enormously many atoms, so the universe of discourse is extremely large Although it is in principle possible to describe the scene at this level of detail, it is senseless if we are interested in only the vertical relationship of the blocks made up of those atoms Of course, for a chemist interested in the composition of blocks, the atomic view of the scene might be more appropriate For this purpose, our conceptualization in terms of blocks has too large a grain Finally, there is the issue of reification of functions and relations as objects in the universe of discourse The advantage of this is that it allows us to consider properties of properties As an example, consider a Blocks World conceptualization in which there are five blocks, no functions, and three unary relations, each corresponding to a different color This conceptualization allows us to consider the colors of blocks but not the properties of those colors We can remedy this deficiency by reifying various color relations as objects in their own right and by adding a partial function - such as color - to relate blocks to colors Because the colors are objects in the universe of discourse, we can then add relations that characterize them, e.g pretty, garish, etc Note that, in this discussion, no attention has been paid to the question of whether the objects in one's conceptualization of the world really exist We have adopted neither the standpoint of realism, which posits that the objects in one's conceptualization really exist, nor that of nominalism, which holds that one's concepts have no necessary external existence Conceptualizations are our inventions, and their justification is based solely on their utility This lack of commitment indicates the essential ontological promiscuity of logic: Any conceptualization of the world is accommodated, and we seek those that are useful for our purposes 10.3 Syntax and Semantics The syntax of First Order Logic is almost identical to that of Herbrand Logic Every syntactically legal sentence in Herbrand Logic is a syntactically legal sentence in First Order Logic The only difference is that First Order Logic has one additional type of sentence An equation is an expression of the form (σ = τ), where σ and τ are terms in the language An equation is a way of stating that two (possibly distinct) terms refer to the same object For example, to express the co-referentiality of plus(2,2) and 4, we write plus(2,2)=4 We can also distinguish terms by writing negated equations To say that plus(2,3) and refer to different objects, we write ¬(plus(2,3)=4) (which we abbreviate as plus(2,3)≠4) In what follows, it is useful to treat equations as atomic sentences In Herbrand Logic, the only atomic sentences are relational sentences; the notions are identical In First Order Logic, we have two types of atomic sentences - relational sentences and equations Other than this one small change, the languages of Herbrand Logic and First Order Logic are syntactically the same The real difference between the two logics is a semantic one In First Order Logic, object constants refer to objects in some application area; function constants refer to functions on these objects; and relations refer to relations on these objects Such interpretations are used in defining the truth or falsity of First Order Logic sentences This notion of interpretation is the source of the expressive power of First Order Logic Unfortunately, with this power comes somewhat greater complexity An interpretation i is a mapping from the constants of the language into the elements of a conceptualization We use the expression ∀i to refer to the universe of discourse corresponding to the conceptualization Each object constant in our language is mapped into an element of the universe of discourse Each function constant with arity n is mapped into an n-ary function on the universe of discourse Each relation constant with arity n is mapped into an n-ary relation on the universe of discourse As an example, consider a First Order Logic language with object constants a and b, unary function constant f, and binary relation constant r As our universe of discourse, we take the set consisting of the numbers 1, 2, and Admittedly, this is a very abstract universe of discourse It is also very small This has the advantage that we can write out examples in their entirety Of course, in practical situations, universes of discourse are likely to be more concrete and much larger The associations shown below define one interpretation of our language in terms of this universe of discourse As we did with Propositional Logic, we are using metalevel statements here Remember that these are not themselves sentences in First Order Logic ∀i = {1, 2, 3} = bi = ci = fi = {1→2, 2→1, 3→3} ri = {⟨1, 1⟩, ⟨1, 2⟩, ⟨2, 2⟩} Note that more than one object constant can refer to the same object In this case, both b and c refer to the number Also, not every object in the universe of discourse need have constant that refers to it For example, there is no object constant that denotes the number Note that the function denoted by f assigns one and only one value to every number in the universe of discourse By contrast, in the relation r, some objects are paired with several other objects, whereas some objects not appear at all A variable assignment for a universe of discourse ∀i is a mapping from the variables of the language into ∀i xi = yi = zi = A value assignment based on interpretation i and variable assignment v is a mapping from the terms of the language into ∀i The mapping must agree with i on constants, and it must agree with v on variables The value of a functional term is the result of applying the function corresponding to the function constant to the objects designated by the terms aiv = ziv = f(a)iv = f(z)iv = f(f(z))iv = The truth assignment based on interpretation i and variable assignment v is a mapping from the sentences of the language to true or false defined as follows Given an interpretation i and a variable assignment v, a relational sentence is true if and only if the tuple of objects denoted by the arguments is contained in the relation denoted by the relation constant For example, given the interpretation and the variable assignment presented above, we have the truth assignments shown below The relational sentence r(a,b) is true since ⟨1, 2⟩ is a row in ri The sentence r(z,a) is false since ⟨2, 1⟩ is not in ri r(a, b)iv = true r(z, a)iv = false Given an interpretation i and a variable assignment v, an equation (σ = τ) is true if and only if the object denoted by σ is the same as the object denoted by τ, i.e if and only if σiv is identical to τiv For example, given the interpretation and the variable assignment presented above, we can see that the object constant a and the term f(b) have the same value assignment and hence the equation (a = f(b)) is true (a = f(b))iv = true The conditions for truth of logical sentences in First Order Logic are analogous to those for the truth of logical sentences in Propositional Logic A negation ¬φ is true if and only if φ is false A conjunction (φ ∧ ψ) is true if and only if both φ and ψ are true A disjunction (φ ∨ ψ) is true if and only if φ is true or ψ is true (or both) An implication is true unless the antecedent is true and the consequent is false A reduction is true unless the antecedent is true and the consequent is false An equivalence is true if and only if the truth values of the two embedded sentences are the same Intuitively, a universally quantified sentence is satisfied if and only if the embedded sentence is satisfied for all values of the quantified variable Intuitively, an existentially quantified sentence is satisfied if and only if the embedded sentence is satisfied for some values of the quantified variable Sadly, expressing this intuition rigorously is a little tricky First, we need the notion of a version of a variable assignment Then, we can give a rigorous definition to our intuition by talking about some variations or all variations of a given variable assignment A version v:ν/x of a variable assignment v is a variable assignment that assigns object x as value of the variable ν and agrees with v on all other variables Suppose, for example, we had the variable assignment v shown below xv = yv = zv = The version v:y/3 is shown below xv:y/3 = yv:y/3 = zv:y/3 = With this notation, we can formalize the semantics of quantified sentences as follows An interpretation i and a variable assignment v satisfy a universally quantified sentence ∀ν.φ if and only if i and v satisfy φ for all versions of v that can be defined over the same universe of discourse of i An interpretation i and a variable assignment v satisfy an existentially quantified sentence ∃ν.φ if and only if i and v satisfy φ for at least one version of v that can be defined over the same universe of discourse of i An interpretation i and a variable assignment v are said to satisfy a sentence φ (written |=i φ[v]) if and only if the sentence φ is assigned the value true A sentence is satisfiable if and only if there is an interpretation and a variable assignment that satisfy it A sentence is valid if and only if it is satisfied by every interpretation and variable assignment A sentence is unsatisfiable if and only if it is satisfied by no interpretation and variable assignment An interpretation i is a model of a sentence φ (written |=i φ) if and only if the interpretation satisfies the sentence for every variable assignment (in other words, if |=i φ[v] for every variable assignment v) Note that, if an interpretation satisfies a closed sentence for one variable assignment, then it satisfies the sentence for every variable assignment (i.e it is a model) Note also that, if an interpretation i satisfies an open sentence φ with free variable ν for every variable assignment, then it is a model of ∀ν.φ 10.4 Blocks World Let's revisit the Blocks World introduced in Chapter This time around, let us say we have a vocabulary with six object constants, viz a, b, c, d, e, and f; and let us assume that we have just one binary relation constant on Now, let's consider the following ground literals describing the on relation As before, the positive literals are shown in black, and the negative literals are shown in grey in order to distinguish the two more easily ¬on(a,a) ¬on(b,a) ¬on(c,a) ¬on(d,a) ¬on(e,a) ¬on(f,a) on(a,b) ¬on(b,b) ¬on(c,b) ¬on(d,b) ¬on(e,b) ¬on(f,b) ¬on(a,c) on(b,c) ¬on(c,c) ¬on(d,c) ¬on(e,c) ¬on(f,c) ¬on(a,d) ¬on(b,d) ¬on(c,d) ¬on(d,d) ¬on(e,d) ¬on(f,d) ¬on(a,e) ¬on(b,e) ¬on(c,e) on(d,e) ¬on(e,e) ¬on(f,e) ¬on(a,f) ¬on(b,f) ¬on(c,f) ¬on(d,f) on(e,f) ¬on(f,f) In Herbrand Logic, a set of six object constants fixes a universe of six objects As a result, the set of sentences shown above completely defines the the on relation There is exactly one Herbrand Logic truth assignment for our language that satisfies our sentences By contrast, in First Order Logic, the set of six object constants are simply six names that refer to objects in some universe but not by themselves constrain the universe in any way As a result, in First Order Logic, the set of sentences is satisfied by more than one interpretation That is, the set of sentences accurately describes more than one arrangement of blocks and does not specify which one Here we look at several examples of interpretations for this language First, we see an interpretation i in which the universe consists of six blocks, denoted here by labelled squares Each of our object constants refers to a distinct block Constant a refers to the block labeled A; constant b refers to the block labeled B; and so forth The relation constant on refers to a set of 2-tuples of blocks ∀i = { A , B , C , D , E , F } = A bi = B ci = C di = D ei = E fi = F oni = {⟨ A , B ⟩, ⟨ B , C ⟩, ⟨ D , E ⟩, ⟨ E , F ⟩} Of course, there is nothing requiring that the objects be named in this fashion The constants could refer to different blocks, as in the following interpretation Although the interpretations of the object constants are different, the sentences still hold because the interpretation of the relation constant is also different ∀i = { A , B , C , D , E , F } = F bi = E ci = D di = C ei = B fi = A oni = {⟨ F , E ⟩, ⟨ E , D ⟩, ⟨ C , B ⟩, ⟨ B , A ⟩} In First Order Logic, there need not be a 1-1 relationship between object constants and objects in the universe of discourse of an interpretation Some objects can have more than one name Some objects might have no names at all The following interpretation illustrates this There are just four blocks in this interpretation Object constant a refers to the same object as d; Object constant b refers to the same object as e; and object constant c refers to the same object as f Moreover, there is a block with no name whatsoever ∀i = { A , B , C , D } = A bi = B ci = C di = A ei = B fi = C oni = {⟨ A , B ⟩, ⟨ B , C ⟩} The following is an example of an interpretation that does not satisfy our sentences In this case, the interpretation fails because the block named a is not on the block named b; and so the sentence on(a,b) is false ∀i = { A , B , C , D , E , F } = A bi = B ci = C di = D ei = E fi = F oni = {⟨ B , C ⟩, ⟨ D , E ⟩, ⟨ E , F ⟩} The following is another example of an interpretation that does not satisfy our sentences In this case, the interpretation fails because the block named c is on the block named d; and so the sentence ¬on(c,d) is false ∀i = { A , B , C , D , E , F } = A bi = B ci = C di = D ei = E fi = F oni = {⟨ A , B ⟩, ⟨ B , C ⟩, ⟨ C , D ⟩, ⟨ D , E ⟩, ⟨ E , F ⟩} While there are many different interpretations that satisfy these sentences, in a sense all have the same structure In every Blocks World interpretation, there is at least one stack of three blocks resting on the table In some interpretations there can be multiple stacks, as illustrated above, and there can be other arrangements However, to satisfy our sentences, an interpretation must ensure that there is at least stack of three blocks resting on the table Before leaving this example, it is worth noting that First Order Logic is even more abstract than these examples might suggest In particular, there is nothing requiring that objects in the universe be blocks They could equally well be natural numbers or people or buildings All that is required is that the on relation among these objects have the equivalent of a stack of three blocks 10.5 Arithmetic In the introduction to this chapter, we mentioned as motivation the desire to write down properties satisfied by a class of arithmetic structures without first fixing the space of objects Here we consider several properties of addition We then examine some interpretations that satisfy the properties and some interpretations that not Consider the language with the object constant 0, the function constant s, and the relation constant plus The sentences below describe the properties of addition we saw in Section 6.7 (except that the same relation is replaced with equality) ∀y.plus(0,y,y) ∀x.∀y.∀z.(plus(x,y,z) ⇒ plus(s(x),y,s(z))) ∀x.∀y.∀z.∀w.(plus(x,y,z) ∧ ¬(z=w) ⇒ ¬plus(x,y,w)) Modular arithmetic with modulus ∀i = {0, 1} 0i = si = {0→1, 1→0} plusi = {⟨0,0,0⟩,⟨0,1,1⟩,⟨1,0,1⟩,⟨1,1,0⟩} Modular arithmetic with modulus ∀i = {0, 1, 2} 0i = si = {0→1, 1→2, 2→0} plusi = {⟨0,0,0⟩,⟨0,1,1⟩,⟨0,2,2⟩,⟨1,0,1⟩,⟨1,1,2⟩,⟨1,2,0⟩,⟨2,0,2⟩,⟨2,1,0⟩,⟨2,2,1⟩} Addition over the natural numbers N ∀i = {0, 1, 2, 3, 4, } 0i = si = {0→1, 1→2, 2→3, 3→4, } plusi = {⟨m, n, k⟩ | m, n, k ∈ N and k is the sum of m and n} Nonstandard arithmetic over two objects, where 1+1=1 ∀i = {0, 1} 0i = si = {0→1, 1→0} plusi = {⟨0,0,0⟩,⟨0,1,1⟩,⟨1,0,1⟩,⟨1,1,1⟩} Here is an example of an interpretation that does not satisfy our sentences This is the same as the interpretation for arithmetic with modulus shown above except that the tuple ⟨0,0,0⟩ is missing The interpretation fails to satisfy our sentences because the sentence ∀y.plus(0,y,y) is false ∀i = {0, 1} 0i = si = {0→1, 1→0} plusi = {⟨0,1,1⟩,⟨1,0,1⟩,⟨1,1,0⟩} The following interpretation is another non-example It is the same as arithmetic with modulus except that it has an additional tuple It does not satisfy our sentences because (among other things) the sentence ∀x.∀y.∀z.∀w.(plus(x,y,z) ∧ ¬(z=w) ⇒ ¬plus(x,y,w)) is false ∀i = {0, 1} 0i = si = {0→1, 1→0} plusi = {⟨0,0,0⟩,⟨0,0,1⟩,⟨0,1,1⟩,⟨1,0,1⟩,⟨1,1,0⟩} 10.6 Properties of Sentences Although the semantics of Herbrand Logic and First Order Logic are different, many of the key concepts of the two logics are the analogous Notably, the concepts of validity, contingency, unsatisfiability, and so forth have essentially the same definitions in First Order Logic as in Herbrand Logic A sentence is valid if and only if it is satisfied by every interpretation A sentence is unsatisfiable if and only if it is not satisfied by any interpretation, i.e no matter what interpretation we take, the sentence is always false A sentence is contingent if and only it is neither valid nor unsatisfiable, i.e there is some interpretation that satisfies the sentence and some interpretation that falsifies the sentence A sentence is satisfiable if and only if it is either valid or contingent A sentence is falsifiable if and only if there is either unsatisfiable or contingent 10.7 Logical Entailment A set of First Order Logic sentences Δ logically entails a sentence φ (written Δ |= φ) if and only if every interpretation that satisfies Δ also satisfies φ As with validity and contingency and unsatisfiability, this definition is essentially the same for First Order Logic as for Propositional Logic and Herbrand Logic 10.8 Proofs Formal proofs in First Order Logic are very similar to formal proofs in Herbrand Logic The only differences are that the induction rules are removed and that equality rules included The Fitch system for First Order Logic (First Order Fitch) consists of the usual ten logical rules from Propositional Logic, the four quantifier rules from Herbrand Logic, and two new equality rules, which are introduced below The Equality Introduction (QI) rule allows the insertion of an arbitrary instance of reflexivity Equality Introduction σ=σ where σ is any term For example, without any premises whatsoever, we can write down equations like (a=a) and (f(a)=f(a)) and (f(x)=f(x)) The Equality Elimination rule tells us that, when we have an equation and a sentence containing one or more occurrences of the one of the terms in the equation, then we can deduce a version of the sentence in which the that term is replaced by the other term in the equation Equality Elimination φ[τ1] τ1=τ2 φ[τ2] where τ2 is substitutable for τ1 in φ In order to avoid the unintended capture of variables, QE requires that the replacement must be substitutable for the term being replaced in the sentence This is the same substitutability condition that adorns the Universal Elimination rule of inference Note that the equation in the Equality Elimination rule can be used in either direction, i.e an occurrence of τ1 can be replaced by τ2 or an occurrence of τ2 can be replaced by τ1 Note that Fitch for First Order Logic does not include the Domain Closure rule (DC) or the induction rules introduced in chapter In Propositional Logic, provability and logical entailment are identical A set of premises logically entails a conclusion if and only if the conclusion is provable from the premises The analogous result holds in First Order Logic Fitch for First Order Logic is sound for First Order Logic That is, from any given set of premises, if a conclusion is provable using First Order Fitch, then the conclusion is entailed by the premises under First Order Logic In other words, the following soundness theorem holds Soundness Theorem: Given a set of FOL sentences Σ and an FOL sentence φ, if Σ |-FOL φ then Σ |=FOL φ Conversely, Fitch for First Order Logic is also complete for First Order Logic That is, if a conclusion is entailed by a set of premises under First Order Logic, then the conclusion is provable from the set of premises using First Order Fitch In other words, the following completeness theorem holds Completeness Theorem: Given a set of FOL sentences Σ and an FOL sentence φ, if then Σ |=FOL φ then Σ |-FOL φ 10.9 Examples - Equality Suppose we know that b=a and we know that b=c The following is a proof that a=c b=a Premise b=c Premise a=c Equality Elimination 2, For another example, suppose that we believe f(a)=b and f(b)=a Let's prove that f(f(a))=a A proof of this conclusion using First Order Fitch is shown below f(a)=b Premise f(b)=a Premise f(f(a))=a Equality Elimination 2, Now, let's look at a couple of examples that illustrate how equational reasoning interacts with relational reasoning We know that Pat is the father of Quincy, and we know that fathers are older than their children Our job is to prove that Pat is older than Quincy Our proof is shown below As usual, we start with our premises The father of Quincy is Pat, and fathers are older than their children Next, we use Universal Elimination to instantiate our quantified sentence The father of Quincy is older than Quincy Next we use Equality Elimination to replace father(quincy) with pat and thereby derive the conclusion that Pat is older than Quincy father(quincy) = pat Premise ∀x.older(father(x),x) Premise older(father(quincy),quincy) Universal Elimination older(pat,quincy) Equality Elimination 3, Finally, let's work through an example involving a disjunction of equations We know that p(a) is true and p(b) is true, and we know that (a=c ∨ b=c) is true Our job is to prove p(c) The proof follows As always, we start with our premises On line , we start a new subproof with the assumption that a=c is true From this assumption and our first premise, we conclude that p(c) must be true Of course, we are not done yet, since we have proved the result only under this assumption that a=c We make this clear with a use of Implication Introduction to derive the sentence (a=c ⇒ p(c)) Now, we start another subproof, this time with the assumption b=c As before, we derive p(c); and, from this, we derive the implication (b=c ⇒ p(c)) Finally, we use Existential Elimination to combine our two partial results with our disjunction of equations p(a) Premise p(b) Premise a=c ∨ b=c Premise a=c Assumption p(c) Equality Elimination: 1, a=c ⇒ p(c) Implication Introduction: b=c Assumption p(c) Equality Elimination: 2, b=c ⇒ p(c) Implication Introduction: 10 p(c) Or Elimination: 3, 6, 10.10 Example - Arithmetic Earlier, we mentioned the desire to define a class of arithmetic structures without fixing the size of the universe In the following, we consider a natural problem in this context Suppose we have a class of arithmetic structures defined only by the two properties that there is a plus operator that is both commutative (i.e., ∀x.∀y.(x + y = y + x)) and associative (i.e ∀x.∀y.∀z (x + (y + z) = (x + y) + z)) Modular addition over any finite modulus, addition over the natural numbers, addition over the real numbers, and many other arithmetic structures all satisfy these property Our problem is to prove that in any arithmetic structure satisfying these two properties, the sum of two even numbers is also even, where even is defined as being the sum of two identical numbers Our goal is to prove the conclusion ∀u.∀v.(even(u) ∧ even(v) ⇒ even(u + v)) We start with the following three premises The first two state the two properties that all arithmetic structures in the class satisfy The third is the definition of even ∀x.∀y.(x + y = y + x) Premise ∀x.∀y.∀z.(x + (y + z) = (x + y) + z) Premise ∀x.(even(x) ⇔ ∃y.y + y = x) Premise even(u) ∧ even(v) Assumption ∃y.y + y = u UE, BE, AE, IE: 3,4 ∃y.y + y = v UE, BE, AE, IE: 3,4 y1 + y1 = u Assumption y2 + y2 = v Assumption u+v=u+v QI 10 (y1 + y1) + (y2 + y2) = u + v x UE, QE: 2,9 11 y1 + (y1 + (y2 + y2)) = u + v x UE, QE: 2,10 12 y1 + ((y1 + y2) + y2) = u + v x UE, QE: 2,11 13 y1 + ((y2 + y1) + y2) = u + v x UE, QE: 1,12 14 y1 + (y2 + (y1 + y2)) = u + v x UE, QE: 2,13 15 (y1 + y2) + (y1 + y2) = u + v x UE, QE: 2,14 16 ∃y.y + y = u + v EI: 15 17 even(u + v) UE, BE, IE: 3,16 18 y2 + y2 = y ⇒ even(u + v) II: 17 19 ∀y.(y + y = v ⇒ even(u + v)) UI, UE, UI: 18 20 y1 + y1 = u ⇒ ∀y.(y + y = v ⇒ even(u + v)) 21 ∀y.(y + y = u ⇒ ∀y.(y + y = v ⇒ even(u + v))) UI, UE, UI: 20 22 ∀y.(y + y = v ⇒ even(u + v)) EE: 5,21 23 even(u + v) EE: 6,22 II: 19 24 even(u) ∧ even(v) ⇒ even(u + v) II: 23 25 ∀u.∀v.(even(u) ∧ even(v) ⇒ even(u + v)) x UI: 24 10.11 Herbrand Logic Versus First Order Logic Now that we have presented the semantics and proof systems for both First Order Logic and Herbrand Logic, we can talk about deciding which logic to use in a given situation We start by comparing the two logics with respect to several theoretical properties For any given set of equality-free sentences Σ, we know that every satisfying truth assignment in Herbrand Logic corresponds to a satisfying First Order Logic interpretation of Σ As a result, if an equality-free sentence φ is satisfied in all of the First Order Logic interpretations that satisfy Σ, then φ is also satisfied in all the Herbrand Logic truth assignments of Σ From the definition of logical entailment, we obtain the following theorem that whenever a sentence is entailed under First Order Logic, it is also entailed under Herbrand Logic That is, Herbrand Logic is deductively more powerful than First Order Logic Theorem 10.3: Given a set of equality-free sentences Σ and an equality-free sentence φ, if Σ |=FOL φ then Σ |=HL φ A similar result holds for sentences with equality when we extend Herbrand Logic to incorporate equality We also know that the same set of sentences conveys more information in Herbrand Logic than in First Order Logic in the sense that the set of sentences admits fewer satisfying worlds and entails more conclusions However, in a given situation, which logic should we use? In general, we should use the logic that allows us to simply and accurately model the information available In the case that the set of objects is known and fixed in advance, Herbrand Logic is the natural choice because it is simple and accurate in modeling the information In many applications, the exactly set of objects is not fixed, but a "covering" set of identifiers can be used instead For example, United States presidents is not fixed, but we can fix the Herbrand universe {a, s(a), s(s(a)), } to stand for all actual and potential presidents For another example, new English essays are written everyday, but we can fix a Herbrand universe to stand for all existing and potential English essays Herbrand Logic continues to make sense in these situations because of its simplicity and deductive power Finally, there are cases when not even a covering set of identifiers can be fixed in advance For example, consider the class of all arithmetic structures where the plus operator is commutative In this class the size of the underlying set of objects can be arbitrarily large, exceeding any fixed set of identifiers (even an infinite set of identifiers) In these cases, Herbrand Logic is inappropriate because it cannot accurately model the information available First Order Logic, which imposes no restrictions on the set of objects, is the more appropriate choice We stated in section 10.12 that First Order Fitch is sound and complete for First Order Logic In Herbrand Logic, we have the analogous soundness theorem but not the completeness theorem Theorem 10.4 (Soundness of Herbrand Fitch): Given a set of sentences Σ and a sentence φ, if Σ |-HL φ then Σ |=HL φ Theorem 10.5 (Incompleteness of Herbrand Fitch): There exists a set of sentences Σ and a sentence φ such that Σ |=HL φ but it is not the case that Σ |-HL φ The upshot is that Herbrand Logic itself is expressively powerful but the proof system for Herbrand Logic is not as powerful as its semantics Even so, the proof system for Herbrand Logic remains more powerful than the proof system for First Order Logic Theorem 10.6: Given a set of equality-free sentences Σ and an equality-free sentence φ, if Σ |-FOL φ then Σ |-HL φ A similar result holds for sentences with equality when we include the two equality rules into Fitch for Herbrand Logic Recap First Order Logic is an extension to Herbrand Logic that allows one to write sentences without fixing the size of the universe in advance The syntaxes of the two logics are the same (except for the inclusion of equality in First Order Logic) The main difference is in the semantics First Order Fitch is the same as Herbrand Fitch except that the Domain Closure and Induction rules are removed and Equality Introduction and Equality Elimination are added First Order Fitch is sound and complete for First Order Logic Although Herbrand Fitch is not complete for Herbrand Logic, it is capable of proving everything that can be proved by First Order Fitch in the absence of equality Moreover, when extended with equality rules, it is capable of proving everything that can be proved by First Order Fitch