# Full text of "Set Theory"

## See other formats

UMIVERSITEIT-STELLEIMBOSCH-UIMIVERSITY jou kennisvennoot • your knowledge partner SET THEORY P. Ouwehand Department of Mathematical Sciences Stellenbosch University 2012 i ii ZFC ZFC 0. Set Existence: There is a set. 3x{x = x) ZFC 1. Extensionality: If sets x, y have the same elements, then x = y. Vx Vy [V2;(2; Gx-H-zGy)— >-a; = y] ZFC 2.tp Separation Schema: If P{x) is a property of sets (with parameter p) describable by a first-order sentence ip{x,p), then for any set Z and any p there is a set F = {x G Z : P{x)} which consists precisely of those elements of Z which have the property P. yzyp^y^x^xEy-^-^xEzA if{x,p)) ZFC 3. Pairing: For any x,y there is a set z = {x,y} whose elements are precisely x and y. Vx Vy 3z [Vit; {w E z -(-^ w = x'V w = y)] ZFC 4. Union: For every set x there is a set y = |J x which is the union of all the elements of x. Vx 3y Vz G y ■<->■ 3w G x (z G w)] ZFC 5. Power Set: For any set x there is a set y = V{x) whose elements are precisely all the subsets of X. Vx 3y G y -H- z C x) ZFC 6. Infinity: There is an infinite set. 3x [0 G X A Vy (y G X ^ y U {y} G x)] ZFC 7. Choice: Every family of non-empty sets has a choice function: If x is a family of non-empty sets, then there is a function f on x which chooses from each it; G x an element f{w) G w. Vx \iw G X ^ 0) ^ 3/ ( / is a function A dom(/) = x A ^/w G x {f{w) G w))] ZFC 8.,^ Replacement Schema: A relation R{u, v) between sets which is describable by a first- order sentence (^(n, v,p) (with parameter p) is said to be functionlike if for every u there is at most one v such that R{u, v). If R is functionlike and x is a set, then the image z = R[x] of x under i? is a set. (yu Vv Vu; [ip{u, v,p) A <^(n, — >■ t; = w]^ — >■ Vx 3y V?; (?; G y ■<->■ 3n G x (^(-u, ZFC 9. Foundation: Every non-empty set has a G-minimal element. Vx [x 7^ -)• 3y G X (y n x) = 0] iii These notes are for a short course in set theory at the undergraduate level at Stellenbosch University. No pretense at orignality is claimed. Though amplified by material from a number of additional sources, the debt to the first few chapters of the book Set Theory, by Thomas Jech, Springer 2003, should be easily discernible. iv Contents 1 The Axioms of Set Theory 1.1 Introduction! 1.2 Naive Set Theory 1.3 The Iterative Conception of Set 1.4 The Axioms of ZF(]I 1.5 Sets and ClassesI 12 Relations and FunctionsI 12.1 Ordered Pairs and Finite Cartesian ProductsI 12.2 Relationsl 12.3 FunctionsI 2.4 Equivalence Relations 12.5 Order Relationsl 13 Natural NumbersI 3.1 The Set uj of Natural NumbersI 3.2 ((J, g) is a Transitive Well-Ordered Set . . . . 3.3 RecursionI 13.4 Finite and Infinite SetsI 14 Ordinalsl 4.1 Well-Orderings 14.2 Ordinalsl 14.3 Transfinite Induction and Recursion! 14.4 Ordinal Arithmetic! 14.4.1 Limits! 14.4.2 Addition! 4.4.3 Multiplication 4.4.4 Exponentiation 4.5 Project: Goodstein's Theorem 15 Cardinals I 5.1 Cardinality: Basic Definitions 15.2 Cardinal Arithmetic in ZF! 5.3 Alephs 5.4 The Canonical Well-Ordering of ON x ON 5.5 Arithmetic of Alephs V vi CONTENTS 5.6 Project: Cardinal Arithmetic and The Axiom of Choice 79 The Axiom Of Choicel 83 83 6.2 Well-Ordering] 85 6.1 Choice Functions, Partitions and Cartesian Products 6.3 Zorn's Lemma and Other Maximal Principles] 87 90 95 96 6.4 Cardinal Arithmetic in ZFCl 6.5 Length, Area, Volume and (AC) 6.6 Project: The Banach-Tarski Theorem Chapter 1 The Axioms of Set Theory 1.1 Introduction Every mathematician needs to know some set theory. "Set theory", according to (the set theorist) Kenneth Kuncn, "is a theory of everything". AH of mathematics can be done within the framework of set theory. This means that all mathematical objects can be interpreted as sets, and all mathematical statements about these objects can be phrased in the language of sets. Not only can this be done: Typically, it is done. We assume here that, to some extent, you are already aware of this. For example, a group is an ordered tuple {G, , e), where • G is a set. • ■ : G X G ^ G is a function which plays the role of group multiplication. • T-Gisa function which plays the role of multiplicative inverse • e G G is an element which plays the role of multiplicative identity element. You probably know that a function / : ^4 — t- i? is a set: It a subset of the Cartesian product A X B, where we regard (a, 6) G / if and only if /(a) = b. The Cartesian product A x B of two sets is itself a set, namely the set of ordered pairs (a, b) where a & A and b G B. An ordered pair (a, b) can be interpreted as a set also, namely (a, b) := {{a}, {a, b}}. Ordered tuples of higher order can be defined iteratively, (a, 6, c) := ((a, 6), c) = {{(a, 6)}, {(a, 6), c}} = {{{{a}, {a, 6}}}, {{{a}, {a, 6}}, c}} The group laws can then be written as equality of certain functions: For example, the asso- ciative law {ab)c = a{bc) is the assertion that the functions given by the compositions G X G X G ^ G X G ^ G : (a,b,c) ^ {ab, c) {ab)c and G X G X G ^ G X G ^ G : {a,b,c) ^ {a,bc) ^ a{bc) are equal. Since functions are sets, the associative law for G is equivalent to the assertion that two sets are equal. In principle, everything about groups can be formulated in the language of sets. It will look horrendous, it is true, but it can be done. Thus the whole group (G, , e) 1 Set Theory 2 is a set — not just the base set G on which the group operations are defined — the whole object {G, -'^ , e). And each element of (G, , e) is a set,. . . and each element of an element of (G, -'^ , e) is a set, and. . . , In the same way all the objects you encounter in mathematics — natural numbers, com- plex numbers, topological spaces, differentiable manifolds, tangent bundles. Boolean algebras, measure spaces, stochastic processes, pseudo-differential operators — can be interpreted as sets. We do not say that these objects are sets; indeed, we prefer to completely avoid meta- physical speculation about what mathematical objects might be, and in what sense they might be said to exist. We simply mean that we can construct sets that behave like these mathemat- ical objects. This kind of thing is not new to you: You're probably quite used to regarding real numbers as points on a line. But a point is not the same thing as a real number. "A point," says Euclid, "is that which has no parts". A real number, on the other hand, is an element of a complete ordered field. But, thought of in the right way, the points on a line can be said to behave like the real numbers, and thus can be interpreted as real numbers. So we will try to show that all of mathematics can be done in the set-theoretic universe. Many of the set-theoretic ideas required are rather straightforward, and can be developed without recourse to the axiomatic method (though every mathematician must be aware of the potential pitfalls associated with a purely "naive " view of sets) . But apart from its intrinsic importance as a fundamental vehicle for for doing other math- ematics, set theory is also an independent branch of mathematics in its own right, regarded as one of the four main branches of mathematical logic. In other words, set theory — like group theory or number theory — is worth pursuing for its own sake: "Set theory is the mathemat- ical science of the infinite" , according to the set theorist Thomas Jech. You probably already know that the natural numbers N and the rational numbers Q both form countable sets — i.e. they have the same size (cardinality): |N| = |Q|. If you know that, you probably also know that the set of real numbers M is uncountable, and thus that infinity comes in different sizes, with |N| < |M|. Some historians date the birth of the autonomous subject Set Theory to that day (7 December 1873) that Georg Cantor — who is regarded as the creator of the subject — wrote to Richard Dedekind to inform him of this fact. A short while later, in 1878, Cantor framed the Continuum Problem: Is there a set A whose size is strictly between that of the natural numbers and the real numbers, i.e. such that |N| < |^| < |M|? Cantor was unable to settle this question, though he hypothesized that there is no such set A — the Continuum Hypothesis. The M^er-mathematician David Hilbert was one of the first to appreciate the power of set theory, and its importance to general mathematics. Accordingly the Continuum Problem was first on the famous and influential list of 23 problems that he presented at the Second International Congress of Mathematicians in Paris, 1900. This innocuous-appearing question initially resisted all efforts, and was finally settled in a rather surprising way: It was proved by Kurt Godel in 1938 that the Continuum Hypothesis cannot be disproved, and then by Paul Cohen in 1963 that the Continuum Hypothesis cannot be proved! In order to obtain such results, it was necessary to write down a precise list of axioms for set theory — the Zermelo-Praenkel Axioms — so that there would be agreement on what constitutes a valid set-theoretic proof. What Godel and Cohen proved was that the Continuum Hypothesis is independent of these axioms. (There remains open the possibility that mathematicians of the future will discover a new plausible axiom for set theory, which will then settle the Continuum Problem in one way or the otheirl) Set theory is riddled with problems like these, where it ^Indeed, some mathematicians believe that the Continuum Hypothesis was recently settled by the set 3 Axioms of Set Theory can be proved that the problem is unsolvabl^ Such independence proofs require methods from logic (model theory, proof theory, recursion theory), and thus an axiomatic approach is here indispensable. This chapter is concerned with the foundations of set theory, and a little philosophjj^ will be unavoidable. In fact, the title of the first ever monograph dedicated to the sub- ject — Cantor's Grundlagen einer allgemeinen Mannigfaltigkeitslehre: Ein mathematisch- philosophischer Versuch in der Lehren des Unendlichen, 1883 — makes clear that, in set theory, mathematics and philosophy were inextricably intertwined right from the start. Exercise 1.1.1 All active branches of mathematics have problems that are unsolved. It is usually tacitly agreed that such an unproved statement is either true or false. But set theory has problems that are provably unsolvable. This leads to a philosophical question: Do you believe that the Continuum Hypothesis must be either true or false — we just do not know which? Or do you believe that it simply doesn't make sense to ask whether the Continuum Hypothesis is true or false? Can you back up your belief with an argument? □ 1.2 Naive Set Theory Classifying objects into sets is a basic operation of the human mind. It is more basic even than counting: Before you can count some objects, you first have to decide which objects to count, i.e. you first have to mentally group them together, and separate them from objects you do not want to count. Nothing seems more natural, therefore, than to collect objects which share a common property into a set. Naive Set Theory (NST) is a formal theory which allows us to do just that. It has the following axioms: NST 1. Extensionality: If two sets X,Y have the same elements, then X = Y. NST 2.p Full Separation: If P(x) is a property of sets, then there is a set Y = {x: P{x)} which consists precisely of those x which possess the property P. Observe that the Full Separation Axiom is not a single axiom, but an axiom schema: We have an axiom for every possible property P(-) (though right now we won't be too precise about exactly what constitutes a "property"). Thus there are infinitely many axioms. The first axiom states that a set is determined by its members, and by nothing else. In a sense, it tells us what it means to be a set. The second axiom allows us to show that many sets actually exist. For example: theorist Hugh Woodin, and that it is false. Others dispute this. ^Assuming that the axioms of set theory are consistent, that is. But Godel proved that we cannot prove that the axioms of a set theory are consistent without assuming the consistency of some more powerful theory. In other words, it is impossible to prove from the axioms of set theory that set theory is consistent. ''l highly recommend browsing the online Stanford Encyclopedia of Philosophy, http: //plato . Stanford, ledu for good introductions to the philosphy of mathematics, and much more besides. Set Theory 4 • If P{x) = {x x), then {x : P{x)} is a set with no elements (since x = x for any x). By the Axiom of Extensionahty any two empty sets are equal, because they have exactly the same elements. Hence a unique empty set exists, and we denote it by 0. • If P(x) = (x = x), then {x : P{x)} is the set of all sets: Any set x is equal to itself. • If P{x) = (x = a V X = 6), then {x : P{x)} is the set {a, b}. • If P{x) = {\/z £ X {z £ a)), then {x : P{x)} is the power set of a, i.e. the set of all subsets of a. • If P{x) = (x is a real number > 0), then {x : P{x)} is the set of all real numbers greater or equal to zero. (At this point, we do not yet know that there are any real numbers, so it is at this stage possible that this set is empty.) Towards the end of the 19th Century, mathematicians became increasingly more concerned with the foundations of mathematical knowledge, and the status of mathematical "reality" . Some logicians — amongst whom were Gottlob Frege and Bertrand Russell — tried to reduce mathematics to logic. In June 1902, Russell wrote a letter to Frege, warmly expressing his appreciation for the latter's work, but adding that "There is just one point where I have encountered a difficulty" . Russell asked Frege to consider the property P{x) = {x ^ x) If y = {x : X ^ x} exists (which, according to the Full Separation Axiom, it must), then there are two possibilities: Either y £ y or y ^ y. (i) If y G y, then P{y) must hold (since y consists of exactly those sets which have property P). In that case y ^ y — contradiction. (ii) li y ^ y, then certainly P{y) holds. So y is one of the sets that has property P, and therefore y £ {x : P{x)}, i.e. y £ y — contradiction. Thus the theory NST is inconsistent! Russell's paradox had devastating consequences for Frege's programme to reduce arithmetic to logic. Russell's paradoi^ (also called Russell's antinomy) is just the simplest example of the problems that may arise from a too liberal notion of set. Cantor himself was aware of such problems by 1899 at the latest, and dealt with this by denying the status of sethood to certain collections (multiplicities). For a multiplicity can be such that the assumption that all of its elements 'are together' leads to a contradiction, so that it is impossible to conceive of the multiplicity as 'one finished thing'. Such multiplicities, I call absolutely infinite or inconsistent multiplicities. As we can readily see, the 'totality of everything thinkable ', for example, is such a multiplicity; later still other examples will turn up. If on the other hand, the totality of the elements of a multiplicity can be though of without contradiction as 'being together', so that they can be gathered together into 'one thing', I call it a consistent multiplicity or set. — Cantor, letter to Dedekind, 1899. ^Apparently, Ernst Zermelo was aware of Russell's paradox some years before. But it was Russell who understood its devastating consequences, and who brought it to the attention of the wider world. 5 Axioms of Set Theory Thus, according to Cantor, the 'totahty of everything thinkable' is inconsistent: There is no set of all set^ If we can't just lump anything thinkable together into a set, then what can we do? A number of different approaches have been suggested. Some mathematicians and philosophers thought that Russell's paradox indicated that there were problems not just with an intuitive notion of set, but with logic itself. (It is easy to rephrase Russell's paradox in purely logical term^. The intuitionists, for example, thought that the basic laws of logic do not apply to infinite collections. Amongst others, they denied the validity of the Law of the Excluded Middl^ Other mathematicians thought that classical logic was inherently sound, and that one need merely analyse the notion of set with greater care and precision, if one wants to avoid contradictions and inconsistencies. This is the approach we will take. But avoiding an inconsistent set theory is harder than one might think. Firstly, a number of set theories have been proposed with the aim of avoiding Russell's paradox, but which subsequently turned out to be inconsistent after all — notably the system ML (Mathematical Logic) proposed in 1940 by the great logician Willard van Orman Quine. And many of these set theories are artificial: They are, as Russell commented, not "such as even the cleverest logician would have thought of if he had not known of the contradictions." In the next section, we will look at an intuitively compelling view of the set-theoretic universe, in which sets are built up in stages, with each set made up from simpler sets. This picture of the set-theoretic universe then suggests certain axioms for set formation, the Zermelo-Fraenkel Axioms, or ZF. If ZF set theory is consistent, then we may never know it: Godel's 2""^ Incompleteness Theorem states (roughly) that a theory that is strong enough to contain basic arithmetic cannot prove its own consistency. If ZF is inconsistent, then one day we may find a contradiction. However, ZF has been used for nearly a century already, and none has been found as yelrl ^How did Cantor come to this conclusion? Define the power set V{X) to be the set of all subsets of X. Cantor proved that \X\ < \'P{X)\. Now if X is the set of all sets, then Vi^X) C X, since every member of 'P{X) is a set. But then [^(X)! < \X\, and hence the assumption that all sets can be collected into a single set-of-all-sets leads to contradiction. Cantor knew about this by 1899 at the latest, though it was only published in 1932. ^ Here's how: "It seems to make perfect sense to inquire of a property whether it applies to itself or not. The property of being red, for instance, does not apply to itself, since red is surely not red, whereas (the property of being) abstract, being itself abstract, applies to itself. Calling the property of not applying to itself 'impredicable', we arrive at the paradoxical consequence that impredicable is impredicable if and only if impredicable is not impredicable. The property-theoretical (logical) variant is as paradoxical as the set- theoretical (mathematical) one." — Foundations of Set Theory, by Fraenkel, Bar-Hillel and Levy, 2nd revised edition, Springer 1973. ^This is the law which asserts that for any statement A, either A is true or A is false. The intuitionists therefore denied that the tautology A V ^A is true. *In September 2011, the mathematician Edward Nelson announced that arithmetic (to be precise, Peano Arithmetic (PA)) is inconsistent. His proof turned out to have an error. Nelson is hardly a "lightweight" mathematician, and this episode reveals that serious mathematicians may have legitimate qualms about the consistency of even the most basic and longstanding mathematical theories. By Godel's 2"^* Incompleteness Theorem, PA is not strong enough to prove its own consistency. ZF is strong enough to prove that PA is consistent [ZF h Con(Pyl)), but since we don't know if ZF is consistent, that proof is also suspect. (In 1936 the logician Gerhard Gentzen proved that PA is consistent. His proof goes beyond ordinary arithmetic by requiring a transfinite induction, but only up to a countable ordinal.) Set Theory 6 1.3 The Iterative Conception of Set Definition: By a "set" we mean any collection M into a whole of definite distinct objects m (which are called the "elements" of M) of our perception or of our thought. — Georg Cantor, 1895 (First sentence of Contributions to the Founding of the Theory of Trans- finite Numbers.) A set, according to Cantor, is 'any collection . . . , into a whole of def- inite, well- distinguished objects... of our intuition or thought.' Cantor also defined a set as a 'many, which can be thought of as one, i.e., a totality of definite elements that can be combined into a whole by a law. ' One might object to the first definition on the grounds that it uses the concepts of collection and whole, which are notions no better understood than that of set, that there ought to be sets of objects that are not objects of our thought, that 'intuition' is a term laden with a theory of knowl- edge that no one should believe, that any object is 'definite, ' that there should be sets of ill- distinguished objects, such as waves and trains, etc., etc. And one might object to the second on the grounds that a many' is ungrammatical, that if something is a many" it should hardly be thought of as one, thai totality is as obscure as set, that it is far from clear how laws can combine anything into a whole, that there ought to be other combinations into a whole than those effected by laws, ' etc., etc. — George Boolos, 1971 (Opening lines The Iterative Conception of Set, J. Philosophy 68 no.8, 1971, pp. 215-231.) Russell's paradox involved the multiplicity of all sets which are not elements of themselves. Can a set belong to itself? Looked at one way, the answer appears to be "Yes!" : Let A be the set of all sets definable, in English, using fewer than 100 words. We have just defined A in English using fewer than 100 words. Hence A E A. However, "definable in English" is a rather obscure term. English has just finitely many words, so there are just finitely many sentences of length < 100 words. Thus there are at most finitely many natural numbers definable in < 100 words. Now let n be the smallest natural number not definable in fewer than 100 words Contradiction. . . Another example is the set X:= {{{...0...}}} where there are infinitely many brackets arount the empty set. Then clearly {X} = X, so X ex. But looked at from another angle, the answer to our question appears to be "No!" , however: In order to collect some objects into a set — i.e in order to make a set — one has to have those objects first. Thus the elements which make up a set are, in some sense, more primitive than the set itself. As an analogy, one can think of the natural numbers as being "created" from the number by the operation of repeatedly adding 1. We start with zero, then we make 7 Axioms of Set Theory 1, then we make 2, then we make 3, etc. Now it must not be thought that natural numbers are continuously being created — "make" is just a manner of speaking. In the same way, one can think of sets of being made up as follows: We start off with some primitive objects called atoms (also called individuals, or urelemente), which are not sets, and have no elements themselves. At stage 0, we construct all sets of atoms. Then at stage 1, we construct all sets made up of atoms and sets created at stage 0. At stage 2, we construct all sets consisting of atoms and sets created at stages and 1. Thus if we start with two atoms a, b, then we get the following: Stage 0: New sets 0, {a}, {b}, {a, b}. Stage 1: New sets {0}, {{a}}, {{6}}, {{a, b}}, {0, {a}}, {0, {b}}, {0, {a, b}}, {0, {a}, {6}}, . . . Stage 2: New sets {{0}}, {0, {0}}, {{{a}}}, ■ ■ • Stage 3: . . . We do this for all the stages 0,1,2, But it doesn't stop there! Once we have all these sets, we can construct sets made up of atoms and sets constructed at some finite stage. Thereafter we can construct sets made up of atoms and sets just constructed, at the first infinite stage, for example the set {0, {0} , {{0}},{{{0}}},...} w 1 bracket 2 brackets 3 brackets You may have noticed that we have not said anything about the nature of the atoms. What should or could atoms be? We might take them to be basic objects that we need in mathematics, e.g. natural - or real numbers, or points and lines. However, it turns out that we do not need atoms at all! Instead, we can build up all of mathematics from nothing, i.e. from the empty set. To do this, define a pure set as follows: • Atoms are not pure. • A set is pure if and only if all its elements are pure. Thus at stage 0, we form the set 0, which is pure — vacuously, because none of its members are impure. At stage 1, the set {0} is pure. At stage 2, we have the pure sets {{0}} and {0, {0}}, etc. The infinite set (★) constructed at the first infinite stage is clearly pure as well. We will only be concerned with pure sets. Thus when we say "set" , we mean "pure set" . This means that in our mathematical universe, everything is a set. Now observe that sets are made again and again. For example, is a subset of all sets, and hence is created anew at each stage. But for each set there is a least stage, i.e. a stage at which the set appears for the first time. We thus have the following intuitions about set formation via stages: (i) There is a least stage at which we have the set 0. (ii) Every stage a has a successor a + 1. At stage a + l we can construct sets whose elements are sets constructed at stage a and earlier. This means that if a collection (multiplicity) has elements all of which are created at stage a or earlier, then that collection is a set, and is "created" by stage a + 1. Set Theory 8 (iii) Having made sets at a whole lot of stages, there is least a new stage, at which we can construct sets whose elements are sets constructed at an earlier stage. (iv) A collection (multiplicity) is a set if and only if it is formed at some stage. (v) For every set, there is a least stage at which that set is formed. Thus we have stages 0, 1, 2, 3, . . . , n . . . . After that comes a stage — call it stage oj. After stage uj come stages a; + 1, a; + 2, . . . , 2a;, 2a; + 1, ... , 3a;, ...,a;-a;,...,a;,...,a;,...,a; ,...,a; Thereafter comes another stage — call it uji, etc. These stages conform to Cantor's two "generating principle^': (1) Every stage a has an immediate successor stage a + 1, and (2) given any sequence of stages without a last element, there is a stage which comes immediately afterwards. If a, /3 are stages, we write a < /3 to mean that stage a comes before stage /3. When we form a set at some stage /3, its elements must be sets that have been constructed at some stage a < /3. It is therefore impossible for a set to belong to itself. Furthermore, it is impossible to get a loop such as Xi G X2 G X3 • • • G e Xi Now we consider a property that will turn out to play a very important role in the future development of set theory: Let A be any non-empty collection of stages, and let F be the collection of all stages that come strictly before any of the stages in A, i.e. F := {7 : V(5 G A (7 < (5)}. We can then consider two cases: • Either F is empty. In that case if the zeroth stage belongs to A, so is the least element of A. • Else F is non-empty. In that case intuition (iii) above says that there is a least stage a which lies above all 7 G F. We now claim that a is the least element of A. To see this, note first that each 5 G A lies strictly above all 7 G F. Since a is the least stage which lies above all 7 G F, it follows that V5 G A (a < (5). Next observe that it cannot be the case that V(5 G A (a < (5), for then q G F. Hence there is some /3 G A such that (3 < a. But since /3 G A we also have a < (3, and thus q = /3. In particular, a G A. Since V5 G A (a < 5), we see that a is the least element of A. In both cases, therefore, we have shown that A has a least element: Any non-empty collection of stages has a least element. All of the above is at the intuitive level, and must be made formal. And we will accomplish this: For example, the intuitive idea of a stage will be captured by the formal notion of an ordinal. The fact that each set is "created" at some stage will be be evident when we consider the cumulative hierarchy of sets and the Axiom of Foundation. Cantor, Grundlagen einer allgemeinen Manmchfaltgkeitslehre, 1883. 9 Axioms of Set Theory 1.4 The Axioms of ZFC Set theory is that branch of mathematics whose task it is to investi- gate mathematically the fundamental notions "number", "order", and "function", taking them in their pristine, simple form, and to develop thereby the logical foundations of all arithmetic and analysis; thus it constitutes an indispensable component of the science of mathematics. At present, however, the very existence of the discipline seems to be threatened by certain contradictions, or "antinomies" , that can be de- rived from its principles — principles necessarily governing our think- ing, it seems - and to which no entirely satisfactory solution has yet been found. [. . . jUnder these circumstances there is at this point nothing left for us to do but to proceed in the opposite direction, and starting from set theory as it is historically given, to seek out the principles re- quired for establishing the foundations of this mathematical discipline. In solving the problem we must, on the one hand, restrict these prin- ciples sufficiently to exclude all contradictions and, on the other, take them sufficiently wide to retain all that is valuable in this theory. Now in the present paper I intend to show how the entire discipline created by Cantor and Dedekind can be reduced to a few definitions and seven principles, or axioms, which appear to be mutually independent. — Ernst Zermelo, 1908 (Opening lines of Untersuchungen iiber die Grundlagen der Mengen- lehren I, Mathcmatischcn Annalen 65, pp. 261-281, 1908. Translation in van Heijenoort, From Frege to Godel) You may already have encountered various systems of axioms, such as the axioms for a vector space, or of a field. A model of a system of axioms is a mathematical object that satisfies these axioms. Thus a field is a model of the field axioms. Systems of axioms can be separated into two classes: I. Some systems of axioms arc obtained by studing a number of mathematical objects/theories and abstracting certain common elements. Examples of such arc the axioms of group theory, of field theory, of vector spaces, of topology, of lattice theory, of probability theory. . . . These systems of axioms have many models, i.e. there are many different kinds of group, many different kinds of field, many different kinds of vector space, .... Moreover, these axioms are intended to have many models. II. Other systems of axioms are meant to describe and characterize one kind of mathematical objects, i.e. these axioms have an intended model. The axioms of arithmetic are meant to capture just the set N of natural numbers with the operations +, x and relation < on N. The axioms of a complete ordered field are intended to capture the set of real numbers. Euclid's axioms of geometry are intended to characterize space. In some cases, systems of axioms succeed in characterizing exactly the intended model. Notably the axioms for a complete ordered field exactly capture the reals, in that any two models of these axioms are isomorphic (i.e. essentially "the same"). In many cases, it is impossible to completely capture an object with a set of axioms: If a set of first-order Set Theory 10 axioms has just one an infinite model, it will have many non-isomorphic models " The axioms of set theory are of the second kind: They are intended to describe the universe of all sets. Below, we will write down these axioms, and briefly justify them in terms of the iterative conception of sets. In the chapters that follow, we will develop the consequences of these axioms in some detail, and (hopefully) convince you that all of mathematics can be interpreted in the universe of sets. The first set of axioms for set theory were presented by Ernst Zermelo in 1908, in a paper entitled Investigations in the Foundations of Set Theory I. An additional schema of axioms was suggested by Abraham Fraenkel in 1922 (and independently, by Thoralf Skolem in 1923). We shall phrase these axioms in the language of first order predicate logic. The language of set theory has at its disposal the following symbols, whose meaning is assumed to be already known to you. We have at our disposal a countable collection of • Variables: vo,vi,V2,V3, . . . As non-logical symbols, we have the two binary predicate symbols • Equality: = • Membership: G The atomic formulas of our theory are expressions X = y X £ y and these can be combined into formulas by means of • Logical connectives: A (conjunction, and),V (disjunction, or),-i (negation, not), — )■ (im- plication, implies), <-)• (equivalence, if and only if). • Quantifiers: V (for all), 3 (there exists). • Brackets: (,) It is assumed that you already understand what a (well-formed) formula is, when an instance of a variable in a formula is free or bound, etc. All of this is part of our metatheory, the basic background theory in which we discuss our formal set theory. It's obvious — if you stop to think for a minute — that one cannot develop any theory (mathematical, or otherwise) from absolutely nothing. We have to assume that we have a common background language in which to develop our formal theory, and a common basis of knowledge that we can use as springboard from which to start. For us, this metalanguage and metatheory includes a natural language (English) augmented by a few mathematical notions, such as the intuitive notion of natural number, the notion of a finite set, and very elementary first-order logic. This foundation, involving only intuitively obvious finitary notions and - reasoning, is (hopefully) obvious a priori, and its validity not in doubt ^"This is the Lowenheim- Skolem Theorem, a basic result in the branch of mathematical logic called model theory. It follows from this theorem that if the axioms of set theory are consistent, then there is a countable model of set theory, i.e. a model in which there re just countably many sets — a result that Skolem found highly paradoxical. ^^If you do doubt it, your mathematics will look very different. For example, for the intuitionists every function / : R — ^ R is continuous. This follows logically from certain assumptions that are quite natural. 11 Axioms of Set Theory Apart from the above symbols, we will introduce symbols for defined notions as we proceed. Many of these will undoubtedly be familiar to you, such as U, n, 0, — for union, intersection, empty set and set difference. We will use x,y, z,X,Y, Z, . . . as symbols for variables, and also use brackets [,], j , , to make expressions easier to understand. We will also employ expressions such as x ^ y rather than the formally correct = y), and we will omit brackets wherever it makes an expression easier to read without risk of ambiguity. Thus, for example, we will write Vx 3y (x y) instead of Vfo (3fi {^{vq G vi))) The latter is a well-formed sentence of our formal language, but the former is much easier to read. We also assume that you are aware of various logical equivalences, such as (/? V ■i/' -'(-'(^ A -I'i/') and Mx ip (-k/^) etc. Furhermore, we introduce the following notation: (Vx y) (f means Vx(x G y — >■ (p) and (3x & y) if means 3x {x & y A ip) ZFC 0: The Axiom of Set Existence ZFC 0: There is a set. 3x {x = x) Without this axiom, we have nothing to talk about. (For some systems of axioms, models with no elements are acceptable, and even desirable. Category theorists, for example, like to work with categories that have initial objects, and these can be empty. But our axioms are meant to characterize a single "object" : The universe of all sets — and we believe that that universe is non-empty.) This axiom is not strictly necessary, as it is implied by the Axiom Infinity. ZFC 1: The Axiom of Extensionality ZFC 1: If two sets have the same elements, then they are equal. Vx Vy [Vz (z G X -H- 2; G y) — >■ X = y] The extension of a property is the collection of all objects which have that property, i.e. to which that property extends. It is not a priori clear that {Moons of Venus} = {Moons of Mercury} Set Theory 12 or that {x'^ ■.xeR} = : y G M} The intension of those definitions of sets — i.e. the way in which they are defined — is different. However their extension is the same. Both the sets {Moons of Venus} and {Moons of Mercury} are empty, and are thus equal to 0. Both the sets {x'^ : x € M} and {y^^ : y G M} consist exactly of all non-negative real numbers, and are thus equal to [0,oo). The Axiom of Extensionality states the intension is not important when it comes to set equality, but only the extension. We may now introduce a new symbol C for the relation of inclusion, a binary relation on sets which is defined as follows. Definition 1.4.1 We say that a set j; is a subset of y and write a; C y if and only if every element of x is an element of y: X y ■^=^ \/z {z £ X ^ z £ y) We also write y ^ x instead of x C y. We say that x is a proper subset of y, and write x C y (or y ^ x), if x y and x ^ y. □ The Axiom of Extensionality can now be written thus: 2;^yAyCx— 7>x = y Remarks 1.4.2 • The binary relations E and C are sometimes confused, especially by beginners. They mean quite different things, however. Indeed, observe that it is always the case that x C x, but that it is seldom the case that x G x. (Indeed, never: That X G X contradicts the iterative conception of sets was discussed in §1.3[ and will follow formally from the Axiom of Foundation. ) • Another role of the Axiom of Extensionality is to limit the universe of discourse to sets. In particular, it says that if two objects x, y of our universe have the same elements, then X = y. It therefore follows, for example, that there are no atoms. (For if a is an atom, then a and have the same elements, so a = — contradiction.) □ ZFC 2: The Axiom Schema of Separation The remaining axioms, apart from the Axiom of Foundation, are axioms of set formation, i.e. they assert that certain sets exist, or that certain collections (multiplicities) are sets. The guiding principle, expounded by Cantor, Russell and Zermelo, is that of limitation of size. Sets are collections that are "not too big." This prevents one forming a "set of all sets", for example. ZFC 2^: Let if{x,pi,...,pn) be first-order formula with free variables amongst x,pi, . . . ,pn- For any set z, there is a set y consisting of all those x G z which have the property (p. Wz Vpi ... ypn ByWx {x £ y -ir^ X £ z A ^p{x,pi, . . . ,pn)) 13 Axioms of Set Theory Given a set z and a property we may separate all those elements of z which have the property if, and put them together to form a new set y. By the Axiom of Extensionality, there is just one such set y. We therefore may introduce some new notation: Let {x e z : (p{x,p)} denote the unique set y whose existence is asserted by the Axiom Schema of Separation. Observe that y is a sub-collection of z. The Separation Axiom asserts that the collection (multiplicity) y is a set. It is easy to justify the existence of such a set y from the iterative conception of sets: Given the set z, we know that it is "created" at some stage a. All the elements x of z are therefore created at some stage strictly earlier than a. In particular, the elements of y are created at some stage strictly earlier than a, as y C. z. But at stage a we make sets out of all those collections whose elements are sets created at an earlier stage. In particular, we make a set out of the collection y at (or possibly even before) stage a. Observe that Separation is not a single axiom, but an axiom schema: We have one axiom for every first-order ip{x,p). However, given any formula (or string of symbols) it is a mechanical procedure to check whether or not $ is an instance of the separation axiom. The Axiom Schema of Separation is also called the Axiom Schema of Comprehension or the Aussonderung-axiom in the literature. It will, in fact, follow from the Axiom Schema of Replacement (ZFC 7), so it isn't strictly necessary. But it is easier to use than Replacement, and is on the original list of axioms that Zermelo published in 1908. Exercise 1.4.3 (a) Prove that the existence of a unique empty set follows from the axioms ZFC 0-2. We may therefore define to be the unique set with no elements. (b) Prove that ZFC 0-2 imply that every set x has at least one subset y which is not an element of x, i.e. prove that Vx 3y (y C a; A y x). Deduce that the collection of all sets is not a set. (c) Recall that if a, b are sets, then their intersection a n 6 is the collection of all elements which belong to both a and b. Further recall that a — 6 is the collection of all elements which belong to a but not to b. Show that ZFC 0-2 imply that if a, b are sets, then so are aCib and a — b. □ Remarks 1.4.4 In the exercise above, we formally introduced the symbols 0,n,— . Previ- ously, we introduced the symbol C and the notation {x £ a : f}. We will use these introduced notions as though they are part of our formal language. They can always be eliminated, however. For example, \/x 3y {y C x) is shorthand for ^x 3y \/z {z e y ^ z £ x) and {yea: ip{y,p)}r]b = is shorthand for 3z ^y {y e z y e a A ip{y,p)) A-i3x {x e z A x e b) □ Set Theory 14 ZFC 3: The Axiom of Pairing ZFC 3: Given two sets x, y there is a set z whose members are precisely x,y. Va; Vy Vi/; (w^z-^w = x\/w = y) By the Axiom of Extensionality, any two sets whose members are precisely x, y are equal, and thus we may define the unordered pair {x, y} to be the unique set whose elements are precisely x and y, i.e. z = {x, y} is shorthand for xEz/KyEzA Vu; {w E z — >■ w = x V w = y) We then define the singleton {x} to be the set {x,x}. The Axiom of Pairing is easy to justify via the iterative conception of sets: If a and b are sets, then each is created at some stage, say stages a and p. Then {a, b} will exist at any stage 7 which is strictly greater than both a and /3. Remarks 1.4.5 The Axiom of Pairing is often stated in the following weaker form: Given any two sets x, y, there exists a set z that they both belong to. However, together with the Separation axiom, this weaker form implies our form. For if z is a set which has x, y amongst its elements, then {x, y} := {w & z : w = xV w = y} □ ZFC 4: The Axiom of Union ZFC 4: For any set x, there is a set y whose elements are precisely the elements of elements of x. Vx 3y Vz [z G y -H- 3w G x {z £ w)] By the Axiom of Extensionality, there is a unique such set y, which is called the union of the set X, and denoted by y = (Jx or (more commonly) by y = [^{w : w & x} or y = vj We can justify this axiom as follows: If x is a set, then it is formed at some stage a. Each element of x is therefore created at some stage < a, and thus the elements of elements of x are created before stage a as well. Now the union of a; — i.e. the collection of all elements of elements of a; — is a collection of elements that are all created strictly before stage a, and hence |J x is formed as a set by stage a at the latest. Remarks 1.4.6 The Axiom of Union may be stated in the following weaker form: Given any set x, there exists a set z that has amongst its elements all the elements of elements of x. Together with the Separation axiom this weaker form implies our form. For if z such a set, then [J X := {y G z : 3w G x (y G w)} 15 Axioms of Set Theory □ We may now define (with the aid of the Axiom of Pairing) the sets xUy :=[J{x,y} x U y U z :=\^{x,y, z} xi U • • • U j;„ = |J{xi, . . . , j;„} etc. Exercise 1.4.7 (a) Suppose that C := {y : (p{y,p)} is the cohection of all sets that have some propertjj^ given by the first-order formula C need not be a set itself — it might be "too big" — but each of its elements is a set. Define the intersection P| C of C to be the collection of all sets z with the property that z belongs to each element of C, i.e. f]C = {z:\/y€C{zGy)} Show that ZFC 0-2 imply that if the collection C is non-empty, then P| C is a set. Further, use ZFC 0-3 to define the intersections x Ciy, x (1 y Ci z etc. Observe that the Axiom of Union is not required for the existence of intersections. (b) Show that P| is not a set. □ We may also write y = P| C as f]{y : y £ C} oi y = Clyec U- Further, x Ciy := {^{x, y}, etc. Exercise 1.4.8 If the following identities and inclusions are unfamiliar, now^^ is the time to prove them. (a) x C y n z iff X C y and x'^z; x^yUziSx^y and x ^ z. (b) 2;Cyiff2;ny = xiffxUy = ?/iffx — y = 0. (c) The distributive laws: x Ci {y U z) = {x Ci y) U {x Ci z) and xU {y Ci z) = (x U y) Ci {x U z). (d) De Morgan's Laws: x — {y U z) = {x — y) (1 {x — z) and x — {y (1 z) = {x — y) U {x — z). ZFC 5: The Axiom of Power Set ZFC 5: For any set x, there is a set y whose elements are precisely the subsets x. yx 3y \/z {z £ y -f-)- z C x) By the Axiom of Extensionality, there is a unique such set y, which is called the power set of the set X, and denoted by y = V{x) The Axiom of Power Set may be justified as follows: If x is a set formed at stage a, then all the elements of x are formed before stage a. But then each subset of x is a collection of elements that are created before stage a, so each subset of x is made into a set by stage a at the latest. So the collection of all subsets of x is a collection of sets that all exist by stage a, and thus 'P(x) is made into a set by stage a + 1 at the latest. ^■^The technical term for such a collection is class — cf. [|1.5 "Yes, NOW! Set Theory 16 Exercise 1.4.9 Show that if x,y are sets, then {x,y} € V{V{x) UV{y)). Given the above, do we need the Axiom of Pairing? Or does Pairing foUow from Union and Power Set? ZFC 6: The Axiom of Infinity "But concerning your proof, I protest above all against the use of an infinite quantity as something completed, which is never permissible in mathematics. The infinite is merely a way of speaking, the true meaning being a limit which certain ratios approach indefinitely close, while others are permitted to increase without restriction. " — Gauss, letter to H. Schumacher, 1831 □ " The existence of the infinite will never again be deniable while that of the finites is nevertheless upheld. If one permits either to fall, one must do away as well with the other. " — Cantor, Grundlagen, 1883. The axioms ZFC 0-5 ahow us to prove that there are infinitely many sets, but they do not allow us to prove that there is an infinite set. This is not hard to see: The axioms ZFC 0-2 allows us to deduce the existence of possibly just one set, namely 0. Further, axioms ZFC 2-5 allow us to create new sets from old, by taking subsets, pairs, unions, and power set. But if we have at our disposal only finite sets (i.e. if we are at a stage where only finite sets have been created), then the operations of taking subsets, pairing, union and power set are not capable of forming infinite sets. Indeed, even ZFC 0-5 together with ZFC 7-9 are not sufficient to prove the existence of an infinite set, and the existence of such a set is, in fact, denied by the intuitionists and finitists. Following Aristotle, the concept of infinity was separated into two distinct versions. On the one hand, there is the idea of an actual infinity, i.e. a completed, finished, definite thing consisting of infinitely many elements or parts. On the other, there is potential infinity, where one has a sequence of things which is never completed or finished. Aristotle allowed only potential infinities as non-paradoxical, and many mathematicians and philosophers agreed. Certainly, this point of view, eloquently expressed in the quote of Gauss above, had much to recommend it, and under its guidance mathematicians were able to develop completely rigorous definitions of limits without the use of infinitesimal^^ It was Cantor who made the use of the actual infinity respectable in mathematics. It was Zermelo who first understood that its existence had to be asserted: ZFC 6: There is an infinite set. 3x [0 £ X A yy {y G X - ■> yU{y}e x)] The philosopher George Berkeley ridiculed the use of infinitesimals: "May we not call them the ghosts of departed quantities?" Strangely enough, Cantor too was resolutely opposed to the use of infinitely small numbers, and "refuted" them with arguments that could be applied equally to his infinitely large transfinite numbers. But the logician Abraham Robinson resurrected them and legitimised their use with his invention of the hyperreals and non-standard analysis. 17 Axioms of Set Theory Let us analyze this axiom: Define an operation S{-) on sets by S{x) := xU {x} This means that y = S{x) yz{z£y-^z£xVz = x) By the Axioms of Pairing and Union, S{x) is a set whenever x is. For reasons that wiU become clear later, the function S{-) is called the successor function. The Axiom of Infinity states that there is a set which has the following amongst its elements: 0, 5(0), S{S{0)), S{S{S{0))),... "Clearly", such a set must be infinite. However, we have not yet defined what wc mean by an infinite set — this requires the formal notion of a natural number. Let us say that a set X is inductive if it satisfies the condition in the Axiom of Infinity, i.e. if 0GxAVy(yGx^ S{y) G x) The Axiom of Infinity therefore asserts that there is an inductive set. (The set of natural numbers will then be defined to be the smallest inductive set.) Once wc have defined what we mean by an infinite set, we shall be able to prove that an inductive set is indeed infinite. Moreover, we shall show that if an infinite set exists, then so does an inductive set. Thus the existence of an infinite set will turn out to be equivalent to the existence of an inductive set. Let us quickly see how to justify this axiom via the iterative conception of sets: Observe that is formed at stage 0. Then S{0) = {0} is a set whose elements arc formed at stage 0, so S{0) is formed at stage 1. Then S{S{0)) = {0,{0}} is a set whose elements are formed at stage 1 and earlier so SS{0)) is made into a set at stage 2. Thus: is formed at stage S{0) is formed at stage 1 SS{0) is formed at stage 2 SSS{0) is formed at stage 3 These sets therefore are all formed before the first infinite stage oj, and hence the inductive set {0,S{0),SS{0),SSS{0),...} is formed at stage cj. ZFC 7: The Axiom of Choice ZFC 7: Every family of non-empty sets has a choice function: If x is a family of non-empty sets, then there is a function f on x which chooses from each w E x an element f{w) G w. \/x [Vu) G a; (w 7^ 0) — )• 3/ ( / is a function A dom(/) = a; A \/w G x {f{w) G w))] Set Theory 18 In the next chapter, we will see that functions can be interpreted in the universe of sets, so that to the statement "/ is a function" can be written in the language of set theory. Observe that if / is a choice function on x, then / : x — )• |J x. (Of course, not every / : x — )• |J x is a choice function.) The Axiom of Choice — formulated explicitly by Zermelo in 1904 — is an immensely powerful tool. It seems intuitively obvious, and yet has generated so much controversy that it deserves (and will get) a chapter on its own. The reason that this axiom was initially eyed with such suspicion by a large number of mathematicians is that the axiom is non-constructive. It simply asserts the existence of an object — a choice function — without saying how such an object might be constructed. The axioms ZFC 2-6 have a slightly different character: They assert that certain well-defined collections are sets. The Axiom of Choice (abbreviated AC) , on the other hand, asserts the existence of an undefined set And this is precisely the source of its power: If one could define a choice function for a particular family x, then one would not need the AC to assert its existence. Remarks 1.4.10 It is important to note that one does not need AC to "choose" an element from a non-empty set. This fact has often been misunderstood. The fact is that AC doesn't really allow you to "choose"; that's just a manner of speaking. What AC does do is assert that a certain set exists. Thus if w is non-empty set, then x := {w} is family of non-empty sets, and AC says that there is a function / : x — t- |J x such that f{w) G w. But we don't need AC for that ! If it; is non-empty, then (by definition of "non-empty" and first-order logic) there is an element a G w. We may not know what a looks like — i.e. we may not explicitly be able to choose a particular a — but there is such an a. It will be clear after you've read the next chapter that the function / : x — )• |Jx which has f{w) = a exists, i.e. a choice function exists. In the same way, it can be shown that any finite family of non-empty sets has a choice function (because it follows from the Axioms of Pairing and Union show that every finite collection of sets is a set). But a problem arises when we have an infinite family x of non- empty sets: It is always possible to choose an element from each w G x. What is not clear is that one can do this "in one go", i.e. simultaneously for all w £ x, via some function /. □ I'm not sure if one can justify the Axiom of Choice via the iterative conception of sets. However, in 1938 Kurt Godel proved the following result: If ZF (i.e. set theory without AC) is consistent, then so is ZFC (set theory with AC). Put differently, AC cannot introduce an inconsistency into set theory. If ZFC is inconsistent, then ZF was already inconsistent also. ZFC 8: The Axiom of Replacement The remaining two axioms — Replacement and Foundation — are very important for the subject of set theory itself, and for metamathematics. They play only a tiny role in ordinary mathematics however. I do not see know to derive the Axiom Schema of Replacement from the iterative concep- tion of sets. It is easy to derive it from the idea of limitation of size, however. By that we mean that a set is a collection that is not "too big" . Here's how: We shall soon see — and you are no doubt aware already — that functions are sets. Remember, sets are all there is. 19 Axioms of Set Theory Suppose that ip{x, y,p) is a first-order formula in the language of set theory which defines a "function" . Here x, y play the role of independent and variable, and p the role of a fixed parameter. Informally, such a formula defines a "function" F when for any x there is at most one y such that ip{x,y,p) holds. In that case we have a "function" F given by y = F{x) if and only if ip{x, y,p) holds The Axiom of Relacement asserts that if F is such a "function" and if X is a set, then the direct image F[X] := {F{x) :xeX} = {y:3xeX ^{x, y,p)} is a set also. The use of quotation marks around the word function is essential here. We shall see in the next chapter how to interpret functions as sets. But it will follow from ZFC 0—5 that the direct image of a set under a function is a set — The Axiom of Replacement isn't necessary for that. The function F we use here may be "too big" to be a set, however. Nevertheless, even though F may be too big, its image F[X\ is surely no bigger than X. So if X is a set, then F[X] can't be too big to be a set. All this may be confiising at first. Fortunately wc don't need to talk about the size of a "function" F to formulate the Axiom Schema of Replacement: ZFC 8(^: Suppose that (p{x,y,pi, . . . ,pn) is a first-order formula with free variables amongst x,y and p := {pi, . . . ,pn)- Suppose that for every x there is at most one y so that (p{x,y,p). If X is a set, then the collection {y :3x e X (p{x, y,p)} is a set. \/p VixMyMz {ip{x,y,p)/\Lp{x,z,p) ^ y = z) — >■ VX3y Vy (y G y ■<->■ -^x ^ X ip{x,y,p))\ Observe that Replacement — like Separation — is an Axiom Schema. We have one axiom for every ip. The Axiom Schema of Replacement was suggested independently by Abraham Praenkel and Thoralf Skolem in 1922. It allows the creation of many, many new sets. Whereas the other axioms of ZFC allow one to form new sets that are at most a few levels higher than the sets from which they are created, the Axiom of Replacement allows one to form sets at very high levels. (E.g. it is possible that the sets X is at a "low level" and that the created set F[X\ at a very "high level". ) In particular, it necessary to develop a theory of ordinal numbers and transfinite induction. As we shall see, the ordinals form the backbone that supports the entire set-theoretic universe, so this axiom is indispensable to the set theorist. Set Theory 20 ZFC 9: The Axiom of Foundation A well-known scientist (some say it was Bertrand Russell) once gave a public lecture on astronomy. He described how the earth orbits around the sun and how the sun, in turn, orbits around the center of a vast collection of stars called our galaxy. At the end of the lecture, a little old lady at the back of the room got up and said: "What you have told us is rubbish. The world is really a fiat plate supported on the back of a giant tortoise. " The scientist gave a superior smile before replying, "What is the tortoise standing on?" "You're very clever, young man, very clever," said the old lady. "But it's turtles all the way down!" — Stephen Hawking, A Brief History of Time In the set-theoretic universe, its sets all the way down. Every element of a set is a set. Every element of an element of a set is a set. But how far is all the way down'? Not that far, as it turns out. According to the iterative conception of sets, sets are made in stages, with each set being made up of elements that have been formed at a strictly earlier stage. We saw also that any non-empty collection of stages has a least element. Suppose now that a; is a non-empty set. Let r be the collection of all stages associated with an element of x, i.e. r = {7 : There is x G X such that z is formed at stage 7} The r is a non-empty collection of stages, so has a least clement a. Let y G X be an element formed at stage a. If z G y, then z is formed at a stage /3 < a. Now /3 F, because a is least, and hence z ^ x. It follows that if 2; G y, then z ^ x, i.e. that y Ci x = 0. We have thus shown (informally, from the iterative conception of sets), that for every non-empty set • there is y E x such that y n x = 0. We now take this to be an axiom: ZFC 9: Vx (x 7^ - ■> 3y G x(y n x = 0) Exercise 1.4.11 Show using the Axiom of Foundation that "the turtles don't go very far down" in ZFC: There are is no sequence of sets xi, X2, X3, . . . such that . . . Xn+l G x„ G . . . X3 G X2 G Xi It follows that every G-decreasing chain xi 3 X2 3 xs 3 . . . is of finite length. (Technically, we say that the G-relation is wellfounded.) □ 21 Axioms of Set Theory The Axiom of Foundation is also referred to as the Axiom of Regularity in the hterature. It was introduced by John von Neumann in 1925. While it is of scant use to the ordinary mathematician, it is of great import to the set theorist, as we shall see in a later chapter. In essence, together with Replacement, it allows us to formalize the iterative conception of set within ZFC itself, and build up the universe of sets from nothing, by iterating the operations of power set and union along the ordinals. 1.5 Sets and Classes The version of set theory that we have developed (i.e. ZFC) has just one type of object, namely sets. Some set theories allow for another kind of object, however, namely classes. We will use this terminology only informally. Thus, informally, a class is a collection of the form C := {x : ipix,p)} i.e. C is the collection of all sets that have property (p. Here x,p are sets, with p playing the role of a parameter. For example, the universe of all sets — denoted by V — is the class V := {x : X = x} Some classes are "too big" to be sets: Russell's paradox (together with Separation) shows that V is not a set. But every set is a class, for if y is a set, then the class determined by the formula ip{x, y) = x G y is clearly y itself: {x : X ey} = y A class that is not a set is called a proper class. Thus V is a proper class. Later, we shall see that the class On of all ordinals is a proper class also. The theory NBG (von Neumann-Bernays-Godel) is a commonly used set theory allows both classes and sets as objects. (This is a conservative extension of ZFC: Any theorem about sets is NBG is also a theorem of ZFC, and vice versa. Thus NBG says nothing about sets that ZFC doesn't already know.) Inside ZFC, however, classes are just a way of speaking: If C := {x : ipix,p)} is a class, then when we say x G C, we mean simply that the statement (p{x,p) is true. Thus any mention of C can be avoided, simply by using its defining formula (/?. Exercise 1.5.1 Observe that every element of a class is a set. Show that a class is a set if and only if it is an element of some class. □ In Exercise 1 1.4. 7 it was shown that the intersection of a non-empty class is a set, i.e. that if C 7^ 0, then f] C is a set. The Axiom Schema of Separation states essentially that a subclass of a set is a set: {x £ z : ip{x,p)} = zri{x : ip{x,p)} = z CiC Set Theory 22 The Axiom Schema of Replacement can be formulated as follows: If F is a class functior and X is a set, the direct image F[X] is a set. ^''i.e. a "function" between classes. For example, the successor function S{x) :— xU {x} is a class function S : V V. Chapter 2 Relations and Functions In this chapter, we revise some elementary notions in mathematics, and show how they can be interpreted in the universe of sets. It is assumed that you've encountered the material in §2.1-2.5 before, so we move at speed. For the development below, the axioms ZFC — 5 usually suffice. The Axiom of Choice (AC, ZFC 7) also occurs in a few exercises, but its use is always stated explicitly. 2.1 Ordered Pairs and Finite Cartesian Products We can use the notion of an unordered pair to define the notion of an ordered pair. An ordered pair (x, y) is an object which, intuitively, contains x, y in that order. The essential property that we want ordered pairs to possess is the following: (x, y) = (a, b) if and only if x = a and y = h In particular, (x,y) ^ {y,x) (unless x = y). But by the Axiom of Extensionality {x,y} = {y,x} We can, however, interpret ordered pairs as sets, i.e. find sets which exhibit the correct behaviour. We follow the Polish mathematician Kazimierz Kuratowski, and define {x,y) := {{x},{x,y}} Exercise 2.1.1 Show that Kuratowski's definition of orded pair results in the correct be- haviour, i.e. that (x, y) = (a, h) if and only if x = a and y = b. Also show that the following alternative definitions of ordered pair fail to exhibit the desired behaviour: {a,b):={{a},{b}}, {a,b) := {a,{b}} Finally, give one other definition of ordered pair that does exhibit the correct behaviour. □ 23 Set Theory 24 Observe that the set {x, y) is a concrete reahzation of the abstract idea of an ordered pair {x,y). This is not to say that is {x,y), but merely that the set {x,y) behaves as an ordered pair should. Once we have defined ordered pairs as sets, we are able to find interpretations of ordered triples and quadruples, {x, y, z) ■= {{x, y),z), {x, y, z, w) := {{x, y, z),w), and, in generaQ {Xi, . . .,Xn+l) ■■= {{Xl, . . .,Xn),Xn+l) Given two sets X, Y, we may define the Cartesian product X xY X and Y as follows: X X Y = {z : 3x £ X 3y £ y {z = {x,y))} We then define X xY X Z := {X xY) X Z etc. Observe that in that case X xY X Z = {{x, y,z) : X e X,y eY,z e Z} Similarly, we define X"^ := X x X, X^ := X x X x X , etc. We have not yet shown that Cartesian products are sets. That is the objective of the next exercise: Exercise 2.1.2 Let X,Y be sets. (a) Write down a first-order formula pair{z, x, y) which states that z = {x, y}. (b) Write down a first-order formula ordered_pair{z , x, y) in the langage of set theory which states that z = {x,y). (c) Show that ii x £ X and y £Y, then {x, y) £ V{V{X U Y)). (d) Show that X x y is a set. □ Here are some easy exercises about the arithmetic of Cartesian products: Exercise 2.1.3 (a) {X \JY) x Z = {X x Z)U {Y x Z), {X r\Y) x Z = {X x Z)r\{Y x Z), {X -Y) X Z = {X X Z) - {Y X Z), (b) (A X B) n {X X Y) = {An X) X {B nY). (c) X xY = ii and only if X = or F = 0. □ ^We are not using natural numbers in this definition, but giving a schema for defining ordered tuples of arbitrary length. If you want to define an ordered 100-tuple, this requires a formula containing 100 variables. The concept/number 100 is not required. However, in order to define all finite ordered tuples simultaneously, using induction, we need the concept of natural number. This will have to wait until a later chapter. 25 Relations and Functions 2.2 Relations In mathematics, one studies many relations between various objects x, y. For example: • X < y where x, y are numbers. • X £ y, where x, y are sets. • X = y, where x,y are natural numbers/real numbers/sets/matrices/. . . • x\y {x divides y), where x,y are integers, or polynomials. • X is a root of y, where x is a real number, and y a real function. If R stands for some relation, we often write xRy to mean that x, y satisfy that relation. Relations need not be binary, i.e. they need not be between just two objects. Furthermore, • x,y,z are related if and only if x is congruent to y modulo z, where x, y, z are natural numbers. • x,y,z are related if and only if x(y) = z, where x is a real function, and y,z are real numbers. • x,y,z are related if and only if the dot product of x and y is z, where x, y are two n-dimensional vectors, and z is a real number. • X, y, z, w are related if and only if the point (x, y) lies on a circle of radius z and centre w. Observe that the order of x,y,z,. . . may matter. If the relation is strict inequality <, then X is related to y if and only if x < y. This does not mean y is also related to x. Given some relation, we can interpret it as a set R as follows: Let R be the set of all ordered pairs (x, y) with the property that x is related to y. Thus Definition 2.2.1 A binary relation i? is a set of ordered pairs. When {x,y) G R, we may indicate this by writing xRy. An n-ary relation i? is a set of ordered n-tuples. When (xi, . . . , Xn) G R, we may indicate this by writing R{xi, . . . , x„). For example, the relation of strict inequality on the natural number^ N := {0, 1, 2, 3, ... } is the set < defined by <:= {(0,1), (0,2), (0,3), (0,4),..., (1,2), (1,3), (1,4),..., (2,3), (2, 4),..., (3, 4),...} □ Definition 2.2.2 Suppose that i? is a binary relation. ■^Informally speaking, that is. In the end of course, they will be of the same type, since everything is a set. ''We will define the natural numbers later in this chapter. Set Theory 26 (a) The domain and range of R are defined by dom(i2) := {x : 3y (xRy)} ran(i?) := {y : 3x (xRy)} (b) If A is a set, then the image of A under R is defined by R[A] := {y G ran(i?) -.JxeA {xRy)} (c) If 5 is a set, then the inverse image of B under R is defined by R-^[B] := {x e dom(i?) :3yeB (xRy)} (d) If A is a set, then the restriction of i? to A is the binary relation R \ A defined by R\ A:= {{x,y) e R : x e A} □ If X,Y are sets, then any subset R C X x Y is a relation, with dom(i?) C X, and ran(i?) C y. In that case, we say that i? is a relation from X to Y. If R C X x X, we say that R is a relation on X. For any set X there is a identity relation Idx on X defined by Idx := {{x,x) :xeX} so that X Idx y if and only if x,y e X and x = y. (Thus Idx is just the relation of equality, restricted to the set X.) Exercise 2.2.3 Let i? be a binary relation. (a) Show that if {x, y) € i?, then x, y G |J |J i?. Conclude that that dom(i?) and ran(i?) are sets. (b) Show that R C dom(i?) x ran(i?). (c) Define the inverse R"^ of R by (x, y) e R~^ iff xRy Show that R~^ is a set, and thus a binary relation. (d) If S is another binary relation, define the composition S o R of R with S by {x,z) €SoR iff 3y {xRy A ySz) Show that o is a set, and thus a binary relation. (e) Show that composition of binary relations is associative: To{SoR) = {ToS)oR 27 Relations and Functions (f) Consider the relation < on the set of natural numbers. What is < ^1 Show that < o <=<. (g) Show that if dom(i?) = X, ran(i?) C Y, then Idy o R = R = Roldx- □ Observe that xS o Rz if and only if there is y such that xRySz — the order of i?, S is reversed! (This is why some texts write Ro S where we write S o R. We have reasons for preferring the current version, however, as will become apparent in the next section.) Definition 2.2.4 A binary relation i? on a set X is (a) Reflexive if Vx G X (xRx). (b) Symmetric if Vx, y E X {xRy — > yRx). (c) Asymmetric if Vx, y £ X {xRy — -^{yRx)). (d) Antisymmetric if Vx, y E X {xRy A yRx —^x = y). (e) Transitive if Vx, y,z E X {xRy A yRz — >■ xRz). (f) Connected if Vx, y E X {xRy V x = y V yRx). □ Exercise 2.2.5 Suppose that i? is a binary relation on a set X. (a) Show that -R is reflexive if and only if Idx ^ ^• (b) Show that R is symmetric if and only if R~^ C R. (c) Show that R is asymmetric if and only if i? fl = 0. (d) Show that R is transitive if and only if R o R C R. (e) Show that R is connected if and only if i? U R~^ U Idx = X x X. □ 2.3 Functions Informally, a function / can be thought to induce a binary relation between elements x,y, namely: x and y are related if /(x) = y. But not every binary relation is a function, for functions arc single-valued, i.e. if /(x) = y and /(x) = z, then we must have y = z. This leads to the following definition: Definition 2.3.1 A function (or map) / is a binary relation with the property that if (x, y) G / and (x, z) £ f, then y = z. We may write /(x) = y instead of (x, y) G / or x / y. Set Theory 28 □ When / is a function with dom(/) = X and ran(/) C Y, we indicate this by writing / : X — 7- y. We may also write x ^ y to mean that f{x) = y (when / is clear from context). If X, Y are sets, then the collection of all functions f : X ^ Y is denoted by Y-^ . (Some authors use ^Y.) Exercise 2.3.2 Show that if X, Y are sets, then X^ is a set. Observe that the identity relation Idx on a set X is a function, with Idx : X ^ X : x x. Obviously, it will be referred to as the identity function as well. Since a function is a binary relation, the notions of domain, range, inverse, image, inverse image, restriction and composition make sense. In particular any function / will possess an inverse relation f~^. However, /^^ need not be a function. Definition 2.3.3 Let f : X ^ Y he a function. (a) If the binary relation f~^ is a function f :Y X, i.e. if f~^ is a function and dom(/~^) = Y, then we say that / is invertible. (b) We say f : X ^ Y is injective (or one-to-one) if Vx, x' G X f{x) = f{x') — >■ a; = x'). We may write f : X ^ Y (or f : X ^ Y) to assert that / is an injection. (c) We say that f : X Y is surjective (or onto) if Y = ran(/), i.e. if Vy G 3a; € X {f{x) = y). Wc may write f : X ^Y to assert that / is a surjcction. (d) We say that f : X Y is a bijective if / is both injective and surjective. We may write / : X >— » y to assert that / is a bijection. Exercise 2.3.4 Let / : X — > y be a function. (a) Show that a composition of in/sur/bijcctions is an in/sur/bijection. (b) Show that the relation f~^ is a function if and only if / is injective. (c) Show that if / is invertible, then / o f~^ = Idy, and o / = Idx- (d) Say that a function g : Y ^ X is a left inverse of / if ^ o / = Idx- Show that / has a left inverse if and only if it is an injection. (e) Say that a function h : Y ^ X is a right inverse of / if and only if f o h = Idy. Using AC, show that a function has a right inverse if and only if it is a surjection. (f) Show that if / has both a left inverse g and a right inverse h, then / is invertible, and □ □ 9 = h = f -1 (g) Show that / is invertible if and only if it is a bijection. □ 29 Relations and Functions Suppose that X is a sets, and that F : I ^ X : i Xi is a, surjection. Then X = {F{i) ■.iel} = {xi:iel} The function F is called an indexing of the set X. The set / is called the index set, each i G I is called an index, and F{i) = xi is called the i*^ term. Note that any set can be indexed by some set. For example, a set X can be indexed by itself: Let I = X, and let Xi = i. Later on, however, we shall often use ordinals, a kind of transfinite number, to index sets. If X = {xi : i e I}, we will write [jx=ijx, n^=n^^ etc. Exercise 2.3.5 Let / : X — >■ y be a function. Show that the inverse image f~^[-] is a function /-![.] : p(y) ^ : B ^ f-^[B] := {x e X : f{x) G B} which preserves the basic set-theoretic operations U'D'"' (^-S- show that /~^[Uie/^] ~ Uielf'^i^il for any family Yi G V{Y).) □ Associated with any product X xY are associated projections itx-,t^y defined by -Kx X xY ^ X : {x,y) ^ X Try ■ X x Y ^ Y : {x,y) y Exercise 2.3.6 Suppose that fx'-A^X,fY'-A^Y are functions with common domain A. Show that there is a unique function f : A ^ XxY such that fx = T^x^f and /y = nyof- [Hint: Let f C. A x {X x Y) consist of all ordered pairs (a, {fx{o), fyio))) where a ^ A. □ 2.4 Equivalence Relations Recall that any set X carries an identity relation Idx defined by Idx := {{x,x) : X G X} which is just the relation of equality, restricted to the set X. The relation of equality is a class relation which satsifies following properties: • = is reflexive: x = x. • = is symmetric, li x = y, then y = x. • = is transitive: li x = y and y = z, then x = z. A relation which satisfies these same properties is called an equivalence relation. To be precise: Set Theory 30 Definition 2.4.1 A binary relation R on a set X is an equivalence relation if and only if (i) R is reflexive, i.e. \/x G X [xRx). (ii) R is symmetric, i.e. Vx,y G X — t- yRx). (iii) i? is transitive, i.e. \/x,y,z £ X (xRy A yi?2; — t- xRz). □ It is easy to see that if n is a non-zero integer, then the relation =„ of equivalence modul^ n is an equivalence relation on the set Z of integers. An equivalence relation R can be thought of as a generalization of the identity relation: If xRy, then x, y are in a certain sense identified, i.e. regarded as "equal" for the purpose at hand. When we collect together all the elements which are made "equal" we get equivalence classes: Definition 2.4.2 If R is an equivalence relation on a set X and x £ X, then the equivalence class of X modulo R is the set defined by: [x\r ■.= {y£X: xRy} □ Some authors may write x/R instead of [x]ji. The following device yields a fruitful way of looking at equivalence relations: Definition 2.4.3 A family V of non-empty sets is called a partition of a set X if and only if: (i) [j{P:PeV} = X. (ii) Any two distinct members oiV are disjoint, i.e. P,Q £ V and P ^ Q implies PCiQ = 0. □ The members P €z V are sometimes called the blocks of the partition. Observe that (i) of the above definition states that each x £ X belongs to at least one block, whereas (ii) states that no x £ X belongs to more than one block. Thus each x £ X belongs to exactly one block. Exercise 2.4.4 (a) Every equivalence relation on X induces a partition of X: Let R be an equivalence relation on a set X. Define a family of sets Vr by Vr := {[x]r -.xeX} Show that Vr is a partition of X. ^Recall that a =„ b if and only if 6 — a is divisible by n. 31 Relations and Functions (b) Every partition of a set X induces an equivalence relation on X: Let 7^ be a partition of X. Define a binary relation R^p on X by xR-py if and only if {x e P ^y e P) (i.e. X R-p y if and only if x, y belong to the same block of V.) Show that R-p is an equivalence relation. Further show that the blocks of V are precisely the equivalence classes of R-p, i.e. that [x]ii^ = P if and only if x G P. (c) Show that the constructions in (a), (b) above are inverses of each other, in the following sense: Starting from an equivalence relation P, the construction in (a) yields a partition Vr. If we then apply the construction in (b) to the partition Vr, we get an equivalence relation R-p^. This relation is precisely the original equivalence relation P, i.e. Pp^ = P. Similarly, Vr-p = V. □ If P is an equivalence relation on a set X, then the set of equivalence classes — which we above denoted by Vr — is usually denoted by X/ P. The equivalence class of an element x — above denoted by [x\r — is often denoted by x/R. Here is another reason why equivalence relations are ubiquitous in mathematics: Exercise 2.4.5 (a) Every function induces an equivalence relation on its domain: Suppose that / : X — > y is a function. Define a binary relation P on X by xRy iff fix) = f{y) The relation P is called the kernel of /, and denoted P = ker/. Show that ker/ is an equivalence relation. (b) Assuming the Axiom of Choice, every equivalence relation on a set X is induced by a function with domain X: Show that if P is an equivalence relation, then there is a function / with domain X such that P = ker /. □ The following exercise gives two equivalents of AC: Exercise 2.4.6 Show that the following are equivalent: (a) AC (b) For every partition V there is a set C which has precisely one element in common with each block P gV. (c) Every surjection has a right inverse. □ Set Theory 32 2.5 Order Relations Definition 2.5.1 (a) A binary relation < on a set X is said to be a partial order relation if < is: (i) reflexive, i.e. x < x, (ii) antisymmetric, i.e. x < y and y < x imply x = y, and (iii) transitive, i.e. x < y and y < z imply x < z. (b) A binary relation < on a set X is said to be a strict partial order relation if < is (i) asymmetric, i.e. x < y implies y ^ x. (ii) transitive, i.e. x < y and y < z imply x < z. (c) A total ordering is a connected partial ordering. A strict total ordering is a connected strict partial ordering. A (strictly) totally ordered set is also called a chain. □ A partially ordered set is a pair {X, <) where < is a partial ordering on X. Observe that Idx is a partial ordering on X, and that is a strict partial ordering on X. Given a partial ordering <, we can define a strict partial ordering < by X < y if and only if x < y A x ^ y i.e. < := < - Idx- Conversely, given a strict partial ordering <, we can define a partial ordering < by X < y if and only if x < y V x = y i.e. < := < u Idx- Moreover, < is total if and only if < is total. Definition 2.5.2 Let < be a partial ordering on X. (a) Let Y C X. An element yo e Y is said to be a <-minimal element of Y if there is no y G Y such that y < yo- We will refer to yo simply as a minimal element of y if < is clear from context. (b) Let y C X. An element yo eY is said to be the <-minimum element of Y if and only if Vy G F (yo < v)- We will refer to yo simply as the minimum element of y, or the least element of y, if < is clear from context. (c) The partial ordering < is said to be well-founded if every non-empty subset of X has an <— minimal element. (d) A total ordering < is called a well-ordering if every non-empty subset of X has a <- minimum element. □ 33 Relations and Functions Exercise 2.5.3 Let {X, <) be a partial ordering, and let Y C X). (a) Every minimum element of y is a minimal element. The converse may be false. (b) A subset of a partial ordering can have at most one minimum element. (c) If (X, <) is a chain, then every minimal element of Y is also a minimum element of Y. □ Definition 2.5.4 Let {A, <) and {B, ^) be partially ordered sets. (a) A function f : B is order-preserving if and only if x<y ^ f{x) ^ f{y) (b) {A, <) and (B, ^) are said to be order-isomorphic if there is a bijection f : A ^ B with the property that both /, ^ are order preserving. Such an / is called an order- isomorphism. □ Exercise 2.5.5 (a) Let <) and (B,^) be partially ordered sets. Show that a bijection f : A ^ B is an order-isomorphism if and only x<y ^ f{x) < f{y) (b) Give an example to show that not every order-preserving bijection is an order-isomorphism. (c) Show that {A, <) and {B, :<) are chains, then any order-preserving bijection is an order- isomorphism. (d) Show that if X is a set and X is a family of subsets of X, then the inclusion relation C is a partial ordering on X, i.e. {X, C) is a partial ordering. (e) Let {X, <) be a partial ordered set. For x e X, define I X := {y e X : y < x} and X := {], x : x & X} Show that {X, <) and (X, C) are order-isomorphic. (Thus, in some sense, C is the only partial order relation that exists, up to isomorphism.) □ Set Theory 34 Chapter 3 Natural Numbers God created the natural numbers; everything else is the work of man. — attributed to Leopold Kronecker It seems that Kronecker may have almost literally believed this. When Lindemann proved in 1882 that vr is trancendental (i.e. not the root of a polynomial with rational coefficients), Kronecker opined that the proof was indeed beautiful, but since irrational numbers do not exist, Lindemann's proof had no meaning. Certainly, Kronecker was the first and most vociferous critic of Cantor's transfinite sets[^ In 1889, Giuseppe Peano listed five axioms with the intention of providing a rigorous foundation for the natural numbers. The Peano Postulates are: I. is a natural number. II. Every natural number has a successor s(n) which is also a natural number. (The intention is that s(n) = n + 1.) III. is not the successor of any natural number. IV. If two natural numbers have the same successor, then they are identical, i.e. if s(n) = s{m), then n = m. V. If X is a set of natural numbers such that: (i) G X, (ii) If n £ X, then s{n) £ X, then every natural number belongs to X. (This is the Principle of Mathematical Induction.) The informal idea is that the set of natural numbers is the set {0, s(0), s{s{0)), s{s{s{0))), ...} so that every natural number is obtained from by finitely many applications of the successor function s(-). The point of this chapter is to create the natural numbers: We shall construct an object which satisfies the Peano Postulates inside the set-theoretic universe. ^Doron Zeilberger is a well-known modern mathematician who holds similar opinions and publishes them regularly. They're worth reading. Go have a look on the www. 35 Set Theory 36 In this chapter, we work mainly with the theory ZF~, which is ZFC without the Axioms of Choice and Foundation. AC will be used occasionally, but its use will in all cases be stated expplicitly. 3.1 The Set lu of Natural Numbers Let us briefly recall some definitions made when we introduced the Axiom of Infinity: We defined the successor function S{-) on sets by S{x) := xU {x} (Some authors write x~^ instead of S{x).) We defined a set X to be inductive if (i) e X, and (ii) X is closed under S{-): U x £ X, then S{x) £ X. The Axiom of Infinity is the statement There exists an inductive set Now let I be the class of all inductive sets: I := {x : Ind(j;)} where Ind(x) = £ x A {y £ x ^ S{y) £ x) By the Axiom of Infinity, I is non-empty. It follows from Exercise 1.4.7 that P|I is a set. (Indeed, if A is any inductive set, then f]I = {x £ A : Ind(x)} is a set by Separation.) We will give this set a name: and call u the set of natural numbers. Another commonly used symbol for the set of natural numbers is N. It is easy to see that uj is itself an inductive set. It follows that uj is the smallest inductive set: If X is any inductive set, then u <Z x. Exercise 3.1.1 (a) Show that the intersection of a non-empty class of inductive sets is an inductive set. (b) Conclude that uj is the smallest inductive set: If x is any inductive set, then a; C x. □ Since £ u and u is closed under S{-), we see that u must have the following elements (which we simultaneously provide with names): := 1 := 5(0) = {0} 2:= 5(1) = {0,1} 3:= 5(2) = {0,1,2} 4:= 5(3) = {0,1,2,3} = {0,l,2,...,n-l} 37 Natural Numbers Observe that the set n has (informal) n elements. If n S cj, we write n + 1 := S{n) (i.e. n + 1 is another notation for S{n) := nU {n}). The following result is easy: Theorem 3.1.2 (Principle of Mathematical Induction) Suppose that X <^ u has the proper- ties that (i) e X. (n) IfneX, then n+1 e X. Then X = uj. □ Exercise 3.1.3 Prove the Principle of Mathematical Induction. □ Let us investigate the structure of the set uj in greater detail. We want to verify that uj, S satisfy the Peano Postulates. We won't take the simplest route — things are about to get quite technical — but the route we follow is good preparation for the general study of the ordinals. In particular, we will also study the order relation on the natural numbers. 3.2 (cj, g) is a Transitive Well— Ordered Set Above, we saw that 1 = {0}, 2 = {0, 1}, 3 = {0, 1, 2} . . . etc. Thus, intuitively, m < n if and only if m G n. In other words, it seems that the G-relation acts as a strict total ordering < on the set of natural numbers. Of course, the set of natural numbers yields the primary example of a well-ordered seQ It is the fact that the natural numbers are well-ordered that makes possible proofs by induction and definitions by recursion, so the importance well- ordering is hard to overstate. In this section, we make precise and verify the above assertions for our interpretation uj of the set of natural numbers. We begin with some definitions: Definition 3.2.1 (a) A set X is transitive if each element of X is also a subset of X: Trans(X) = Vx € X[x C X) (This means that if y G x and x G X, then also y €z X.) (b) An element y S x is said to be an ^-minimal element of x if there is no z G y is such that z G X. Put differently, y G x is an G-minimal element of x if y H x = 0. (c) A set X is well-founde^\i every non-empty subset z C x has a G-minimal element. ^Recall that a total ordering (X, <) is a well-ordering if every non-empty subset of X has a least element. We will investigate well-ordered sets in more detail in the next chapter. ''By the Axiom of Foundation, every non-empty set has an €-minimal element, i.e. every set is well-founded. But we do not want to use this axiom here if we can avoid it. Set Theory 38 □ Lemma 3.2.2 (a) If x is transitive, then so is S{x). (h) If X is well-founded, then x ^ x. (c) If X is transitive and well-founded, then so is S{x). (d) If x,y are sets such that (i): x is transitive and well-founded, and (ii): S{x) = S{y), then X = y. Proof: (a) If y G S{x) = x U {x}, then either y G x or y = x. Since x is transitive, y C x in either case, and thus y C S{x) also. (b) {x} is a non-empty subset of x, and thus has a S-minimal element, i.e. there is y € {x} so that y n {x} = 0. But then x = y, so x (1 {x} = 0, contradicting the fact that X G X n {x}. (c) We have already seen that S{x) is transitive if x is. Now suppose further that x is well-founded, and that y Q S{x) is non-empty. We distinguish two cases: Case (i): If y fix is non-empty, then since x is well-founded, yDx has an S-minimal element z, i.e. there there is z € y D x such that z Ciy D x = 0. We claim that z is an G-minimal element of y also. For suppose that there is w £ z Dy. Then certainly w G z. But as z G x and X is transitive, we must have w G x as well. But then w G z D y D x — contradiction, because z Ci y Ci x = 0. Case (ii): If y D x = 0, then y = {x}. But since we know that x x, we see that x is a E-minimal element of y. (d) Suppose X ^ y. Then x G y and y G x. Since x is transitive, x G x. But since x is well-founded, x x — contradiction. H Lemma 3.2.3 (a) If X is inductive, then {x G X : x CI X} is inductive also. Hence uj is transitive. (h) If X is inductive, then {x G X : x is transitive and well-founded} is inductive also. Hence every element n G u is transitive and well-founded. (c) UJ is well-founded. (d) If X is inductive, then {x G X : x = V 3y G X (x = S{y))} is inductive also. Hence every non-0 element of oj is the successor of some element ofuj. Proof: (a) Let y := {x € X : x C X}. Then gY. Now suppose that x gY. Then x G X, so S{x) G X, because X is inductive. Further, x C X, and hence S{x) C X. Since S{x) G X and S{x) C X, we see that S{x) G Y. Hence Y is inductive. It follows that {n G a; : n C w} is an inductive subset of co. Since uj is the smallest inductive set, we see that n C a; holds for every n G oj. Thus oj is transitive. (b) Let Y = {x G X : X is transitive and well-founded}. Certainly gY. Now if x G y, then ^(x) G X (because X is inductive). Moreover, ^(x) is transitive and well-founded (by Lemma 3.2.2). Hence S{x) G Y, and thus Y is inductive. 39 Natural Numbers It follows that {n £ uj : n is transitive and well-founded} is an inductive subset of w, and thus that it is equal to w. (c) Let X C a; be a non-empty set, and let n £ X. If nOX = 0, then n is an G-minimal element of X. Else n H X is a non-empty subset of the well-founded set n, and therefore nf] X has an G~minimal element m. We claim that m is an G-minimal element of X also. Indeed, since m G n and n is transitive, we have m <^ n, and hence mriX = nirinr]X = 0. (d) Let Y := {x G X : X = V 3y £ X {x = S{y))}, so that a non-empty element of X belongs to Y if and only if it is the successor of some element of X. Then clearly G Y. Moreover, if x £ Y, then clearly S{x) is the successor of some element of X, and hence S{x) G Y also. As above, we see that {n G a; : n = V 3m E w (n = S{m))} is an inductive subset of u, and hence is equal to u. We can now see that uj, S do indeed satisfy Peano's axioms: I. £ CO, because u is inductive. II. If n G w, then S{n) G to, because u is inductive. III. / S{n) for any n, because is the empty set, whereas n £ S{n). IV. If S{n) = S{m), then n = m, by Lemmas 3.2.2[ d) and 3.2.3| ^b). V. The Principle of Mathematical Induction holds, by Theorem [3.1.2 Now let's investigate the order relation on the set of natural numbers. Define a binary relation < on a; as follows: m < n if and only if m £ n As usual, "n < m" means ^'n = m V n < m". Observe that 0<1<2<3<..., because 0g1g2g3g Theorem 3.2.4 (oj, £) is a transitive strict well-ordered set. □ We already know that cj is a transitive well-founded set, and that the same is true for any n £ uj. The proofs of the remaining assertions of this theorem are given in the following exercise: Exercise 3.2.5 (a) Show that < n for all n £ uj. [Hint: Let X := {n £ uj : < n} and show that X is an inductive subset of uj.] (b) Show that for all m,n £ uj we have m < n + 1 if and only if m < n. (c) Show that for all m,n £ uj, we have m < n if and only if m + 1 < n. [Hint: Let X := {n £ uj : Vm £ uj (m < n — )• m + 1 < n)}.] (d) Show that the relation < is transitive: li k < m and m < n, then k < m. Set Theory 40 (e) Show that the relation < is asymmetric: li n < m, then m ^ n. (f ) Show that the relation < is connected: For any n, m, either n < m oi n = m oi m < n. [Hint: Show, with the aid of (a)-(c) that the set {n e oj : Vm e oj {n < m V n = m V m < n} is inductive.] (g) Show that if X C a; is non-empty, then X has a <-least element. □ We thus see that w is well-ordered by the G-relation. Furthermore, if n G w, then n C w, since cj is transitive. Since every subset of a well-ordered set is also well-ordered, we see that each n G w is also well-ordered by G. Thus if either x e oj ot x = oj, then x has the following properties: (i) X is transitive. (ii) The G-relation well-orders x. This means that every n E u, as well as oj itself, is an ordinal: An ordinal is a transitive set which is well-ordered by the G-relation. So we have ordinals 0, 1,2,3, ...,a; But now it follows easily that ^(t^'), SS{u}), SSS{lu), . . . are transitive and well-ordered by G also. This yields transfinite ordinals u + l,uj + 2,uj + 3, . . . We will examine ordinals more closely in the next chapter, where we shall see that every well-ordering is order-isomorphic to a unique ordinal. We remark that assuming the Axiom of Foundation, the G-relation is a well-ordering as soon as it is a linear ordering. However, we have not used Foundation in this chapter. 3.3 Recursion In practically all branches of mathematics there are instances of definitions by induction. Typically, one is given an initial object xq and then a rule to transform a given object Xn into a new object Xn+i- The following theorem shows that our definition of the natural numbers allows for such definitions: Theorem 3.3.1 (Recursion Theorem — Version I) Let f : X X be a function, and let xq E X. Then there exists a unique function g : uj ^ X with the properties that 3(0) = xo g{n + 1) = f{g{n)) (*) □ The idea is that ^(0) = xo, xi = g{l) = f{g{0)) = f{xo), X2 = g(2) = f{g{l)) = f{xi), etc. The proof of the above theorem is given by the next exercise: 41 Natural Numbers Exercise 3.3.2 (a) First prove uniqueness: If gi : uj ^ X and , g2 '■ uj ^ X are two functions which both satisfy (*) of Theorem 3.3.1, then gi = g2- (b) Let be the set of ah subsets of F C x X with the following properties: (i) (0,xo) G F; (ii) If (n, x) G F, then (n + 1, f{x)) G F. Let (7 := f^J-- Explain why g is a set, and why g £ J^. (c) Since g is a set of ordered pairs it is a binary relation. Show that dom(g) = oo. (d) It remains to show that the relation g is a function, i.e. that if (n,x), (n, y) G g, then X = y. Define S := {n £ oj : Vx, y £ X {{n, x) £ g A {n, y) £ g ^ x = y)} We must show that S = oj. First show that £ S. [Hint: Suppose that (0, y) £ g for some y ^ xq. Show that G — {(0, y)} G J^.] (e) Finally, show that S = u. [Hint: Show that S is inductive. If n G 5, there is a unique x such that (n, x) G G, and hence certainly (n + 1, /(x)) G G. Suppose that {n + l,y) £ G for some y / x. Show that G-{{n + l,y)}£T] □ In some definitions by induction, one needs not just the current value Xn to define Xn+i, but also earlier values Xm for m < n. For example, the Fibonacci sequence is defined by Xq = Xi = 1 Xn+l = Xn-l + Xn for n > 1 In general, we may want to define function of xq, xi, . . . , x^j. The way to do this is as follows: Given a set X, let X^'^ := {h : h is a function with dom(/i) G uj and ran(/i) C X} It is not hard to see that X^^ is a set, as each h £ X^^ is a subset oi u x X. If /i G X^^ has dom(/i) = n, and h{k) = £ X for k < n, then we indicate this state of affairs by writing h = {ho, . . . ,/i„_i] i.e. we may think of an element of X'^^ as being finite sequence of elements of X. This is merely a temporary notation to make the discussion that follows easier to understand. Note that the empty sequence | ] := G X'^^. Now let / : X^^ X he a function. Operating on an intuitive level, we can inductively define a sequence 2^0 = /(ID Xi = f{lxoj) X2 = /([xo,xiI) . . . X„+i = /(|xo,xi . . . ,x„]) If we (still operating intuitively) define a function g : uj ^ X : n ^ Xn, then we have g(0)=xo = /([l) = /(<7fO) Set Theory 42 because g \ = = I}. Next, <7(l) = xi = /(Ixol) = /(gri) because 5 f 1 is the finite sequence 1(7(0)] = |xol- Then, 5(2) = X2 = /([xo,xil) = /(5 r2) because g \ 2 = lg{0),g{l)} = [xcxi], etc. The fohowing theorem states that the above construction can be accomphshed within the confines of ZF~ : Theorem 3.3.3 (Recursion Theorem — Version II) For any set X and any function f : X"^^ ^ X there is a unique function g : uj ^ X such that g{n) = f{g\n) = f{{g{d),g[l),...,g{n -1)1) for all n ^ uj Proof: Uniqueness of g follows easily by induction. To prove that g exists, we use Theorem 3.3. 1| First, define F : X<'^ X<'^ by F{h) := /i U {(dom(/i), /(/i))}. This means that F([xo, . . . , x„_i|) := {xo, . . . , Xn-i, filxo, • • • , a;n-il)l By Theorem 3.3.1 there is a unique G : uj ^ X^^ such that G(0) = I1 G(n + l) = F(G(n)) Thus G(n + 1) = G{n) U {(n, f{G{n)))}. Observe that each G{n) is a function G{n) : n — t- X, and that G{n + 1) f n = G{n). Now let g := Uran(G) = [j^^^ G{n). Then g : co ^ X, and g \ n = G{n) for all n £ uj. Hence g{n) = G{n + l)(n) = f{G{n)) = f{g \ n), as required. Now that we've constructed the natural numbers, it is possible to define the operations of addition and multiplication on the set of natural numbers by recursion. However, we will leave this till a later chapter, where we shall define these operations on the class of all ordinals. We can also: • Construct the integers Z as a set of equivalence classes of ordered pairs of natural numbers. • Construct the rational numbers Q as a set of equivalence classes of integers. • Construct the real numbers M, as the set of Dedekind cuts {L, U), where L, U are non- empty subsets of Q with the property that U has no minimum element, and for all / S L and u G U we have / < u. • Construct the complex numbers C as the set of ordered pairs of real numbers. Of course, we must also define all the usual operations and relations on these sets as well. All this can be done, but we won't pursue it in this course. Try Edmund Landau's Foundations of Analysis for a mercilessly rigorous approach. 43 Natural Numbers 3.4 Finite and Infinite Sets Observe that if n G u, then n = {m : m < n}, because < is G. Further observe that if m,n G u, then m < n if and only if m C n, because G is transitive on the set of natural numbers. Definition 3.4.1 Suppose that n G A set X is said to have n elements if and only if there is a bijection from X to the set n. In that case we say that the cardinality of X is n, and write |X| = n. A set X is said to be finite if there is n G a; such that \X\ = n. A set is said to be infinite if and only if it is not finite. □ Thus a set X has n elements if it can be indexed by the set n, i.e. can be written X := {xi : i G n}. Since the G is <, this can also be written as X := {xi : i < n}. This state of affairs may also be indicated by writing X := {xq, xi, . . . , Xn-i}- Clearly each n G a; is a finite set, and |n| = n. Tlieorem 3.4.2 (a) The notion of "number of elements of a finite set" is well-defined: If \X\ = n and \X\ = m, then n = m. (h) A subset of a finite set is finite: If X is finite, and y C X. Then Y is finite. In addition, \Y\ < \X\. (c) The image of a finite set is finite: If X is finite and f is a function, then f[X] is finite. In addition \f[X]\<\X\. (d) The union of a finite family of finite sets is finite. (e) The power set of a finite set is finite. (f) A set X is infinite if and only if for every n £ oj there is an injection f : n ^ X. (g) (Using AC) A set X is infinite if and only if there is an injection f : oj ^ X . □ The proofs of the above assertions are contained in the exercises that follow. First, how- ever, we state and prove a Lemma: Lemma 3.4.3 Let n £ uj and let X be a proper subset of n. Then there is no bijection f : X n. Proof: Let m G a; be least such that there a bijection / : m X onto a proper subset X ^ m. Then m > (because = has no proper subsets), and hence by Lemma 3.2.3 'd) there \s n £ uj such that m = n + 1. Now consider two cases: Case 1: If n G X, there is k < n such that f{k) = n. Let g : n + 1 n + 1 be the bijection which has g{k) = n, g{n) = k, and which has g{m) = m for all m ^ {k,n}. (Thus g is the permutation which swaps k and n, and leaves all other elements of n + 1 unmoved.) Then f '■= f ° 9 ^ bijection f : n + 1 >^ X with f{n) = n. It is easy to see that / \ n maps n Set Theory 44 onto X — {n}, a proper subset of n. But this contradicts the minimahty of m = n + 1. Case 2: li n ^ X, then X C n. It is now easy to see that / \ n maps n onto X — a proper subset of n — again contradicting the minimahty of m. -\ Exercise 3.4.4 (a) Prove Theorem 3.4.2[ a): Show that if \X\ = n and \X\ = m, then n = m. [Hint: First show that if n 7^ m, then either n ^ m or else m ^ n. Now use Lemma 3.4.3 (b) Show (without using the Axiom of Choice) that the foUowing are equivalent for n,m £ ui: (i) n < m. (ii) There is an injection n ^ m. (iii) There is a surjection m ^ n. [Hint: (ii) implies (i): If / : n ^ m and m < n, then / is a bijection from n onto ran/ ^ n. (iii) implies (ii): If g : m ^ n, one can define a right inverse f : n ^ m without using AC] (c) Suppose that X, Y are finite sets. Show (without using AC) that the following are equiv- alent: (i) 1^1 < \y\- (ii) There is an injection f : X Y. (iii) There is a surjection g lY ^ X. [Hint: Use (b).] (d) Prove Theorem 3.4.2K b): Show that if X is finite and Y C X, then Y is finite, and \Y\ < \X\. (e) Prove Theorem 3.4.2 c): Suppose that X is a finite set, and that / is a function. Show that f[X]\ is a finite set, and that |/[^]| < |^|. □ Exercise 3.4.5 (a) Suppose that X, Y are finite sets. Show that X UY is finite. [Hint: Recall that we haven't defined addition on the natural numbers, only the successor function. Suppose that m is least for which there are finite sets X, Y with |X| = m and such that X UY infinite. Show that m = n + 1 for some n. Let x €z X, and consider X' ■.= X - {x} and Y' ■.= YU {x}.] (b) Prove Theorem [3A2](d) : Show that if X = {Xq, Xi, . . . , Xn-i} is a finite family of finite sets, then |J = |Jfc=o -^k is finite. (c) Prove Theorem |3.4.2[ e): Show that if X is finite, then V{X) is finite. [Hint: Induction on \X\. Observe that if x G X, then V{X) = XVjy, where X := {Y (1 X ■ X ^Y] a.ndy := {Y X : X e Y].] 45 Natural Numbers □ Exercise 3.4.6 Prove Theorem 3.4.2[ f): Show that a set X is infinite if and only if there is an injection f : n ^ X for every n £ u. Conclude that every inductive set is infinite. □ A set X is said to be Dedekind infinite if there is a bijection from X onto a proper subset of X, and Dedekind finite otherwise. Exercise 3.4.7 (a) Show that every finite set is Dedekind finite, and that every Dedekind infinite set is infinite. (b) Prove (without using AC) that a set X is Dedekind infinite if and only if there is an injection uj X. [Hint: Suppose that Y X is a proper subset of X, and that f : X Y is an injection. Let xq £ X — Y, and define g : u) ^ X hy g{0) = xo g{n + l) = figin)) If g : uj ^ X, consider f : X ^ X hy fix) where S is the successor function. X if X ^ ian{g) g{S{g'^{x))) ifxGran(5) (c) Prove (with the aid of AC) that a set X is (in)finite if and only if it is Dedekind (in)finite. Conclude that Theorem 3.4.2[ g) holds. [Hint: Suppose that X is the set of all non-empty subsets of X, and that F : X ^ [JX is a choice function. Consider g : uj ^ X defined by g{n) := F{X — ran(g \ n)) and use Theorem [3X2t f).] □ Set Theory 46 Chapter 4 Ordinals 4.1 Well— Orderings Let (X, <) be a totally ordered set. If x £ X, then we define the initial segment given by x by: X{x) := {y £ X : y < x} Thus X{x) is the set of all elements that are strictly below x. Clearly X{x) is a totally ordered set also (with the relation <f X(x) x X{x) inherited from X). Lemma 4.1.1 Suppose that (X, <) is a total ordering. Define X := {X{x) : x G X}. Then the map f : X ^ X : x ^ X{x) is an order-isomorphism from (X, <) to {X ^ C). Proof: It suffices to show that x < y if and only if X{x) C X{y). It is obvious from the transitivity of < that X[x) C X{y) whenever x < y. Conversely, suppose that X{x) C X{y). If X ^ y, then y < x, so y G X{x), and hence y G X(y) — contradiction. H When two partial orderings {X, <) and (Y, -<) are order-isomorphic, we say that they have the same order-type. Recall that a total ordering (W, <) is called a well-ordering if every non-empty subset of X has a <-least element. Clearly, the set of natural numbers, together with its usual ordering, is a well-ordering. The following lemma gives two methods for constructing well-orderings on products. It will prove useful when we discuss the arithmetic of ordinal and cardinal numbers. Lemma 4.1.2 (a) Suppose that {X, <) and {Y, -<) are strict partially (totally, well-) ordered sets. Define the lexicographic ordering C on X x Y by (xi,yi) C (x2,y2) xi < X2 V (xi = X2 Ayi-<y2) Then \Z is a strict partial(totally, well-) ordering on X x Y . ^Lexicography: The editing or making of a dictionary. In a dictionary, words are ordered lexicographically. To decide which word occurs first in the dictionary, you look at their first letters. If these are the same, you look at their second letters. . . 47 Set Theory 48 (b) Suppose that {X,<) is a strict well-ordered set. Define the canonical ordering C on the product X X X by max{x,y} < max{x',y'} max{x, y} = max{x', y'} and x < x max{x, y} = maxja;', y'}, x = x\ and y < y' Then \Z is a strict well-ordering on X x X . {x,y) C {x',y') or or □ Exercise 4.1.3 Consider (w, <), the set of natural numbers equipped with its usual order relation. (a) Write down the first ten elements in the lexicographic ordering \Zi of oj x uj. Hence show that (w X w, C;) is a well-ordering, but not order-isomorphic to (w, <). [Hint: Observe that the element (1,0) is not the successor of any {n,m) & uj x oj.] (b) Prove Lemma 4.1.2| ^a). (c) Write down the first ten elements in the canonical ordering Cc of u x uj. Hence show that (a; X UJ, Cc) is order-isomorphic to {uj, <), and thus a well-ordered set. (d) Prove Lemma 4.1.2[ b). □ Exercise 4.1.4 (a) Show that a finite total ordering is a well-ordering. [Hint: You must show that if {X, <) is a finite total ordering, and if y is a non-empty subset of X, then Y has a least element. Do this by induction on \Y\.] (b) Here is a useful characterization of well-ordered sets: A total ordering (X, <) a well- ordering if and only if there is no sequence (xn) of elements of X such that xi)x2 > xs > .... Prove this (using AC). [Hint: One direction is easy. For the other, suppose that Y C X has no least element, and let F be a choice function for family the non-empty subsets of Y. Define g : uj ^ Y recursively by g{n) = F{Y - {g{0), g{l), . . .,g{n- 1)}).] □ It is precisely well-ordering that allows arguments by induction: Theorem 4.1.5 (Induction on a Well-Ordering) Let (W, <) be a well-ordered set. Let E C W be a set with the following properties. (i) The least element ofW is a member of E. (a) For all X G X, ifW{x) C E, then x G E. Then E = W. 49 Ordinals Proof: Else let wo be the least element oi W — E. Then W{wo) C E, so wq £ W — contradiction. H In preparation for the definition of the transfinite ordinals, we will now begin our inves- tigation of the structure of well-ordered sets. Most of the results that follow in this section are due to Cantor. Theorem 4.1.6 Let {W,<) be a well-ordered set. (a) Let f : W ^ W be a strict order-preserving map. Then x < f{x) for all x £ W. (b) Idw is the only order-automorphism from W to W. Proof: (a) Let E := {x £ W : x ^ f{^)}- If E ^ 0, then E has a least element xq. Since xq ^ /(xq), we have /(xq) < xq, and hence /(/(xq)) < f{xo). But then /(xq) € E, contradicting the fact that xq is the least element of E. Hence E = 0. (b) If / is an order-preserving automorphism, then both the maps /, f~^ are strict order- preserving. Hence x < /(x) and /(x) < /^^(/(x)). H Corollary 4.1.7 If (W, <) and {V, :<) are order-is amorphic well-ordered sets, then there is a unique order-isomorphism from W onto V. □ Corollary 4.1.8 No well-ordering (W, <) is order-is amorphic to an initial segment W{wo) of itself. □ Theorem 4.1.9 Let (W, <) and {V, :<) be well-ordered sets. Then exactly one of the following is the case: (a) W,V are order-is amorphic. (b) W is order-isomarphic to an initial segment of V , i.e. there is vq G V such that W is order-is amorphic to V{vo). (c) V is order-isomarphic to an initial segment ofW, i.e. there isw^^W such that W{wq) is order-is amorphic to V . Proof: Define a set of ordered pairs / by / := {{w,v) £ W X V : W{w) is order-isomorphic to V{v)} Observe first that / is a function: For suppose that {w, vi) and {w, V2) both belong to /, where vi / f2- Without loss of generality, we may assume that vi -< V2. Then V{vi) and ^(^2) are order-isomorphic (as each is order-isomorphic to W{w)). But then, if h : ^(^2) — ?• V{vi) is an order-isomorphism, we must have h{vi) -< vi, which contradicts Theorem 4.1.6 Set Theory 50 Next observe that / is a strict order-preserving map: For suppose that {wi,vi), {w2, V2) € / are such that wi < W2- We want to show that vi -< V2- Now there is an order-isomorphism h : W{w2) Viv2), and this (when restricted) yields an order-isomorphism / f W{wi) : W{wi) — >■ V{h{wi)). Hence {wi,h{wi)) G /, so that h{wi) = vi. But as h{wi) G V{v2), we see that vi -< V2, as required. In particular, it follows that / is one-to-one. Observe that dom(/) is downwards-closed: If wi < W2 and W2 G dom(/), then wi G dom(/) as well. Indeed, there isv2&V such that W{w2) is order-isomorphic to V{v2)- Since wi G W{w2), there must be vi G ^(^2) such that W{wi) is order-isomorphic to ^(^1), so that wi G dom(/). In the same way, it can be shown that ran(/) is downwards-closed. Now we distinguish three cases: Case 1: dom/ ^ W: In that case, let wq be the <-lcast element of W — dom/. We claim that / is an order-isomorphism from W{wo) onto V. Now because dom(/) is downwards- closed, there is no w G dom(/) such that u'o < w. Hence dom(/) = W{wo). If / is not sur- jective, let vq be the ^-least element ofV — ran(/). Then clearly W{wo) is order-isomorphic to V{vo) — contradicting the fact that wq ^ dom(/). Hence / is a strictly order-preserving map with dom(/) = W{wo) and ran(/) = V. Case 2: ran/ ^ V: In that case, let vq be the -< -least element of F — ran(/). Arguing as in Case 1, it follows that / is an order-isomorphism from W onto V{vO). Case 3: dom/ = W and ran/ = V. In that case, / is an order-isomorphism of W onto V. H 4.2 Ordinals Recall that the set uj of natural numbers was shown to have the following properties: (i) a; is a transitive set: Vn (ra G a; n C w). (ii) UJ is well-ordered by the G-rclation: The relation < on u defined by n < m <^=^ n £ m is a strict total-ordering on lo, and every non-empty subset of u has a <-least element. Furthermore, we saw that every n E lo has precisely the same properties: Each n G a; is transitive, and well-ordered by G. This suggests a definition, due to John von Neumann: Definition 4.2.1 A set is said to be an ordinal if and only if it is transitive and well-ordered by G. □ Thus UJ is an ordinal, as is each n e uj. Exercise 4.2.2 Recall that n = {m : m < n} for all n G uj. Thus uj{n) = n for all n G w, where uj{n) is the initial segment of uj determined by n. In fact, this property characterizes the ordinals: Show that a strict well-ordering {W, <) is an ordinal if and only if W{x) = x for all X gW. [Hint: (<^): v < w iS v £ W{w) iSvGw. li v £w £W, then v G W{w) C W.] 51 Ordinals □ We shall use the first few letters of the Greek alphabet to denote generic ordinals. Since G is a strict order relation, we shall also write a < (3 when a G (3. Observe that if a is a non-zero ordinal and xq is the first (i.e. the least) element of a, then xq = = 0. For if x G xq, then x G a (by transitivity of a), and x < xq (because < is G), which contradicts the fact that xq is the least element of a. Suppose now that xi is the next element in a, assuming it has one, i.e. xi is the least element in a which is > xq. Then X G xi <^=^> X < xi <;=^ X = Xo and hence xi = {xq} = {0} = 1. Now if X2 is the next element of a (again, assuming there is one), then X G X2 <^=^ X < X2 <^=^ X = Xo V X = Xl and so X2 = {xo, xi} = {0, 1} = 2. It thus appears that each ordinal a starts 0<1<2<3<.... Each of these elements is itself an ordinal. In Remarks 4.2.13, we will explain that the existence of an inductive set is equivalent to the existence of an infinite set. Thus keep the Axiom of Infinity in mind while reading the proofs of the next few propositions and theorems, and observe that these proofs do not rely on this axiom. Proposition 4.2.3 // a is an ordinal and 13 £ a, then /3 is an ordinal. Proof: Suppose that a is an ordinal, and that (3 G a. We first show that f3 is transitive, i.e. that if 7 G (3, then 7 C Now if 7 G /3 and 5 G 7, then (by transitivity of a, we have S,^, 13 G a. Since < is G on a, we have 5 < 7 < /?, and since < is transitive, we have 6 < /3, i.e. (5 G /3. Thus 7 C /3. H Now observe that every ordinal is a transitive wellfounded set: Every non-empty subset of an ordinal has a least element, which must be an G-minimal element. If X, Y are transitive wellfounded sets, then a G-isomorphism from X to y is defined to be a bijective map vr : X ^ Y with the property that Xl G X2 <^=^ vr(xi) G 7r(x2) (xi,X2GX) Note that if a, f3 are ordinals, then an G-isomorphism from ct to /3 is precisely an order- isomorphism. Also observe that if vr : X — )■ y is an G-isomorphism between transitive wellfounded sets, and X G X, then 7r(x) = {vr(x') : x' G x} = 7r[x]: Indeed, if y' G vr(x), then, there is some x' G X such that 7r(x') = y' . It follows that 7r(x') G vr(x), and hence that x' G x. The following proposition will be useful. Proposition 4.2.4 Suppose that X,Y are transitive well-founded sets, and that tt : X ^ Y is an G-isomorphism. Then 7r(x) = x for all x G X . Hence X = Y . Set Theory 52 Proof: If {x £ X : 7r{x) 7^ x} is non-empty, it has a 6-minimal element xq. Now if x £ xq, then also x £ X, and hence 7t{x) = x. Thus 7r(xo) = {vr(x) : x G xq} = {x : x G xq} = xq — contradiction. Hence {x E X : 7r(x) 7^ x} = 0. H Theorem 4.2.5 // a, /3 are ordinals, then exactly one of the following is the case: (i) a = f3, or (a) a £ 13, or (Hi) P £ a. Proof: a, /3 are both transitive sets well-ordered by G, and hence wellfounded. Hence at most one of the above alternatives can be the case. By Theorem 4.1.9|either (i) a, (3 are order-isomorphic, or (ii) a is isomorphic to an initial segment of (3, or (iii) /3 is order-isomorphic to an initial segment of a. Suppose that a is isomorphic to an initial segment of (3, i.e. that there is f3o £ (3 and an order-isomorphism vr : a /3(/3o)- Observe that /3(/3o) := {7 G /3 : 7 < /?o} = {7 G /3 : 7 e /3o} = /3n /3o = /3o because /3q C /3. Thus vr is an G-isomorphism from the transitive wellfounded set a onto the transitive wellfounded set /3o- By Proposition 4.2 A we have a = /3o, and thus a £ f3. The remaining two cases can be dealt with in a similar fashion. H The next corollary follows immediately: Corollary 4.2.6 If a, f3 are ordinals, then a £ f3 if and only if a'^ /3. □ We denote the class of all ordinals by ON := {a : Q is an ordinal} We shall soon see that ON is a proper class, i.e. not a set. For the moment, however, note that ON is a class which is well-ordered by £: By Theorem 4.2.5 G is a strict linear ordering on ON. To see that it is a well-ordering, suppose that Y is a non-empty subclass of ON, and let a G Y. If a H Y = 0, then a is the least element of Y. Else, a H Y is a non-empty subset of the ordinal a, and hence has a least element (3. That /3 is easily seen to be the least element of Y. Thus every non-empty subclass of ON has a least element. Here are some additional facts about ordinals that are easy to establish: Proposition 4.2.7 (a) If a is an ordinal, then a = {{3 : (3 < a}. 53 Ordinals (b) If a, (3 are ordinals, then exactly one of the following is the case: either a is an initial segment of f3, or a = 13, or f3 is an initial segment of a. (c) If C is a non-empty class off ordinals, then {^C is an ordinal. In fact, [^C is the least element of C (i.e. H C S C and H ^ = inf C ). (d) If C is a set of ordinals, then \JC is an ordinal. Furthermore, [jC = supC. (e) If a is an ordinal, so is S{a) := aL){a}. Furthermore, S{a) is the successor of a, i.e. the least ordinal which is > a. □ Exercise 4.2.8 Prove the preceding proposition. □ Since S{a) is the successor of a we write a + 1 := S{a). An ordinal a is said to be a successor ordinal if an only if there is /3 such that a = (3 + 1. An ordinal which is not a successor ordinal is called a limit ordinal. Observe that is a limit ordinal, as is uj. Indeed, uj is the smallest non-zero limit ordinal: For if a < a; is a non-zero ordinal, it is a non- zero natural number, and hence a successor ordinal. Furthermore, the successor of a natural number is a natural number, so ui is not a successor ordinal. Exercise 4.2.9 Suppose that ao < ai < 02 is a strictly increasing sequence of ordinals. Show that sup„^^ a„ is a limit ordinal. □ Exercise 4.2.10 If a is a limit ordinal, then a = sup{/3 : f3 < a}. If a is a limit ordinal, then sup a < a. □ Exercise 4.2.11 Show that the class ON of all ordinals is a proper class, i.e. not a set. [Hint: If ON is a set, then by Proposition 4.2.7, sup ON is an ordinal.] □ The result of the previous exercise — that the collection of ordinals is too big to be a set — is known as the Burali-Forti paradox, after Burali-Forti, who published it in 1897. It is probably the earliest of the paradoxes that set theory had to contend with in its initial years. The Axiom of Replacement has not yet been needed in our development of set theory, but that is about to change. Theorem 4.2.12 Every well-ordered set is isomorphic to a unique ordinal. Set Theory 54 Proof: Let {W, <) be an arbitrary well-ordered set. Define a binary relation F by {w,a) G F w £ W and a G ON and W{w) is order-isomorphic to a. Then by Theorem 4.2.5| -F is a class function. By Theorem |4.1.9 F is injective. Moreover, it is easy to see that F is strictly order-preserving. Next, we show that if w G W, then is an ordinal. (We are not claiming that W{'w) C dom(F).): Firstly, since W{w) is a set, so is F[Vl^(u))], by the Axiom of Replacement. Now 7^ ON, because ON is not a set (by Exercise 4.2.11). If 7 is the least ordinal which is not an element of F[W{w)], then certainly F[VF(t(;)] = {j3 e ON : /3 < 7} = 7. In the same way it can be shown that is an ordinal. Now we demonstrate that dom(F) = W: For else W — dom(F) has a least element w. Then ^ := F[VF(tt;)] is an ordinal, and F : W{w) — )• ^ is a order-preserving bijection, and hence an order-isomorphism. Hence {w, ^) S -F — contradiction. Thus if we define a := F[W], then F is an order-isomorphism from W onto a. Hence every well-ordered set is order-isomorphic to some ordinal. That ordinal must be unique. because no two distinct ordinals are order-isomorphic (by Theorem 4.1.9 and Proposition Remarks 4.2.13 Recall that the Axiom of Infinity asserted: There is an inductive set. The set oj of natural numbers is then the smallest inductive set. A set X was defined to be finite if it can be placed into bijective correspondence with some n £ ui, i.e. if \X\ = n for some n £ uj. A set was said to be infinite if it is not finite. We showed that a set X is infinite if and only if for every n G uj there is an injection n ^ X. We now want to show that, in the theory ZFC 0-5 + Replacement, the existence of an inductive set is equivalent to the existence of an infinite set. First suppose that there exists an inductive set X. It follows that u is a set (namely the intersection of the non-empty class of all inductive sets). li n G to then n C w, so that there is an obvious injection n ^ to. Hence OJ is infinite. This proves that if there is an inductive set, then there is an infinite set. Conversely, suppose that there is an infinite set X. Certainly every natural number is an ordinal. But it is, at this stage, conceivable that ON is the class of all natural numbers, i.e. that u = ON is a proper class. (Obviously, if ON contains some ordinal a > uj, then OJ G ON as well, by transitivity, and hence u would be a set. So if a; is a proper class, then OJ = ON.) Let X := {Y C. X '.Y is finite}. By Power Set and Separation, Af is a set. Define a binary relation F by (Y,n) e F <?=^ n£ujA\Y\=n Then F is clearly a class function (i.e. if {Y,ni) G F and {Y,n2) G F, then m = 712). By the Axiom of Replacement, F[J\!] is a set. We claim that F[X] = uj. If not, there is a least m G OJ such that m F[Af], because oj = ON is a well-ordered class. Now m 7^ 0, since £ X. Thus m = n + 1 for some n £ oj. Hence n G -F[Af], i.e. there is y C X such that \Y\ = n. But since X is infinite, we cannot have y = X, and thus there is x £ X — Y . Then |yu{x}| = n + l, so 71 + 1 G F[X] — contradiction. Thus F[X] = is a set, i.e. an inductive set exists. □ 55 Ordinals Remarks 4.2.14 Observe that since a; is an ordinal, we may inductively define new ordinals CO + n (for n G w) by oj + := iu CO + {n + I) = S{uj + n) The class uU {u + n : n € uj} is clearly transitive an well-ordered by €, and thus ought to be an ordinal: call it a; + The problem is that the Axioms ZFC 0-7 are not strong enough to prove that {a; + n : n G a;} is a set. It is, at this stage, conceivable that u + oj = ON is the class of all ordinals. Consider, however, the formula (p{n, y) which asserts that n € a; and that y = u + n. Clearly ip(n,y) defines a function F, in the sense that if (p{n,y\) and ip{n,y2), then yi = 7/2- By the Axiom of Replacement, F[uj] = {y : 3n E co {(p(n, y)} is a set, i.e. {cj + n : n G w} is a set. Then a; + a; := a; U {a; + n : n G is a set as well. Clearly the ordinal a; + a; is a limit ordinal. □ 4.3 Transfinite Induction and Recursion Induction is a major tool for proving results about natural numbers. Recursion is a major tool for constructing functions whose domain is the set of natural numbers. The aim of this section is to extend these tools to the class ON. The Induction Principle for the natural numbers reads as follows: If X C w is such that (i) G X, and (ii) n e X ^ n + 1 e X, then X = uj. Each non-zero natural number is a successor ordinal. In general, however, there are also limit ordinals that need to be taken care of. Hence: Theorem 4.3.1 (Transfinite Induction) Let X. be a class of ordinals. Suppose that (i) G X; (ii) If a then it follows that a + 1 G X; (Hi) If a is a limit ordinal and ^ G X for all (3 < a, then it follows that a G X. Then X = ON. Proof: If X / ON, there is a least ordinal a such that a X. By (i), a / 0. By (ii), a is not a successor ordinal. By (iii), a is not a limit ordinal — contradiction, as a limit ordinal is, by definition, an ordinal which is not a successor ordinal. H Exercise 4.3.2 Let X be a class of ordinals. Suppose that Va G ON (a C X — >■ a G X). Show that X = ON. □ Set Theory 56 Next, we discuss transfinite recursion. Recall that our second version of the Recursion Theorem for natural numbers read as follows: If / : X^'^ — )■ X is a function (where X"^^ is the set of all finite sequences in X, i.e. the set of all function from a natural number to the set X), then there is a unique function g : oj ^ X such that g{n) = f{g\n) = f{[g{Q),...,g{n-l)}) where |xo, . . • , ic^-i] is the function from the set n := {0, 1, . . . , n — 1} to the set X given by k ^ Xk. In particular, ^(0) = /(0), ^(l) = /([<?(0)1), g{2) = f{\g{Q),g{l)}), etc. Each g{n) is a function of the previously defined values ^(O), . . . , g(n — 1). To extend this result to the transfinite realm, we introduce similar notation and terminol- ogy: If a is an ordinal, then an a-sequence (or transfinite sequence of length q) is a function with domain the set a = : ^ < a}. We write: \x^ : ^ < aj is the function with domain the set a = : ^ < a} given by ^ i-^ Given a sequence s := [x^ : ^ < a] and a set x, we obtain the sequence s U {{a,x)} with domain a + 1, and denote it by {x^ : < aj^x. We also write : G ON] for a class function on ON. Suppose now that -F is a class function, defined on the class V^*^'^ of all transfinite sequence^ i.e. F assigns to each transfinite sequence of sets [x^ : < a] a set -F(|x^ : ^ < a]). We can then create a "sequence" |x^ : ^ € ON] recursively as follows: Xa := F{lx^ : C < a]) so that Xa is defined as a function of previously defined values x^, (^ < a). Thus: Xo:=F{0), xi:=F([xo]), X2 := F([xo, xi]), This shows how to define x„ recursively for all n < a; — and we knew how to do that already. But now we continue: By the Axiom of Replacement, {xn : n < cj} is a set, and hence the function |x„ : n < ujj exists (i.e. is a set). We thus have Xui ■■= F{lxn ■■ n < u)}), x^+1 := F([x„ : n < a;]^X(^), x^+2 := F(|x„ : n < wJ^x^^Xo^+i), In general, if q G ON is such that xg has already been defined for all ^ < a, then by the Axiom of Replacement, the function [x^ : ^ < a] exists (i.e. is a set), and we may define Xa :=F(|x^ :^<a]). In this way we build up a class function G : ON — )• V : a i— )• x^ Theorem 4.3.3 (Transfinite Recursion) Let F be a class function (on VJ. Then there is a unique function G on ON such that G{a) = F{G \ a) for all a G ON (*) Proof: Define a first-order formula ip{a, h) which asserts that a G ON, and /i is a function h := |x^ : < a} with domain a such that For all /3 < a, /i(/3) = F{h \ (3), i.e. x,3 = F{lx^ : ? < /3]) ^We include the finite sequences in the class of transfinite sequences. 57 Ordinals We show that tp defines a class function: If both 'ip(a, hi) and ip{a, hold, then hi = h2, i.e. the function h is unique (assuming it exists). Indeed, suppose that (3q is the least ordinal < a such that hiiPo) / /i2(/3o). Then /ii(/3o) = : ^ < /3ol) = F(I/i2(0 : ? < /^o) = h2{Po) — contradiction. In particular, it follows by the Axiom of Replacement that if hp) holds for all /3 < a, then {hp : /3 < a} is a set. Now let (p{a,x) = 3h [-(/'(a,/!) Ax = F{h)]. If both (p{a,x) and {p{a,y), hold, then X = F(h) = y, where h is the uniqe function such that ip{a,h). Hence ip defines a class function G{a) = X <;=^ ip{a,x) It remains to show that dom(G) = ON, i.e. that G{a) is defined for every a G ON. If not, let ao be the least ordinal such that oq dom(G). Then for each f3 < ao, there is a unique function hp : fi ^ W such that hp). We now consider two cases: Case 1: Suppose that ao = /? + 1 is a successor ordinal. In that case there is hp = \x^ : ^ < j3\ such that tp{f3,hp) holds. If we define xp = F{hp), and then define h := : ^ < /SJ^xp, then clearly V'(q^O) h) holds also, so that oq € dom(G) after all. Case 2: Suppose that ao is a limit ordinal. By the Axiom of Replacement, {hp : f3 < ao} is a set. It is easy to show that if ?? < /? < ao, then h^j = hp fa. It follows that h := IJ/3<ao a function with domain ao. Furthermore, if we define Xp := h{(3) for /3 < ao then h = [xg : < ao], and hp = [x^ ■ S, < P} ^ot all /3 < ao- Clearly, therefore, '4j{aQ,h) holds, and hence ao S dom(G) in this case also. We have now shown that there exists a class function G : ON — )■ V with the required property {k). Suppose that G' is another. If G 7^ G', let a be the smallest ordinal such that G(a) ^ G'{a). Then G f a = G' f a, and hence G(a) = F(G f a) = F{G' \ a) = G'{a) — contradiction. Hence G = G', i.e. there is a uniqe function with the property (★). H 4.4 Ordinal Arithmetic 4.4.1 Limits Definition 4.4.1 (a) A transfinite sequence [7^ : < a] of ordinals is said to be strictly increasing if T] < ^ implies 7,, < 7^ (b) If A > is a limit ordinal, then the limit of a strictly increasing sequence [7^ : ^ < A] of ordinals is defined by hm 75 := sup{7^ : ^ < A} = M 75 c — (This limit exists, by Proposition 4.2.7 (c) A strictly increasing sequence [7^ : ^ < a] (or [7^ : C £ ON]) of ordinals is said to be normal if it is "continuous" , i.e. if for for every limit ordinal A in its domain we have 7a = lim 75 i.e. hm7^ = 71;^^ Set Theory 58 □ Similarly, we say that a class function F : ON — )• ON is normal if it is strictly increasing and continuous. Of course, any sequence is a function. Proposition 4.4.2 Suppose that [7^ : ^ G ON] is a normal sequence of ordinals. (a) 7g > C for all^e ON. (b ) jx is a limit ordinal if X is a limit ordinal. (c) The sequence has arbitrarily large fixed points.- For every a G ON there is fi > a such that = j3. □ Exercise 4.4.3 Prove Proposition 4.4.2 [Hint for (c): Define /3o := 7Q>/3n+i := Iji^^P ■= 1™ l^n] □ 4.4.2 Addition Definition 4.4.4 (Ordinal Addition) We define a class function + : ON x ON — ON by transfinite recursion as follows. Let a £ ON. (i) a + := a. (a) a + (/3 + 1) := (a + /3) + 1 for successor ordinals (3 + 1. (Hi) a + B := limfa + £) for limit ordinals /3 > 0. □ Observe that, for each a G ON, we have defined above a function Sa ■ ON — )• ON : /3 1— )• a+/3. It is easy to see that each Sa{-) is a normal function (i.e. strictly increasing and continuous). It is also easy to see that ordinal addition coincides with the usual addition on the set of natural numbers. Furthermore, ordinal addition is associative: Lemma 4.4.5 For all a,f3,j £ ON, we have a + (/3 + 7) = (a + /3) + 7 Proof: By induction on 7: For all ordinals a, (3 we have a+(/3 + 0) = a + /3 = (a + /3) + Next, if 7 is such that q + (/3 + 7) = (a + /3) + 7 holds for all a, /3, then a + (/3+(7 + l)) = a + ((/3 + 7) + l) = (a + (/3 + 7)) + 1 = ((a + /3)+7) + l = (a + /3) + (7 + l) and hence the result holds for successor ordinals 7 + 1. Finally if 7 is limit, and a + {fi + rf) = (a + /?) + ?? for all r] < ^, then a + (/3 + 7) = a + lim(/3 + r?) = lim(a + {(3 + 77)) = lim((a + f3) + i]) = {a + (3) + j and hence the result holds for limit ordinals. 59 Ordinals H However, ordinal addition fails to be commutative! For example: 1 + a; = lim (1 + n) = a; u + 1 := S(uj) > u n<u! so 1 + UJ ^ UJ + 1. Consider now uj + 1. This is the least ordinal which is > Hence it is order-isomorphic to the set of natural numbers, followed by a new maximum element. Similarly, a; + 2 = (a; + l) + l has another additional element added to the top: It is order- isomorphic to io followed by two new elements. Next, consider oj + oj = lim{uj + n). This is the least ordinal which is > cj + n for every n < uj. n<ui Clearly, therefore, u + oj \s order-isomorphic to w, followed by another copy of u. We make this precise. Suppose that (X, -<x) and {Y, -<y) are disjoint totally ordered sets. We can form their linear sum (Xuy, -<) by gluing X, Y together in such a way that every element of X precedes every element of Y . This means that the linear sum of X, Y is order-isomorphic to an ordering consisting of X followed by Y. Thus we define ^ on X U y as follows: u^v ^ X {JY then u ~< V u,v E X and u -<x v or u, v G y and u -<y v or u e X and v eY Observe that ~< = U -(y U {X xY). If {X,^x), {Y,^y) are totally ordered sets, but X CiY ^ 0, then we can make them disjoint by a standard trick. Let X := {0} x X, and define -<x by {0,x) -<x (0,a;') if and only if X -<x x' . Clearly (X, -<x) and (X, -<^) are order-isomorphic. Similarly, let Y := {1} x Y and define -<y by (l^y) ~<y O-^v') if '^^^Y V u' ■ Then X, Y are disjoint and order isomorphic to X, Y. Observe that the linear sum of X,y is a sort of lexicographic ordering: For i,j G {0, 1} and u, u G X U y we have: (z, u) < {j, v) i < j V {i = j /\u <v) where you have to keep track to which set each < belongs. Proposition 4.4.6 If a, 15 ^ ON, then a + 15 is order-isomorphic to the linear sum of a and 15 (in that order). Proof: By induction on (5. H Exercise 4.4.7 Let a,/3,7 G ON. (a) Show that if /3 < 7, then a + /3 < a + 7. Also find an example where /3 < 7, but /? + a = 7 + a. Set Theory 60 (b) Show that if a < /?, then there is a unique ordinal 7 so that a + 7 = /?. Also find an example where a < j3, but the equation x + a = j3 has many solutions x. [Hint: The set : a < ^ < /3} is well-ordered, so is order-isomorphic to some ordinal 7.] □ 4.4.3 Multiplication Definition 4.4.8 (Ordinal Multiphcation) We define a class function ■ : ON x ON — > ON by trans finite recursion as follows. Let a G ON. (i) a • := 0. (a) a ■ {P + 1) := a ■ P + a for successor ordinals (3 + 1. (Hi) a - fi := lim(a • ^) for limit ordinals P > 0. □ Observe that, for each a G ON, we have defined above a function P„ : ON — t- ON : /? a-/3. It is easy to see that each -Pa(') is a normal function (i.e. strictly increasing and continuous). It is also easy to see that • coincides with the usual multiplication on the set of natural numbers. Furthermore, ordinal multiplication is associative: Lemma 4.4.9 For all a,/3,7 G ON, we have a • (/3 ■ 7) = (a ■ /3) • 7 □ Exercise 4.4.10 Prove the preceding Lemma. □ Observe that 2 ■ u = lim 2 ■ n = Lo uj-2 = u}-{1 + 1)=uj-1 + uj = u + co n<ijj Thus ordinal multiplication also fails to be commutative: 2 ■ u ^ cj ■ 2. Indeed 2-oj looks like w-many copies of the total ordering 2: Each element of oj is replaced by a copy of 2. On the other hand, co ■ 2 looks like 2 copies of co: Each element of 2 is replaced by a copy of UJ. We make this precise: Suppose that {X, -<x) and {Y, -<y) are totally ordered sets. The linear product of these sets is the set X xY equipped with the reverse lexicographical ordering -<, i.e. {x,y) ~< {x',y') y -<y y or y = y' and x -<x x' 61 Ordinals Thus, if we keep y €Y fixed, the subset Xy := {{x,y) : x G X} C X x y is a copy of X. We see furthermore that if y -<y y' , then each element of the copy Xy precedes each element of the copy Xy' in the reverse lexicographic ordering. Thus, roughly, {X xY,^) consists of "F copies of X" . Proposition 4.4.11 If a,P E ON, then a - 13 is order-isomorphic to the linear product of a and P (in that order). Proof: By induction on jS. H Exercise 4.4.12 Let a,^,7 G ON. (a) Show that a • (/? + 7) = a • /3 + a • 7. Also find an example where (a + /3) • 7 7^ a • 7 + /3 • 7. (b) Show that if /3 < 7 and a > 0, then a • /3 < ct ■ 7. Also find an example where /3 < 7 and a > 0, but /3 • a = 7 • a. □ 4.4.4 Exponentiation Definition 4.4.13 (Ordinal Exponentiation) We define a class function ON x ON — >■ ON by transfinite recursion as follows. Let a G ON. (i) ftO := 1. (ii) a^'^^ := ■ a for successor ordinals /3 + 1. (Hi) = lim for limit ordinals (3 > 0. □ Observe that, for each a G ON, we have defined above a function : ON — )• ON : P a^. It is easy to see that each Ea{-) is a normal function (i.e. strictly increasing and continuous). It is also easy to see that this function coincides with the usual exponentiation on the set of natural numbers. Exercise 4.4.14 Show that if a,/3, 7 G ON, then □ Set Theory 62 4.5 Project: Goodstein's Theorem Goodstein's Theorem is a a result about the natural numbers involving just the basic arith- metic operations. To state it, we introduce some defintions and terminology. Wc describe here the procedure for writing a natural number in "superbase" form. Given a natural number n > 2, every m G w has a unique representation in base n, i.e. there are unique /cq, . . . , fc^ G {0, . . . , n — 1} such that d m = ki ■ Each exponent i in the n*-term of the above representation may be an arbitrary natural number. In particular, it is possible that i > n. In that case, we can write i itself in base n, i.e. i = X]j=i h ' - some of the exponents j are > n, one repeats the procedure. Clearly, after finitely many steps, the number m will have an expression in which no numbers > n appear. This is the representation of m in superbase n. For example, if we we want to write the number m = 1 026 in superbase form, with base n = 2. Then 1 026 = 2^° + 2^ = 22'+2^ +2'= 22^'+^+2^ + 2^ No numbers > 2 appear in this representation of 1 026 in superbase 2. Problem 1. Write 59 106 in superbase 3. □ Now consider the Goodstein algorithm for generating a sequence of natural numbers (m„) starting from an arbitrary natural number m: Step 1. Let mi := m. Step 2. Write mi in superbase 2. Replace all occurrences of the symbol '2' by the symbol '3', to get a new number. Subtract 1 from this new number, to get the number m2- Step 3. Write m2 in superbase 3. Replace all occurrences of the symbol '3' by the symbol '4', to get a new number. Subtract 1 from this new number, to get the number m^. Step 4. Write m^, in superbase 4. Replace all occurrences of the symbol '4' by the symbol '5', to get a new number. Subtract 1 from this new number, to get the number m^. Step 5. : We illustrate this for m = mi =5. Step 1. mi = 2^ + 1, so m2 = 33 + 1 - 1 = 27. 63 Ordinals Step 2. m2 = 3^, so ma = 4^ - 1 = 255. Step 3. mg = 3 • 43 + 3 • 42 + 3 • 4 + 3. Hence 7714 = 3- 5=^ + 3- 52 + 3- 5 + 3 = 467. Continuing, we get 775, 1197, 1751, 2454, 3 325, 4382, 5 643, 7126, 8849, 10830, 13087, 15 637, 18499, 21691, 25 231, ... Theorem (Goodstein's Theorem) The sequence {mn)n will eventually reach the number 0, for any starting value m = mi. □ The above theorem was proved in 1944. The remarkable fact about Goodstein's Theo- rem is this: Even though it is a statement involving only the basic arithmetic of the natural numbers, it requires transfinite ordinals in order to prove it. In 1981, Baker and Paris proved that Goodstein's theorem implies the consistency of first-order Peano Arithmetic. By Godel's Second Incompleteness Theorem, PA is not strong enough to prove its own consis- tency. Hence Goodstein's Theorem is not provable in PA. It follows that transfinite methods are necessary even for proving results in number theory! Goodstein's theorem is unprovable without recourse to the transfinite. We now formalize the Goodstein algorithm. Definition: Given a base n>2, define the function Sn oj ^ oj recursively by d d 5„(0) := Sn[Y.^i- nf) :=Y,h-{n + 1)^"« 1=0 i=0 Then, for each n > 1, define functions Qn ■ uj lo recursively by gi{m) := m gn+i{m) := Sn+i{gn{m)) - 1 □ Problem 2: Show by induction that the function Sn effects the replacing of n by n + 1, so that k ■ {n+ ly if i < n fc • (n + 1)"+^ if i = ra Snik ■ n') = etc. Conclude that the sequence {gn{fn))n is precisely the Goodstein sequence starting with m. □ Problem 3. (a) Calculate the first few terms mi,m2, ■ ■ ■ ,mioo, ... of the Goodstein sequence {gn{^))n starting with mi = 5. (Play around, and push it as far as the recursive depth permitted by your programming language will allow. Observe that the terms don't seem to be converging to 0.. . . ) Set Theory 64 (b) We now try to understand how and why the sequence m„ will eventually reach 0. (i) Suppose that, for some m, it is the case that gni'm) in base n + 1. For which k is gk{m) = 0? (ii) Suppose that, for some m, it is the case that gni'm) which k is gki^n) = 0? (iii) Suppose that, for some m, it is the case that gnim) which k is gk{m) = 0? (c) Calculate how long it will take for (m„)n to reach zero when m = 4. Hint: There's doubling at work: m2 = 2-3^ + 2-3+2. Then = 2-6^ + l-6 + 5 has linear coefficient reduced by 1. Then mu = 2 ■ 12^ + 0-12 + 11 has linear coefficient reduced once more. Then 717.23 = 1 • 24^ + 23-24 + 23 has the quadratic coefficient reduced by 1. Then 77747 will have the linear coefficient reduced from 23 to 22. Doubling 22 more times, the quadratic coefficient of nin will disappear when n = 1 + 3 + 6 + 12 + 24 - -- + 3- 2^^. In that case m„+i = (77 + 1) - (77 + l)-*^ + (77 + 1) - (77 + 1)°. Double (77 + 1) more times to remove the linear coeficient altogether. . . ] □ To prove that eventually reaches zero, we use transfinite ordinal arithmetic. For 77 > 2, define a function \I'„ : w — )■ ON as follows: d d *„(0):=0 *„(^A;i-77^) :=^A:i-a;*"« i=0 i=0 For example, *2(10) = ^2(2^ + 2^) = u;*2(3) +^^2(1) ^ ^^{2^+2") ^^^^{2°) ^ ^u,*2(i)+<^*2(o) ^ Problem 4: (a) Show that for all 2 < 77, 777 < co, *„(777) = *„+l(5n(?77)) (b) Show that each is a strictly increasing function, i.e. that ^ni'm + 1) > *n('T7) for all 2 < 77, 777 < a;. (c) Show that if 2 < 77, 777 < a; and gni'm) > 0, then *n+2(5n+l("i)) < *n+l(5nM) (d) Now prove Goodstein's Theorem: V777 377 igni'm) = 0). = 77 = - (77 + 1)^ + 77 - (77 + 1)° = 2- (77 + 1)^ + 0- (77 + 1)°. For = 77 - (77 + 1)"^ + 77 - (77 + l)'^. For □ Chapter 5 Cardinals 5.1 Cardinality: Basic Definitions Ordinals are for counting (i.e. enumerating) sets. Such an enumeration automatically induces a well-ordering on a set X: Here is the first element xq, here is the next xi,. . . , here is the the next Xi^,. . . . We do not yet know whether or not every set X can be enumerated in such a fashion. If X is well-orderable (i.e. if there exists a well-ordering relation on X), then X can be thus enumerated, because every well-ordering is order-isomorphic to a unique ordinal. However, even when we do know that a set X can be well-ordered, the well-order induced by the enumeration of its elements is not generally unique (unless X is finite). For example, the set Lo of natural numbers can be enumerated in the usual fashion: 0,1,2,3,... (1) It can also be enumerated by first listing all the even numbers, and then all the odd numbers: 0,2,4,. ..,1,3,5,... (2) We can enumerate it also by first listing all the positive numbers which are not powers of some prime, followed by all the prime powers, followed by zero, as follows: 1, 6, 10, 12, 15, . . . , 2, 4, 8, 16, ... , 3, 9, 27, 81, . . . , 5, 25, 125, . . . , 7, 49, (3) The order-type of the above enumerations are: (1) w, (2) u + uj, and (3) lo-uj + 1. Obviously, the size of the set of natural numbers does not depend on how we enumerate it. To measure the size of a set, we therefore need a different notion, namely that of cardinality. The idea is that two sets A, B have the same size if there is a correspondence between the elements of A and B, such that to each element a ^ A there corresponds a unique element b & B, and vice versa. Definition 5.1.1 (a) We say that two sets A,B have the same cardinality (or the same cardinal number) and write \A\ = \B\ there exists a bijection A^ B In that case we say that A, B are equivalent, or equipollent, or have the same power. 65 Set Theory 66 (b) We say that the cardinahty of a set A is smaller than the cardinality of B, and write \A\ < \B\ there exists an injection A B We also write \A < \B\ to mean \A\ < \B\ but \A\ / \B\. □ Recall that in the chapter in natural numbers, we wrote |X| = n if there is a bij action from X to the set n. In particular |n| = n for all n € c<j. We also showed that \n\ = \m\ if and only if n = m. Thus we may define the finite cardinal numbers to be the natural numbers. For infinite sets we do not yet have a candidate for the object \X\, i.e. the symbol \X\ is (at this stage) meaningless when taken in isolation. We have given meaning to statements of the form \X\ = \Y\ and \X\ < \Y\, where \X\ does not occur in isolation. Thus a statement such as \X\ = \X\, though quite easy to prove, does not follow trivially from the logical rules that govern the symbol =. What we are asserting when we say that \X\ = \X\ is not that the object \X\ is equal to itself — we do not yet have an object \X\ — but that there is a bijection from X to X. Equality doesn't come into it at all. Cardinal numbers |X| as definite objects can be defined. However, such a definition requires either the Axiom of Choice (in which case l-'^l is the least ordinal which is able to enumerate X), or else the Axiom of Foundation (in which case l-'^^l is the family of all sets of minimal rank which are equivalent to X — the definition of rank requires the Axiom of Foundation) . In this chapter we avoid the use of both these axioms. The next exercise shows that the relations defined above behave as their notation suggests. Exercise 5.1.2 (a) Show that equivalence of sets is an equivalence relation on the class of all sets: (i) 1^1 = 1^1; (ii) If ^1 = \B\, then \B\ = \A\; (iii) If \A\ = \B\, and \B\ = \C\, then \A\ = \C\. (b) Show that |A| < (c) Show that if \A\ = \A'\ and \B\ = \B'\, then \A\ < \B\ if and only if \A'\ < \B'\. (d) Show that if |^| < \B\ and \B\ < |C|, then |^| < |C|. □ We say that a set X is countable if and only if |X| < \lo\, i.e. if and only if there is an injection X ^ oj. Clearly, each natural number is countable, as is u itself. A set which is not countable is said to be uncountable. Exercise 5.1.3 (a) Show (without using (AC)) that a set X is countable if and only if there is a surjection oj -» X. Thus X is countable if and only if it can be enumerated by natural numbers, possibly with repetitions X := {xn : n G a;} 67 Cardinals (b) Show that a subset of a countable set is countable. (c) Show that X is countable if and only if either X is finite (i.e. \X\ = n for some n G w) or \X\ = \oj\. Thus we cannot have n < \X\ < \uj\ for all n G w g [Hint: Let a; -» X, where X is infinite. Define h : oj ^ X recursively by h{ri) = g{m.m.{k E u : g{k) ran(/i f n)}) and show that /i is a bijection.] (d) Show that the union of two countable sets is countable. Conclude that the union of a finite family of countable sets is countable]^ (e) Show that the cartesian product of two countable sets is countable. Conclude that the cartesian product of a finite family of countable sets is countable. [Hint: First show that u x uj ^ u : (z, j) i— )• 2* • (2j + 1) — 1 is a bijection. (You may assume that the usual laws of arithmetic hold.)] □ Intuition for the sizes of sets may also suggest to you that < is antisymmetric, i.e. that if 1^1 < \B\ and \B\ < \A\, then |j4| = \B\. However, this is not at all obvious: One has to show that if there is an injection from A to B and an injection from B to A, then there is a bijection from A to B. Nevertheless, we will proceed to show that it is the case. The proof of this fact is the first (in these notes) that deserves the appellation "beautiful" . Theorem 5.1.4 (Schroder-Bernsteirj^ Theorem) If A,B are sets such that \ A\ < \B\ and \B\ < \ A\, then \ A\ = \B\. i Proof: We first reduce the problem to the situation where B (1 A. Suppose that A ^ B j and B ^ A are injections. Define B := j[B], and Ai := (j o i)[^]. Then AiOBCA, \Ai\ = \A\, \B\ = \B\ If we can show that \A\ = \B\, then also |^| = \B\. We have thus reduced the problem to the following situation: Given that Ai <^ B <^ A f and that \ Ai\ = \ A\, we have to show that |^| = \B\, i.e. that there exists a bijection A B. g Now since |^| = \Ai\, there is a bijection A Ai, so that Ai = g[A]. Now define two sequences of sets |A„ : N < ujj and : n < ojj hy induction as follows: Ao := A An+i := g[An] Bq := B Bn+i := g[Bn] ^You may know that the union of a countable family of countable sets is countable, but to prove that, you need (AC). ^Also known as the Cantor-Bernstein Theorem. Set Theory 68 Since 5 is a bijection one can think oi Aq, Ai, A2, ■ ■ ■ as copies of the same set A. Similarly Bo, Bi, B2, ■ ■ ■ are copies of B. A peek at Figure 5.1 may clarify matters. Moreover, Ao^Bo^Ai^Bi^---^An^Bn^ An+l 5 Bn+1 5 . . . When we form set differences of successive sets, we obtain disjoint sets Aq — Bq, Bq — Ai, Ai-Bi, B1 — A2, ... An — Bn, Bn — An+l, Thus we can write A, B as disjoint unions of these set differences. A= \J {An-Bn)U U {Bn-An+l) B= [j - 5„) U |J (S^-^+i) 0<n<aj 0<n<aj l<n<aj 0<n<tj Now observe that g (suitably restricted) is a bijection from each set difference to the next one of the same type. To be precise. g[An — Bn] = An+1 — Bn+1 g[Bn — An+l] = Bn+1 — An+2 f We can now construct the required bijection A B. The basic idea is to use g to move each point X G An — Bn to the corresponding point f{x) G An+i — Bn+i- Points x G Bn — An+i are left unmoved, however. Thus we define g{x) ifxe IJ {An-B, 0<n<ui X if X G U {Bn- An +1; 0<n<t>; Now observe that / is a bijection from Uo<n<aj(^"~^") onto Ui<n<aj(^"~-^")- clearly is also a bijection from Uo<n<a;(^n ~ ^n+i) onto itself. It follows easily that / : A B is a bijection. Remarks 5.1.5 With the Schroder-Bernstein Theorem in place, we see that < defines a partial ordering on the equivalence classes of =. Another intuitively plausible fact is the Law of Trichotomy, which asserts that < induces a total ordering: If A, B are sets, then Either |^| < \B\ or |^| = \B\ or \A\ > \B\ Cantor assumed that the Law of Trichotomy held, but was unable to supply a proof. In fact, the Law of Trichotomy is equivalent to (AC) as we shall see in a later chapter. Recall that a set X is Dedekind infinite if and only if there is a bijection from X onto a proper subset of X. We have seen in an exercise that every Dedekind infinite set is infinite. Furthermore, we saw that a set X is Dedekind infinite if and only if there is an injection uj X , i.e. if and only if \uj\ < \X\. To prove that an infinite set is Dedekind infinite required (AC) however. Thus without (AC) it is possible that \X\ > n for all n £ uj, yet \X\ ^ In that case X,u} are incomparable elements in the ordering <. Cardinals Figure 5.1: Illustration of the Schroder-Bernstein Theorem. The sets An are circles, and the sets Bn are rectangles. Furthermore the sets An — Bn are shaded white, whereas the sets Bn — Aji^i arc shaded grey. The function / moves each point in a white region to the corresponding point in the next white region. It leaves points in the grey regions unmoved. Set Theory 70 □ Right now, we do not yet know that there are uncountable sets. The next theorem, owing to Cantor, shows that there are: Theorem 5.1.6 For every set X, we have \X\ < \V{X)\ Proof: Clearly X — > V{X) : x i-> {x} defines an injection form X into 'P{X). Hence \X\ < \V{X)\. Let f : X 'Pi^) be an arbitrary function. We show that / cannot be a surjection: Indeed, define an element Y eV(X) by Y --{xeX f{x)} If y G ran(/), then Y = f{y) for some y E X. Then y G y ^ ye fiy) ^ y^Y — contradiction. Hence Y ran(/), i.e. / is not surjcctivc. Since / was arbitrary, it follows that there is no surjection — and thus no bijection — from X onto P{X). Hence \X\ / H As a consequence, V{oj) is an uncountable set. 5.2 Cardinal Arithmetic in ZF We stated in the previous section that we do not (yet) have candidates for the cardinality \X\ of an infinite set X. Wc will now proceed as if we can assign a cardinal number k = |X| to a set X, and define arithmetic operations on these cardinals. Each statement about cardinal arithmetic can be translated into a statement about operations on sets: The notion of cardinal number is not necessary, but it is convenient. We use the letters k, A, /x, u to denote cardinals. Recall that is the set of all functions from B to A. Definition 5.2.1 Suppose that k = \A\, A = \B\. (a) K + X := 1^4 U where we require that An B = 0. (b) K-X:= \AxB\. (c) := \A^\. □ Remarks 5.2.2 (a) Note in the definition of cardinal addition that if A, B arc not disjoint, one can always replace them with equivalent sets which are disjoint. Simply choose x ^ y, and define A' := A x {x}, B' := B x {y}. Then \A\ = \A'\, \B\ = \B'\, and A' n B' = 0. 71 Cardinals (b) Clearly = 1 and , I'' = 1. Furthermore = if k > 0. Observe, however, that there is a unique function — )• 0, so that O'^ = 1. Exercise 5.2.3 Check that the above definitions make sense, i.e. are independent of the choice of A, B. Thus check that (a) \i Ar\B = 0, Ar\B = 0,\A\ = \A\ and \B\ = \B\, then \AU B\ = \A\J B\. (b) If \A\ = \A\ and \B\ = \B\, then \Ay.B\ = \Ay. B\. (c) If \A\ = \A\ and \B\ = \Bl then \A^\ = \A^\. □ The next proposition shows that cardinal arithmetic is better behaved than ordinal arithmetic. Proposition 5.2.4 (a) k + X = X + k. (b) {k + x) + = K + {x + (c) K ■ X = X ■ K. (d) {k ■ X) • fi = K ■ {X • fi). (e) K ■ {X + fl) = K ■ X + K ■ fl. (f) K-^+M = . ^/._ (g) [k ■ XY = ■ Xf". (h) If n < X, then K + /_i<A + /i, n ■ fj, < X ■ and k,^ < X^. (i) // < A < fi, then < k^. □ Exercise 5.2.5 Prove Proposition 5.2.4 [E.g. to prove k^^^" = ■ k^, let \A\ = k, \B\ = A, \C\ = n with BnC = 0, and show that there is a bijection from A^'^'-^ onto A^ x A'-' .] □ Exercise 5.2.6 Recall from Exercise 5.1.3 that if X is an infinite countable set, then \X\ = We thus denote the cardinality of an infinite countable set by the symbol lo. Show that UJ + U = UJ-LO = U}. □ The following theorem relates the cardinalities of power sets to cardinal arithmetic. Theorem 5.2.7 // \X\ = k, then \V{X)\ = 2'^ . Hence k < 2'^ for all cardinals k. Set Theory 72 Proof: We have to show that there is a bijection from V{X) onto 2^ , where 2 := {0, 1}. Observe that every subset Y C X induces a unique characteristic function (or indicator function) xy ■ X 2 which, for each x £ X, indicates whether or not x gY. Thus: f 1 ifxeY Xy{x) ■= \ .r . for X G X 10 if X ^Y Conversely, every function / : X — )• 2 indicates a unique subset Xj := {x £ X : f{x) = 1}. There is therefore a correspondence between subsets of X and functions X — t- 2. To be precise, define ^> : V{X) ^2^ -.Y ^ xy{-) ^ -.2^ ^ V{X) : f ^ Xf Then it is easy to see that <I> o vj; = [d^x and ^I/ o $ = idp(x)- Hence ^' = and $ is a bijection. It follows by Theorem 5.1.6 that k < 2'^ for all cardinals k. 5.3 Alephs We are now ready to define the notion of cardinal number for certain sets: The alephs are the cardinal numbers of well-orderable sets. Recall that every well-ordering is order-isomorphic to a unique ordinal. Thus if X is well- orderable, then there is a bijection a X from some ordinal a to the set X. Conversely, if there is a bijection / : a X : x^, then / induces a well-ordering on X: Define x^ < x^ if and only if ^ < 77. Thus a set is well-orderable if it and only if it can be placed into a bijective correspondence with some ordinal. Equivalently, a set X is well-orderable if and only if it can be enumerated by an ordinal, i.e. if and only if there is an q G ON such that X = {x^ : ^ < a}. The order-type of an enumeration need not be unique: At the beginning of this chapter, we showed how to enumerate the set of natural numbers by various ordinals: w, w + a; and w X a; + 1. Since the size of the set of natural numbers is unaffected by how we order it, we have |a;| = |a; + a;| = |w-a; + l| Many other enumerations of the set of natural numbers are possible, but it is clear that the natural numbers cannot be enumerated by an ordinal which is < a;: oj is the smallest ordinal which can enumerate the set of natural numbers. This leads to the following definition: Definition 5.3.1 (a) An ordinal a is said to be a cardinal number (or just cardinal) if and only if for all ordinals /3: /3 < a =^ < \a\ (b) If a set X is a well-orderable set, then \X\ is the smallest ordinal which enumerates it, i.e. IXI = smallest ordinal a for which \X\ = \a\ □ 73 Cardinals Thus a is a cardinal if and only if there is no bijection f : /3 ^ a for any f3 < a, i.e. if and only a cannot be enumerated by any /3 < a. Moreover, if X is well — orderable, then \X\ is a cardinal. An ordinal a which is a cardinal is also called an initial ordinal: It is the first ordinal to attain the size |a|. Clearly 0, 1, 2, . . . , w are cardinal numbers. Furthermore, uj is the smallest infinite cardinal. Remarks 5.3.2 (a) The set of cardinals is therefore a subclass of the set of ordinals. As such, it inherits the well-ordering on ON. Observe that if a, /3 are cardinals, then a< (3 in ON |a| < |/3| By Definition 5.3.1 we see that a < (3 implies |a| < and \a\ < Conversely, if a, /3 are cardinals then it cannot be the case that /3 < a, and hence a < p. There is therefore no chance of confusion if we use the same symbol < for the order relation on both the class of ordinals and the class of cardinals. (b) We have defined arithmetic operations on both the class of ordinals and the class of cardinals. Here, however, care must be taken. The ordinal product uj ■ co is not at all the same as the cardinal product u ■ to. Indeed, for the ordinal product we have uj ■ uj > oj, yet |a; • wl = \uj\. Thus the ordinal product w • a; is an ordinal which is not a cardinal. The cardinal product is simply uj ■ uj = u. It is moreover not hard to see that for ordinal exponentiation, uj^ := Unew'^" count- able, i.e. that for ordinal exponentiation uj^ is a countable set. However, for cardinal exponentiation, we clearly have |a;|l^l > 21'^' > |a;|, i.e. for cardinal exponentiation, uj^ is an uncountable set. The two interpretations of uj^ do not even have the same cardinality! Thus when doing arithmetic, you have to keep track of whether you are dealing with ordinals or cardinals. Only for the natural numbers do ordinal and cardinal arithmetic coincide. □ Here is an easy result: Proposition 5.3.3 Every infinite cardinal is a limit ordinal. Proof: If a > w is infinite, then the map / : a + 1 — )• a defined by /(7) := < is a bijection, i.e |a + 1| = \a\. Thus a + 1 is not a cardinal. if 7 = a 7 + 1 if 7 < w 7 if u; < 7 < a Set Theory 74 As we stated earher, clearly 0, 1, 2, . . . , a; are cardinal numbers. Are there any others? We know that \X\ < \V{X)\. However, we generally do not know if V{X) is a well-orderable set, even if X is well-orderable. Thus though \X\ may be a well-defined cardinal number (in the sense of Definition 5.3.1), the object |'P(X)| may not have any meaning in isolation. Nevertheless, as we will now show, there are as many cardinals as there are ordinals. Since ON is a proper class, so is the class of cardinals. Let X be any set (not necessarily well-orderable). Then V{X) is a set. Now if Y <Z X, then a well-ordering of y is a subset of 1" x y. Hence for each Y G V^X), the class := {-< G V{Y X Y) well-orders Y} is a set, by Separation. It follows that W := IJyeVix) ^ s^^' possible well-orderings of subsets of X . We can define a class function F which assigns to each well-ordering -< G W the unique ordinal to which it is order-isomorphic. By the Axiom of Replacement, -F[W] is a set. Thus |J -F[W] = supF[W] is an ordinal, which may or may not belong to -F[W]. In any case, the successor ordinal of sup-F[W] cannot be a member of F[W]. Thus {q G on : a F[W]} is a non-empty class of ordinals, and therefore has a least member — call it h{X). Now observe that, by definition, a G F[yV] if and only if X has a subset which can be enumerated by a, i.e. a G -F[yV] <;=^ there is an injection from a to X Hence h{X) is the least ordinal a for which there exists no injection from a into X. The reasoning above shows that the ordinal h{X) always exists, for any set X — well- orderable or not. Thus: Definition 5.3.4 If X is a set, define the Hartog's number h{X) of X as follows: h{X) := least ordinal a such that there is no injection a X □ Proposition 5.3.5 For any set X , h{X) is a cardinal number. Moreover, h{X) ^ \X\. Proof: If a < h(X), then there is an injection a ^ X. If it was the case that |a| = |/i(X)|, there would exist an injection h{X) ^ X — contradiction. Hence \a\ < \h{X)\ for all a < h{X). Similarly, if |/i(^)| < X, then there would exist an injection h{X) ^ X — contradiction. Hence \h{X)\ ^ |A:|. H Observe that we are not saying that h{X) > \X\. That would be the case if the Law of Trichotomy holds, i.e. if (AC) holds. However, without (AC), it is possible that \X\, h{X) are incomparable in size. Proposition 5.3.6 For every a G ON there is a cardinal which is greater than a. Indeed, h{a) is the least cardinal > a. 75 Cardinals Proof: Both a, h{a) are ordinals. Thus either h{a) < a, or else a < h{a). Since ^ we must have a < h{a). Thus h{a) is a cardinal which is > a. If /3 is a cardinal and /3 < h{a), then there is an injection f3 ^ a, and hence |/3| < \a\. Because /3 is a cardinal, it then follows that (3 < a. Hence h{a) is the least cardinal > a. H If K is a cardinal, then is the least cardinal > k, i.e. the successor of k in the class of cardinals. We denote it by := We already remarked that the order relation on the class of cardinals is a restriction of the order relation on the class of ordinals. Here is another result linking the two — The supremum of a set C of cardinals does not depend on whether you regard C as a set of ordinals or a set of cardinals: Proposition 5.3.7 If C is a set of cardinals, then sup C :=\JC is a cardinal. Proof: We know that a := (J C is an ordinal, and that it is the least ordinal which is > k for every k £ C. Now suppose that f3 < a. By definition of a, there is k G C such that /3 < K < a. Because k is a cardinal, it follows that |/3| < |k| < \a\. Hence |/3| < \a\ whenever /3 < a. We can now define an enumeration of the class of all infinite cardinals, the aZep/i^ ^^o,^^l,^^2, • • . . . . • • • - • • by transfinite recursion: b^o := ^0 is the set of natural numbers 'Ra+i '■= ^a+i ■= succcssor ordinals a + 1 "i^x = '■= supji^Q, : a < A} if A is a limit ordinal We use "i^a and cj^ interchangeably. The ^l;-notation is typically used when we want to emphasize the size of a set, whereas the w-notation when we are concerned also with the order-type. Observe that ui is the least uncountable ordinal, i.e. that a < wi if and only if \a\ < |a;|. Then U2 is the least ordinal whose cardinality is > |a;i|, etc. Exercise 5.3.8 Show that the map F : ON — )■ ON : a i— t- Wq, is normal, i.e. strcitly increasing and continuous. Deduce that uja > a for all a G ON. Also conclude that there are arbitrarily large ordinals for which = a. □ Next, we observe that {^^Q, : a £ ON} enumerates all the infinite cardinals: ''Aleph N is the first letter of the Hebrew alphabet. The next two, Beth 3, and Gimel 1 will also make an appearance in cardinal arithmetic. Set Theory 76 Proposition 5.3.9 // k is an infinite cardinal, then there is a unique a £ ON such that Exercise 5.3.10 Prove Proposition 5.3.9 [Hint: Sliow tiiat if k is a cardinal, then k < i^K+i- Now show by induction that the following holds for all all a If K is a cardinal such that k < ^^Q,, then k = H« for some ordinal (3 < a. □ The alephs were introduced by Cantor. We emphasize once again that the alephs are the sizes (cardinalities) of well-orderable infinite sets. An infinite set X is well-orderable if and only if \X\ = Hq, for some a. For Cantor, the restriction to well-orderable sets posed no problem: Already in 1883 he had proposed as a Law of Thought that every set is well- orderable. However, this proposal met with much opposition, and thus Cantor tried to find a proof. The first proof was, in fact, found by Zermelo, but his proof used a new principle for set formation — the Axiom of Choice. We shall have much more to say about this in the next chapter. 5.4 The Canonical Well-Ordering of ON x ON Recall that, given a well-ordered set(X, <) , we defined in Lemma 4.1.2 the canonical well- ordering {X X X,\z). We can define the canonical well-ordering of the class ON x ON in the same manner: Given a,/3,a',P' G ON, we put {a, 13) < {a',f3') max{a, /3} < max{a', /3'} or max{a, (3} = max{a', /3'} and a < a' or max{a,/3} = max{a',/3'}, a = a , and (3 < [3' It is straightforward to verify that this does indeed define a well-ordering of ON x ON, i.e. a total ordering with the property that every non-empty subclass of ON x ON has a least element. Observe that if 7 G ON, then the restriction < f (7 x 7) of the canonical well-ordering on ON X ON to the set 7 x 7 is precisely the canonical well-ordering of 7 x 7. Here are some observations that will be useful in the sequel: • Observe that the initial segment {{(,,r]) : ??) < (a,/3)} is a set, as it is a subset of (max{a,/3} + 1) x max{a,/3} + 1). • Further observe that (^, T]) < (0, a) <;=^ ^ < a A rj < a i.e. {(^, rj) : (^, r]) < (0, a)} = a x a Thus each a x a is an initial segment of the canonical well-ordering of ON x ON. 77 Cardinals • Since every well-ordered set is order-isomorphic to a unique ordinal, we may define a map r : ON x ON — )• ON as follows: T{a, /?) is the order-type of the initial segment induced by {a, (3) in the canonical well-ordering of ON x ON, i.e. r(a,/3) := order-type of {{C,v) ■ i^^v) < • r is an order-preserving injection from ON x ON into ON: No well-ordered set is order-isomorphic to an initial segment of itself. • If /3 < a, then r(0,/3) < r(0,a): This is because r(0, a) is the order-type of of the initial segment a x a of ON x ON. Since /3 x /3 is an initial segment of ON x ON and /3 X /3 C a X a, we see that /3 x /3 is an initial segment of a x a. • ran(r) is downwards-closed: If 7 G ran(r) and < 7, then 6 S ran(r). This is because some initial segment of ON x ON has order-type 7, and an initial segment of that initial segment must have order-type 5. a r is surjective: The map $ : ON — t- ON $(a) > Q for all a £ ON, by Theorem 4.1.6 r(0, a) is strictly increasing, and hence It follows that a < T{0,a) G ran(r). Since ran(r) is downwards-closed, it follows that a G ran(r), for all a G ON. We reiterate: The map F : ON x ON — t- ON is an order-preserving bijection. We refer to r as the canonical bijection from ON x ON to ON. Exercise 5.4.1 Consider the map ^' : ON — )■ ON : a 1— )• r(a x a). Show that ^' is a normal function. Conclude that r(a x a) > a for all a G ON. Further deduce that there are arbitrarily large ordinals for which T{a x a) = a. □ With the previous exercise in mind, the next proposition should not be too surprising: Proposition 5.4.2 r{uja x ^a) = for all a G ON. Proof: Recall that clIq, X clIq, = : {£,,tj) < (0,u;q,)} is an initial segment of ON x ON, and that T{uja X Wa) = r(0,a;Q,) > uja- Suppose that a is the least ordinal such that T{uja x uja) u)(x, SO that r(wQ, X oja) > uJa- Then there exists (^,7?) G Wq, x such that T{(,,t]) = oja- Since cj^ is a limit ordinal, we can find an ordinal 5 so that ^, r/ < J < Wq. Then J x 5 is an initial segment of ON x ON. Since (^, 77) £ 6 x 5, we have uJa C T{5 x 6). Since F is injective, it follows that Ua < |F((5 x 5)\ = \6 x 5\. Now \6\ = Uf^ for some /3 < a, and by the minimality assumption F(a;^ x ujfs) = u^p. Since F is injective, we see that ujp = \T{u)p x ujp)\ = \ujp x ujp\. It follows that Wq, < |(5 X 5| = \ujp X w/jI = ojp — contradiction, since cj^ < Wq. Hence there is no ordinal a such that T{uja x Wa) 7^ Wq,. Set Theory 78 5.5 Arithmetic of Alephs Cardinal addition and multiplication are trivial: Theorem 5.5.1 Proof: Since T{u}a x <^a) = and F is injective, it follows that Thus b^Q, • Kq, = b^Q, for all a G ON Now if 7 = max{Q,/3}, then by Proposition 5.2.4 it follows that = max{H„, H^} < + < H.^ + = 2 • < H7 • ^^-^ = Similarly, = max{^^a, ^^/j} < Ha • < ^7 • = ^7 Remarks 5.5.2 (a) It is important to note that the above theorem holds only for alephs. If the Axiom of Choice holds, then every infinite cardinal is an aleph, as we shall soon be able to show. However, the statement For all infinite i.e. the statement that for all infinite sets A we have |^ x j4| = |^|, is equivalent to (AC). (b) Finite sums and products of alephs are now easy to calculate. Exponentiation is altogether more difficult. We know, for example, that 2*^'^ > Hq. We do not know if 2*^'^ is an aleph, i.e. if 2^0 = H„ for some a G ON. With (AC), 2^" will be an aleph. Even with (AC), however, we are not able to determine the value of (x for which 2^^' — Hq;. The axioms of ZFC place very few constraints on the possible values of a. a might be any of 1,2,3,. ... It cannot be u. It can be any of a; + 1, cj + 2, w + S, . . . , or for that matter Hi, H2, H3, .... Again, we cannot have a = H(^. The assumption that a = 1 — i.e. that 2^° = Hi — is due to Cantor, and is known as the Continuum Hypothesis. Here, the word continuum refers to the set M of real numbers. The Generalized Continuum Hypothesis is the assertion that 2^" = H„+i for ah a G ON □ The next theorem shows what 2^° has to do with the continuum. Theorem 5.5.3 IMI = 2^0 79 Cardinals □ Exercise 5.5.4 We prove the preceding theorem. Assume here that the basic properties of the sets Q, M of rational and real numbers are known. (We have not yet established that Q, M can be formalized inside ZFC. However, such a formalization must of course capture all these known properties.) (a) Show that |Q| = Nq. (b) Observe that if r G M, then r = sup{g G Q : g < r} — a known property. Conclude that |M| < 2^0. (c) An element / G 2'^ is a function / : w — )■ {0, 1}, which can be thought of as a sequence of O's and I's. With each such function is associated a real number whose decimal expansion consists of digits given by this sequence. Show that 2^° < |M|. (d) Finally, conclude that |M| = 2^°. □ 5.6 Project: Cardinal Arithmetic and The Axiom of Choice Problem 1: We prove a theorem of Tarski (1924): Theorem: In ZF, (AC) holds if and only if = k for every infinite cardinal k. 1.1 First, explain why if (AC) holds, then = k for every infinite cardinal. 1.2 Next, we show that if k is an infinite cardinal and if K is an aleph, then K + K = K • K implies k < H or K > k So suppose that K + ii = K-ii. Let K be an infinite set such that \K\ = k, and let {A, <) be a well-ordered set such that |yl| = N. (i) Explain why there exist disjoint sets Ki and Ai such that \Ki\ = k, \Ai\ = K and K X A = KiUAi. (ii) Explain why exactly one of the following possibilities must be the case: Either (1) 3k£ Kya€ A {{k, a) G Ki), or (2) yk£ K3a£ A {{k,a) G ^i). (iii) Show that in case (1), we have H < k. (iv) Suppose now that case (2) holds. For each k £ K, let be the least element of A such that {k,ak) G A^. Observe that {{k,ak) ■ k G K} C Ai. Conclude that k < i^. Set Theory 80 1.3 Now suppose that k'^ = k for every infinite cardinal. Let X be any infinite set, with cardinaUty A = \X\, and let K := h{X) be the Hartog's number of X. Observe that K is indeed an aleph, as the notation suggests. (i) Show that (A + H)^ > A • H. (ii) Conclude that A + K = A • (iii) Explain why we cannot have H < A. Conclude that A < (iv) Deduce that X can be well-ordered. Hence explain why (AC) holds. □ Problem 2: We prove a theorem of Sierpihski (1947): Theorem: In ZF, (GCH) implies (AC). Recall that k <2'^ for any cardinal k. The Generalized Continuum Hypothesis (GCH) is the following assertion: If k is an infinite cardinal, then there are no cardinals strictly between K and 2'^. To be precise If K, A are infinite cardinals such that « < A < 2**, then either A = k or A = 2**. 2.1 We first prove, without assuming either (AC) or (GCH), Lemma 1: If k is an infinite cardinal, and if h{K) is the Hartog's number of k, then Let X be a set such that \X\ = k. Then C is a partial order relation on V{X). Thus if C V{X), then {X , C) is a partial ordering. It may happen that {X, C) is a well- ordering, or it may not. Define W ■.= {X <Z V{X) : {X, C) is a well-ordering} (i) Show that W C P(7^(X)). (ii) Define an equivalence relation on W by X Kiy [X, C) is order-isomorphic to {y, C) and let E := W/ « be the set of equivalence classes of «. Explain why E C V{V{V{X))). Conclude that |E| < 22'". 81 Cardinals (iii) Each E £E, is a collection of all subsets of V{X) that have the the same order-type. Thus with each E & K we can associate a unique ordinal oe- Thus E can itself be regarded as a well-ordered set, with £' ^ F if and only if as < ap- Conclude that |E| is an aleph. (iv) Show that |E| ^ \X\. Conclude that |E| > h{K). [Hint: Let ij be the order-type of the well-ordered set E. Suppose that |E| < \X\. Then there is an injection / : 77 — )■ X : ^ x^. Define Xjs := {x^ : ^ < /?}, and X := {Xp : /3 < ry}. Observe that (Af, C) has order-type ry. The initial segment of E determined by the equivalence class of X then has order-type rj. . .] (v) Finally, conclude that h{tC) < 2^ . 2.2 Now prove the following easy results in cardinal arithmetic, not assuming (AC) or (GCH): (i) Lemma 2: If k > Hq, then 2 ■ k = k. Remark: Saying k > is not the same as saying that k is infinite, but that it is Dedekind infinite (i.e. has a countable subset). [Hint: Show 1 + k = k.] (ii) Lemma 3: IfK> Ko, then 2" + k = 2'^. (iii) Lemma 4: If X, k are cardinals such that 2 ■ k = k and X + k > 2'^, then X> 2'^. [Hint: Choose disjoint sets Ki,K2,L so that \Ki\ = \K2\ = k, \L\ = X. Justify the equations \LUKi\ = X + K = 2'^ = 2''+'^ = \r{Ki U K2)\ Let f : L\J Ki ViKiU K2) he a bijection. If £; C ^2, define E* ■.= EU{xeKi :x^f{x)} Show that the map V{K2) -)• V{Ki \J K2) : E ^ E* is injective. Further show that each E* f[Ki\- Conclude that each E* = f{y) for some y E L. Deduce that > \V{K2)\ = 2\] 2.3 Henceforth, assume (GCH): Let k > b^o be an infinite cardinal, and let /i(k) be the Hartog's number of k. Define a sequence {pn)n<uj of cardinals inductively as follows: PQ •■= Pn+i = 2''" Prove: Lemma 5: If k> and h{n) < Pn+i, then either K is an aleph, or h(^K) ^ p^- (i) Observe that 2 ■ pn = Pn, by Lemma 2. (ii) Show that p„ < /i(k) + pn < 2^^" for all n. (iii) Conclude from (GCH) that we have twho possible cases: Either (1) /i(k) + Pn = Pn, or else (2) h{K)+pn = 2P^. (iv) Show that in case (1), we have hi^K) ^ pn- Set Theory 82 (v) Show that in case (2), we have h{K) > 2''" > k by Lemma 4. Conclude that k is an aleph. 2.4 Finally, let X be any infinite set. Define k = 2^o+l^l. (i) Observe that Lemma 1 states that < p3. Also observe that ^ po. Use Lemma 5 to conclude that k is an aleph. (ii) Deduce that X can be well-ordered. Hence explain why (AC) holds. □ Chapter 6 The Axiom Of Choice Recall: Axiom of Choice: Every family of non-empty sets has a choice function: If is a family of non-empty sets, then there is a function f on. X which chooses from each X E X an element fix) G X. Exercise 6.0.1 Show by induction that every finite family X of non-empty sets has a choice function. Thus (AC) is needed only for infinite X. □ 6.1 Choice Functions, Partitions and Cartesian Products Our first result shows that for (AC) to hold, it suffices that it holds for every every family X of mutually disjoint non-empty sets: Proposition 6.1.1 Let (AC)* be the following statement: (AC)*: If X is a partition (i.e. a family of non-empty pairwise disjoint sets), then there is a set C which contains precisely one element from each X E X (i.e. C CiX is a singleton for every X E X). Then (AC) and (AC)* are equivalent in ZF. Proof: Clearly (AC) impUes (AC)*. Conversely, suppose that X is a family of non-empty sets, not necessarily pairwise disjoint. For each X E X, let X := {X} x X. Then X := {X : X € X} is a family of pairwise disjoint sets. By (AC)*, there is a set C which contains precisely one element from each X £ X. Then C is a set of ordered pairs, and thus a binary relation. By definition, if {X, x) G C, then x e X. Moreover, C is clearly a function: If {X, x) e C and {X, y) E C then {X, x), {X, y) e C Ci X, and hence x = y. It follows that C is a choice function on X. H Proposition 6.1.2 In ZF, the following are equivalent: 83 Set Theory 84 (a) (AC) (h) Every surjection has a right inverse. f Proof: Suppose that (AC) holds, and that X Y is a surjection. For each y £ Y define Xy := f~^[{y}]- Then X := {Xy : y £ Y} is a family of non-empty sets which partitions X. By (AC), X has a choice function h : X ^ y]X with the property that h{Xy) £ Xy for ah y £ Y. In particular, f{h{Xy)) = y. Define g : Y ^ X : y ^ h{Xy). Then {fog){y) = f{h{Xy))=y. Conversely, suppose that every surjection has a right inverse. Let X he a family of disjoint non-empty sets, and let f :[_} X ^ X he defined as follows: If x G Af , then f{x) is the unique X £ X so that x £ X Then / is a surjection, and thus has a right inverse g : X ^yjX. Clearly g{X) £ X, i.e. g is a choice function for the partition X. By Proposition 6.1.2, (AC) holds Recall that the cartesian product Aq x Ai of two sets is defined to be the set of all ordered pairs (ao,ai), where oq £ Aq and ai £ Ai. We can then define products with more factors inductively: ji+l n ^fc := ^fc X An+i i.e. tIq x • • • x ^„ x An+i := (tIq x X • • • X An) X An+l k=0 k=0 However, this allows us to define products with only finitely many factors. How should we define a product nie/ ^« with infinitely many factors, i.e. where the index set / has infinite cardinality? More generally, if ^ is a non-empty collection of sets, can we define Y\ Here the intention is that if ^ = {Ai : i £ I}, then = Yliei (just as = IJie/ ^*-) Clearly we have 13^=0 {(^O) oi) ■ ao £ Aq A ai £ Ai. Thus we ought to have YlAi := {{ai : i £ I) -.yi £ I {ai £ Ai)} The problem is that we have not yet defined ordered tuples {ai : i £ I) of infinite length. However, we can clearly think of each such tuple as a function, i.e. {ai : i £ I) is the function f : I ^ \^ Ai defined by f{i) = ai Thus the function / chooses, for each i E J, an element f{i) £ Ai. Then Ai :={/:/ is a function I A- \^ Ai such that f{i) £ Ai for all i £ 1} i&I i£l i.e. the cartesian product Hie/ °^ functions / which choose, for each i £ I, an element f{i) £ Ai. There is an obvious natural bijection between the old definition of x as a set of ordered pairs and the new definition as a set of functions {0, 1} ^ AqU Ai: The ordered pair 85 Axiom of Choice (ao, ai) £ AqX Ai corresponds to the function {0, 1} ^ AqU Ai : i-^ oq, 1 ai- Henceforth we shall identify the two objects without further explanation. It will usually be clear from context which definition of product is meant. Now if ^ is a non-empty non-indexed collection of sets, then we can index A by itself: A := {Xa : Ae A} where Xa := A Then H = Has^ °^ functions f : A ^ \JA with the property that f{A) E Xa, i.e. with the property that f{A) G A. Thus if / G f^^, then / is a function which chooses, for each ^4 G ^ an element f{A) E A, i.e.: Definition 6.1.3 is the set of all choice functions on A. □ The following proposition is now obvious: Proposition 6.1.4 The following statements are equivalent in ZF: (a) (AC) (b) A product of a non-empty family of non-empty sets is non-empty: Whenever A is a non-empty family of non-empty sets, then Y\A ^ 0. □ Proposition 6.1.4 is the reason that (AC) was also called the multiplicative axiom in the past. 6.2 Well— Ordering Cantor proposed as an obvious Law of Thought the Well-Ordering Principle: Every set can be well-ordered. The intuition behind this law was simple. Suppose that X is a set. Pick an element xq E X. If any elements are left over, pick an element xi from X — {xq}. If any are left over, pick an X2 €z X — {xq, xi}. This defines a sequence by transfinite recursion: If x^ have been chosen for ^ < a, then pick an element Xq E X — {x^ : ^ < a}. Just keep doing this until you run out of elements. If this happens at stage f3, then X = {x^ : ^ < /?}. This enumeration of X induces well-ordering of X in an obvious manner. The discovery of the paradoxes generated increased skepticism about the foundations of set theory, and this led Cantor to seek a proof of his Well-Ordering Principle. The first such proof was given by Zermelo in 1904 (and again in 1908). In doing so, Zermelo explicitly formulated the Axiom of Choice. Proposition 6.2.1 Let X be a set. In ZF the following are equivalent: (a) The set X can be well-ordered. (b) The family X := V{X) — {0} of non-empty subsets of X has a choice function. Set Theory 86 Proof: (b) =^ (a): Cantor's intuition can be implemented as follows: Suppose that F is a choice function for X := V{X) — {0}. Pick a set c such that c ^ X. We extend to a function on V by defining F{x) = c \i x ^ X . Define a function G on ON by transfinite recursion follows: G{a) = F(X-ran(G \ a)) Observe that this means that G(a) & X — {G(^) : ^ < a} as long as X — ran(G \ a) ^ 0. Now if G{a) 7^ c for any a G ON, then G is injective, and hence the function G^^ : ran(G) — t- ON exists, and is a bijection. Since ran(G) C X, we see that = ON is a set, by Replacement — contradiction. Hence there is a G ON such that G{a) = c. If ag is the least such a, then X — ran(G \ qq) = 0, and hence X = {G(C) : S, < ckq}. Define a relation -< on X by G(0 ^ G{r]) if and only if ^ < r] Then G : (oq, <) — {X, -<) is an order-isomorphism, so -< well-orders X. (a) (b): If {X, -<) is a well-ordering, define F on X := V{X) - {0} by: F(Y) := -< -least element of Y To be precise F is the set F = {{Y,y) e X X X :y eV A eY {y ^ z)} which exists by Separation. Then F is clearly a choice function on X. H As an immediate corollary we have: Theorem 6.2.2 In ZF the following are equivalent: (a) (AC) (h) Every set can he well-ordered. Proof: (a) =^ (b): Suppose that X is a set. By (AC), V{X) — {0} has a choice function. By proposition 6.2. 1[ X can be well-ordered. (b) =^ (a): Suppose that X is a. family of non-empty sets. Let X = \] X . Then X C V{X) — {0}. Since X can be well-ordered, there is a choice function F for V{X) — {0}. Then F \ X is a, choice function for X . Recall that the alephs ^^q, were defined to be the cardinals of the well-orderable sets. (AC) implies that every set is well-orderable. In particular, for every infinite set X there is a unique a G ON such that \X\ = Nq,. Hence (AC) considerably simplifies cardinal arithmetic of all sets. We shall investigate cardinal arithmetic in ZFC in a later section. 87 Axiom of Choice 6.3 Zorn's Lemma and Other Maximal Principles We wish to show how, by introducing a certain axiom on sets of sets instead of the well-ordering theorem, one is enabled to make the proofs shorter and more algebraic. — Max Zorn, A Remark on Method in Transfinite Algebra, 1935 Recall the following: Definition 6.3.1 Let (P, <) be a partially ordered set. (i) If X C P, then an element ti G P is an upper bound for X \i\/x ^ X [x < u). (ii) An element m G P is called a maximal element of P if G P {m < x). (iii) A non-empty subset C C P is called a chain in P if C is totally ordered by <, i.e. for all x,y £ C, either x < y or y < x. (iv) A chain C in P is said to be a maximal chain if it is not a proper subset of any other chain in P. □ Theorem 6.3.2 In ZF, the following are equivalent: (a) (AC) (b) Zorn's lemma (1935): If every chain in a non-empty partially ordered set (P, <) has an upper bound, then P has a maximal element. (c) Hausdorff's Maximal Principle (1909): Every chain in a non-empty partially ordered set can be extended to a maximal chain. Proof: (a) (b): Let (P, <) be a partially ordered set with the property that every chain in P has an upper bound, and let P be a choice function for the family V{P) — {0}. Also choose some set q ^ P. For ^ C P, define 'W A := {x £ P : \/a £ A [a < x)} . By transfinite induction, define a map G on ON as follows: If a G ON, G{a) :-- P(tt ran(G \ a) if tt ran(G \ a) ^ a else To explain in simple English, suppose that := G'(^) has been defined for ^ < a. Then Oq, := G{a) is an element chosen (by F) from {x G P : < a (ag < x)}, i.e. is an element of P so that aa > for all ^ < a, provided such an element exists. If such an element does not exists, then aa ■= q- If Oq, 7^ for all a G ON, then : a G ON] is a sequence of distinct elements of P, i.e. G : ON — t- P is an injection, which is impossiblej^ Let A be the least ordinal such that ^By a similar argument as in the proof of Proposition 6.2. l[ Here is another proof: If h{P) is the Hartog's number of P, then {aa : a < h{P)l would be an injection from h{P) to P, contradicting the very definition of Hartog's number. Set Theory 88 a\ = q. Then C := {a^ : ^ < A} is a chain in P, and therefore has an upper bound u in P. We claim that u is a maximal element of P: For if not, then there is a; G P such that u < x. Hence tt {o^ : ^ < A} / 0, so oa G -P, i.e. q e P — contradiction. (b) ^ (c): Let C be a chain in a partially ordered set (P, <). Let r := {D C P : D is a chain in P, and D D C} Then (P,(^) is a partially ordered set, ordered by set inclusion. We show that every chain in P has an upper bound in P. So suppose that C := {Ci : i G /} is a chain in P. Define U ■.= {JC. Then (i) [/ is a chain in (P, <): For if x, y G U, there are i,j G / such that x £ d and y G Cj. Since C is a chain in (P, C), we have either Ci C Cj or Cj C Cj. Suppose without loss of generality that Cj C Cj. Then both x,y G Cj. Since Cj is a chain in (P, <), we have must have either x < y oi y < x. (ii) J7 G P: Clearly C CU. (iii) U is an upper bound of C in P: If d G C, then Ci C\JC = U. It follows from Zorn's Lemma that P has a maximal element Cm Then clearly Cm is a maximal chain in (P, <) which extends C. (c) =^ (a): We show that every family of non-empty sets has a choice function. Let ^ be a family of non-empty sets. Let F be the collection of all partial choice functions, i.e. F :={/:/ is a function, dom(/) C X, and f{X) G X for all X G dom(/)} Order F by inclusion, so that (F, C) is a partial ordering. By the Hausdorff Maximal Principle, there is a maximal chain C in F. Let P := IJC. Then (i) P is a function: For if [X,x), {X,y) G P, then there exist /, 5 G C so that {X,x) G / and {X,y) G g. Since C is a chain in (F, C), either / C (7 or 3 C /. Suppose without loss of generality that f ^ g. Then both {X, x), {X, y) G g. Since 5 is a function, x = y, as required. (ii) P G F: Indeed, if {X,x) G P, there is / G C so that {X,x) G /. Observe that F{X) = X = f{X). Since / is a partial choice function, f{X) G X, and thus F{X) G X. (iii) dom(P) = X. For suppose there is X E X — dom(P). As each element of X is non- empty, one may choose x e X. Then G := FU{{X, x)} G F also. Moreover, D := CU{G} is a chain, since / C G for all / G F. Then D is a chain which properly extends C — contradiction, since C is a maximal chain. Thus P is a choice function with domain X. 89 Axiom of Choice Remarks 6.3.3 Various maximum principles were discovered in the years since Zermelo proposed the Axiom of Choice, by Hausdorff, Kuratowski, Tukey, Teichmiiller, Zorn and others. Zom's Lemma is the (AC)-equivalent that has become the industry standard, a tool required by all mathematicians working with infinite sets. Although it goes by the moniker of "Lemma", Zorn actually proposed it as an axiom. Indeed, Zorn proposed the following maximum principle. (MP): Let ^ be a family of sets partially ordered by C and closed under the taking of unions of C-chains. Then A has a maximum element. We saw in Chapter 2 that every partially ordered set {X, <) is order- isomorphic to a family of sets ordered by inclusion: Define X := {], x : x E X}, where X x := {y E x : y < x}. Then {X, <) and {X, C) arc order-isomorphic. In particular, and chain in X corresponds to one in X, any maximal element in X corresponds to one in X, and vice versa. Thus the maximum principle (MP) is equivalent to Zorn's Lemma. Observe that the proof of Zorn's lemma from (AC) required transfinite recursion. Zorn's lemma is a method by which one can avoid transfinite recursion — it is built in, as it were. This usually makes the technique more palatable to the working mathematician. □ Here follows an example of the use of Zorn's lemma: Theorem 6.3.4 Every vector space has a basis. Let F be a vector space, and let P be the family of all linearly independent subsets of V. We claim that (P, C) satisfies the hypotheses of Zorn's Lemma. To see this let C := {Cj : i E 1} is a chain in P. We must show that C has an upper bound. Define U = \JC Then: (i) [7 G P: For if vi,...,Vn € U, then there arc G / such that € Cj^. Since C is a chain, the finite set {Cjj, . . . ,Cj^} has a maximum element, say Cj .. Then each vi, . . . ,Vn G Cij . Since Ci- is a linearly independent subset of V, the vectors vi, . . . ,Vn are linearly independent. Hence ?7 is a linearly independent subset of V also. (ii) U is an upper bound for C in (P, C) Each Q G C has Ci C U. By Zorn's Lemma, P has a maximal element B, i.e. a maximal linearly independent subset. If w G F is such that v ^ B, then B U {v} is not linearly independent. Hence there are 6i, . . . , 6„ G -B and scalars ai, . . . , a„ so that aibi H — • + anbn + v = 0. It is now easy to see that each v eV is a linear combination of members of so that 5 is a basis. H We end with a short list of results that depend on (AC) for their validity. • Every vector space has a basis. Moreover, any two bases have the same cardinality. • Tychonov's Theorem in topology: The product Hie/ of arbitrary family of com- pact spaces is compact (in the product topology). Tychonov's Theorem is equivalent to (AC). Set Theory 90 • Nielsen-Schreier Theorem in group theory: A subgroup of a free group is free. • Hahn-Banach Theorem in functional analysis. This leads to various infinite-dimensional separating hyperplanc theorems, which assert that any two disjoint closed convex sets can be separated by a hyperplane. • Every field has a unique algebraic closure (up to isomorphism). • Compactness Theorem in logic: If every finite subset of a set S of first-order sentences has a model, then E has a model. 6.4 Cardinal Arithmetic in ZFC (AC) simplifies cardinal arithmetic considerably. It also extends its range to infinite sums and products. Recall that, by definition, \X\ < \Y\ if and only if there is an injection X ^ Y. Every injection has a left inverse (without using (AC)) which is a surjection. Using (AC), every surjection has a right-inverse, which is an injection. Hence (using (AC)), \X\ < \Y\ if and only if there is a surjection Y ^ X . Also recall that the alephs arc the cardinals of wcU-orderable sets: If X is well-order able, then there is an order isomorphism (and thus a bijection) from some ordinal to X. Then \X\ is defined to be the least ordinal for which there is a bijection onto X. Now, since the Axiom of Choice implies that every set can be well-ordered, every infinite cardinal |X| is an aleph, i.e. for every infinite set X there is an a € ON such that |X| = K^. Thus the cardinals form a well-ordered class 0<1<2<. ..,<i^o< i^i <••-,< i^u;<--- Every infinite cardinal appears on the list '■ ol G ON], i.e. the class of ordinals and the class of alcphs are "order-isomorphic" as classes. We defined the alephs inductively as follows Ko := Kq+i := for successor ordinals a + 1 Ka '■= sup{Kq, : a < A} for limit ordinals A For obvious reasons, we call a cardinal of the form Kq+i a successor cardinal. A cardinal of the form where A is a limit ordinal, is called a limit cardinal. Every cardinal is either a limit cardinal or a successor cardinal. For example ^uj-,^uji are limit cardinals, whereas ^i., ^^j^i^^^^^j^^.^j^^ are successor car- dinals. In particular, any two cardinals are comparable (this is the Law of Trichotomy): If X,Y are sets, then either |X| < |y| or |y| < \X\. For if \X\ = and \Y\ = K^, then < if and only if a < /3. Furthermore, as we showed earlier that + = • = ^a-, using the canonical well-ordering on ON x ON, we must have « + = = for every infinite cardinal k. It follows easily that n + \ = K ■ \ = max{K, A} 91 Axiom of Choice for any two infinite cardinals k,X. Here is an oft-used fact that reUes on (AC): Theorem 6.4.1 (a) The union of a countable family of countable sets is countable. (b) If X is a family of sets, then \[jX\< \X\-svip{\X\ -.XeX] Proof: (a) Let X := {X„ : n < cj} be a countable collection of countable sets. For each n < oj, there is a surjection /„ : a; ^ Xn : m i-> Xn,m which lists the elements of i.e. Xn ■■= {fn{m) :m<u} = {xn,m : m < u;} Choose such an for each n < u. We then have a surjection / : a; X a; -» [J A" : (n, m) Xn,m Since we know |a; x a;| = Kq • Kq = Kq, we see that (J is countable. (b) Let K := \X\ and let A := sup{|X| : X G X}. Thus we can enumerate X := {Xa : a < k} Then, for each a < k, we can choose an enumeration Xa ■= {xa,i3 '■ f3 < Xa} whcrc = Aq, := \Xa\ < A Fix xq g\JX, and define a map p : k x X ^ \JX hy , f Xa,fi if ^ < Aa I xq else Then p is surjective, so | [J -Y] < k • A. H Proposition 6.4.2 If2<K<X and X is infinite, then = 2^ Proof: 2^ < < (2'')^ = 2'^'^ = 2^, since k-X = max{K, A} = A. H It follows, for example, that = 2^°. With (AC), we can define infinite sums and products of cardinals as follows: Definition 6.4.3 (a) Let {ki : i G /} be an indexed set of cardinals. Let {Xi : i G 1} be a family of pairwise disjoint sets such that \Xi\ = Ki. Define Set Theory 92 (b) Let {Ki : i G 1} be an indexed set of cardinals. Let {Xi : i G /} be a family of sets such that \Xi\ = Ki. Define llKi = \llXi\ iei iei where Hie/ -^i set of all choice functions, i.e. the set of all functions f : I ^ [Jiei with the property that f{i) G Xi for all iei. (c) If K, A are cardinals, define K^'^ := sup{/c^ : // a cardinal, /j, < X} □ Remarks 6.4.4 Observe that without (AC), it does not follow that J2iei '^i well-defined. One could have \Xi\ = \Yi\ for all i G /, without having lUie/^'^il = lUie/^^l (where {Xi : i G /} and {1^ : i G /} are families of pairwise disjoint sets). The reason is that to construct a bijection which shows that | IJjg/ Xj| = | IJjg/ Yi\, one first has to choose bijections fi:Xi>—^ Yi that show \Xi\ = \Yi\. Thus (AC) is necessary to make the definition of Yliei '^i independent of the sets Xj. A similar remark holds for Hie/ □ Remarks 6.4.5 1. Observe that if for all i G I, and if \I\ = A, then Ki = K ■ X iei i.e. sums and products behave as they ough to. For if X is a set of cardinality \X\ = k, and if we set Xi := X x {i} for all i G /, then IJie/ X x I are the same set. Now by definition X^jgj Ki = \ Ujg/ -^^l k ■ X = \X x I\. 2. Further observe that if for all i G /, and if |/| = A, then YlKi = K^ iei i.e. products and exponentials behave as they ought to: For if X is a set of cardinality \X\ = K, and if we set Xi := X for all i & I, then Yiiel -^i °^ functions from / to X, i.e. Hie/^i ^^'^ -^^ ^^^^ same set. Now by definition. Hie/ '^i — \ W.iei-^i\ and K^ = \X^\. 3. Next, observe that iei iei For if \Xi\ = Ki, then there is an obvious bijection ^ : nie/^'^i'* ~^ (Wiei ■^i)'^ defined as follows. If / G Yli^iXf, then f : I ^ Uiel lias fi := f{i) : X ^ Xi for each i G /. Thus fi{S,) G Xi for each i G I and ^ < A. Define / : A ^ Hiei^i fm) = Mo = fm)- 93 Axiom of Choice 4. Similarly, □ Proposition 6.4.6 (a) Suppose that A is an infinite cardinal. If Ki > for all i < X, then Ki = X - sup Ki (b) Suppose that X is an infinite cardinal. : i < XJ is an increasing sequence of non-zero cardinals, then JJ = (sup Ki)^ Proof: (a) Let a := "^i^x '^i' ^ supj^i : i < A}. Clearly ^ < a. Observe that a = Ki < n = X ■ jJL < X ■ a (★) i<\ i<X Further note that, since each > 0, X = ^\ <^Ki = a i<\ i<\ Hence A ■ cr = cr, so by (★) it follows that a = X - jjl. (b) Let vr := Hka let := supjKj : i < A}. Observe that since the Ki are non-zero, we have tt > /x. Now since Ki < jj, for all i, we have To prove the reverse inequality, recall that ^^^x A = A x A = A. Thus we may partition A into a family {Aj : j < A} of pairwise disjoint sets with cardinalities \Aj\ = A. Then TT = n «i = n ^ ) = n = i<\ j<\ ieAj j<\ (The fact that Y\i<\ = n7<A(n7eA ^*^y establish.) If A is a set and k a cardinal, define [A]^ := {X C A : \X\ = A} [A]<^ — {X^A: \X\ < X} Proposition 6.4.7 (a) If \A\ = k > X, then K^ Set Theory 94 (b) If \A\ = K> X, and A is an infinite cardinal, then [A\<^ Proof: (a) If / G A^, then / : A -> ^, so / C A x A, and |/| = A. Since \\ x A\ = X ■ k = k, we see that < |[A x = It follows that = \A^\. On the other hand, for every X G [A]^, there is at least one function Fx : X ^ A whose range is X. Hence [^]^| < \A^\ = K^. (b) We have [A]<' = u [Ar /U a cardinal <A SO by (a) I [^]^| = ^ = A • sup{«;'' : n a cardinal < A} = A • k^-^ = II a cardinal <\ H Lastly, we state and prove: Theorem 6.4.8 (Konig's Theorem) // < Aj for all i G I, then iel i&I Proof: Suppose, on the contrary that Hie/ — ^iei l"'^*! ~ Since | Hie/ -^il ^ Sis/ '^i^ '^^^ ^^'^ * partition {1^ : z G /} of Ilie/ "'^i ^^^h that < Kj. Here, the Yi are pairwise disjoint, and IJie/ ^ ~ Tlie/ For each i G /, let Pj be the projection of Yi onto the i*^ coordinate, i.e. Observe that Pi C Xj. Now |Pj| < \Yi\ = Ki < Aj = \Xi\, so there is Xi £ Xi — Pi for each i G I. Define h : I —> [Ji^j Xi : i ^ Xi. Then certainly h G Ilie/"'^*' Since h{i) Pi for any i G /, we see that h ^Yi for any i G /. It follows that h G niei" "'^^ ~ Uie/ ^ — contradiction. H Corollary 6.4.9 For every cardinal k, we have k < 2". Proof: 1 < 2. Now i<K i<K; H Observe that we require the inequality Ki < Aj to be strict in Konig's Theorem. For example, if Kn = Xn = 2^° for all n < Kq, then 2^0 = Ko • 2^° = 2^0 n<Ko and Y[ 2^0 = (2^0)^0 = 2^0 n<Ko 95 Axiom of Choice 6.5 Length, Area, Volume and (AC) "Length" is a non-negative real number associated with certain subsets of the real line M, i.e. it is a function £(•) which assigns to a subset C M its length C{E). Intuitively, the length function should have certain properties: l£. If C M is bounded, then <£(£;)< oo. 2c- C{-) is additive: If F are disjoint bounded subsets of M, then C{E U F) = C{E) + More generally, £(•) is countably additive: if Ei, E2, ■ ■ ■ En, ... (n G N) is a sequence of mutually disjoint bounded subsets M, then >C(U^i ^n) = X^^i J~-{En)- 3c- is translation invariant: If a set F is obtained by shifting a set E, then C{F) = C{E)- Ac- If is a interval, then C{E) = length of interval. Using an argument due to Vitali in 1905, we show that it is impossible to assign a length to every bounded subset of M, i.e. there is no function which satisfies each of the properties (l)-(4) of C{-) above, and which is defined for every bounded subset of M. Thus there are subsets of M which have no length. This does not mean that these sets have zero length; it means that there is no number which can be called their length, and which is consistent with (l)-(4). Example 6.5.1 Define an equivalence relation ~ on M by X ~ y *^=^ y — X G Q Let {Ei : i G /} be an enumeration of the equivalence classes of ~. Note that if x G M, then there exists ^ G Q such that < x + g < 1. Now since x ~ x + 5, we see that for every x there is y G [0, 1] such that x ~ y. Thus [0, 1] H / for every i G /. Now piclij^for each i G / one Xj G [0, 1] n Ei, and define a Vitali set H hy H := {xj :G /}. Thus for each y G M there is a unique i £ I such that y ^ Xi- For q £ Q, define H + q := {xi + q : i G /}. First note that the H + q are mutually disjoint: For if y G {H + q) r\ {H + g') for rational numbers g, g', then y = Xi + q = xj + q' for some i,j G /, and thus Xj = xj + {q' — q), i.e. Xj ~ Xj- It follows that Xj = Xj, thus that q = q' , and thus that H + q = H + q' . Next, we claim that for each y G M there is a unique g G Q such that y G H + q := {xi + q : i £ I}. Indeed, existence follows from the fact that there is an i G / such that y ^ Xi, so that q := y — Xi has the property that y G H + q. Uniqueness follows from the disjointness of the H + q- Now let {qn : n G N} be an enumeration of Q Pi [—1, 1]. Note that if x G [0, 1], there is a unique i £ I such that x — Xj G Q H [—1, 1], so that x G UneN(-^ + Since H C [0, 1], we also have UneN(-^ + Qn) ^ [-1,2]. Thus: [0,1] c\J{H + qn)^ [-1,2] neN ■^This requires the Axiom of Choice. Set Theory 96 Now suppose that the Vitah set H has an length i.e. that C{H) exists. Each H + qn is a translation of if, and thus C{H+qn) = C{H) for all n G N. Now since [0, 1] C \J^^^{H+qn) C [-1,2], it follows that The fact that 1 < Y.n=i^^^) implies that C{H) > 0, whereas the fact that E^=i'C(^) < 3 < oo implies that C{H) — contradiction. Hence C M is a bounded set to which a length cannot be assigned. Just so, under (AC), some subsets of fail to have an area, and some subsets of fail to have a volume. The latter seems particularly counterintuitive: Given any subset of M^, to find its volume, just dump it in a full bucket of water, and measure how much water flows out. It gets even worse, as will be shown in the next section. 6.6 Project: The Banach— Tarski Theorem The paradoxical decomposition of the unit ball U in due to Banach and Tarski (1924) is one reason why the Axiom of Choice may be regarded with some suspicion. It says that the unit ball U can be broken up into finitely many pieces (into five pieces, in fact), and that these pieces can be reassembled to form two new unit balls! First, we introduce some terminology and notation so that we can phrase the Banach- Tarski Theorem. • An isometry on is a distance preserving map, i.e. a map / : ^ such that ||/(x) — /(y)|| = ||x — y|| for all x, y G M^. Observe that any isometry is clearly a bijection. Translations and rotations (and compositions thereof) are isometries. • Two sets A^B C are congruent if and only if there is an isometry / : — )■ so that f[A\ = B. This means that A, B have the same shape — think of congruent triangles — and that / moves the figure A to the figure B. We write A = B \i A, B are congruent sets. • We say that a set A can be reassembled into a set B if there is a partition (^i)i<n of A and a partition {Bi)i^n of B (into the same finite number of pieces) such that Ai = Bi for all i < n. In that case, we write A^ B. The idea is that the set A can be broken up into pieces Ai, that each piece Ai can be moved by an isometry to Bi, and that the pieces Bi make up the set B. Thus A can be broken up into finitely many pieces, and the pieces can be reassembled to form B. \n=l / Furthermore, the sets H + qn are mutually disjoint, so □ 97 Axiom of Choice Theorem: (Banach-Tarski) The unit ball U in can be partitioned into two disjoint sets U = U1UU2 such that Ui ^ U and U2 ^ U □ Problem 1. (a) Show that « is an equivalence relation on P(M^). (b) Show that if (^j)i<n is a partition of A and {Bj)i^n is a partition of B such that Ai ~ Bi for all i < n, then Am B. (c) Prove a "Schroder-Bernstein Theorem" for ss: If A^ C B C A and A, then S A. [Hint: Let (Aj)i<„ and (^j^)j<„ be partitions of A and A^, and let fi : Ai ^ A\ be isometrics so that fi[Ai] = A\. Define / = Ui<n /»' / agrees with /j on ^Ij. Define ^° = A, = /[A'^], 5° = 5, B"+i = /[B'^] Let C := Un<a;(^" - -S")- Observe that f[C] C C, and that /[C] !==j C. Further observe that A = C\J{A-C) &ud B = f[C\ U (A - C), and deduce that A S.] □ Our next aim is to partition the unit ball U into five pieces which can be reassembled to form two unit balls. To do that we will partition the unit sphere S into pieces having certain properties. {S is the boundary of U .) Let c denote the centre of the unit ball. Each X <^ S determines a unique X <^U as follows: X := set of all points in ?7 — {c} whose projection along a ray from c to the surface S lies in X. Clearly, \i X,Y C. S are such that X then also X KiY . Consider now two axes do,di in centred at c. For dcfinitcncss, assume that do is the Z-axis, and that di lies in the XZ-plsme at an angle 6 to the Z-axis. Consider two rotations: ijj = Rotation by 7r/3 about do (p = Rotation by tt about di Observe that tp, ip are isometries which map S to S. Let / be the identity. Note that if)^ = I and (p^ = /, so that (p~^ = ip,ip'''^ = ip^- Consider the group G of all rotations generated by ip, (p. Take, for example, the element a := (poipoipoloil^oip. Using the facts that 1]/' = I and p'^ = /, we can reduce this to a = o ip. In this way, every element of G has an irreducible representation of the form (p o ip^^ o (p o tp^^ o . . . or tp^^ o (p o tp^^ o (p o . . . Each such an irreducible representation has a length: 1{I)=0, = l{',p^^) = 1, l{ip o ^p^'^) = l{ip^^ o ^) = 2, l{p) o ^p^^ o p)) = 2, . . . Set Theory 98 Now the group G depends on the angle 9 between (Iq and di. We want to show that there is so that every element of G has a unique irreducible representation. Equivalently, we want to show that we can choose 9 so that no non-trivial irreducible representation yields the identity /. Problem 2. (a) Show that each element of a G G can be represented by an orthogonal matrix (rotation matrix) whose entries are bivariate polynomials in cos 9 and sin 9 (i.e. for each entry aij of the matrix a there is a polynomial Pij{x, y) so that a^j = pij{cos9, sin 9)). [Hint: A rotation by tt about di can be effected by rotating the axis di by 9 onto the Z-axis, then rotating by tt, and then rotating the axis by —9 back to its original position.] (b) Show that an irreducible representation a G G of length > 1 yields the identity element if and only if z := e*^ is a solution to a system of polynomials. [Hint: cos^ = (c) Deduce that for each a G G there are at most finitely many 9 so that a yields the identity element. Conclude that there exists a so that no irreducible representation of length > 1 yields the identity element. □ Remarks: We've tried to avoid group theory in the above. What we were actually doing, should you know some group theory, is to show that the free product Z2 * Z3 of the groups (if) = Z2 and (ip) = Z3 is embeddable into the group SO(M^). If Go is the group of rotations generated by ip, tp when the angle between do, di is 9, then there is an obvious homomorphism Tr^) : Z2 * Z3 — )■ Ge- We showed that there exists 9 such that ng is an isomorphism. □ Henceforth, fix a so that no irreducible representation of length > 1 yields the identity of the corresponding group G. Then every element of G has a unique irreducible representation. We define a partition G = ^Ui?UGby induction on the length of irreducible representations. The elements of length < 1 are classified as follows: Now classify the elements of length > 2 according to the following rules: • If a starts with ip^'^, then (1) a e A =^ ifoae B. (2) a e BUG =^ (poae A. Observe that if a starts with (p, i.e. if a = (po/3, then ipoa has irreducible representation /3, since (po a = (p o o (3 = p. Since the length of /3 is < the length of a, o a = ^ has already been classified. 99 Axiom of Choice • If a starts with then tpo a e B and il^"'^ o a e C. ijjo a e C and 'tp~^ o a e A. ijjo a e A and ^"-^ o a G -B. If a starts with tp"^^, then ■0='=^ o a has aheady been classified. Problem 3: Show that (p[A] = BuC ip[A]=B tp-\A] = C [Hint: To show ip[A] C B L) C , let a G A. If q starts with ip, then a = ip o (3 for some /3 G i? U C, by rule (2). Hence ipoa = (3GB\JC. Else, if a starts with i^^^, then by rule (1), ^poa e B C BUC. Hence (p o a e B U C ioi all a e A. Conversely, to show B U C C ip[A], suppose that a £ B L) C. If a starts with (p, then a = ip o p for some (3 £ A (and thus ip o /3 E B), so a E p>[A]. Else if a starts with ip^^, then o a G A, by (2), so a = o (99 o q) G <^[A]. Hence a G <^[A] for all a G -B U C. The other equations can be proved similarly.] □ We now use the partition G = AUBUC to construct a partition S = X\JY\JZ\jQof the unit sphere. This, in turn, will induce a partition of the unit ball B. Problem 4: We show that there is a partition S = XUYUZUQof the unit sphere S, where \Q\=^o and X = Y = Z = YUZ (a) Observe that |G| = b^o- Also note that every rotation of S other than the identity has exactly two fixed points. Let Q be the set of all x G 5 so that a; is a fixed point of some aeG- {I}. Conclude that \Q\ = \Ho\. (b) For X G 5 - Q, let := {a{x) : a G G} Define a binary relation Ron S — Q by xRy < — > x e Py. Show that R is an equivalence relation. (c) The Axiom of Choice implies that there is a set M which contains exactly one element of each equivalence class of R. Define X --A-M ■-{a{x):xeM,aeA}, Y := B ■ M, Z = C-M where G = AU B U C is the partition of G constructed earlier. Show that ip[X] = YUZ, iPlX] = Y, ^~'^[X] = Z {3) aeA =^ {A) aeB =^ (5) a G C =^ (d) Conclude that X = Y = Z = YUZ. Set Theory 100 □ Now recall that if X C S, then X is the set of all points inU — {c} whose projection along a ray from the centre c onto the surface S lies in X. The partition S = XUYUZUQofS therefore induces a partition U = XUYUZUQU{c} of the unit ball. Moreover, clearly X f-^ Y Z Y U Z Problem 5: (a) Show that Y U Z ^ X \J {Y U Z). Deduce that X^Y^Z^YUZ^XUYUZ (b) Show that U ^ X\jQU{c}. (c) Show that there is a rotation tt G such that Tr[Q] Ci Q = 0. Conclude that 7t[Q] C XUYUZ. Using the fact that Z Ki XUYUZ, deduce that there is T C such that [Hint: Choose an axis of rotation whose poles are not members of Q. For each angle 5 G [0, 27r), let its denote a rotation by S about this axis. For q G Q, define Aq := {S G [0, 27r) : Trs{q) G Q}. Then A := [j^^g Ag is countable. Choose S G [0, 27r)-A, and let tt := tts-] (d) Let p & Z — T. Observe that U^XUQU{c}^Yufu{p}CYUZCU Conclude that YUZ ^U. (e) Finally note that U = UiUU2, where Ui := X[jQU{c} and U2:=YUZ. This concludes the proof of the Banach-Tarski Theorem. □