Skip to main content

Full text of "Set Theory"

See other formats


UMIVERSITEIT-STELLEIMBOSCH-UIMIVERSITY 

jou kennisvennoot • your knowledge partner 



SET THEORY 

P. Ouwehand 



Department of Mathematical Sciences 
Stellenbosch University 



2012 



i 



ii 



ZFC 

ZFC 0. Set Existence: There is a set. 

3x{x = x) 

ZFC 1. Extensionality: If sets x, y have the same elements, then x = y. 

Vx Vy [V2;(2; Gx-H-zGy)— >-a; = y] 

ZFC 2.tp Separation Schema: If P{x) is a property of sets (with parameter p) describable by a 
first-order sentence ip{x,p), then for any set Z and any p there is a set F = {x G Z : P{x)} 
which consists precisely of those elements of Z which have the property P. 

yzyp^y^x^xEy-^-^xEzA if{x,p)) 

ZFC 3. Pairing: For any x,y there is a set z = {x,y} whose elements are precisely x and y. 

Vx Vy 3z [Vit; {w E z -(-^ w = x'V w = y)] 

ZFC 4. Union: For every set x there is a set y = |J x which is the union of all the elements of x. 

Vx 3y Vz G y ■<->■ 3w G x (z G w)] 

ZFC 5. Power Set: For any set x there is a set y = V{x) whose elements are precisely all the subsets 
of X. 

Vx 3y G y -H- z C x) 

ZFC 6. Infinity: There is an infinite set. 

3x [0 G X A Vy (y G X ^ y U {y} G x)] 

ZFC 7. Choice: Every family of non-empty sets has a choice function: If x is a family of non-empty 
sets, then there is a function f on x which chooses from each it; G x an element f{w) G w. 

Vx \iw G X ^ 0) ^ 3/ ( / is a function A dom(/) = x A ^/w G x {f{w) G w))] 

ZFC 8.,^ Replacement Schema: A relation R{u, v) between sets which is describable by a first- 
order sentence (^(n, v,p) (with parameter p) is said to be functionlike if for every u there is 

at most one v such that R{u, v). 

If R is functionlike and x is a set, then the image z = R[x] of x under i? is a set. 



(yu Vv Vu; [ip{u, v,p) A <^(n, — >■ t; = w]^ — >■ Vx 3y V?; (?; G y ■<->■ 3n G x (^(-u, 



ZFC 9. Foundation: Every non-empty set has a G-minimal element. 

Vx [x 7^ -)• 3y G X (y n x) = 0] 



iii 



These notes are for a short course in set theory at the undergraduate level at Stellenbosch 
University. No pretense at orignality is claimed. Though amplified by material from a number 
of additional sources, the debt to the first few chapters of the book Set Theory, by Thomas 
Jech, Springer 2003, should be easily discernible. 



iv 



Contents 



1 


The Axioms of Set Theory 




1.1 Introduction! 




1.2 Naive Set Theory 




1.3 The Iterative Conception of Set 




1.4 The Axioms of ZF(]I 




1.5 Sets and ClassesI 



12 Relations and FunctionsI 

12.1 Ordered Pairs and Finite Cartesian ProductsI 

12.2 Relationsl 

12.3 FunctionsI 

2.4 Equivalence Relations 

12.5 Order Relationsl 

13 Natural NumbersI 

3.1 The Set uj of Natural NumbersI 

3.2 ((J, g) is a Transitive Well-Ordered Set . . . . 

3.3 RecursionI 

13.4 Finite and Infinite SetsI 

14 Ordinalsl 

4.1 Well-Orderings 

14.2 Ordinalsl 

14.3 Transfinite Induction and Recursion! 

14.4 Ordinal Arithmetic! 

14.4.1 Limits! 

14.4.2 Addition! 

4.4.3 Multiplication 

4.4.4 Exponentiation 

4.5 Project: Goodstein's Theorem 



15 Cardinals I 

5.1 Cardinality: Basic Definitions 

15.2 Cardinal Arithmetic in ZF! 

5.3 Alephs 

5.4 The Canonical Well-Ordering of ON x ON 

5.5 Arithmetic of Alephs 



V 



vi 



CONTENTS 



5.6 Project: Cardinal Arithmetic and The Axiom of Choice 79 



The Axiom Of Choicel 



83 

83 

6.2 Well-Ordering] 85 



6.1 Choice Functions, Partitions and Cartesian Products 



6.3 



Zorn's Lemma and Other Maximal Principles] 87 

90 

95 

96 



6.4 Cardinal Arithmetic in ZFCl 



6.5 Length, Area, Volume and (AC) 



6.6 Project: The Banach-Tarski Theorem 



Chapter 1 

The Axioms of Set Theory 



1.1 Introduction 

Every mathematician needs to know some set theory. "Set theory", according to (the set 

theorist) Kenneth Kuncn, "is a theory of everything". AH of mathematics can be done within 
the framework of set theory. This means that all mathematical objects can be interpreted as 
sets, and all mathematical statements about these objects can be phrased in the language of 
sets. Not only can this be done: Typically, it is done. 

We assume here that, to some extent, you are already aware of this. For example, a group 
is an ordered tuple {G, , e), where 

• G is a set. 

• ■ : G X G ^ G is a function which plays the role of group multiplication. 

• T-Gisa function which plays the role of multiplicative inverse 

• e G G is an element which plays the role of multiplicative identity element. 

You probably know that a function / : ^4 — t- i? is a set: It a subset of the Cartesian product 
A X B, where we regard (a, 6) G / if and only if /(a) = b. The Cartesian product A x B of 
two sets is itself a set, namely the set of ordered pairs (a, b) where a & A and b G B. An 
ordered pair (a, b) can be interpreted as a set also, namely (a, b) := {{a}, {a, b}}. Ordered 
tuples of higher order can be defined iteratively, 

(a, 6, c) := ((a, 6), c) = {{(a, 6)}, {(a, 6), c}} = {{{{a}, {a, 6}}}, {{{a}, {a, 6}}, c}} 

The group laws can then be written as equality of certain functions: For example, the asso- 
ciative law {ab)c = a{bc) is the assertion that the functions given by the compositions 

G X G X G ^ G X G ^ G : (a,b,c) ^ {ab, c) {ab)c 

and 

G X G X G ^ G X G ^ G : {a,b,c) ^ {a,bc) ^ a{bc) 

are equal. Since functions are sets, the associative law for G is equivalent to the assertion that 
two sets are equal. In principle, everything about groups can be formulated in the language of 
sets. It will look horrendous, it is true, but it can be done. Thus the whole group (G, , e) 

1 



Set Theory 



2 



is a set — not just the base set G on which the group operations are defined — the whole 
object {G, -'^ , e). And each element of (G, , e) is a set,. . . and each element of an element 
of (G, -'^ , e) is a set, and. . . , 

In the same way all the objects you encounter in mathematics — natural numbers, com- 
plex numbers, topological spaces, differentiable manifolds, tangent bundles. Boolean algebras, 
measure spaces, stochastic processes, pseudo-differential operators — can be interpreted as 
sets. We do not say that these objects are sets; indeed, we prefer to completely avoid meta- 
physical speculation about what mathematical objects might be, and in what sense they might 
be said to exist. We simply mean that we can construct sets that behave like these mathemat- 
ical objects. This kind of thing is not new to you: You're probably quite used to regarding 
real numbers as points on a line. But a point is not the same thing as a real number. "A 
point," says Euclid, "is that which has no parts". A real number, on the other hand, is an 
element of a complete ordered field. But, thought of in the right way, the points on a line can 
be said to behave like the real numbers, and thus can be interpreted as real numbers. 

So we will try to show that all of mathematics can be done in the set-theoretic universe. 
Many of the set-theoretic ideas required are rather straightforward, and can be developed 
without recourse to the axiomatic method (though every mathematician must be aware of 
the potential pitfalls associated with a purely "naive " view of sets) . 

But apart from its intrinsic importance as a fundamental vehicle for for doing other math- 
ematics, set theory is also an independent branch of mathematics in its own right, regarded as 
one of the four main branches of mathematical logic. In other words, set theory — like group 
theory or number theory — is worth pursuing for its own sake: "Set theory is the mathemat- 
ical science of the infinite" , according to the set theorist Thomas Jech. You probably already 
know that the natural numbers N and the rational numbers Q both form countable sets — i.e. 
they have the same size (cardinality): |N| = |Q|. If you know that, you probably also know 
that the set of real numbers M is uncountable, and thus that infinity comes in different sizes, 
with |N| < |M|. Some historians date the birth of the autonomous subject Set Theory to that 
day (7 December 1873) that Georg Cantor — who is regarded as the creator of the subject 
— wrote to Richard Dedekind to inform him of this fact. A short while later, in 1878, Cantor 
framed the Continuum Problem: Is there a set A whose size is strictly between that of the 
natural numbers and the real numbers, i.e. such that |N| < |^| < |M|? Cantor was unable 
to settle this question, though he hypothesized that there is no such set A — the Continuum 
Hypothesis. The M^er-mathematician David Hilbert was one of the first to appreciate the 
power of set theory, and its importance to general mathematics. Accordingly the Continuum 
Problem was first on the famous and influential list of 23 problems that he presented at the 
Second International Congress of Mathematicians in Paris, 1900. This innocuous-appearing 
question initially resisted all efforts, and was finally settled in a rather surprising way: It was 
proved by Kurt Godel in 1938 that the Continuum Hypothesis cannot be disproved, and then 
by Paul Cohen in 1963 that the Continuum Hypothesis cannot be proved! In order to obtain 
such results, it was necessary to write down a precise list of axioms for set theory — the 
Zermelo-Praenkel Axioms — so that there would be agreement on what constitutes a valid 
set-theoretic proof. What Godel and Cohen proved was that the Continuum Hypothesis is 
independent of these axioms. (There remains open the possibility that mathematicians of the 
future will discover a new plausible axiom for set theory, which will then settle the Continuum 
Problem in one way or the otheirl) Set theory is riddled with problems like these, where it 



^Indeed, some mathematicians believe that the Continuum Hypothesis was recently settled by the set 



3 



Axioms of Set Theory 



can be proved that the problem is unsolvabl^ Such independence proofs require methods 
from logic (model theory, proof theory, recursion theory), and thus an axiomatic approach is 
here indispensable. 

This chapter is concerned with the foundations of set theory, and a little philosophjj^ 
will be unavoidable. In fact, the title of the first ever monograph dedicated to the sub- 
ject — Cantor's Grundlagen einer allgemeinen Mannigfaltigkeitslehre: Ein mathematisch- 
philosophischer Versuch in der Lehren des Unendlichen, 1883 — makes clear that, in set 
theory, mathematics and philosophy were inextricably intertwined right from the start. 



Exercise 1.1.1 All active branches of mathematics have problems that are unsolved. It is 
usually tacitly agreed that such an unproved statement is either true or false. But set theory 
has problems that are provably unsolvable. This leads to a philosophical question: Do you 
believe that the Continuum Hypothesis must be either true or false — we just do not know 
which? Or do you believe that it simply doesn't make sense to ask whether the Continuum 
Hypothesis is true or false? Can you back up your belief with an argument? 



□ 



1.2 Naive Set Theory 

Classifying objects into sets is a basic operation of the human mind. It is more basic even 
than counting: Before you can count some objects, you first have to decide which objects to 
count, i.e. you first have to mentally group them together, and separate them from objects 
you do not want to count. Nothing seems more natural, therefore, than to collect objects 
which share a common property into a set. Naive Set Theory (NST) is a formal theory which 
allows us to do just that. It has the following axioms: 



NST 1. Extensionality: If two sets X,Y have the same elements, then X = Y. 
NST 2.p Full Separation: If P(x) is a property of sets, then there is a set 

Y = {x: P{x)} 

which consists precisely of those x which possess the property P. 

Observe that the Full Separation Axiom is not a single axiom, but an axiom schema: We 
have an axiom for every possible property P(-) (though right now we won't be too precise 
about exactly what constitutes a "property"). Thus there are infinitely many axioms. 

The first axiom states that a set is determined by its members, and by nothing else. In a 
sense, it tells us what it means to be a set. The second axiom allows us to show that many 
sets actually exist. For example: 

theorist Hugh Woodin, and that it is false. Others dispute this. 

^Assuming that the axioms of set theory are consistent, that is. But Godel proved that we cannot prove 
that the axioms of a set theory are consistent without assuming the consistency of some more powerful theory. 
In other words, it is impossible to prove from the axioms of set theory that set theory is consistent. 

''l highly recommend browsing the online Stanford Encyclopedia of Philosophy, http: //plato . Stanford, 
ledu for good introductions to the philosphy of mathematics, and much more besides. 



Set Theory 



4 



• If P{x) = {x x), then {x : P{x)} is a set with no elements (since x = x for any x). By 
the Axiom of Extensionahty any two empty sets are equal, because they have exactly 
the same elements. Hence a unique empty set exists, and we denote it by 0. 

• If P(x) = (x = x), then {x : P{x)} is the set of all sets: Any set x is equal to itself. 

• If P{x) = (x = a V X = 6), then {x : P{x)} is the set {a, b}. 

• If P{x) = {\/z £ X {z £ a)), then {x : P{x)} is the power set of a, i.e. the set of all 
subsets of a. 

• If P{x) = (x is a real number > 0), then {x : P{x)} is the set of all real numbers greater 
or equal to zero. (At this point, we do not yet know that there are any real numbers, 
so it is at this stage possible that this set is empty.) 

Towards the end of the 19th Century, mathematicians became increasingly more concerned 
with the foundations of mathematical knowledge, and the status of mathematical "reality" . 
Some logicians — amongst whom were Gottlob Frege and Bertrand Russell — tried to reduce 
mathematics to logic. In June 1902, Russell wrote a letter to Frege, warmly expressing his 
appreciation for the latter's work, but adding that "There is just one point where I have 
encountered a difficulty" . Russell asked Frege to consider the property 

P{x) = {x ^ x) 

If y = {x : X ^ x} exists (which, according to the Full Separation Axiom, it must), then there 
are two possibilities: Either y £ y or y ^ y. 

(i) If y G y, then P{y) must hold (since y consists of exactly those sets which have property 
P). In that case y ^ y — contradiction. 

(ii) li y ^ y, then certainly P{y) holds. So y is one of the sets that has property P, and 
therefore y £ {x : P{x)}, i.e. y £ y — contradiction. 

Thus the theory NST is inconsistent! Russell's paradox had devastating consequences for 

Frege's programme to reduce arithmetic to logic. 

Russell's paradoi^ (also called Russell's antinomy) is just the simplest example of the 

problems that may arise from a too liberal notion of set. Cantor himself was aware of such 

problems by 1899 at the latest, and dealt with this by denying the status of sethood to certain 

collections (multiplicities). 

For a multiplicity can be such that the assumption that all of its elements 
'are together' leads to a contradiction, so that it is impossible to conceive 
of the multiplicity as 'one finished thing'. Such multiplicities, I call 
absolutely infinite or inconsistent multiplicities. As we can readily see, 
the 'totality of everything thinkable ', for example, is such a multiplicity; 
later still other examples will turn up. 

If on the other hand, the totality of the elements of a multiplicity can be 
though of without contradiction as 'being together', so that they can be 
gathered together into 'one thing', I call it a consistent multiplicity or 
set. 

— Cantor, letter to Dedekind, 1899. 

^Apparently, Ernst Zermelo was aware of Russell's paradox some years before. But it was Russell who 
understood its devastating consequences, and who brought it to the attention of the wider world. 



5 



Axioms of Set Theory 



Thus, according to Cantor, the 'totahty of everything thinkable' is inconsistent: There is no 
set of all set^ 

If we can't just lump anything thinkable together into a set, then what can we do? A 
number of different approaches have been suggested. Some mathematicians and philosophers 
thought that Russell's paradox indicated that there were problems not just with an intuitive 
notion of set, but with logic itself. (It is easy to rephrase Russell's paradox in purely logical 
term^. The intuitionists, for example, thought that the basic laws of logic do not apply 
to infinite collections. Amongst others, they denied the validity of the Law of the Excluded 
Middl^ Other mathematicians thought that classical logic was inherently sound, and that 
one need merely analyse the notion of set with greater care and precision, if one wants to 
avoid contradictions and inconsistencies. This is the approach we will take. But avoiding an 
inconsistent set theory is harder than one might think. Firstly, a number of set theories have 
been proposed with the aim of avoiding Russell's paradox, but which subsequently turned out 
to be inconsistent after all — notably the system ML (Mathematical Logic) proposed in 1940 
by the great logician Willard van Orman Quine. And many of these set theories are artificial: 
They are, as Russell commented, not "such as even the cleverest logician would have thought 
of if he had not known of the contradictions." 

In the next section, we will look at an intuitively compelling view of the set-theoretic 
universe, in which sets are built up in stages, with each set made up from simpler sets. 
This picture of the set-theoretic universe then suggests certain axioms for set formation, the 
Zermelo-Fraenkel Axioms, or ZF. If ZF set theory is consistent, then we may never know it: 
Godel's 2""^ Incompleteness Theorem states (roughly) that a theory that is strong enough to 
contain basic arithmetic cannot prove its own consistency. If ZF is inconsistent, then one day 
we may find a contradiction. However, ZF has been used for nearly a century already, and 
none has been found as yelrl 



^How did Cantor come to this conclusion? Define the power set V{X) to be the set of all subsets of X. 
Cantor proved that \X\ < \'P{X)\. Now if X is the set of all sets, then Vi^X) C X, since every member 
of 'P{X) is a set. But then [^(X)! < \X\, and hence the assumption that all sets can be collected into a 
single set-of-all-sets leads to contradiction. Cantor knew about this by 1899 at the latest, though it was only 
published in 1932. 

^ Here's how: "It seems to make perfect sense to inquire of a property whether it applies to itself or not. 
The property of being red, for instance, does not apply to itself, since red is surely not red, whereas (the 
property of being) abstract, being itself abstract, applies to itself. Calling the property of not applying to 
itself 'impredicable', we arrive at the paradoxical consequence that impredicable is impredicable if and only 
if impredicable is not impredicable. The property-theoretical (logical) variant is as paradoxical as the set- 
theoretical (mathematical) one." — Foundations of Set Theory, by Fraenkel, Bar-Hillel and Levy, 2nd revised 
edition, Springer 1973. 

^This is the law which asserts that for any statement A, either A is true or A is false. The intuitionists 
therefore denied that the tautology A V ^A is true. 

*In September 2011, the mathematician Edward Nelson announced that arithmetic (to be precise, Peano 
Arithmetic (PA)) is inconsistent. His proof turned out to have an error. Nelson is hardly a "lightweight" 
mathematician, and this episode reveals that serious mathematicians may have legitimate qualms about the 
consistency of even the most basic and longstanding mathematical theories. 

By Godel's 2"^* Incompleteness Theorem, PA is not strong enough to prove its own consistency. ZF is strong 
enough to prove that PA is consistent [ZF h Con(Pyl)), but since we don't know if ZF is consistent, that proof 
is also suspect. (In 1936 the logician Gerhard Gentzen proved that PA is consistent. His proof goes beyond 
ordinary arithmetic by requiring a transfinite induction, but only up to a countable ordinal.) 



Set Theory 



6 



1.3 The Iterative Conception of Set 

Definition: By a "set" we mean any collection M into a whole of 
definite distinct objects m (which are called the "elements" of M) of our 
perception or of our thought. 

— Georg Cantor, 1895 

(First sentence of Contributions to the Founding of the Theory of Trans- 
finite Numbers.) 



A set, according to Cantor, is 'any collection . . . , into a whole of def- 
inite, well- distinguished objects... of our intuition or thought.' Cantor 
also defined a set as a 'many, which can be thought of as one, i.e., a 
totality of definite elements that can be combined into a whole by a law. ' 
One might object to the first definition on the grounds that it uses the 
concepts of collection and whole, which are notions no better understood 
than that of set, that there ought to be sets of objects that are not objects 
of our thought, that 'intuition' is a term laden with a theory of knowl- 
edge that no one should believe, that any object is 'definite, ' that there 
should be sets of ill- distinguished objects, such as waves and trains, etc., 
etc. And one might object to the second on the grounds that a many' is 
ungrammatical, that if something is a many" it should hardly be thought 
of as one, thai totality is as obscure as set, that it is far from clear how 
laws can combine anything into a whole, that there ought to be other 
combinations into a whole than those effected by laws, ' etc., etc. 

— George Boolos, 1971 

(Opening lines The Iterative Conception of Set, 
J. Philosophy 68 no.8, 1971, pp. 215-231.) 

Russell's paradox involved the multiplicity of all sets which are not elements of themselves. 
Can a set belong to itself? 

Looked at one way, the answer appears to be "Yes!" : Let A be the set of all sets definable, 
in English, using fewer than 100 words. We have just defined A in English using fewer than 
100 words. Hence A E A. However, "definable in English" is a rather obscure term. English 
has just finitely many words, so there are just finitely many sentences of length < 100 words. 
Thus there are at most finitely many natural numbers definable in < 100 words. Now let n 
be the smallest natural number not definable in fewer than 100 words Contradiction. . . 

Another example is the set 

X:= {{{...0...}}} 

where there are infinitely many brackets arount the empty set. Then clearly {X} = X, so 

X ex. 

But looked at from another angle, the answer to our question appears to be "No!" , however: 
In order to collect some objects into a set — i.e in order to make a set — one has to have 
those objects first. Thus the elements which make up a set are, in some sense, more primitive 
than the set itself. As an analogy, one can think of the natural numbers as being "created" 
from the number by the operation of repeatedly adding 1. We start with zero, then we make 



7 



Axioms of Set Theory 



1, then we make 2, then we make 3, etc. Now it must not be thought that natural numbers 
are continuously being created — "make" is just a manner of speaking. In the same way, 
one can think of sets of being made up as follows: We start off with some primitive objects 
called atoms (also called individuals, or urelemente), which are not sets, and have no elements 
themselves. At stage 0, we construct all sets of atoms. Then at stage 1, we construct all sets 
made up of atoms and sets created at stage 0. At stage 2, we construct all sets consisting of 
atoms and sets created at stages and 1. Thus if we start with two atoms a, b, then we get 
the following: 

Stage 0: New sets 0, {a}, {b}, {a, b}. 

Stage 1: New sets {0}, {{a}}, {{6}}, {{a, b}}, {0, {a}}, {0, {b}}, {0, {a, b}}, {0, {a}, {6}}, . . . 
Stage 2: New sets {{0}}, {0, {0}}, {{{a}}}, ■ ■ • 
Stage 3: . . . 

We do this for all the stages 0,1,2, But it doesn't stop there! Once we have all these sets, 

we can construct sets made up of atoms and sets constructed at some finite stage. Thereafter 
we can construct sets made up of atoms and sets just constructed, at the first infinite stage, 
for example the set 

{0, {0} , {{0}},{{{0}}},...} w 

1 bracket 2 brackets 3 brackets 

You may have noticed that we have not said anything about the nature of the atoms. 
What should or could atoms be? We might take them to be basic objects that we need in 
mathematics, e.g. natural - or real numbers, or points and lines. However, it turns out that 
we do not need atoms at all! Instead, we can build up all of mathematics from nothing, i.e. 
from the empty set. To do this, define a pure set as follows: 

• Atoms are not pure. 

• A set is pure if and only if all its elements are pure. 

Thus at stage 0, we form the set 0, which is pure — vacuously, because none of its members 
are impure. At stage 1, the set {0} is pure. At stage 2, we have the pure sets {{0}} and 
{0, {0}}, etc. The infinite set (★) constructed at the first infinite stage is clearly pure as well. 

We will only be concerned with pure sets. Thus when we say "set" , we mean "pure set" . 
This means that in our mathematical universe, everything is a set. 

Now observe that sets are made again and again. For example, is a subset of all sets, 
and hence is created anew at each stage. But for each set there is a least stage, i.e. a stage 
at which the set appears for the first time. 

We thus have the following intuitions about set formation via stages: 

(i) There is a least stage at which we have the set 0. 

(ii) Every stage a has a successor a + 1. At stage a + l we can construct sets whose elements 
are sets constructed at stage a and earlier. This means that if a collection (multiplicity) 
has elements all of which are created at stage a or earlier, then that collection is a set, 
and is "created" by stage a + 1. 



Set Theory 



8 



(iii) Having made sets at a whole lot of stages, there is least a new stage, at which we can 
construct sets whose elements are sets constructed at an earlier stage. 

(iv) A collection (multiplicity) is a set if and only if it is formed at some stage. 

(v) For every set, there is a least stage at which that set is formed. 

Thus we have stages 0, 1, 2, 3, . . . , n . . . . After that comes a stage — call it stage oj. After 
stage uj come stages 

a; + 1, a; + 2, . . . , 2a;, 2a; + 1, ... , 3a;, ...,a;-a;,...,a;,...,a;,...,a; ,...,a; 

Thereafter comes another stage — call it uji, etc. These stages conform to Cantor's two 
"generating principle^': (1) Every stage a has an immediate successor stage a + 1, and (2) 
given any sequence of stages without a last element, there is a stage which comes immediately 
afterwards. 

If a, /3 are stages, we write a < /3 to mean that stage a comes before stage /3. When we 
form a set at some stage /3, its elements must be sets that have been constructed at some stage 
a < /3. It is therefore impossible for a set to belong to itself. Furthermore, it is impossible to 
get a loop such as 

Xi G X2 G X3 • • • G e Xi 

Now we consider a property that will turn out to play a very important role in the future 
development of set theory: Let A be any non-empty collection of stages, and let F be the 
collection of all stages that come strictly before any of the stages in A, i.e. 
F := {7 : V(5 G A (7 < (5)}. We can then consider two cases: 

• Either F is empty. In that case if the zeroth stage belongs to A, so is the least 
element of A. 

• Else F is non-empty. In that case intuition (iii) above says that there is a least stage 
a which lies above all 7 G F. We now claim that a is the least element of A. To see 
this, note first that each 5 G A lies strictly above all 7 G F. Since a is the least stage 
which lies above all 7 G F, it follows that V5 G A (a < (5). Next observe that it cannot 
be the case that V(5 G A (a < (5), for then q G F. Hence there is some /3 G A such that 
(3 < a. But since /3 G A we also have a < (3, and thus q = /3. In particular, a G A. 
Since V5 G A (a < 5), we see that a is the least element of A. 

In both cases, therefore, we have shown that A has a least element: 

Any non-empty collection of stages has a least element. 

All of the above is at the intuitive level, and must be made formal. And we will accomplish 
this: For example, the intuitive idea of a stage will be captured by the formal notion of an 
ordinal. The fact that each set is "created" at some stage will be be evident when we consider 
the cumulative hierarchy of sets and the Axiom of Foundation. 



Cantor, Grundlagen einer allgemeinen Manmchfaltgkeitslehre, 1883. 



9 



Axioms of Set Theory 



1.4 The Axioms of ZFC 

Set theory is that branch of mathematics whose task it is to investi- 
gate mathematically the fundamental notions "number", "order", and 
"function", taking them in their pristine, simple form, and to develop 
thereby the logical foundations of all arithmetic and analysis; thus it 
constitutes an indispensable component of the science of mathematics. 
At present, however, the very existence of the discipline seems to be 
threatened by certain contradictions, or "antinomies" , that can be de- 
rived from its principles — principles necessarily governing our think- 
ing, it seems - and to which no entirely satisfactory solution has yet 
been found. [. . . jUnder these circumstances there is at this point nothing 
left for us to do but to proceed in the opposite direction, and starting 
from set theory as it is historically given, to seek out the principles re- 
quired for establishing the foundations of this mathematical discipline. 
In solving the problem we must, on the one hand, restrict these prin- 
ciples sufficiently to exclude all contradictions and, on the other, take 
them sufficiently wide to retain all that is valuable in this theory. 
Now in the present paper I intend to show how the entire discipline 
created by Cantor and Dedekind can be reduced to a few definitions and 
seven principles, or axioms, which appear to be mutually independent. 

— Ernst Zermelo, 1908 

(Opening lines of Untersuchungen iiber die Grundlagen der Mengen- 
lehren I, Mathcmatischcn Annalen 65, pp. 261-281, 1908. Translation 
in van Heijenoort, From Frege to Godel) 

You may already have encountered various systems of axioms, such as the axioms for a vector 
space, or of a field. A model of a system of axioms is a mathematical object that satisfies 
these axioms. Thus a field is a model of the field axioms. 
Systems of axioms can be separated into two classes: 

I. Some systems of axioms arc obtained by studing a number of mathematical objects/theories 
and abstracting certain common elements. Examples of such arc the axioms of group 
theory, of field theory, of vector spaces, of topology, of lattice theory, of probability 
theory. . . . These systems of axioms have many models, i.e. there are many different 
kinds of group, many different kinds of field, many different kinds of vector space, .... 
Moreover, these axioms are intended to have many models. 

II. Other systems of axioms are meant to describe and characterize one kind of mathematical 
objects, i.e. these axioms have an intended model. The axioms of arithmetic are meant 
to capture just the set N of natural numbers with the operations +, x and relation < 
on N. The axioms of a complete ordered field are intended to capture the set of real 
numbers. Euclid's axioms of geometry are intended to characterize space. 

In some cases, systems of axioms succeed in characterizing exactly the intended model. 
Notably the axioms for a complete ordered field exactly capture the reals, in that any two 

models of these axioms are isomorphic (i.e. essentially "the same"). In many cases, it is 
impossible to completely capture an object with a set of axioms: If a set of first-order 



Set Theory 



10 



axioms has just one an infinite model, it will have many non-isomorphic models " 

The axioms of set theory are of the second kind: They are intended to describe the universe 
of all sets. Below, we will write down these axioms, and briefly justify them in terms of the 
iterative conception of sets. In the chapters that follow, we will develop the consequences 
of these axioms in some detail, and (hopefully) convince you that all of mathematics can be 
interpreted in the universe of sets. 

The first set of axioms for set theory were presented by Ernst Zermelo in 1908, in a paper 
entitled Investigations in the Foundations of Set Theory I. An additional schema of axioms 
was suggested by Abraham Fraenkel in 1922 (and independently, by Thoralf Skolem in 1923). 
We shall phrase these axioms in the language of first order predicate logic. The language of 
set theory has at its disposal the following symbols, whose meaning is assumed to be already 
known to you. We have at our disposal a countable collection of 

• Variables: vo,vi,V2,V3, . . . 

As non-logical symbols, we have the two binary predicate symbols 

• Equality: = 

• Membership: G 

The atomic formulas of our theory are expressions 

X = y X £ y 

and these can be combined into formulas by means of 

• Logical connectives: A (conjunction, and),V (disjunction, or),-i (negation, not), — )■ (im- 
plication, implies), <-)• (equivalence, if and only if). 

• Quantifiers: V (for all), 3 (there exists). 

• Brackets: (,) 

It is assumed that you already understand what a (well-formed) formula is, when an instance 
of a variable in a formula is free or bound, etc. All of this is part of our metatheory, the basic 
background theory in which we discuss our formal set theory. It's obvious — if you stop 
to think for a minute — that one cannot develop any theory (mathematical, or otherwise) 
from absolutely nothing. We have to assume that we have a common background language 
in which to develop our formal theory, and a common basis of knowledge that we can use as 
springboard from which to start. For us, this metalanguage and metatheory includes a natural 
language (English) augmented by a few mathematical notions, such as the intuitive notion 
of natural number, the notion of a finite set, and very elementary first-order logic. This 
foundation, involving only intuitively obvious finitary notions and - reasoning, is (hopefully) 
obvious a priori, and its validity not in doubt 

^"This is the Lowenheim- Skolem Theorem, a basic result in the branch of mathematical logic called model 
theory. It follows from this theorem that if the axioms of set theory are consistent, then there is a countable 
model of set theory, i.e. a model in which there re just countably many sets — a result that Skolem found 
highly paradoxical. 

^^If you do doubt it, your mathematics will look very different. For example, for the intuitionists every 
function / : R — ^ R is continuous. This follows logically from certain assumptions that are quite natural. 



11 



Axioms of Set Theory 



Apart from the above symbols, we will introduce symbols for defined notions as we proceed. 
Many of these will undoubtedly be familiar to you, such as U, n, 0, — for union, intersection, 
empty set and set difference. We will use x,y, z,X,Y, Z, . . . as symbols for variables, and 
also use brackets [,], j , , to make expressions easier to understand. We will also employ 

expressions such as x ^ y rather than the formally correct = y), and we will omit brackets 
wherever it makes an expression easier to read without risk of ambiguity. Thus, for example, 
we will write 

Vx 3y (x y) instead of Vfo (3fi {^{vq G vi))) 

The latter is a well-formed sentence of our formal language, but the former is much easier to 
read. 

We also assume that you are aware of various logical equivalences, such as 

(/? V ■i/' -'(-'(^ A -I'i/') and Mx ip (-k/^) 

etc. 

Furhermore, we introduce the following notation: 

(Vx y) (f means Vx(x G y — >■ (p) 

and 

(3x & y) if means 3x {x & y A ip) 



ZFC 0: The Axiom of Set Existence 



ZFC 0: 




There is a set. 


3x {x = x) 



Without this axiom, we have nothing to talk about. (For some systems of axioms, models 
with no elements are acceptable, and even desirable. Category theorists, for example, like to 
work with categories that have initial objects, and these can be empty. But our axioms are 
meant to characterize a single "object" : The universe of all sets — and we believe that that 
universe is non-empty.) This axiom is not strictly necessary, as it is implied by the Axiom 
Infinity. 

ZFC 1: The Axiom of Extensionality 
ZFC 1: 

If two sets have the same elements, then they are equal. 

Vx Vy [Vz (z G X -H- 2; G y) — >■ X = y] 

The extension of a property is the collection of all objects which have that property, i.e. to 
which that property extends. 
It is not a priori clear that 



{Moons of Venus} = {Moons of Mercury} 



Set Theory 



12 



or that 

{x'^ ■.xeR} = : y G M} 

The intension of those definitions of sets — i.e. the way in which they are defined — 
is different. However their extension is the same. Both the sets {Moons of Venus} and 
{Moons of Mercury} are empty, and are thus equal to 0. Both the sets {x'^ : x € M} and 
{y^^ : y G M} consist exactly of all non-negative real numbers, and are thus equal to [0,oo). 
The Axiom of Extensionality states the intension is not important when it comes to set 
equality, but only the extension. 

We may now introduce a new symbol C for the relation of inclusion, a binary relation on 
sets which is defined as follows. 

Definition 1.4.1 We say that a set j; is a subset of y and write a; C y if and only if every 
element of x is an element of y: 

X y ■^=^ \/z {z £ X ^ z £ y) 

We also write y ^ x instead of x C y. 

We say that x is a proper subset of y, and write x C y (or y ^ x), if x y and x ^ y. 

□ 

The Axiom of Extensionality can now be written thus: 

2;^yAyCx— 7>x = y 

Remarks 1.4.2 • The binary relations E and C are sometimes confused, especially by 
beginners. They mean quite different things, however. Indeed, observe that it is always 
the case that x C x, but that it is seldom the case that x G x. (Indeed, never: That 
X G X contradicts the iterative conception of sets was discussed in §1.3[ and will follow 
formally from the Axiom of Foundation. ) 

• Another role of the Axiom of Extensionality is to limit the universe of discourse to sets. 
In particular, it says that if two objects x, y of our universe have the same elements, 
then X = y. It therefore follows, for example, that there are no atoms. (For if a is an 
atom, then a and have the same elements, so a = — contradiction.) 

□ 

ZFC 2: The Axiom Schema of Separation 

The remaining axioms, apart from the Axiom of Foundation, are axioms of set formation, i.e. 
they assert that certain sets exist, or that certain collections (multiplicities) are sets. The 
guiding principle, expounded by Cantor, Russell and Zermelo, is that of limitation of size. 
Sets are collections that are "not too big." This prevents one forming a "set of all sets", for 
example. 

ZFC 2^: 

Let if{x,pi,...,pn) be first-order formula with free variables amongst x,pi, . . . ,pn- For 
any set z, there is a set y consisting of all those x G z which have the property (p. 

Wz Vpi ... ypn ByWx {x £ y -ir^ X £ z A ^p{x,pi, . . . ,pn)) 



13 



Axioms of Set Theory 



Given a set z and a property we may separate all those elements of z which have the 
property if, and put them together to form a new set y. By the Axiom of Extensionality, 
there is just one such set y. We therefore may introduce some new notation: Let 

{x e z : (p{x,p)} 

denote the unique set y whose existence is asserted by the Axiom Schema of Separation. 
Observe that y is a sub-collection of z. The Separation Axiom asserts that the collection 
(multiplicity) y is a set. 

It is easy to justify the existence of such a set y from the iterative conception of sets: 
Given the set z, we know that it is "created" at some stage a. All the elements x of z are 
therefore created at some stage strictly earlier than a. In particular, the elements of y are 
created at some stage strictly earlier than a, as y C. z. But at stage a we make sets out of all 
those collections whose elements are sets created at an earlier stage. In particular, we make 
a set out of the collection y at (or possibly even before) stage a. 

Observe that Separation is not a single axiom, but an axiom schema: We have one axiom 
for every first-order ip{x,p). However, given any formula (or string of symbols) it is a 
mechanical procedure to check whether or not $ is an instance of the separation axiom. 

The Axiom Schema of Separation is also called the Axiom Schema of Comprehension or 
the Aussonderung-axiom in the literature. It will, in fact, follow from the Axiom Schema of 
Replacement (ZFC 7), so it isn't strictly necessary. But it is easier to use than Replacement, 
and is on the original list of axioms that Zermelo published in 1908. 

Exercise 1.4.3 (a) Prove that the existence of a unique empty set follows from the axioms 
ZFC 0-2. We may therefore define to be the unique set with no elements. 

(b) Prove that ZFC 0-2 imply that every set x has at least one subset y which is not an 
element of x, i.e. prove that Vx 3y (y C a; A y x). Deduce that the collection of all sets 
is not a set. 

(c) Recall that if a, b are sets, then their intersection a n 6 is the collection of all elements 
which belong to both a and b. Further recall that a — 6 is the collection of all elements 
which belong to a but not to b. Show that ZFC 0-2 imply that if a, b are sets, then so 
are aCib and a — b. 



□ 



Remarks 1.4.4 In the exercise above, we formally introduced the symbols 0,n,— . Previ- 
ously, we introduced the symbol C and the notation {x £ a : f}. We will use these introduced 
notions as though they are part of our formal language. They can always be eliminated, 
however. For example, 

\/x 3y {y C x) is shorthand for ^x 3y \/z {z e y ^ z £ x) 

and 

{yea: ip{y,p)}r]b = is shorthand for 3z ^y {y e z y e a A ip{y,p)) A-i3x {x e z A x e b) 

□ 



Set Theory 



14 



ZFC 3: The Axiom of Pairing 
ZFC 3: 

Given two sets x, y there is a set z whose members are precisely x,y. 

Va; Vy Vi/; (w^z-^w = x\/w = y) 

By the Axiom of Extensionality, any two sets whose members are precisely x, y are equal, 
and thus we may define the unordered pair {x, y} to be the unique set whose elements are 
precisely x and y, i.e. 

z = {x, y} is shorthand for xEz/KyEzA Vu; {w E z — >■ w = x V w = y) 

We then define the singleton {x} to be the set {x,x}. 

The Axiom of Pairing is easy to justify via the iterative conception of sets: If a and b are 
sets, then each is created at some stage, say stages a and p. Then {a, b} will exist at any 
stage 7 which is strictly greater than both a and /3. 

Remarks 1.4.5 The Axiom of Pairing is often stated in the following weaker form: Given 
any two sets x, y, there exists a set z that they both belong to. However, together with the 
Separation axiom, this weaker form implies our form. For if z is a set which has x, y amongst 
its elements, then 

{x, y} := {w & z : w = xV w = y} 

□ 

ZFC 4: The Axiom of Union 
ZFC 4: 

For any set x, there is a set y whose elements are precisely the elements of elements of x. 

Vx 3y Vz [z G y -H- 3w G x {z £ w)] 

By the Axiom of Extensionality, there is a unique such set y, which is called the union of the 
set X, and denoted by 

y = (Jx 

or (more commonly) by 

y = [^{w : w & x} or y = vj 

We can justify this axiom as follows: If x is a set, then it is formed at some stage a. Each 

element of x is therefore created at some stage < a, and thus the elements of elements of x 
are created before stage a as well. Now the union of a; — i.e. the collection of all elements 
of elements of a; — is a collection of elements that are all created strictly before stage a, and 
hence |J x is formed as a set by stage a at the latest. 

Remarks 1.4.6 The Axiom of Union may be stated in the following weaker form: Given 

any set x, there exists a set z that has amongst its elements all the elements of elements of x. 
Together with the Separation axiom this weaker form implies our form. For if z such a set, 
then 

[J X := {y G z : 3w G x (y G w)} 



15 



Axioms of Set Theory 



□ 

We may now define (with the aid of the Axiom of Pairing) the sets 

xUy :=[J{x,y} x U y U z :=\^{x,y, z} xi U • • • U j;„ = |J{xi, . . . , j;„} 

etc. 

Exercise 1.4.7 (a) Suppose that C := {y : (p{y,p)} is the cohection of all sets that have 
some propertjj^ given by the first-order formula C need not be a set itself — it might 
be "too big" — but each of its elements is a set. Define the intersection P| C of C to be 
the collection of all sets z with the property that z belongs to each element of C, i.e. 

f]C = {z:\/y€C{zGy)} 

Show that ZFC 0-2 imply that if the collection C is non-empty, then P| C is a set. 
Further, use ZFC 0-3 to define the intersections x Ciy, x (1 y Ci z etc. Observe that the 
Axiom of Union is not required for the existence of intersections. 

(b) Show that P| is not a set. 

□ 

We may also write y = P| C as f]{y : y £ C} oi y = Clyec U- Further, x Ciy := {^{x, y}, 

etc. 



Exercise 1.4.8 If the following identities and inclusions are unfamiliar, now^^ is the time to 
prove them. 

(a) x C y n z iff X C y and x'^z; x^yUziSx^y and x ^ z. 

(b) 2;Cyiff2;ny = xiffxUy = ?/iffx — y = 0. 

(c) The distributive laws: x Ci {y U z) = {x Ci y) U {x Ci z) and xU {y Ci z) = (x U y) Ci {x U z). 

(d) De Morgan's Laws: x — {y U z) = {x — y) (1 {x — z) and x — {y (1 z) = {x — y) U {x — z). 

ZFC 5: The Axiom of Power Set 



ZFC 5: 

For any set x, there is a set y whose elements are precisely the subsets x. 

yx 3y \/z {z £ y -f-)- z C x) 



By the Axiom of Extensionality, there is a unique such set y, which is called the power set of 
the set X, and denoted by 

y = V{x) 

The Axiom of Power Set may be justified as follows: If x is a set formed at stage a, then 
all the elements of x are formed before stage a. But then each subset of x is a collection of 
elements that are created before stage a, so each subset of x is made into a set by stage a at 
the latest. So the collection of all subsets of x is a collection of sets that all exist by stage a, 
and thus 'P(x) is made into a set by stage a + 1 at the latest. 



^■^The technical term for such a collection is class — cf. [|1.5 
"Yes, NOW! 



Set Theory 



16 



Exercise 1.4.9 Show that if x,y are sets, then {x,y} € V{V{x) UV{y)). 

Given the above, do we need the Axiom of Pairing? Or does Pairing foUow from Union 
and Power Set? 



ZFC 6: The Axiom of Infinity 

"But concerning your proof, I protest above all against the use of an 
infinite quantity as something completed, which is never permissible in 
mathematics. The infinite is merely a way of speaking, the true meaning 
being a limit which certain ratios approach indefinitely close, while others 
are permitted to increase without restriction. " 

— Gauss, letter to H. Schumacher, 1831 



□ 



" The existence of the infinite will never again be deniable while that of 
the finites is nevertheless upheld. If one permits either to fall, one must 
do away as well with the other. " 

— Cantor, Grundlagen, 1883. 

The axioms ZFC 0-5 ahow us to prove that there are infinitely many sets, but they 
do not allow us to prove that there is an infinite set. This is not hard to see: The axioms 
ZFC 0-2 allows us to deduce the existence of possibly just one set, namely 0. Further, 
axioms ZFC 2-5 allow us to create new sets from old, by taking subsets, pairs, unions, and 
power set. But if we have at our disposal only finite sets (i.e. if we are at a stage where 
only finite sets have been created), then the operations of taking subsets, pairing, union and 
power set are not capable of forming infinite sets. Indeed, even ZFC 0-5 together with ZFC 
7-9 are not sufficient to prove the existence of an infinite set, and the existence of such a 
set is, in fact, denied by the intuitionists and finitists. Following Aristotle, the concept of 
infinity was separated into two distinct versions. On the one hand, there is the idea of an 
actual infinity, i.e. a completed, finished, definite thing consisting of infinitely many elements 
or parts. On the other, there is potential infinity, where one has a sequence of things which 
is never completed or finished. Aristotle allowed only potential infinities as non-paradoxical, 
and many mathematicians and philosophers agreed. Certainly, this point of view, eloquently 
expressed in the quote of Gauss above, had much to recommend it, and under its guidance 
mathematicians were able to develop completely rigorous definitions of limits without the use 
of infinitesimal^^ 

It was Cantor who made the use of the actual infinity respectable in mathematics. It was 
Zermelo who first understood that its existence had to be asserted: 



ZFC 6: 




There is an infinite set. 




3x [0 £ X A yy {y G X - 


■> yU{y}e x)] 



The philosopher George Berkeley ridiculed the use of infinitesimals: "May we not call them the ghosts 
of departed quantities?" Strangely enough, Cantor too was resolutely opposed to the use of infinitely small 
numbers, and "refuted" them with arguments that could be applied equally to his infinitely large transfinite 
numbers. But the logician Abraham Robinson resurrected them and legitimised their use with his invention 
of the hyperreals and non-standard analysis. 



17 



Axioms of Set Theory 



Let us analyze this axiom: Define an operation S{-) on sets by 

S{x) := xU {x} 

This means that 

y = S{x) yz{z£y-^z£xVz = x) 

By the Axioms of Pairing and Union, S{x) is a set whenever x is. For reasons that wiU 
become clear later, the function S{-) is called the successor function. The Axiom of Infinity 
states that there is a set which has the following amongst its elements: 

0, 5(0), S{S{0)), S{S{S{0))),... 

"Clearly", such a set must be infinite. However, we have not yet defined what wc mean by 
an infinite set — this requires the formal notion of a natural number. Let us say that a set 
X is inductive if it satisfies the condition in the Axiom of Infinity, i.e. if 

0GxAVy(yGx^ S{y) G x) 

The Axiom of Infinity therefore asserts that there is an inductive set. (The set of natural 
numbers will then be defined to be the smallest inductive set.) Once wc have defined what 
we mean by an infinite set, we shall be able to prove that an inductive set is indeed infinite. 
Moreover, we shall show that if an infinite set exists, then so does an inductive set. Thus the 
existence of an infinite set will turn out to be equivalent to the existence of an inductive set. 

Let us quickly see how to justify this axiom via the iterative conception of sets: Observe 
that is formed at stage 0. Then S{0) = {0} is a set whose elements arc formed at stage 0, 
so S{0) is formed at stage 1. Then S{S{0)) = {0,{0}} is a set whose elements are formed 
at stage 1 and earlier so SS{0)) is made into a set at stage 2. Thus: 

is formed at stage 

S{0) is formed at stage 1 

SS{0) is formed at stage 2 

SSS{0) is formed at stage 3 



These sets therefore are all formed before the first infinite stage oj, and hence the inductive 
set 

{0,S{0),SS{0),SSS{0),...} 

is formed at stage cj. 



ZFC 7: The Axiom of Choice 
ZFC 7: 

Every family of non-empty sets has a choice function: If x is a family of non-empty sets, 
then there is a function f on x which chooses from each w E x an element f{w) G w. 

\/x [Vu) G a; (w 7^ 0) — )• 3/ ( / is a function A dom(/) = a; A \/w G x {f{w) G w))] 



Set Theory 



18 



In the next chapter, we will see that functions can be interpreted in the universe of sets, so 
that to the statement "/ is a function" can be written in the language of set theory. Observe 
that if / is a choice function on x, then / : x — )• |J x. (Of course, not every / : x — )• |J x is a 
choice function.) 

The Axiom of Choice — formulated explicitly by Zermelo in 1904 — is an immensely 
powerful tool. It seems intuitively obvious, and yet has generated so much controversy that it 
deserves (and will get) a chapter on its own. The reason that this axiom was initially eyed with 
such suspicion by a large number of mathematicians is that the axiom is non-constructive. It 
simply asserts the existence of an object — a choice function — without saying how such an 
object might be constructed. The axioms ZFC 2-6 have a slightly different character: They 
assert that certain well-defined collections are sets. The Axiom of Choice (abbreviated AC) , 



on the other hand, asserts the existence of an undefined set And this is precisely the source 



of its power: If one could define a choice function for a particular family x, then one would 
not need the AC to assert its existence. 



Remarks 1.4.10 It is important to note that one does not need AC to "choose" an element 
from a non-empty set. This fact has often been misunderstood. The fact is that AC doesn't 
really allow you to "choose"; that's just a manner of speaking. What AC does do is assert 
that a certain set exists. Thus if w is non-empty set, then x := {w} is family of non-empty 
sets, and AC says that there is a function / : x — t- |J x such that f{w) G w. But we don't 
need AC for that ! If it; is non-empty, then (by definition of "non-empty" and first-order 
logic) there is an element a G w. We may not know what a looks like — i.e. we may not 
explicitly be able to choose a particular a — but there is such an a. It will be clear after 
you've read the next chapter that the function / : x — )• |Jx which has f{w) = a exists, i.e. a 
choice function exists. 

In the same way, it can be shown that any finite family of non-empty sets has a choice 
function (because it follows from the Axioms of Pairing and Union show that every finite 
collection of sets is a set). But a problem arises when we have an infinite family x of non- 
empty sets: It is always possible to choose an element from each w G x. What is not clear is 
that one can do this "in one go", i.e. simultaneously for all w £ x, via some function /. 

□ 



I'm not sure if one can justify the Axiom of Choice via the iterative conception of sets. 
However, in 1938 Kurt Godel proved the following result: If ZF (i.e. set theory without AC) 
is consistent, then so is ZFC (set theory with AC). Put differently, AC cannot introduce an 
inconsistency into set theory. If ZFC is inconsistent, then ZF was already inconsistent also. 



ZFC 8: The Axiom of Replacement 

The remaining two axioms — Replacement and Foundation — are very important for the 
subject of set theory itself, and for metamathematics. They play only a tiny role in ordinary 
mathematics however. 

I do not see know to derive the Axiom Schema of Replacement from the iterative concep- 
tion of sets. It is easy to derive it from the idea of limitation of size, however. By that we 
mean that a set is a collection that is not "too big" . Here's how: 



We shall soon see — and you are no doubt aware already — that functions are sets. Remember, sets are 
all there is. 



19 



Axioms of Set Theory 



Suppose that ip{x, y,p) is a first-order formula in the language of set theory which defines 
a "function" . Here x, y play the role of independent and variable, and p the role of a fixed 
parameter. Informally, such a formula defines a "function" F when for any x there is at most 
one y such that ip{x,y,p) holds. In that case we have a "function" F given by 



y = F{x) if and only if ip{x, y,p) holds 

The Axiom of Relacement asserts that if F is such a "function" and if X is a set, then the 
direct image 

F[X] := {F{x) :xeX} = {y:3xeX ^{x, y,p)} 

is a set also. The use of quotation marks around the word function is essential here. We shall 
see in the next chapter how to interpret functions as sets. But it will follow from ZFC 0—5 
that the direct image of a set under a function is a set — The Axiom of Replacement isn't 
necessary for that. The function F we use here may be "too big" to be a set, however. 
Nevertheless, even though F may be too big, its image F[X\ is surely no bigger than X. So 
if X is a set, then F[X] can't be too big to be a set. 

All this may be confiising at first. Fortunately wc don't need to talk about the size of a 
"function" F to formulate the Axiom Schema of Replacement: 



ZFC 8(^: 

Suppose that (p{x,y,pi, . . . ,pn) is a first-order formula with free variables amongst x,y 
and p := {pi, . . . ,pn)- Suppose that for every x there is at most one y so that (p{x,y,p). 
If X is a set, then the collection {y :3x e X (p{x, y,p)} is a set. 

\/p VixMyMz {ip{x,y,p)/\Lp{x,z,p) ^ y = z) — >■ VX3y Vy (y G y ■<->■ -^x ^ X ip{x,y,p))\ 



Observe that Replacement — like Separation — is an Axiom Schema. We have one axiom 
for every ip. 

The Axiom Schema of Replacement was suggested independently by Abraham Praenkel 
and Thoralf Skolem in 1922. It allows the creation of many, many new sets. Whereas the 
other axioms of ZFC allow one to form new sets that are at most a few levels higher than 
the sets from which they are created, the Axiom of Replacement allows one to form sets at 
very high levels. (E.g. it is possible that the sets X is at a "low level" and that the created 
set F[X\ at a very "high level". ) In particular, it necessary to develop a theory of ordinal 
numbers and transfinite induction. As we shall see, the ordinals form the backbone that 
supports the entire set-theoretic universe, so this axiom is indispensable to the set theorist. 



Set Theory 



20 



ZFC 9: The Axiom of Foundation 

A well-known scientist (some say it was Bertrand Russell) once gave a 
public lecture on astronomy. He described how the earth orbits around 
the sun and how the sun, in turn, orbits around the center of a vast 
collection of stars called our galaxy. At the end of the lecture, a little old 
lady at the back of the room got up and said: "What you have told us is 
rubbish. The world is really a fiat plate supported on the back of a giant 
tortoise. " The scientist gave a superior smile before replying, "What is 
the tortoise standing on?" "You're very clever, young man, very clever," 
said the old lady. "But it's turtles all the way down!" 

— Stephen Hawking, A Brief History of Time 

In the set-theoretic universe, its sets all the way down. Every element of a set is a set. 
Every element of an element of a set is a set. But how far is all the way down'? Not that far, 
as it turns out. 

According to the iterative conception of sets, sets are made in stages, with each set being 
made up of elements that have been formed at a strictly earlier stage. We saw also that any 
non-empty collection of stages has a least element. Suppose now that a; is a non-empty set. 
Let r be the collection of all stages associated with an element of x, i.e. 

r = {7 : There is x G X such that z is formed at stage 7} 

The r is a non-empty collection of stages, so has a least clement a. Let y G X be an element 
formed at stage a. If z G y, then z is formed at a stage /3 < a. Now /3 F, because a is 
least, and hence z ^ x. It follows that if 2; G y, then z ^ x, i.e. that y Ci x = 0. We have 
thus shown (informally, from the iterative conception of sets), that for every non-empty set 



• there is y E x such that y n x = 0. 




We now take this to be an axiom: 




ZFC 9: 




Vx (x 7^ - 


■> 3y G x(y n x = 0) 



Exercise 1.4.11 Show using the Axiom of Foundation that "the turtles don't go very far 
down" in ZFC: There are is no sequence of sets xi, X2, X3, . . . such that 



. . . Xn+l G x„ G . . . X3 G X2 G Xi 
It follows that every G-decreasing chain 

xi 3 X2 3 xs 3 . . . 

is of finite length. (Technically, we say that the G-relation is wellfounded.) 



□ 



21 



Axioms of Set Theory 



The Axiom of Foundation is also referred to as the Axiom of Regularity in the hterature. 
It was introduced by John von Neumann in 1925. While it is of scant use to the ordinary 
mathematician, it is of great import to the set theorist, as we shall see in a later chapter. In 
essence, together with Replacement, it allows us to formalize the iterative conception of set 
within ZFC itself, and build up the universe of sets from nothing, by iterating the operations 
of power set and union along the ordinals. 



1.5 Sets and Classes 

The version of set theory that we have developed (i.e. ZFC) has just one type of object, 
namely sets. Some set theories allow for another kind of object, however, namely classes. We 
will use this terminology only informally. Thus, informally, a class is a collection of the form 

C := {x : ipix,p)} 

i.e. C is the collection of all sets that have property (p. Here x,p are sets, with p playing the 
role of a parameter. 

For example, the universe of all sets — denoted by V — is the class 

V := {x : X = x} 

Some classes are "too big" to be sets: Russell's paradox (together with Separation) shows 
that V is not a set. But every set is a class, for if y is a set, then the class determined by the 
formula ip{x, y) = x G y is clearly y itself: 

{x : X ey} = y 

A class that is not a set is called a proper class. Thus V is a proper class. Later, we shall see 
that the class On of all ordinals is a proper class also. 

The theory NBG (von Neumann-Bernays-Godel) is a commonly used set theory allows 
both classes and sets as objects. (This is a conservative extension of ZFC: Any theorem about 
sets is NBG is also a theorem of ZFC, and vice versa. Thus NBG says nothing about sets 
that ZFC doesn't already know.) 

Inside ZFC, however, classes are just a way of speaking: If 

C := {x : ipix,p)} 

is a class, then when we say x G C, we mean simply that the statement (p{x,p) is true. Thus 
any mention of C can be avoided, simply by using its defining formula (/?. 

Exercise 1.5.1 Observe that every element of a class is a set. Show that a class is a set if 
and only if it is an element of some class. 

□ 



In Exercise 1 1.4. 7 it was shown that the intersection of a non-empty class is a set, i.e. that 



if C 7^ 0, then f] C is a set. 

The Axiom Schema of Separation states essentially that a subclass of a set is a set: 



{x £ z : ip{x,p)} = zri{x : ip{x,p)} = z CiC 



Set Theory 22 



The Axiom Schema of Replacement can be formulated as follows: If F is a class functior 
and X is a set, the direct image F[X] is a set. 



^''i.e. a "function" between classes. For example, the successor function S{x) :— xU {x} is a class function 
S : V V. 



Chapter 2 

Relations and Functions 



In this chapter, we revise some elementary notions in mathematics, and show how they can 
be interpreted in the universe of sets. It is assumed that you've encountered the material in 
§2.1-2.5 before, so we move at speed. For the development below, the axioms ZFC — 5 
usually suffice. The Axiom of Choice (AC, ZFC 7) also occurs in a few exercises, but its use 
is always stated explicitly. 

2.1 Ordered Pairs and Finite Cartesian Products 

We can use the notion of an unordered pair to define the notion of an ordered pair. An ordered 
pair (x, y) is an object which, intuitively, contains x, y in that order. The essential property 
that we want ordered pairs to possess is the following: 

(x, y) = (a, b) if and only if x = a and y = h 

In particular, (x,y) ^ {y,x) (unless x = y). But by the Axiom of Extensionality 

{x,y} = {y,x} 

We can, however, interpret ordered pairs as sets, i.e. find sets which exhibit the correct 
behaviour. We follow the Polish mathematician Kazimierz Kuratowski, and define 

{x,y) := {{x},{x,y}} 

Exercise 2.1.1 Show that Kuratowski's definition of orded pair results in the correct be- 
haviour, i.e. that (x, y) = (a, h) if and only if x = a and y = b. Also show that the following 
alternative definitions of ordered pair fail to exhibit the desired behaviour: 

{a,b):={{a},{b}}, {a,b) := {a,{b}} 

Finally, give one other definition of ordered pair that does exhibit the correct behaviour. 

□ 



23 



Set Theory 



24 



Observe that the set {x, y) is a concrete reahzation of the abstract idea of an ordered pair 
{x,y). This is not to say that is {x,y), but merely that the set {x,y) behaves as an 

ordered pair should. 

Once we have defined ordered pairs as sets, we are able to find interpretations of ordered 
triples and quadruples, 

{x, y, z) ■= {{x, y),z), {x, y, z, w) := {{x, y, z),w), 

and, in generaQ 

{Xi, . . .,Xn+l) ■■= {{Xl, . . .,Xn),Xn+l) 

Given two sets X, Y, we may define the Cartesian product X xY X and Y as follows: 

X X Y = {z : 3x £ X 3y £ y {z = {x,y))} 

We then define 

X xY X Z := {X xY) X Z 

etc. Observe that in that case 

X xY X Z = {{x, y,z) : X e X,y eY,z e Z} 

Similarly, we define X"^ := X x X, X^ := X x X x X , etc. 

We have not yet shown that Cartesian products are sets. That is the objective of the next 
exercise: 

Exercise 2.1.2 Let X,Y be sets. 

(a) Write down a first-order formula pair{z, x, y) which states that z = {x, y}. 

(b) Write down a first-order formula ordered_pair{z , x, y) in the langage of set theory which 
states that z = {x,y). 

(c) Show that ii x £ X and y £Y, then {x, y) £ V{V{X U Y)). 

(d) Show that X x y is a set. 

□ 

Here are some easy exercises about the arithmetic of Cartesian products: 

Exercise 2.1.3 (a) {X \JY) x Z = {X x Z)U {Y x Z), {X r\Y) x Z = {X x Z)r\{Y x Z), 
{X -Y) X Z = {X X Z) - {Y X Z), 

(b) (A X B) n {X X Y) = {An X) X {B nY). 

(c) X xY = ii and only if X = or F = 0. 

□ 

^We are not using natural numbers in this definition, but giving a schema for defining ordered tuples of 
arbitrary length. If you want to define an ordered 100-tuple, this requires a formula containing 100 variables. 
The concept/number 100 is not required. However, in order to define all finite ordered tuples simultaneously, 
using induction, we need the concept of natural number. This will have to wait until a later chapter. 



25 



Relations and Functions 



2.2 Relations 



In mathematics, one studies many relations between various objects x, y. For example: 

• X < y where x, y are numbers. 

• X £ y, where x, y are sets. 

• X = y, where x,y are natural numbers/real numbers/sets/matrices/. . . 

• x\y {x divides y), where x,y are integers, or polynomials. 

• X is a root of y, where x is a real number, and y a real function. 

If R stands for some relation, we often write xRy to mean that x, y satisfy that relation. 
Relations need not be binary, i.e. they need not be between just two objects. Furthermore, 



• x,y,z are related if and only if x is congruent to y modulo z, where x, y, z are natural 
numbers. 

• x,y,z are related if and only if x(y) = z, where x is a real function, and y,z are real 
numbers. 

• x,y,z are related if and only if the dot product of x and y is z, where x, y are two 
n-dimensional vectors, and z is a real number. 

• X, y, z, w are related if and only if the point (x, y) lies on a circle of radius z and centre 
w. 

Observe that the order of x,y,z,. . . may matter. If the relation is strict inequality <, 
then X is related to y if and only if x < y. This does not mean y is also related to x. 

Given some relation, we can interpret it as a set R as follows: Let R be the set of all 
ordered pairs (x, y) with the property that x is related to y. Thus 

Definition 2.2.1 A binary relation i? is a set of ordered pairs. When {x,y) G R, we may 
indicate this by writing xRy. 

An n-ary relation i? is a set of ordered n-tuples. When (xi, . . . , Xn) G R, we may indicate 
this by writing R{xi, . . . , x„). 



For example, the relation of strict inequality on the natural number^ N := {0, 1, 2, 3, ... } is 
the set < defined by 



<:= {(0,1), (0,2), (0,3), (0,4),..., (1,2), (1,3), (1,4),..., (2,3), (2, 4),..., (3, 4),...} 




□ 



Definition 2.2.2 Suppose that i? is a binary relation. 



■^Informally speaking, that is. In the end of course, they will be of the same type, since everything is a set. 
''We will define the natural numbers later in this chapter. 



Set Theory 



26 



(a) The domain and range of R are defined by 

dom(i2) := {x : 3y (xRy)} ran(i?) := {y : 3x (xRy)} 

(b) If A is a set, then the image of A under R is defined by 

R[A] := {y G ran(i?) -.JxeA {xRy)} 

(c) If 5 is a set, then the inverse image of B under R is defined by 

R-^[B] := {x e dom(i?) :3yeB (xRy)} 

(d) If A is a set, then the restriction of i? to A is the binary relation R \ A defined by 

R\ A:= {{x,y) e R : x e A} 

□ 

If X,Y are sets, then any subset R C X x Y is a relation, with dom(i?) C X, and 
ran(i?) C y. In that case, we say that i? is a relation from X to Y. If R C X x X, we say 
that R is a relation on X. 

For any set X there is a identity relation Idx on X defined by 

Idx := {{x,x) :xeX} 

so that X Idx y if and only if x,y e X and x = y. (Thus Idx is just the relation of equality, 
restricted to the set X.) 

Exercise 2.2.3 Let i? be a binary relation. 

(a) Show that if {x, y) € i?, then x, y G |J |J i?. Conclude that that dom(i?) and ran(i?) are 
sets. 

(b) Show that R C dom(i?) x ran(i?). 

(c) Define the inverse R"^ of R by 

(x, y) e R~^ iff xRy 
Show that R~^ is a set, and thus a binary relation. 

(d) If S is another binary relation, define the composition S o R of R with S by 

{x,z) €SoR iff 3y {xRy A ySz) 
Show that o is a set, and thus a binary relation. 

(e) Show that composition of binary relations is associative: 

To{SoR) = {ToS)oR 



27 



Relations and Functions 



(f) Consider the relation < on the set of natural numbers. What is < ^1 Show that < 
o <=<. 

(g) Show that if dom(i?) = X, ran(i?) C Y, then Idy o R = R = Roldx- 

□ 

Observe that xS o Rz if and only if there is y such that xRySz — the order of i?, S is 
reversed! (This is why some texts write Ro S where we write S o R. We have reasons for 
preferring the current version, however, as will become apparent in the next section.) 

Definition 2.2.4 A binary relation i? on a set X is 

(a) Reflexive if Vx G X (xRx). 

(b) Symmetric if Vx, y E X {xRy — > yRx). 

(c) Asymmetric if Vx, y £ X {xRy — -^{yRx)). 

(d) Antisymmetric if Vx, y E X {xRy A yRx —^x = y). 

(e) Transitive if Vx, y,z E X {xRy A yRz — >■ xRz). 

(f) Connected if Vx, y E X {xRy V x = y V yRx). 

□ 

Exercise 2.2.5 Suppose that i? is a binary relation on a set X. 

(a) Show that -R is reflexive if and only if Idx ^ ^• 

(b) Show that R is symmetric if and only if R~^ C R. 

(c) Show that R is asymmetric if and only if i? fl = 0. 

(d) Show that R is transitive if and only if R o R C R. 

(e) Show that R is connected if and only if i? U R~^ U Idx = X x X. 

□ 

2.3 Functions 

Informally, a function / can be thought to induce a binary relation between elements x,y, 
namely: x and y are related if /(x) = y. But not every binary relation is a function, for 
functions arc single-valued, i.e. if /(x) = y and /(x) = z, then we must have y = z. This 
leads to the following definition: 

Definition 2.3.1 A function (or map) / is a binary relation with the property that if (x, y) G 

/ and (x, z) £ f, then y = z. 

We may write /(x) = y instead of (x, y) G / or x / y. 



Set Theory 



28 



□ 



When / is a function with dom(/) = X and ran(/) C Y, we indicate this by writing 

/ : X — 7- y. We may also write x ^ y to mean that f{x) = y (when / is clear from context). 

If X, Y are sets, then the collection of all functions f : X ^ Y is denoted by Y-^ . (Some 
authors use ^Y.) 

Exercise 2.3.2 Show that if X, Y are sets, then X^ is a set. 



Observe that the identity relation Idx on a set X is a function, with Idx : X ^ X : x x. 
Obviously, it will be referred to as the identity function as well. 

Since a function is a binary relation, the notions of domain, range, inverse, image, inverse 
image, restriction and composition make sense. In particular any function / will possess an 
inverse relation f~^. However, /^^ need not be a function. 

Definition 2.3.3 Let f : X ^ Y he a function. 

(a) If the binary relation f~^ is a function f :Y X, i.e. if f~^ is a function and dom(/~^) = 
Y, then we say that / is invertible. 

(b) We say f : X ^ Y is injective (or one-to-one) if Vx, x' G X f{x) = f{x') — >■ a; = x'). We 
may write f : X ^ Y (or f : X ^ Y) to assert that / is an injection. 

(c) We say that f : X Y is surjective (or onto) if Y = ran(/), i.e. if Vy G 3a; € 
X {f{x) = y). Wc may write f : X ^Y to assert that / is a surjcction. 

(d) We say that f : X Y is a bijective if / is both injective and surjective. We may write 
/ : X >— » y to assert that / is a bijection. 



Exercise 2.3.4 Let / : X — > y be a function. 

(a) Show that a composition of in/sur/bijcctions is an in/sur/bijection. 

(b) Show that the relation f~^ is a function if and only if / is injective. 

(c) Show that if / is invertible, then / o f~^ = Idy, and o / = Idx- 

(d) Say that a function g : Y ^ X is a left inverse of / if ^ o / = Idx- Show that / has a 
left inverse if and only if it is an injection. 

(e) Say that a function h : Y ^ X is a right inverse of / if and only if f o h = Idy. Using 
AC, show that a function has a right inverse if and only if it is a surjection. 

(f) Show that if / has both a left inverse g and a right inverse h, then / is invertible, and 



□ 



□ 



9 = h = f 



-1 



(g) 



Show that / is invertible if and only if it is a bijection. 



□ 



29 



Relations and Functions 



Suppose that X is a sets, and that F : I ^ X : i Xi is a, surjection. Then 

X = {F{i) ■.iel} = {xi:iel} 

The function F is called an indexing of the set X. The set / is called the index set, each i G I 
is called an index, and F{i) = xi is called the i*^ term. 

Note that any set can be indexed by some set. For example, a set X can be indexed by 
itself: Let I = X, and let Xi = i. Later on, however, we shall often use ordinals, a kind of 
transfinite number, to index sets. 

If X = {xi : i e I}, we will write 

[jx=ijx, n^=n^^ 

etc. 

Exercise 2.3.5 Let / : X — >■ y be a function. Show that the inverse image f~^[-] is a 
function 

/-![.] : p(y) ^ : B ^ f-^[B] := {x e X : f{x) G B} 

which preserves the basic set-theoretic operations U'D'"' (^-S- show that /~^[Uie/^] ~ 
Uielf'^i^il for any family Yi G V{Y).) 

□ 

Associated with any product X xY are associated projections itx-,t^y defined by 

-Kx X xY ^ X : {x,y) ^ X Try ■ X x Y ^ Y : {x,y) y 

Exercise 2.3.6 Suppose that fx'-A^X,fY'-A^Y are functions with common domain 
A. Show that there is a unique function f : A ^ XxY such that fx = T^x^f and /y = nyof- 
[Hint: Let f C. A x {X x Y) consist of all ordered pairs (a, {fx{o), fyio))) where a ^ A. 

□ 

2.4 Equivalence Relations 

Recall that any set X carries an identity relation Idx defined by 

Idx := {{x,x) : X G X} 

which is just the relation of equality, restricted to the set X. 

The relation of equality is a class relation which satsifies following properties: 

• = is reflexive: x = x. 

• = is symmetric, li x = y, then y = x. 

• = is transitive: li x = y and y = z, then x = z. 

A relation which satisfies these same properties is called an equivalence relation. To be precise: 



Set Theory 



30 



Definition 2.4.1 A binary relation R on a set X is an equivalence relation if and only if 

(i) R is reflexive, i.e. \/x G X [xRx). 

(ii) R is symmetric, i.e. Vx,y G X — t- yRx). 

(iii) i? is transitive, i.e. \/x,y,z £ X (xRy A yi?2; — t- xRz). 

□ 

It is easy to see that if n is a non-zero integer, then the relation =„ of equivalence modul^ 
n is an equivalence relation on the set Z of integers. 

An equivalence relation R can be thought of as a generalization of the identity relation: 
If xRy, then x, y are in a certain sense identified, i.e. regarded as "equal" for the purpose at 
hand. When we collect together all the elements which are made "equal" we get equivalence 
classes: 

Definition 2.4.2 If R is an equivalence relation on a set X and x £ X, then the equivalence 
class of X modulo R is the set defined by: 

[x\r ■.= {y£X: xRy} 

□ 

Some authors may write x/R instead of [x]ji. 

The following device yields a fruitful way of looking at equivalence relations: 

Definition 2.4.3 A family V of non-empty sets is called a partition of a set X if and only 
if: 

(i) [j{P:PeV} = X. 

(ii) Any two distinct members oiV are disjoint, i.e. P,Q £ V and P ^ Q implies PCiQ = 0. 

□ 

The members P €z V are sometimes called the blocks of the partition. Observe that (i) of the 
above definition states that each x £ X belongs to at least one block, whereas (ii) states that 
no x £ X belongs to more than one block. Thus each x £ X belongs to exactly one block. 

Exercise 2.4.4 (a) Every equivalence relation on X induces a partition of X: Let R be an 
equivalence relation on a set X. Define a family of sets Vr by 

Vr := {[x]r -.xeX} 

Show that Vr is a partition of X. 
^Recall that a =„ b if and only if 6 — a is divisible by n. 



31 



Relations and Functions 



(b) Every partition of a set X induces an equivalence relation on X: Let 7^ be a partition of 
X. Define a binary relation R^p on X by 

xR-py if and only if {x e P ^y e P) 

(i.e. X R-p y if and only if x, y belong to the same block of V.) Show that R-p is an 
equivalence relation. Further show that the blocks of V are precisely the equivalence 
classes of R-p, i.e. that [x]ii^ = P if and only if x G P. 

(c) Show that the constructions in (a), (b) above are inverses of each other, in the following 
sense: Starting from an equivalence relation P, the construction in (a) yields a partition 
Vr. If we then apply the construction in (b) to the partition Vr, we get an equivalence 
relation R-p^. This relation is precisely the original equivalence relation P, i.e. Pp^ = P. 
Similarly, Vr-p = V. 

□ 

If P is an equivalence relation on a set X, then the set of equivalence classes — which we 
above denoted by Vr — is usually denoted by X/ P. The equivalence class of an element x 
— above denoted by [x\r — is often denoted by x/R. 

Here is another reason why equivalence relations are ubiquitous in mathematics: 

Exercise 2.4.5 (a) Every function induces an equivalence relation on its domain: Suppose 
that / : X — > y is a function. Define a binary relation P on X by 

xRy iff fix) = f{y) 

The relation P is called the kernel of /, and denoted P = ker/. Show that ker/ is an 
equivalence relation. 

(b) Assuming the Axiom of Choice, every equivalence relation on a set X is induced by a 
function with domain X: Show that if P is an equivalence relation, then there is a 
function / with domain X such that P = ker /. 

□ 

The following exercise gives two equivalents of AC: 

Exercise 2.4.6 Show that the following are equivalent: 

(a) AC 

(b) For every partition V there is a set C which has precisely one element in common with 
each block P gV. 

(c) Every surjection has a right inverse. 



□ 



Set Theory 



32 



2.5 Order Relations 

Definition 2.5.1 (a) A binary relation < on a set X is said to be a partial order relation if 
< is: 

(i) reflexive, i.e. x < x, 

(ii) antisymmetric, i.e. x < y and y < x imply x = y, and 

(iii) transitive, i.e. x < y and y < z imply x < z. 

(b) A binary relation < on a set X is said to be a strict partial order relation if < is 

(i) asymmetric, i.e. x < y implies y ^ x. 

(ii) transitive, i.e. x < y and y < z imply x < z. 

(c) A total ordering is a connected partial ordering. A strict total ordering is a connected 
strict partial ordering. A (strictly) totally ordered set is also called a chain. 

□ 

A partially ordered set is a pair {X, <) where < is a partial ordering on X. 

Observe that Idx is a partial ordering on X, and that is a strict partial ordering on X. 
Given a partial ordering <, we can define a strict partial ordering < by 

X < y if and only if x < y A x ^ y 

i.e. < := < - Idx- 

Conversely, given a strict partial ordering <, we can define a partial ordering < by 

X < y if and only if x < y V x = y 

i.e. < := < u Idx- 

Moreover, < is total if and only if < is total. 
Definition 2.5.2 Let < be a partial ordering on X. 

(a) Let Y C X. An element yo e Y is said to be a <-minimal element of Y if there is no 

y G Y such that y < yo- 

We will refer to yo simply as a minimal element of y if < is clear from context. 

(b) Let y C X. An element yo eY is said to be the <-minimum element of Y if and only if 
Vy G F (yo < v)- 

We will refer to yo simply as the minimum element of y, or the least element of y, if < 
is clear from context. 

(c) The partial ordering < is said to be well-founded if every non-empty subset of X has an 
<— minimal element. 

(d) A total ordering < is called a well-ordering if every non-empty subset of X has a <- 
minimum element. 



□ 



33 



Relations and Functions 



Exercise 2.5.3 Let {X, <) be a partial ordering, and let Y C X). 

(a) Every minimum element of y is a minimal element. The converse may be false. 

(b) A subset of a partial ordering can have at most one minimum element. 

(c) If (X, <) is a chain, then every minimal element of Y is also a minimum element of Y. 

□ 

Definition 2.5.4 Let {A, <) and {B, ^) be partially ordered sets. 

(a) A function f : B is order-preserving if and only if 

x<y ^ f{x) ^ f{y) 

(b) {A, <) and (B, ^) are said to be order-isomorphic if there is a bijection f : A ^ B 
with the property that both /, ^ are order preserving. Such an / is called an order- 
isomorphism. 

□ 

Exercise 2.5.5 (a) Let <) and (B,^) be partially ordered sets. Show that a bijection 
f : A ^ B is an order-isomorphism if and only 

x<y ^ f{x) < f{y) 

(b) Give an example to show that not every order-preserving bijection is an order-isomorphism. 

(c) Show that {A, <) and {B, :<) are chains, then any order-preserving bijection is an order- 
isomorphism. 

(d) Show that if X is a set and X is a family of subsets of X, then the inclusion relation C 
is a partial ordering on X, i.e. {X, C) is a partial ordering. 

(e) Let {X, <) be a partial ordered set. For x e X, define 

I X := {y e X : y < x} and X := {], x : x & X} 
Show that {X, <) and (X, C) are order-isomorphic. 

(Thus, in some sense, C is the only partial order relation that exists, up to isomorphism.) 

□ 



Set Theory 34 



Chapter 3 

Natural Numbers 



God created the natural numbers; everything else is the work of man. 

— attributed to Leopold Kronecker 

It seems that Kronecker may have almost literally believed this. When Lindemann proved 
in 1882 that vr is trancendental (i.e. not the root of a polynomial with rational coefficients), 
Kronecker opined that the proof was indeed beautiful, but since irrational numbers do not 
exist, Lindemann's proof had no meaning. 

Certainly, Kronecker was the first and most vociferous critic of Cantor's transfinite sets[^ 

In 1889, Giuseppe Peano listed five axioms with the intention of providing a rigorous 
foundation for the natural numbers. The Peano Postulates are: 

I. is a natural number. 

II. Every natural number has a successor s(n) which is also a natural number. 
(The intention is that s(n) = n + 1.) 

III. is not the successor of any natural number. 

IV. If two natural numbers have the same successor, then they are identical, i.e. if s(n) = 
s{m), then n = m. 

V. If X is a set of natural numbers such that: 

(i) G X, 

(ii) If n £ X, then s{n) £ X, 

then every natural number belongs to X. 

(This is the Principle of Mathematical Induction.) 

The informal idea is that the set of natural numbers is the set 

{0, s(0), s{s{0)), s{s{s{0))), ...} 

so that every natural number is obtained from by finitely many applications of the successor 
function s(-). The point of this chapter is to create the natural numbers: We shall construct 
an object which satisfies the Peano Postulates inside the set-theoretic universe. 

^Doron Zeilberger is a well-known modern mathematician who holds similar opinions and publishes them 
regularly. They're worth reading. Go have a look on the www. 



35 



Set Theory 



36 



In this chapter, we work mainly with the theory ZF~, which is ZFC without the Axioms 
of Choice and Foundation. AC will be used occasionally, but its use will in all cases be stated 
expplicitly. 

3.1 The Set lu of Natural Numbers 

Let us briefly recall some definitions made when we introduced the Axiom of Infinity: We 
defined the successor function S{-) on sets by 

S{x) := xU {x} 

(Some authors write x~^ instead of S{x).) We defined a set X to be inductive if 

(i) e X, and 

(ii) X is closed under S{-): U x £ X, then S{x) £ X. 
The Axiom of Infinity is the statement 

There exists an inductive set 
Now let I be the class of all inductive sets: 

I := {x : Ind(j;)} where Ind(x) = £ x A {y £ x ^ S{y) £ x) 



By the Axiom of Infinity, I is non-empty. It follows from Exercise 1.4.7 that P|I is a set. 
(Indeed, if A is any inductive set, then f]I = {x £ A : Ind(x)} is a set by Separation.) 
We will give this set a name: 

and call u the set of natural numbers. Another commonly used symbol for the set of natural 
numbers is N. 

It is easy to see that uj is itself an inductive set. It follows that uj is the smallest inductive 
set: If X is any inductive set, then u <Z x. 

Exercise 3.1.1 (a) Show that the intersection of a non-empty class of inductive sets is an 
inductive set. 

(b) Conclude that uj is the smallest inductive set: If x is any inductive set, then a; C x. 

□ 

Since £ u and u is closed under S{-), we see that u must have the following elements 
(which we simultaneously provide with names): 

:= 

1 := 5(0) = {0} 
2:= 5(1) = {0,1} 
3:= 5(2) = {0,1,2} 
4:= 5(3) = {0,1,2,3} 



= {0,l,2,...,n-l} 



37 



Natural Numbers 



Observe that the set n has (informal) n elements. 

If n S cj, we write n + 1 := S{n) (i.e. n + 1 is another notation for S{n) := nU {n}). 
The following result is easy: 

Theorem 3.1.2 (Principle of Mathematical Induction) Suppose that X <^ u has the proper- 
ties that 

(i) e X. 

(n) IfneX, then n+1 e X. 
Then X = uj. 

□ 

Exercise 3.1.3 Prove the Principle of Mathematical Induction. 

□ 

Let us investigate the structure of the set uj in greater detail. We want to verify that uj, S 
satisfy the Peano Postulates. We won't take the simplest route — things are about to get 
quite technical — but the route we follow is good preparation for the general study of the 
ordinals. In particular, we will also study the order relation on the natural numbers. 

3.2 (cj, g) is a Transitive Well— Ordered Set 

Above, we saw that 1 = {0}, 2 = {0, 1}, 3 = {0, 1, 2} . . . etc. Thus, intuitively, m < n if 
and only if m G n. In other words, it seems that the G-relation acts as a strict total ordering 
< on the set of natural numbers. Of course, the set of natural numbers yields the primary 
example of a well-ordered seQ It is the fact that the natural numbers are well-ordered 
that makes possible proofs by induction and definitions by recursion, so the importance well- 
ordering is hard to overstate. 

In this section, we make precise and verify the above assertions for our interpretation uj 
of the set of natural numbers. We begin with some definitions: 

Definition 3.2.1 (a) A set X is transitive if each element of X is also a subset of X: 

Trans(X) = Vx € X[x C X) 
(This means that if y G x and x G X, then also y €z X.) 

(b) An element y S x is said to be an ^-minimal element of x if there is no z G y is such that 
z G X. Put differently, y G x is an G-minimal element of x if y H x = 0. 

(c) A set X is well-founde^\i every non-empty subset z C x has a G-minimal element. 

^Recall that a total ordering (X, <) is a well-ordering if every non-empty subset of X has a least element. 
We will investigate well-ordered sets in more detail in the next chapter. 

''By the Axiom of Foundation, every non-empty set has an €-minimal element, i.e. every set is well-founded. 
But we do not want to use this axiom here if we can avoid it. 



Set Theory 



38 



□ 



Lemma 3.2.2 (a) If x is transitive, then so is S{x). 
(h) If X is well-founded, then x ^ x. 

(c) If X is transitive and well-founded, then so is S{x). 

(d) If x,y are sets such that (i): x is transitive and well-founded, and (ii): S{x) = S{y), 
then X = y. 

Proof: (a) If y G S{x) = x U {x}, then either y G x or y = x. Since x is transitive, y C x in 
either case, and thus y C S{x) also. 

(b) {x} is a non-empty subset of x, and thus has a S-minimal element, i.e. there is 
y € {x} so that y n {x} = 0. But then x = y, so x (1 {x} = 0, contradicting the fact that 
X G X n {x}. 

(c) We have already seen that S{x) is transitive if x is. Now suppose further that x is 
well-founded, and that y Q S{x) is non-empty. We distinguish two cases: 

Case (i): If y fix is non-empty, then since x is well-founded, yDx has an S-minimal element 
z, i.e. there there is z € y D x such that z Ciy D x = 0. We claim that z is an G-minimal 
element of y also. For suppose that there is w £ z Dy. Then certainly w G z. But as z G x 
and X is transitive, we must have w G x as well. But then w G z D y D x — contradiction, 
because z Ci y Ci x = 0. 

Case (ii): If y D x = 0, then y = {x}. But since we know that x x, we see that x is a 
E-minimal element of y. 

(d) Suppose X ^ y. Then x G y and y G x. Since x is transitive, x G x. But since x is 
well-founded, x x — contradiction. 

H 



Lemma 3.2.3 (a) If X is inductive, then {x G X : x CI X} is inductive also. 
Hence uj is transitive. 

(h) If X is inductive, then {x G X : x is transitive and well-founded} is inductive also. 
Hence every element n G u is transitive and well-founded. 

(c) UJ is well-founded. 

(d) If X is inductive, then {x G X : x = V 3y G X (x = S{y))} is inductive also. 
Hence every non-0 element of oj is the successor of some element ofuj. 

Proof: (a) Let y := {x € X : x C X}. Then gY. Now suppose that x gY. Then x G X, 
so S{x) G X, because X is inductive. Further, x C X, and hence S{x) C X. Since S{x) G X 
and S{x) C X, we see that S{x) G Y. Hence Y is inductive. 

It follows that {n G a; : n C w} is an inductive subset of co. Since uj is the smallest inductive 
set, we see that n C a; holds for every n G oj. Thus oj is transitive. 

(b) Let Y = {x G X : X is transitive and well-founded}. Certainly gY. Now if x G y, 
then ^(x) G X (because X is inductive). Moreover, ^(x) is transitive and well-founded (by 



Lemma 3.2.2). Hence S{x) G Y, and thus Y is inductive. 



39 



Natural Numbers 



It follows that {n £ uj : n is transitive and well-founded} is an inductive subset of w, and 
thus that it is equal to w. 

(c) Let X C a; be a non-empty set, and let n £ X. If nOX = 0, then n is an G-minimal 
element of X. Else n H X is a non-empty subset of the well-founded set n, and therefore 
nf] X has an G~minimal element m. We claim that m is an G-minimal element of X also. 
Indeed, since m G n and n is transitive, we have m <^ n, and hence mriX = nirinr]X = 0. 

(d) Let Y := {x G X : X = V 3y £ X {x = S{y))}, so that a non-empty element 
of X belongs to Y if and only if it is the successor of some element of X. Then clearly 
G Y. Moreover, if x £ Y, then clearly S{x) is the successor of some element of X, and 
hence S{x) G Y also. 

As above, we see that {n G a; : n = V 3m E w (n = S{m))} is an inductive subset of u, and 
hence is equal to u. 



We can now see that uj, S do indeed satisfy Peano's axioms: 
I. £ CO, because u is inductive. 
II. If n G w, then S{n) G to, because u is inductive. 

III. / S{n) for any n, because is the empty set, whereas n £ S{n). 

IV. If S{n) = S{m), then n = m, by Lemmas 3.2.2[ d) and 3.2.3| ^b). 
V. The Principle of Mathematical Induction holds, by Theorem [3.1.2 



Now let's investigate the order relation on the set of natural numbers. Define a binary 
relation < on a; as follows: 

m < n if and only if m £ n 

As usual, "n < m" means ^'n = m V n < m". 

Observe that 0<1<2<3<..., because 0g1g2g3g 

Theorem 3.2.4 (oj, £) is a transitive strict well-ordered set. 

□ 

We already know that cj is a transitive well-founded set, and that the same is true for 
any n £ uj. The proofs of the remaining assertions of this theorem are given in the following 
exercise: 

Exercise 3.2.5 (a) Show that < n for all n £ uj. 

[Hint: Let X := {n £ uj : < n} and show that X is an inductive subset of uj.] 

(b) Show that for all m,n £ uj we have m < n + 1 if and only if m < n. 

(c) Show that for all m,n £ uj, we have m < n if and only if m + 1 < n. 
[Hint: Let X := {n £ uj : Vm £ uj (m < n — )• m + 1 < n)}.] 

(d) Show that the relation < is transitive: li k < m and m < n, then k < m. 



Set Theory 



40 



(e) Show that the relation < is asymmetric: li n < m, then m ^ n. 

(f ) Show that the relation < is connected: For any n, m, either n < m oi n = m oi m < n. 
[Hint: Show, with the aid of (a)-(c) that the set {n e oj : Vm e oj {n < m V n = 
m V m < n} is inductive.] 

(g) Show that if X C a; is non-empty, then X has a <-least element. 

□ 

We thus see that w is well-ordered by the G-relation. Furthermore, if n G w, then n C w, 
since cj is transitive. Since every subset of a well-ordered set is also well-ordered, we see that 
each n G w is also well-ordered by G. Thus if either x e oj ot x = oj, then x has the following 
properties: 

(i) X is transitive. 

(ii) The G-relation well-orders x. 

This means that every n E u, as well as oj itself, is an ordinal: An ordinal is a transitive set 
which is well-ordered by the G-relation. So we have ordinals 

0, 1,2,3, ...,a; 

But now it follows easily that ^(t^'), SS{u}), SSS{lu), . . . are transitive and well-ordered by G 
also. This yields transfinite ordinals 

u + l,uj + 2,uj + 3, . . . 

We will examine ordinals more closely in the next chapter, where we shall see that every 
well-ordering is order-isomorphic to a unique ordinal. 

We remark that assuming the Axiom of Foundation, the G-relation is a well-ordering as 
soon as it is a linear ordering. However, we have not used Foundation in this chapter. 

3.3 Recursion 

In practically all branches of mathematics there are instances of definitions by induction. 
Typically, one is given an initial object xq and then a rule to transform a given object Xn into 
a new object Xn+i- The following theorem shows that our definition of the natural numbers 
allows for such definitions: 

Theorem 3.3.1 (Recursion Theorem — Version I) 

Let f : X X be a function, and let xq E X. Then there exists a unique function g : uj ^ X 
with the properties that 

3(0) = xo g{n + 1) = f{g{n)) (*) 

□ 

The idea is that ^(0) = xo, xi = g{l) = f{g{0)) = f{xo), X2 = g(2) = f{g{l)) = f{xi), 

etc. 

The proof of the above theorem is given by the next exercise: 



41 



Natural Numbers 



Exercise 3.3.2 (a) First prove uniqueness: If gi : uj ^ X and , g2 '■ uj ^ X are two functions 
which both satisfy (*) of Theorem 3.3.1, then gi = g2- 

(b) Let be the set of ah subsets of F C x X with the following properties: 

(i) (0,xo) G F; 

(ii) If (n, x) G F, then (n + 1, f{x)) G F. 

Let (7 := f^J-- Explain why g is a set, and why g £ J^. 

(c) Since g is a set of ordered pairs it is a binary relation. Show that dom(g) = oo. 

(d) It remains to show that the relation g is a function, i.e. that if (n,x), (n, y) G g, then 
X = y. Define 

S := {n £ oj : Vx, y £ X {{n, x) £ g A {n, y) £ g ^ x = y)} 
We must show that S = oj. First show that £ S. 

[Hint: Suppose that (0, y) £ g for some y ^ xq. Show that G — {(0, y)} G J^.] 

(e) Finally, show that S = u. 

[Hint: Show that S is inductive. If n G 5, there is a unique x such that (n, x) G G, and 
hence certainly (n + 1, /(x)) G G. Suppose that {n + l,y) £ G for some y / x. Show that 
G-{{n + l,y)}£T] 

□ 

In some definitions by induction, one needs not just the current value Xn to define Xn+i, 
but also earlier values Xm for m < n. For example, the Fibonacci sequence is defined by 

Xq = Xi = 1 Xn+l = Xn-l + Xn for n > 1 

In general, we may want to define function of xq, xi, . . . , x^j. The way to do this is 

as follows: Given a set X, let 

X^'^ := {h : h is a function with dom(/i) G uj and ran(/i) C X} 

It is not hard to see that X^^ is a set, as each h £ X^^ is a subset oi u x X. If /i G X^^ has 
dom(/i) = n, and h{k) = £ X for k < n, then we indicate this state of affairs by writing 

h = {ho, . . . ,/i„_i] 

i.e. we may think of an element of X'^^ as being finite sequence of elements of X. This is 
merely a temporary notation to make the discussion that follows easier to understand. Note 
that the empty sequence | ] := G X'^^. 

Now let / : X^^ X he a function. Operating on an intuitive level, we can inductively 
define a sequence 

2^0 = /(ID Xi = f{lxoj) X2 = /([xo,xiI) . . . X„+i = /(|xo,xi . . . ,x„]) 
If we (still operating intuitively) define a function g : uj ^ X : n ^ Xn, then we have 

g(0)=xo = /([l) = /(<7fO) 



Set Theory 



42 



because g \ = = I}. 
Next, 

<7(l) = xi = /(Ixol) = /(gri) 

because 5 f 1 is the finite sequence 1(7(0)] = |xol- 
Then, 

5(2) = X2 = /([xo,xil) = /(5 r2) 

because g \ 2 = lg{0),g{l)} = [xcxi], etc. 

The fohowing theorem states that the above construction can be accomphshed within the 
confines of ZF~ : 

Theorem 3.3.3 (Recursion Theorem — Version II) 

For any set X and any function f : X"^^ ^ X there is a unique function g : uj ^ X such 
that 

g{n) = f{g\n) = f{{g{d),g[l),...,g{n -1)1) for all n ^ uj 
Proof: Uniqueness of g follows easily by induction. To prove that g exists, we use Theorem 



3.3. 1| First, define F : X<'^ X<'^ by F{h) := /i U {(dom(/i), /(/i))}. This means that 
F([xo, . . . , x„_i|) := {xo, . . . , Xn-i, filxo, • • • , a;n-il)l 



By Theorem 3.3.1 there is a unique G : uj ^ X^^ such that 

G(0) = I1 G(n + l) = F(G(n)) 

Thus G(n + 1) = G{n) U {(n, f{G{n)))}. Observe that each G{n) is a function G{n) : n — t- X, 
and that G{n + 1) f n = G{n). Now let g := Uran(G) = [j^^^ G{n). Then g : co ^ X, and 
g \ n = G{n) for all n £ uj. Hence g{n) = G{n + l)(n) = f{G{n)) = f{g \ n), as required. 



Now that we've constructed the natural numbers, it is possible to define the operations 
of addition and multiplication on the set of natural numbers by recursion. However, we will 
leave this till a later chapter, where we shall define these operations on the class of all ordinals. 
We can also: 

• Construct the integers Z as a set of equivalence classes of ordered pairs of natural 
numbers. 

• Construct the rational numbers Q as a set of equivalence classes of integers. 

• Construct the real numbers M, as the set of Dedekind cuts {L, U), where L, U are non- 
empty subsets of Q with the property that U has no minimum element, and for all / S L 
and u G U we have / < u. 

• Construct the complex numbers C as the set of ordered pairs of real numbers. 

Of course, we must also define all the usual operations and relations on these sets as well. All 
this can be done, but we won't pursue it in this course. Try Edmund Landau's Foundations 
of Analysis for a mercilessly rigorous approach. 



43 



Natural Numbers 



3.4 Finite and Infinite Sets 

Observe that if n G u, then n = {m : m < n}, because < is G. Further observe that if 
m,n G u, then m < n if and only if m C n, because G is transitive on the set of natural 
numbers. 

Definition 3.4.1 Suppose that n G A set X is said to have n elements if and only if there 
is a bijection from X to the set n. In that case we say that the cardinality of X is n, and 
write |X| = n. A set X is said to be finite if there is n G a; such that \X\ = n. A set is said 
to be infinite if and only if it is not finite. 

□ 

Thus a set X has n elements if it can be indexed by the set n, i.e. can be written 
X := {xi : i G n}. Since the G is <, this can also be written as X := {xi : i < n}. This state 
of affairs may also be indicated by writing X := {xq, xi, . . . , Xn-i}- Clearly each n G a; is a 
finite set, and |n| = n. 

Tlieorem 3.4.2 (a) The notion of "number of elements of a finite set" is well-defined: If 
\X\ = n and \X\ = m, then n = m. 

(h) A subset of a finite set is finite: If X is finite, and y C X. Then Y is finite. In addition, 
\Y\ < \X\. 

(c) The image of a finite set is finite: If X is finite and f is a function, then f[X] is finite. 
In addition \f[X]\<\X\. 

(d) The union of a finite family of finite sets is finite. 

(e) The power set of a finite set is finite. 

(f) A set X is infinite if and only if for every n £ oj there is an injection f : n ^ X. 

(g) (Using AC) A set X is infinite if and only if there is an injection f : oj ^ X . 



□ 



The proofs of the above assertions are contained in the exercises that follow. First, how- 
ever, we state and prove a Lemma: 

Lemma 3.4.3 Let n £ uj and let X be a proper subset of n. Then there is no bijection 
f : X n. 

Proof: Let m G a; be least such that there a bijection / : m X onto a proper subset 



X ^ m. Then m > (because = has no proper subsets), and hence by Lemma 3.2.3 'd) 
there \s n £ uj such that m = n + 1. 

Now consider two cases: 
Case 1: If n G X, there is k < n such that f{k) = n. Let g : n + 1 n + 1 be the bijection 
which has g{k) = n, g{n) = k, and which has g{m) = m for all m ^ {k,n}. (Thus g is the 
permutation which swaps k and n, and leaves all other elements of n + 1 unmoved.) Then 
f '■= f ° 9 ^ bijection f : n + 1 >^ X with f{n) = n. It is easy to see that / \ n maps n 



Set Theory 



44 



onto X — {n}, a proper subset of n. But this contradicts the minimahty of m = n + 1. 
Case 2: li n ^ X, then X C n. It is now easy to see that / \ n maps n onto X — a 
proper subset of n — again contradicting the minimahty of m. 

-\ 

Exercise 3.4.4 (a) Prove Theorem 3.4.2[ a): Show that if \X\ = n and \X\ = m, then 
n = m. 

[Hint: First show that if n 7^ m, then either n ^ m or else m ^ n. Now use Lemma 



3.4.3 



(b) Show (without using the Axiom of Choice) that the foUowing are equivalent for n,m £ ui: 

(i) n < m. 

(ii) There is an injection n ^ m. 

(iii) There is a surjection m ^ n. 

[Hint: (ii) implies (i): If / : n ^ m and m < n, then / is a bijection from n onto 
ran/ ^ n. 

(iii) implies (ii): If g : m ^ n, one can define a right inverse f : n ^ m without using 
AC] 

(c) Suppose that X, Y are finite sets. Show (without using AC) that the following are equiv- 
alent: 

(i) 1^1 < \y\- 

(ii) There is an injection f : X Y. 

(iii) There is a surjection g lY ^ X. 

[Hint: Use (b).] 

(d) Prove Theorem 3.4.2K b): Show that if X is finite and Y C X, then Y is finite, and 
\Y\ < \X\. 



(e) Prove Theorem 3.4.2 c): Suppose that X is a finite set, and that / is a function. Show 
that f[X]\ is a finite set, and that |/[^]| < |^|. 

□ 

Exercise 3.4.5 (a) Suppose that X, Y are finite sets. Show that X UY is finite. 

[Hint: Recall that we haven't defined addition on the natural numbers, only the successor 
function. Suppose that m is least for which there are finite sets X, Y with |X| = m and 
such that X UY infinite. Show that m = n + 1 for some n. Let x €z X, and consider 
X' ■.= X - {x} and Y' ■.= YU {x}.] 

(b) Prove Theorem [3A2](d) : Show that if X = {Xq, Xi, . . . , Xn-i} is a finite family of finite 
sets, then |J = |Jfc=o -^k is finite. 

(c) Prove Theorem |3.4.2[ e): Show that if X is finite, then V{X) is finite. 

[Hint: Induction on \X\. Observe that if x G X, then V{X) = XVjy, where X := {Y (1 
X ■ X ^Y] a.ndy := {Y X : X e Y].] 



45 



Natural Numbers 



□ 

Exercise 3.4.6 Prove Theorem 3.4.2[ f): Show that a set X is infinite if and only if there is 
an injection f : n ^ X for every n £ u. Conclude that every inductive set is infinite. 

□ 

A set X is said to be Dedekind infinite if there is a bijection from X onto a proper subset 
of X, and Dedekind finite otherwise. 

Exercise 3.4.7 (a) Show that every finite set is Dedekind finite, and that every Dedekind 
infinite set is infinite. 

(b) Prove (without using AC) that a set X is Dedekind infinite if and only if there is an 
injection uj X. 

[Hint: Suppose that Y X is a proper subset of X, and that f : X Y is an 
injection. Let xq £ X — Y, and define g : u) ^ X hy 

g{0) = xo g{n + l) = figin)) 

If g : uj ^ X, consider f : X ^ X hy 



fix) 



where S is the successor function. 



X if X ^ ian{g) 
g{S{g'^{x))) ifxGran(5) 



(c) Prove (with the aid of AC) that a set X is (in)finite if and only if it is Dedekind (in)finite. 
Conclude that Theorem 3.4.2[ g) holds. 

[Hint: Suppose that X is the set of all non-empty subsets of X, and that F : X ^ [JX 
is a choice function. Consider g : uj ^ X defined by 

g{n) := F{X — ran(g \ n)) 

and use Theorem [3X2t f).] 

□ 



Set Theory 46 



Chapter 4 

Ordinals 



4.1 Well— Orderings 

Let (X, <) be a totally ordered set. If x £ X, then we define the initial segment given by x 
by: 

X{x) := {y £ X : y < x} 

Thus X{x) is the set of all elements that are strictly below x. Clearly X{x) is a totally 
ordered set also (with the relation <f X(x) x X{x) inherited from X). 

Lemma 4.1.1 Suppose that (X, <) is a total ordering. Define X := {X{x) : x G X}. Then 
the map f : X ^ X : x ^ X{x) is an order-isomorphism from (X, <) to {X ^ C). 

Proof: It suffices to show that x < y if and only if X{x) C X{y). It is obvious from the 
transitivity of < that X[x) C X{y) whenever x < y. Conversely, suppose that X{x) C X{y). 
If X ^ y, then y < x, so y G X{x), and hence y G X(y) — contradiction. 

H 

When two partial orderings {X, <) and (Y, -<) are order-isomorphic, we say that they 
have the same order-type. 

Recall that a total ordering (W, <) is called a well-ordering if every non-empty subset 
of X has a <-least element. Clearly, the set of natural numbers, together with its usual 
ordering, is a well-ordering. 

The following lemma gives two methods for constructing well-orderings on products. It 
will prove useful when we discuss the arithmetic of ordinal and cardinal numbers. 

Lemma 4.1.2 (a) Suppose that {X, <) and {Y, -<) are strict partially (totally, well-) ordered 
sets. Define the lexicographic ordering C on X x Y by 

(xi,yi) C (x2,y2) xi < X2 V (xi = X2 Ayi-<y2) 

Then \Z is a strict partial(totally, well-) ordering on X x Y . 

^Lexicography: The editing or making of a dictionary. 
In a dictionary, words are ordered lexicographically. To decide which word occurs first in the dictionary, you 
look at their first letters. If these are the same, you look at their second letters. . . 



47 



Set Theory 



48 



(b) Suppose that {X,<) is a strict well-ordered set. Define the canonical ordering C on the 
product X X X by 

max{x,y} < max{x',y'} 
max{x, y} = max{x', y'} and x < x 
max{x, y} = maxja;', y'}, x = x\ and y < y' 

Then \Z is a strict well-ordering on X x X . 



{x,y) C {x',y') 



or 
or 



□ 



Exercise 4.1.3 Consider (w, <), the set of natural numbers equipped with its usual order 
relation. 

(a) Write down the first ten elements in the lexicographic ordering \Zi of oj x uj. Hence show 
that (w X w, C;) is a well-ordering, but not order-isomorphic to (w, <). 

[Hint: Observe that the element (1,0) is not the successor of any {n,m) & uj x oj.] 

(b) Prove Lemma 4.1.2| ^a). 

(c) Write down the first ten elements in the canonical ordering Cc of u x uj. Hence show that 
(a; X UJ, Cc) is order-isomorphic to {uj, <), and thus a well-ordered set. 

(d) Prove Lemma 4.1.2[ b). 

□ 



Exercise 4.1.4 (a) Show that a finite total ordering is a well-ordering. 

[Hint: You must show that if {X, <) is a finite total ordering, and if y is a non-empty 
subset of X, then Y has a least element. Do this by induction on \Y\.] 

(b) Here is a useful characterization of well-ordered sets: A total ordering (X, <) a well- 
ordering if and only if there is no sequence (xn) of elements of X such that xi)x2 > xs > 
.... Prove this (using AC). 

[Hint: One direction is easy. For the other, suppose that Y C X has no least element, 
and let F be a choice function for family the non-empty subsets of Y. Define g : uj ^ Y 
recursively by g{n) = F{Y - {g{0), g{l), . . .,g{n- 1)}).] 



□ 



It is precisely well-ordering that allows arguments by induction: 

Theorem 4.1.5 (Induction on a Well-Ordering) Let (W, <) be a well-ordered set. Let E C 
W be a set with the following properties. 

(i) The least element ofW is a member of E. 

(a) For all X G X, ifW{x) C E, then x G E. 



Then E = W. 



49 



Ordinals 



Proof: Else let wo be the least element oi W — E. Then W{wo) C E, so wq £ W — 
contradiction. 

H 

In preparation for the definition of the transfinite ordinals, we will now begin our inves- 
tigation of the structure of well-ordered sets. Most of the results that follow in this section 
are due to Cantor. 

Theorem 4.1.6 Let {W,<) be a well-ordered set. 

(a) Let f : W ^ W be a strict order-preserving map. Then x < f{x) for all x £ W. 

(b) Idw is the only order-automorphism from W to W. 

Proof: (a) Let E := {x £ W : x ^ f{^)}- If E ^ 0, then E has a least element xq. 
Since xq ^ /(xq), we have /(xq) < xq, and hence /(/(xq)) < f{xo). But then /(xq) € E, 
contradicting the fact that xq is the least element of E. Hence E = 0. 

(b) If / is an order-preserving automorphism, then both the maps /, f~^ are strict order- 
preserving. Hence x < /(x) and /(x) < /^^(/(x)). 

H 

Corollary 4.1.7 If (W, <) and {V, :<) are order-is amorphic well-ordered sets, then there is 
a unique order-isomorphism from W onto V. 

□ 



Corollary 4.1.8 No well-ordering (W, <) is order-is amorphic to an initial segment W{wo) 
of itself. 

□ 

Theorem 4.1.9 Let (W, <) and {V, :<) be well-ordered sets. Then exactly one of the following 
is the case: 

(a) W,V are order-is amorphic. 

(b) W is order-isomarphic to an initial segment of V , i.e. there is vq G V such that W is 
order-is amorphic to V{vo). 

(c) V is order-isomarphic to an initial segment ofW, i.e. there isw^^W such that W{wq) 
is order-is amorphic to V . 

Proof: Define a set of ordered pairs / by 

/ := {{w,v) £ W X V : W{w) is order-isomorphic to V{v)} 

Observe first that / is a function: For suppose that {w, vi) and {w, V2) both belong to /, where 
vi / f2- Without loss of generality, we may assume that vi -< V2. Then V{vi) and ^(^2) are 
order-isomorphic (as each is order-isomorphic to W{w)). But then, if h : ^(^2) — ?• V{vi) is 



an order-isomorphism, we must have h{vi) -< vi, which contradicts Theorem 4.1.6 



Set Theory 



50 



Next observe that / is a strict order-preserving map: For suppose that {wi,vi), {w2, V2) € 
/ are such that wi < W2- We want to show that vi -< V2- Now there is an order-isomorphism 
h : W{w2) Viv2), and this (when restricted) yields an order-isomorphism / f W{wi) : 
W{wi) — >■ V{h{wi)). Hence {wi,h{wi)) G /, so that h{wi) = vi. But as h{wi) G V{v2), we 
see that vi -< V2, as required. 

In particular, it follows that / is one-to-one. 

Observe that dom(/) is downwards-closed: If wi < W2 and W2 G dom(/), then wi G 
dom(/) as well. Indeed, there isv2&V such that W{w2) is order-isomorphic to V{v2)- Since 
wi G W{w2), there must be vi G ^(^2) such that W{wi) is order-isomorphic to ^(^1), so 
that wi G dom(/). 

In the same way, it can be shown that ran(/) is downwards-closed. 

Now we distinguish three cases: 

Case 1: dom/ ^ W: In that case, let wq be the <-lcast element of W — dom/. We claim 
that / is an order-isomorphism from W{wo) onto V. Now because dom(/) is downwards- 
closed, there is no w G dom(/) such that u'o < w. Hence dom(/) = W{wo). If / is not sur- 
jective, let vq be the ^-least element ofV — ran(/). Then clearly W{wo) is order-isomorphic 
to V{vo) — contradicting the fact that wq ^ dom(/). Hence / is a strictly order-preserving 
map with dom(/) = W{wo) and ran(/) = V. 

Case 2: ran/ ^ V: In that case, let vq be the -< -least element of F — ran(/). Arguing as 
in Case 1, it follows that / is an order-isomorphism from W onto V{vO). 

Case 3: dom/ = W and ran/ = V. In that case, / is an order-isomorphism of W onto V. 

H 

4.2 Ordinals 

Recall that the set uj of natural numbers was shown to have the following properties: 

(i) a; is a transitive set: Vn (ra G a; n C w). 

(ii) UJ is well-ordered by the G-rclation: The relation < on u defined by n < m <^=^ n £ m 
is a strict total-ordering on lo, and every non-empty subset of u has a <-least element. 

Furthermore, we saw that every n E lo has precisely the same properties: Each n G a; is 
transitive, and well-ordered by G. 

This suggests a definition, due to John von Neumann: 

Definition 4.2.1 A set is said to be an ordinal if and only if it is transitive and well-ordered 
by G. 

□ 

Thus UJ is an ordinal, as is each n e uj. 

Exercise 4.2.2 Recall that n = {m : m < n} for all n G uj. Thus uj{n) = n for all n G w, 
where uj{n) is the initial segment of uj determined by n. In fact, this property characterizes 
the ordinals: Show that a strict well-ordering {W, <) is an ordinal if and only if W{x) = x 
for all X gW. 

[Hint: (<^): v < w iS v £ W{w) iSvGw. li v £w £W, then v G W{w) C W.] 



51 



Ordinals 



□ 



We shall use the first few letters of the Greek alphabet to denote generic ordinals. Since 
G is a strict order relation, we shall also write a < (3 when a G (3. 

Observe that if a is a non-zero ordinal and xq is the first (i.e. the least) element of a, 
then xq = = 0. For if x G xq, then x G a (by transitivity of a), and x < xq (because < is 
G), which contradicts the fact that xq is the least element of a. Suppose now that xi is the 
next element in a, assuming it has one, i.e. xi is the least element in a which is > xq. Then 

X G xi <^=^> X < xi <;=^ X = Xo 

and hence xi = {xq} = {0} = 1. Now if X2 is the next element of a (again, assuming there is 
one), then 

X G X2 <^=^ X < X2 <^=^ X = Xo V X = Xl 

and so X2 = {xo, xi} = {0, 1} = 2. 

It thus appears that each ordinal a starts 0<1<2<3<.... Each of these elements is 
itself an ordinal. 



In Remarks 4.2.13, we will explain that the existence of an inductive set is equivalent to 
the existence of an infinite set. Thus keep the Axiom of Infinity in mind while reading the 
proofs of the next few propositions and theorems, and observe that these proofs do not rely 
on this axiom. 



Proposition 4.2.3 // a is an ordinal and 13 £ a, then /3 is an ordinal. 



Proof: Suppose that a is an ordinal, and that (3 G a. We first show that f3 is transitive, 
i.e. that if 7 G (3, then 7 C Now if 7 G /3 and 5 G 7, then (by transitivity of a, we have 
S,^, 13 G a. Since < is G on a, we have 5 < 7 < /?, and since < is transitive, we have 6 < /3, 
i.e. (5 G /3. Thus 7 C /3. 

H 



Now observe that every ordinal is a transitive wellfounded set: Every non-empty subset 
of an ordinal has a least element, which must be an G-minimal element. If X, Y are transitive 
wellfounded sets, then a G-isomorphism from X to y is defined to be a bijective map vr : 
X ^ Y with the property that 

Xl G X2 <^=^ vr(xi) G 7r(x2) (xi,X2GX) 

Note that if a, f3 are ordinals, then an G-isomorphism from ct to /3 is precisely an order- 
isomorphism. 

Also observe that if vr : X — )■ y is an G-isomorphism between transitive wellfounded sets, 
and X G X, then 7r(x) = {vr(x') : x' G x} = 7r[x]: Indeed, if y' G vr(x), then, there is some 
x' G X such that 7r(x') = y' . It follows that 7r(x') G vr(x), and hence that x' G x. 

The following proposition will be useful. 



Proposition 4.2.4 Suppose that X,Y are transitive well-founded sets, and that tt : X ^ Y 
is an G-isomorphism. Then 7r(x) = x for all x G X . Hence X = Y . 



Set Theory 



52 



Proof: If {x £ X : 7r{x) 7^ x} is non-empty, it has a 6-minimal element xq. Now if x £ xq, 
then also x £ X, and hence 7t{x) = x. Thus 

7r(xo) = {vr(x) : x G xq} = {x : x G xq} = xq 

— contradiction. Hence {x E X : 7r(x) 7^ x} = 0. 

H 

Theorem 4.2.5 // a, /3 are ordinals, then exactly one of the following is the case: 
(i) a = f3, or 
(a) a £ 13, or 
(Hi) P £ a. 

Proof: a, /3 are both transitive sets well-ordered by G, and hence wellfounded. Hence at 
most one of the above alternatives can be the case. 



By Theorem 4.1.9|either (i) a, (3 are order-isomorphic, or (ii) a is isomorphic to an initial 



segment of (3, or (iii) /3 is order-isomorphic to an initial segment of a. 

Suppose that a is isomorphic to an initial segment of (3, i.e. that there is f3o £ (3 and an 
order-isomorphism vr : a /3(/3o)- Observe that 

/3(/3o) := {7 G /3 : 7 < /?o} = {7 G /3 : 7 e /3o} = /3n /3o = /3o 
because /3q C /3. Thus vr is an G-isomorphism from the transitive wellfounded set a onto the 



transitive wellfounded set /3o- By Proposition 4.2 A we have a = /3o, and thus a £ f3. 



The remaining two cases can be dealt with in a similar fashion. 

H 

The next corollary follows immediately: 
Corollary 4.2.6 If a, f3 are ordinals, then a £ f3 if and only if a'^ /3. 

□ 

We denote the class of all ordinals by 

ON := {a : Q is an ordinal} 
We shall soon see that ON is a proper class, i.e. not a set. For the moment, however, note 



that ON is a class which is well-ordered by £: By Theorem 4.2.5 G is a strict linear ordering 
on ON. To see that it is a well-ordering, suppose that Y is a non-empty subclass of ON, 
and let a G Y. If a H Y = 0, then a is the least element of Y. Else, a H Y is a non-empty 
subset of the ordinal a, and hence has a least element (3. That /3 is easily seen to be the least 
element of Y. Thus every non-empty subclass of ON has a least element. 
Here are some additional facts about ordinals that are easy to establish: 



Proposition 4.2.7 (a) If a is an ordinal, then a = {{3 : (3 < a}. 



53 



Ordinals 



(b) If a, (3 are ordinals, then exactly one of the following is the case: either a is an initial 
segment of f3, or a = 13, or f3 is an initial segment of a. 

(c) If C is a non-empty class off ordinals, then {^C is an ordinal. In fact, [^C is the least 
element of C (i.e. H C S C and H ^ = inf C ). 

(d) If C is a set of ordinals, then \JC is an ordinal. Furthermore, [jC = supC. 

(e) If a is an ordinal, so is S{a) := aL){a}. Furthermore, S{a) is the successor of a, i.e. 
the least ordinal which is > a. 



□ 



Exercise 4.2.8 Prove the preceding proposition. 

□ 



Since S{a) is the successor of a we write a + 1 := S{a). An ordinal a is said to be a 
successor ordinal if an only if there is /3 such that a = (3 + 1. An ordinal which is not a 
successor ordinal is called a limit ordinal. Observe that is a limit ordinal, as is uj. Indeed, 
uj is the smallest non-zero limit ordinal: For if a < a; is a non-zero ordinal, it is a non- 
zero natural number, and hence a successor ordinal. Furthermore, the successor of a natural 
number is a natural number, so ui is not a successor ordinal. 



Exercise 4.2.9 Suppose that ao < ai < 02 is a strictly increasing sequence of ordinals. 
Show that sup„^^ a„ is a limit ordinal. 



□ 



Exercise 4.2.10 If a is a limit ordinal, then a = sup{/3 : f3 < a}. If a is a limit ordinal, 
then sup a < a. 

□ 



Exercise 4.2.11 Show that the class ON of all ordinals is a proper class, i.e. not a set. 
[Hint: If ON is a set, then by Proposition 4.2.7, sup ON is an ordinal.] 



□ 



The result of the previous exercise — that the collection of ordinals is too big to be a set 
— is known as the Burali-Forti paradox, after Burali-Forti, who published it in 1897. It is 
probably the earliest of the paradoxes that set theory had to contend with in its initial years. 



The Axiom of Replacement has not yet been needed in our development of set theory, but 
that is about to change. 



Theorem 4.2.12 Every well-ordered set is isomorphic to a unique ordinal. 



Set Theory 



54 



Proof: Let {W, <) be an arbitrary well-ordered set. Define a binary relation F by 



{w,a) G F 



w 



£ W and a G ON and W{w) is order-isomorphic to a. 



Then by Theorem 4.2.5| -F is a class function. By Theorem |4.1.9 F is injective. Moreover, it 
is easy to see that F is strictly order-preserving. 

Next, we show that if w G W, then is an ordinal. (We are not claiming that 

W{'w) C dom(F).): Firstly, since W{w) is a set, so is F[Vl^(u))], by the Axiom of Replacement. 
Now 7^ ON, because ON is not a set (by Exercise 4.2.11). If 7 is the least ordinal 

which is not an element of F[W{w)], then certainly F[VF(t(;)] = {j3 e ON : /3 < 7} = 7. 

In the same way it can be shown that is an ordinal. 

Now we demonstrate that dom(F) = W: For else W — dom(F) has a least element w. 
Then ^ := F[VF(tt;)] is an ordinal, and F : W{w) — )• ^ is a order-preserving bijection, and 
hence an order-isomorphism. Hence {w, ^) S -F — contradiction. 

Thus if we define a := F[W], then F is an order-isomorphism from W onto a. Hence 
every well-ordered set is order-isomorphic to some ordinal. That ordinal must be unique. 



because no two distinct ordinals are order-isomorphic (by Theorem 4.1.9 and Proposition 



Remarks 4.2.13 Recall that the Axiom of Infinity asserted: There is an inductive set. The 
set oj of natural numbers is then the smallest inductive set. A set X was defined to be finite 
if it can be placed into bijective correspondence with some n £ ui, i.e. if \X\ = n for some 
n £ uj. A set was said to be infinite if it is not finite. We showed that a set X is infinite if 
and only if for every n G uj there is an injection n ^ X. 

We now want to show that, in the theory ZFC 0-5 + Replacement, the existence of an 
inductive set is equivalent to the existence of an infinite set. First suppose that there exists 
an inductive set X. It follows that u is a set (namely the intersection of the non-empty class 
of all inductive sets). li n G to then n C w, so that there is an obvious injection n ^ to. Hence 
OJ is infinite. This proves that if there is an inductive set, then there is an infinite set. 

Conversely, suppose that there is an infinite set X. Certainly every natural number is 
an ordinal. But it is, at this stage, conceivable that ON is the class of all natural numbers, 
i.e. that u = ON is a proper class. (Obviously, if ON contains some ordinal a > uj, then 
OJ G ON as well, by transitivity, and hence u would be a set. So if a; is a proper class, then 
OJ = ON.) Let X := {Y C. X '.Y is finite}. By Power Set and Separation, Af is a set. Define 
a binary relation F by 

(Y,n) e F <?=^ n£ujA\Y\=n 

Then F is clearly a class function (i.e. if {Y,ni) G F and {Y,n2) G F, then m = 712). By 
the Axiom of Replacement, F[J\!] is a set. We claim that F[X] = uj. If not, there is a least 
m G OJ such that m F[Af], because oj = ON is a well-ordered class. Now m 7^ 0, since 
£ X. Thus m = n + 1 for some n £ oj. Hence n G -F[Af], i.e. there is y C X such that 
\Y\ = n. But since X is infinite, we cannot have y = X, and thus there is x £ X — Y . Then 
|yu{x}| = n + l, so 71 + 1 G F[X] — contradiction. Thus F[X] = is a set, i.e. an inductive 
set exists. 



□ 



55 



Ordinals 



Remarks 4.2.14 Observe that since a; is an ordinal, we may inductively define new ordinals 

CO + n (for n G w) by 

oj + := iu CO + {n + I) = S{uj + n) 

The class uU {u + n : n € uj} is clearly transitive an well-ordered by €, and thus ought to be 
an ordinal: call it a; + The problem is that the Axioms ZFC 0-7 are not strong enough to 
prove that {a; + n : n G a;} is a set. It is, at this stage, conceivable that u + oj = ON is the 
class of all ordinals. 

Consider, however, the formula (p{n, y) which asserts that n € a; and that y = u + n. 
Clearly ip(n,y) defines a function F, in the sense that if (p{n,y\) and ip{n,y2), then yi = 7/2- 
By the Axiom of Replacement, F[uj] = {y : 3n E co {(p(n, y)} is a set, i.e. {cj + n : n G w} is a 
set. Then a; + a; := a; U {a; + n : n G is a set as well. Clearly the ordinal a; + a; is a limit 
ordinal. 

□ 

4.3 Transfinite Induction and Recursion 

Induction is a major tool for proving results about natural numbers. Recursion is a major 
tool for constructing functions whose domain is the set of natural numbers. The aim of this 
section is to extend these tools to the class ON. 

The Induction Principle for the natural numbers reads as follows: If X C w is such that 
(i) G X, and (ii) n e X ^ n + 1 e X, then X = uj. 

Each non-zero natural number is a successor ordinal. In general, however, there are also limit 
ordinals that need to be taken care of. Hence: 

Theorem 4.3.1 (Transfinite Induction) Let X. be a class of ordinals. Suppose that 

(i) G X; 

(ii) If a then it follows that a + 1 G X; 

(Hi) If a is a limit ordinal and ^ G X for all (3 < a, then it follows that a G X. 
Then X = ON. 

Proof: If X / ON, there is a least ordinal a such that a X. By (i), a / 0. By (ii), a is 
not a successor ordinal. By (iii), a is not a limit ordinal — contradiction, as a limit ordinal 
is, by definition, an ordinal which is not a successor ordinal. 

H 

Exercise 4.3.2 Let X be a class of ordinals. Suppose that Va G ON (a C X — >■ a G X). 
Show that X = ON. 



□ 



Set Theory 



56 



Next, we discuss transfinite recursion. Recall that our second version of the Recursion 
Theorem for natural numbers read as follows: If / : X^'^ — )■ X is a function (where X"^^ is 
the set of all finite sequences in X, i.e. the set of all function from a natural number to the 
set X), then there is a unique function g : oj ^ X such that 

g{n) = f{g\n) = f{[g{Q),...,g{n-l)}) 

where |xo, . . • , ic^-i] is the function from the set n := {0, 1, . . . , n — 1} to the set X given by 
k ^ Xk. In particular, ^(0) = /(0), ^(l) = /([<?(0)1), g{2) = f{\g{Q),g{l)}), etc. Each g{n) 
is a function of the previously defined values ^(O), . . . , g(n — 1). 

To extend this result to the transfinite realm, we introduce similar notation and terminol- 
ogy: If a is an ordinal, then an a-sequence (or transfinite sequence of length q) is a function 
with domain the set a = : ^ < a}. We write: 

\x^ : ^ < aj is the function with domain the set a = : ^ < a} given by ^ i-^ 

Given a sequence s := [x^ : ^ < a] and a set x, we obtain the sequence s U {{a,x)} with 
domain a + 1, and denote it by {x^ : < aj^x. 

We also write : G ON] for a class function on ON. 

Suppose now that -F is a class function, defined on the class V^*^'^ of all transfinite 
sequence^ i.e. F assigns to each transfinite sequence of sets [x^ : < a] a set -F(|x^ : ^ < a]). 
We can then create a "sequence" |x^ : ^ € ON] recursively as follows: 

Xa := F{lx^ : C < a]) 

so that Xa is defined as a function of previously defined values x^, (^ < a). Thus: 

Xo:=F{0), xi:=F([xo]), X2 := F([xo, xi]), 

This shows how to define x„ recursively for all n < a; — and we knew how to do that already. 
But now we continue: By the Axiom of Replacement, {xn : n < cj} is a set, and hence the 
function |x„ : n < ujj exists (i.e. is a set). We thus have 

Xui ■■= F{lxn ■■ n < u)}), x^+1 := F([x„ : n < a;]^X(^), x^+2 := F(|x„ : n < wJ^x^^Xo^+i), 

In general, if q G ON is such that xg has already been defined for all ^ < a, then by the 
Axiom of Replacement, the function [x^ : ^ < a] exists (i.e. is a set), and we may define 
Xa :=F(|x^ :^<a]). 

In this way we build up a class function G : ON — )• V : a i— )• x^ 

Theorem 4.3.3 (Transfinite Recursion) Let F be a class function (on VJ. Then there is a 
unique function G on ON such that 

G{a) = F{G \ a) for all a G ON (*) 

Proof: Define a first-order formula ip{a, h) which asserts that a G ON, and /i is a function 
h := |x^ : < a} with domain a such that 

For all /3 < a, /i(/3) = F{h \ (3), i.e. x,3 = F{lx^ : ? < /3]) 
^We include the finite sequences in the class of transfinite sequences. 



57 



Ordinals 



We show that tp defines a class function: If both 'ip(a, hi) and ip{a, hold, then hi = h2, 
i.e. the function h is unique (assuming it exists). Indeed, suppose that (3q is the least ordinal 
< a such that hiiPo) / /i2(/3o). Then /ii(/3o) = : ^ < /3ol) = F(I/i2(0 : ? < /^o) = 

h2{Po) — contradiction. 

In particular, it follows by the Axiom of Replacement that if hp) holds for all /3 < a, 
then {hp : /3 < a} is a set. 

Now let (p{a,x) = 3h [-(/'(a,/!) Ax = F{h)]. If both (p{a,x) and {p{a,y), hold, then 
X = F(h) = y, where h is the uniqe function such that ip{a,h). Hence ip defines a class 
function 

G{a) = X <;=^ ip{a,x) 

It remains to show that dom(G) = ON, i.e. that G{a) is defined for every a G ON. If not, 
let ao be the least ordinal such that oq dom(G). Then for each f3 < ao, there is a unique 
function hp : fi ^ W such that hp). We now consider two cases: 

Case 1: Suppose that ao = /? + 1 is a successor ordinal. In that case there is hp = \x^ : ^ < j3\ 
such that tp{f3,hp) holds. If we define xp = F{hp), and then define h := : ^ < /SJ^xp, 
then clearly V'(q^O) h) holds also, so that oq € dom(G) after all. 

Case 2: Suppose that ao is a limit ordinal. By the Axiom of Replacement, {hp : f3 < ao} is a 
set. It is easy to show that if ?? < /? < ao, then h^j = hp fa. It follows that h := IJ/3<ao 
a function with domain ao. Furthermore, if we define 

Xp := h{(3) for /3 < ao 

then h = [xg : < ao], and hp = [x^ ■ S, < P} ^ot all /3 < ao- Clearly, therefore, '4j{aQ,h) 
holds, and hence ao S dom(G) in this case also. 

We have now shown that there exists a class function G : ON — )■ V with the required 
property {k). Suppose that G' is another. If G 7^ G', let a be the smallest ordinal such that 
G(a) ^ G'{a). Then G f a = G' f a, and hence G(a) = F(G f a) = F{G' \ a) = G'{a) — 
contradiction. Hence G = G', i.e. there is a uniqe function with the property (★). 

H 



4.4 Ordinal Arithmetic 
4.4.1 Limits 

Definition 4.4.1 (a) A transfinite sequence [7^ : < a] of ordinals is said to be strictly 
increasing if 

T] < ^ implies 7,, < 7^ 

(b) If A > is a limit ordinal, then the limit of a strictly increasing sequence [7^ : ^ < A] of 
ordinals is defined by 

hm 75 := sup{7^ : ^ < A} = M 75 

c — 



(This limit exists, by Proposition 4.2.7 



(c) A strictly increasing sequence [7^ : ^ < a] (or [7^ : C £ ON]) of ordinals is said to be 
normal if it is "continuous" , i.e. if for for every limit ordinal A in its domain we have 

7a = lim 75 i.e. hm7^ = 71;^^ 



Set Theory 



58 



□ 

Similarly, we say that a class function F : ON — )• ON is normal if it is strictly increasing 
and continuous. Of course, any sequence is a function. 

Proposition 4.4.2 Suppose that [7^ : ^ G ON] is a normal sequence of ordinals. 

(a) 7g > C for all^e ON. 

(b ) jx is a limit ordinal if X is a limit ordinal. 

(c) The sequence has arbitrarily large fixed points.- For every a G ON there is fi > a such 
that = j3. 

□ 



Exercise 4.4.3 Prove Proposition 4.4.2 



[Hint for (c): Define /3o := 7Q>/3n+i := Iji^^P ■= 1™ l^n] 



□ 



4.4.2 Addition 

Definition 4.4.4 (Ordinal Addition) We define a class function + : ON x ON — ON by 

transfinite recursion as follows. Let a £ ON. 

(i) a + := a. 

(a) a + (/3 + 1) := (a + /3) + 1 for successor ordinals (3 + 1. 



(Hi) a + B := limfa + £) for limit ordinals /3 > 0. 



□ 



Observe that, for each a G ON, we have defined above a function Sa ■ ON — )• ON : /3 1— )• a+/3. 
It is easy to see that each Sa{-) is a normal function (i.e. strictly increasing and continuous). 
It is also easy to see that ordinal addition coincides with the usual addition on the set of 
natural numbers. 

Furthermore, ordinal addition is associative: 

Lemma 4.4.5 For all a,f3,j £ ON, we have 

a + (/3 + 7) = (a + /3) + 7 

Proof: By induction on 7: For all ordinals a, (3 we have 

a+(/3 + 0) = a + /3 = (a + /3) + 

Next, if 7 is such that q + (/3 + 7) = (a + /3) + 7 holds for all a, /3, then 

a + (/3+(7 + l)) = a + ((/3 + 7) + l) = (a + (/3 + 7)) + 1 = ((a + /3)+7) + l = (a + /3) + (7 + l) 

and hence the result holds for successor ordinals 7 + 1. 

Finally if 7 is limit, and a + {fi + rf) = (a + /?) + ?? for all r] < ^, then 

a + (/3 + 7) = a + lim(/3 + r?) = lim(a + {(3 + 77)) = lim((a + f3) + i]) = {a + (3) + j 
and hence the result holds for limit ordinals. 



59 



Ordinals 



H 

However, ordinal addition fails to be commutative! For example: 

1 + a; = lim (1 + n) = a; u + 1 := S(uj) > u 

n<u! 

so 1 + UJ ^ UJ + 1. 

Consider now uj + 1. This is the least ordinal which is > Hence it is order-isomorphic 
to the set of natural numbers, followed by a new maximum element. 

Similarly, a; + 2 = (a; + l) + l has another additional element added to the top: It is order- 
isomorphic to io followed by two new elements. 

Next, consider oj + oj = lim{uj + n). This is the least ordinal which is > cj + n for every n < uj. 

n<ui 

Clearly, therefore, u + oj \s order-isomorphic to w, followed by another copy of u. 
We make this precise. 



Suppose that (X, -<x) and {Y, -<y) are disjoint totally ordered sets. We can form their 
linear sum (Xuy, -<) by gluing X, Y together in such a way that every element of X precedes 
every element of Y . This means that the linear sum of X, Y is order-isomorphic to an ordering 
consisting of X followed by Y. 

Thus we define ^ on X U y as follows: u^v ^ X {JY then 



u ~< V 



u,v E X and u -<x v 
or u, v G y and u -<y v 
or u e X and v eY 



Observe that ~< = U -(y U {X xY). 

If {X,^x), {Y,^y) are totally ordered sets, but X CiY ^ 0, then we can make them 
disjoint by a standard trick. Let X := {0} x X, and define -<x by {0,x) -<x (0,a;') if and 
only if X -<x x' . Clearly (X, -<x) and (X, -<^) are order-isomorphic. 

Similarly, let Y := {1} x Y and define -<y by (l^y) ~<y O-^v') if '^^^Y V u' ■ Then 
X, Y are disjoint and order isomorphic to X, Y. 

Observe that the linear sum of X,y is a sort of lexicographic ordering: For i,j G {0, 1} and 
u, u G X U y we have: 



(z, u) < {j, v) i < j V {i = j /\u <v) 

where you have to keep track to which set each < belongs. 



Proposition 4.4.6 If a, 15 ^ ON, then a + 15 is order-isomorphic to the linear sum of a and 
15 (in that order). 

Proof: By induction on (5. 

H 



Exercise 4.4.7 Let a,/3,7 G ON. 

(a) Show that if /3 < 7, then a + /3 < a + 7. 

Also find an example where /3 < 7, but /? + a = 7 + a. 



Set Theory 



60 



(b) Show that if a < /?, then there is a unique ordinal 7 so that a + 7 = /?. 

Also find an example where a < j3, but the equation x + a = j3 has many solutions x. 
[Hint: The set : a < ^ < /3} is well-ordered, so is order-isomorphic to some ordinal 7.] 

□ 

4.4.3 Multiplication 

Definition 4.4.8 (Ordinal Multiphcation) We define a class function ■ : ON x ON — > ON 
by trans finite recursion as follows. Let a G ON. 

(i) a • := 0. 

(a) a ■ {P + 1) := a ■ P + a for successor ordinals (3 + 1. 
(Hi) a - fi := lim(a • ^) for limit ordinals P > 0. 

□ 

Observe that, for each a G ON, we have defined above a function P„ : ON — t- ON : /? a-/3. 
It is easy to see that each -Pa(') is a normal function (i.e. strictly increasing and continuous). 
It is also easy to see that • coincides with the usual multiplication on the set of natural 
numbers. 

Furthermore, ordinal multiplication is associative: 
Lemma 4.4.9 For all a,/3,7 G ON, we have 

a • (/3 ■ 7) = (a ■ /3) • 7 

□ 

Exercise 4.4.10 Prove the preceding Lemma. 

□ 

Observe that 

2 ■ u = lim 2 ■ n = Lo uj-2 = u}-{1 + 1)=uj-1 + uj = u + co 

n<ijj 

Thus ordinal multiplication also fails to be commutative: 2 ■ u ^ cj ■ 2. 

Indeed 2-oj looks like w-many copies of the total ordering 2: Each element of oj is replaced 

by a copy of 2. 

On the other hand, co ■ 2 looks like 2 copies of co: Each element of 2 is replaced by a copy of 

UJ. 

We make this precise: 

Suppose that {X, -<x) and {Y, -<y) are totally ordered sets. The linear product of these 
sets is the set X xY equipped with the reverse lexicographical ordering -<, i.e. 



{x,y) ~< {x',y') 



y -<y y 

or y = y' and x -<x x' 



61 



Ordinals 



Thus, if we keep y €Y fixed, the subset Xy := {{x,y) : x G X} C X x y is a copy of X. We 
see furthermore that if y -<y y' , then each element of the copy Xy precedes each element of 
the copy Xy' in the reverse lexicographic ordering. Thus, roughly, {X xY,^) consists of "F 
copies of X" . 

Proposition 4.4.11 If a,P E ON, then a - 13 is order-isomorphic to the linear product of a 
and P (in that order). 

Proof: By induction on jS. 

H 



Exercise 4.4.12 Let a,^,7 G ON. 

(a) Show that a • (/? + 7) = a • /3 + a • 7. 

Also find an example where (a + /3) • 7 7^ a • 7 + /3 • 7. 

(b) Show that if /3 < 7 and a > 0, then a • /3 < ct ■ 7. 

Also find an example where /3 < 7 and a > 0, but /3 • a = 7 • a. 

□ 



4.4.4 Exponentiation 

Definition 4.4.13 (Ordinal Exponentiation) We define a class function ON x ON — >■ ON 
by transfinite recursion as follows. Let a G ON. 

(i) ftO := 1. 

(ii) a^'^^ := ■ a for successor ordinals /3 + 1. 
(Hi) = lim for limit ordinals (3 > 0. 



□ 



Observe that, for each a G ON, we have defined above a function : ON — )• ON : 
P a^. It is easy to see that each Ea{-) is a normal function (i.e. strictly increasing and 
continuous). It is also easy to see that this function coincides with the usual exponentiation 
on the set of natural numbers. 



Exercise 4.4.14 Show that if a,/3, 7 G ON, then 

□ 



Set Theory 



62 



4.5 Project: Goodstein's Theorem 

Goodstein's Theorem is a a result about the natural numbers involving just the basic arith- 
metic operations. To state it, we introduce some defintions and terminology. 

Wc describe here the procedure for writing a natural number in "superbase" form. Given 
a natural number n > 2, every m G w has a unique representation in base n, i.e. there are 
unique /cq, . . . , fc^ G {0, . . . , n — 1} such that 

d 

m = ki ■ 

Each exponent i in the n*-term of the above representation may be an arbitrary natural 
number. In particular, it is possible that i > n. In that case, we can write i itself in base n, 
i.e. i = X]j=i h ' - some of the exponents j are > n, one repeats the procedure. Clearly, 
after finitely many steps, the number m will have an expression in which no numbers > n 
appear. This is the representation of m in superbase n. 

For example, if we we want to write the number m = 1 026 in superbase form, with base 
n = 2. Then 

1 026 = 2^° + 2^ = 22'+2^ +2'= 22^'+^+2^ + 2^ 
No numbers > 2 appear in this representation of 1 026 in superbase 2. 

Problem 1. Write 59 106 in superbase 3. 

□ 



Now consider the Goodstein algorithm for generating a sequence of natural numbers (m„) 
starting from an arbitrary natural number m: 



Step 1. Let mi := m. 

Step 2. Write mi in superbase 2. 

Replace all occurrences of the symbol '2' by the symbol '3', to get a new number. 
Subtract 1 from this new number, to get the number m2- 

Step 3. Write m2 in superbase 3. 

Replace all occurrences of the symbol '3' by the symbol '4', to get a new number. 
Subtract 1 from this new number, to get the number m^. 

Step 4. Write m^, in superbase 4. 

Replace all occurrences of the symbol '4' by the symbol '5', to get a new number. 
Subtract 1 from this new number, to get the number m^. 

Step 5. : 



We illustrate this for m = mi =5. 
Step 1. mi = 2^ + 1, so m2 = 33 + 1 - 1 = 27. 



63 



Ordinals 



Step 2. m2 = 3^, so ma = 4^ - 1 = 255. 

Step 3. mg = 3 • 43 + 3 • 42 + 3 • 4 + 3. Hence 7714 = 3- 5=^ + 3- 52 + 3- 5 + 3 = 467. 

Continuing, we get 775, 1197, 1751, 2454, 3 325, 4382, 5 643, 7126, 8849, 10830, 13087, 
15 637, 18499, 21691, 25 231, ... 

Theorem (Goodstein's Theorem) 

The sequence {mn)n will eventually reach the number 0, for any starting value m = mi. 

□ 

The above theorem was proved in 1944. The remarkable fact about Goodstein's Theo- 
rem is this: Even though it is a statement involving only the basic arithmetic of the natural 
numbers, it requires transfinite ordinals in order to prove it. In 1981, Baker and Paris 
proved that Goodstein's theorem implies the consistency of first-order Peano Arithmetic. By 
Godel's Second Incompleteness Theorem, PA is not strong enough to prove its own consis- 
tency. Hence Goodstein's Theorem is not provable in PA. It follows that transfinite methods 
are necessary even for proving results in number theory! Goodstein's theorem is unprovable 
without recourse to the transfinite. 

We now formalize the Goodstein algorithm. 

Definition: Given a base n>2, define the function Sn oj ^ oj recursively by 

d d 

5„(0) := Sn[Y.^i- nf) :=Y,h-{n + 1)^"« 

1=0 i=0 

Then, for each n > 1, define functions Qn ■ uj lo recursively by 

gi{m) := m gn+i{m) := Sn+i{gn{m)) - 1 

□ 



Problem 2: Show by induction that the function Sn effects the replacing of n by n + 1, so 
that 

k ■ {n+ ly if i < n 
fc • (n + 1)"+^ if i = ra 



Snik ■ n') = 
etc. 



Conclude that the sequence {gn{fn))n is precisely the Goodstein sequence starting with 

m. 

□ 



Problem 3. 

(a) Calculate the first few terms mi,m2, ■ ■ ■ ,mioo, ... of the Goodstein sequence {gn{^))n 
starting with mi = 5. (Play around, and push it as far as the recursive depth permitted 
by your programming language will allow. Observe that the terms don't seem to be 
converging to 0.. . . ) 



Set Theory 



64 



(b) We now try to understand how and why the sequence m„ will eventually reach 0. 

(i) Suppose that, for some m, it is the case that gni'm) 
in base n + 1. For which k is gk{m) = 0? 

(ii) Suppose that, for some m, it is the case that gni'm) 
which k is gki^n) = 0? 

(iii) Suppose that, for some m, it is the case that gnim) 
which k is gk{m) = 0? 

(c) Calculate how long it will take for (m„)n to reach zero when m = 4. 

Hint: There's doubling at work: m2 = 2-3^ + 2-3+2. Then = 2-6^ + l-6 + 5 has linear 
coefficient reduced by 1. Then mu = 2 ■ 12^ + 0-12 + 11 has linear coefficient reduced 
once more. Then 717.23 = 1 • 24^ + 23-24 + 23 has the quadratic coefficient reduced by 1. 
Then 77747 will have the linear coefficient reduced from 23 to 22. Doubling 22 more times, 
the quadratic coefficient of nin will disappear when n = 1 + 3 + 6 + 12 + 24 - -- + 3- 2^^. 
In that case m„+i = (77 + 1) - (77 + l)-*^ + (77 + 1) - (77 + 1)°. Double (77 + 1) more times to 
remove the linear coeficient altogether. . . ] 

□ 

To prove that eventually reaches zero, we use transfinite ordinal arithmetic. For 77 > 2, 
define a function \I'„ : w — )■ ON as follows: 

d d 
*„(0):=0 *„(^A;i-77^) :=^A:i-a;*"« 

i=0 i=0 

For example, *2(10) = ^2(2^ + 2^) = u;*2(3) +^^2(1) ^ ^^{2^+2") ^^^^{2°) ^ ^u,*2(i)+<^*2(o) ^ 
Problem 4: 

(a) Show that for all 2 < 77, 777 < co, 

*„(777) = *„+l(5n(?77)) 

(b) Show that each is a strictly increasing function, i.e. that ^ni'm + 1) > *n('T7) for all 
2 < 77, 777 < a;. 

(c) Show that if 2 < 77, 777 < a; and gni'm) > 0, then 

*n+2(5n+l("i)) < *n+l(5nM) 

(d) Now prove Goodstein's Theorem: V777 377 igni'm) = 0). 



= 77 = - (77 + 1)^ + 77 - (77 + 1)° 
= 2- (77 + 1)^ + 0- (77 + 1)°. For 

= 77 - (77 + 1)"^ + 77 - (77 + l)'^. For 



□ 



Chapter 5 

Cardinals 



5.1 Cardinality: Basic Definitions 

Ordinals are for counting (i.e. enumerating) sets. Such an enumeration automatically induces 
a well-ordering on a set X: Here is the first element xq, here is the next xi,. . . , here is the 
the next Xi^,. . . . We do not yet know whether or not every set X can be enumerated in such 
a fashion. If X is well-orderable (i.e. if there exists a well-ordering relation on X), then X 
can be thus enumerated, because every well-ordering is order-isomorphic to a unique ordinal. 
However, even when we do know that a set X can be well-ordered, the well-order induced 
by the enumeration of its elements is not generally unique (unless X is finite). For example, 
the set Lo of natural numbers can be enumerated in the usual fashion: 

0,1,2,3,... (1) 

It can also be enumerated by first listing all the even numbers, and then all the odd numbers: 

0,2,4,. ..,1,3,5,... (2) 

We can enumerate it also by first listing all the positive numbers which are not powers of 
some prime, followed by all the prime powers, followed by zero, as follows: 

1, 6, 10, 12, 15, . . . , 2, 4, 8, 16, ... , 3, 9, 27, 81, . . . , 5, 25, 125, . . . , 7, 49, (3) 

The order-type of the above enumerations are: (1) w, (2) u + uj, and (3) lo-uj + 1. 

Obviously, the size of the set of natural numbers does not depend on how we enumerate it. 
To measure the size of a set, we therefore need a different notion, namely that of cardinality. 
The idea is that two sets A, B have the same size if there is a correspondence between the 
elements of A and B, such that to each element a ^ A there corresponds a unique element 
b & B, and vice versa. 

Definition 5.1.1 (a) We say that two sets A,B have the same cardinality (or the same 
cardinal number) and write 

\A\ = \B\ there exists a bijection A^ B 

In that case we say that A, B are equivalent, or equipollent, or have the same power. 



65 



Set Theory 



66 



(b) We say that the cardinahty of a set A is smaller than the cardinality of B, and write 

\A\ < \B\ there exists an injection A B 

We also write \A < \B\ to mean \A\ < \B\ but \A\ / \B\. 

□ 

Recall that in the chapter in natural numbers, we wrote |X| = n if there is a bij action 
from X to the set n. In particular |n| = n for all n € c<j. We also showed that \n\ = \m\ if and 
only if n = m. Thus we may define the finite cardinal numbers to be the natural numbers. 
For infinite sets we do not yet have a candidate for the object \X\, i.e. the symbol \X\ is 
(at this stage) meaningless when taken in isolation. We have given meaning to statements of 
the form \X\ = \Y\ and \X\ < \Y\, where \X\ does not occur in isolation. Thus a statement 
such as \X\ = \X\, though quite easy to prove, does not follow trivially from the logical rules 
that govern the symbol =. What we are asserting when we say that \X\ = \X\ is not that 
the object \X\ is equal to itself — we do not yet have an object \X\ — but that there is a 
bijection from X to X. Equality doesn't come into it at all. 

Cardinal numbers |X| as definite objects can be defined. However, such a definition 
requires either the Axiom of Choice (in which case l-'^l is the least ordinal which is able to 
enumerate X), or else the Axiom of Foundation (in which case l-'^^l is the family of all sets 
of minimal rank which are equivalent to X — the definition of rank requires the Axiom of 
Foundation) . In this chapter we avoid the use of both these axioms. 

The next exercise shows that the relations defined above behave as their notation suggests. 

Exercise 5.1.2 (a) Show that equivalence of sets is an equivalence relation on the class of 
all sets: 

(i) 1^1 = 1^1; 

(ii) If ^1 = \B\, then \B\ = \A\; 

(iii) If \A\ = \B\, and \B\ = \C\, then \A\ = \C\. 

(b) Show that |A| < 

(c) Show that if \A\ = \A'\ and \B\ = \B'\, then \A\ < \B\ if and only if \A'\ < \B'\. 

(d) Show that if |^| < \B\ and \B\ < |C|, then |^| < |C|. 

□ 

We say that a set X is countable if and only if |X| < \lo\, i.e. if and only if there is an 
injection X ^ oj. Clearly, each natural number is countable, as is u itself. A set which is not 
countable is said to be uncountable. 

Exercise 5.1.3 (a) Show (without using (AC)) that a set X is countable if and only if there 
is a surjection oj -» X. 

Thus X is countable if and only if it can be enumerated by natural numbers, possibly 
with repetitions 

X := {xn : n G a;} 



67 



Cardinals 



(b) Show that a subset of a countable set is countable. 

(c) Show that X is countable if and only if either X is finite (i.e. \X\ = n for some n G w) 
or \X\ = \oj\. Thus we cannot have 

n < \X\ < \uj\ for all n G w 

g 

[Hint: Let a; -» X, where X is infinite. Define h : oj ^ X recursively by 

h{ri) = g{m.m.{k E u : g{k) ran(/i f n)}) 
and show that /i is a bijection.] 

(d) Show that the union of two countable sets is countable. Conclude that the union of a 
finite family of countable sets is countable]^ 

(e) Show that the cartesian product of two countable sets is countable. Conclude that the 
cartesian product of a finite family of countable sets is countable. 

[Hint: First show that u x uj ^ u : (z, j) i— )• 2* • (2j + 1) — 1 is a bijection. (You may 
assume that the usual laws of arithmetic hold.)] 

□ 

Intuition for the sizes of sets may also suggest to you that < is antisymmetric, i.e. that 
if 1^1 < \B\ and \B\ < \A\, then |j4| = \B\. However, this is not at all obvious: One has to 
show that if there is an injection from A to B and an injection from B to A, then there is a 
bijection from A to B. Nevertheless, we will proceed to show that it is the case. The proof 
of this fact is the first (in these notes) that deserves the appellation "beautiful" . 

Theorem 5.1.4 (Schroder-Bernsteirj^ Theorem) 

If A,B are sets such that \ A\ < \B\ and \B\ < \ A\, then \ A\ = \B\. 

i 

Proof: We first reduce the problem to the situation where B (1 A. Suppose that A ^ B 

j 

and B ^ A are injections. Define B := j[B], and Ai := (j o i)[^]. Then 

AiOBCA, \Ai\ = \A\, \B\ = \B\ 

If we can show that \A\ = \B\, then also |^| = \B\. 

We have thus reduced the problem to the following situation: Given that Ai <^ B <^ A 

f 

and that \ Ai\ = \ A\, we have to show that |^| = \B\, i.e. that there exists a bijection A B. 

g 

Now since |^| = \Ai\, there is a bijection A Ai, so that Ai = g[A]. Now define two 
sequences of sets |A„ : N < ujj and : n < ojj hy induction as follows: 

Ao := A An+i := g[An] Bq := B Bn+i := g[Bn] 

^You may know that the union of a countable family of countable sets is countable, but to prove that, you 
need (AC). 

^Also known as the Cantor-Bernstein Theorem. 



Set Theory 



68 



Since 5 is a bijection one can think oi Aq, Ai, A2, ■ ■ ■ as copies of the same set A. Similarly 
Bo, Bi, B2, ■ ■ ■ are copies of B. A peek at Figure 5.1 may clarify matters. 
Moreover, 

Ao^Bo^Ai^Bi^---^An^Bn^ An+l 5 Bn+1 5 . . . 

When we form set differences of successive sets, we obtain disjoint sets 

Aq — Bq, Bq — Ai, Ai-Bi, B1 — A2, ... An — Bn, Bn — An+l, 

Thus we can write A, B as disjoint unions of these set differences. 

A= \J {An-Bn)U U {Bn-An+l) B= [j - 5„) U |J (S^-^+i) 



0<n<aj 



0<n<aj 



l<n<aj 



0<n<tj 



Now observe that g (suitably restricted) is a bijection from each set difference to the next 
one of the same type. To be precise. 



g[An — Bn] = An+1 — Bn+1 



g[Bn — An+l] = Bn+1 — An+2 



f 



We can now construct the required bijection A B. The basic idea is to use g to move each 
point X G An — Bn to the corresponding point f{x) G An+i — Bn+i- Points x G Bn — An+i 
are left unmoved, however. Thus we define 



g{x) ifxe IJ {An-B, 

0<n<ui 

X if X G U {Bn- An 



+1; 



0<n<t>; 



Now observe that / is a bijection from Uo<n<aj(^"~^") onto Ui<n<aj(^"~-^")- clearly 
is also a bijection from Uo<n<a;(^n ~ ^n+i) onto itself. It follows easily that / : A B is a 
bijection. 



Remarks 5.1.5 With the Schroder-Bernstein Theorem in place, we see that < defines a 
partial ordering on the equivalence classes of =. Another intuitively plausible fact is the Law 
of Trichotomy, which asserts that < induces a total ordering: If A, B are sets, then 

Either |^| < \B\ or |^| = \B\ or \A\ > \B\ 

Cantor assumed that the Law of Trichotomy held, but was unable to supply a proof. In fact, 
the Law of Trichotomy is equivalent to (AC) as we shall see in a later chapter. 

Recall that a set X is Dedekind infinite if and only if there is a bijection from X onto a 
proper subset of X. We have seen in an exercise that every Dedekind infinite set is infinite. 
Furthermore, we saw that a set X is Dedekind infinite if and only if there is an injection 
uj X , i.e. if and only if \uj\ < \X\. To prove that an infinite set is Dedekind infinite 
required (AC) however. Thus without (AC) it is possible that \X\ > n for all n £ uj, yet 
\X\ ^ In that case X,u} are incomparable elements in the ordering <. 




Cardinals 



Figure 5.1: Illustration of the Schroder-Bernstein Theorem. The sets An are circles, and 
the sets Bn are rectangles. Furthermore the sets An — Bn are shaded white, whereas the 
sets Bn — Aji^i arc shaded grey. The function / moves each point in a white region to the 
corresponding point in the next white region. It leaves points in the grey regions unmoved. 



Set Theory 



70 



□ 

Right now, we do not yet know that there are uncountable sets. The next theorem, owing to 
Cantor, shows that there are: 

Theorem 5.1.6 For every set X, we have 

\X\ < \V{X)\ 

Proof: Clearly X — > V{X) : x i-> {x} defines an injection form X into 'P{X). Hence 

\X\ < \V{X)\. 

Let f : X 'Pi^) be an arbitrary function. We show that / cannot be a surjection: 
Indeed, define an element Y eV(X) by 

Y --{xeX f{x)} 

If y G ran(/), then Y = f{y) for some y E X. Then 

y G y ^ ye fiy) ^ y^Y 

— contradiction. Hence Y ran(/), i.e. / is not surjcctivc. 

Since / was arbitrary, it follows that there is no surjection — and thus no bijection — 
from X onto P{X). Hence \X\ / 

H 

As a consequence, V{oj) is an uncountable set. 

5.2 Cardinal Arithmetic in ZF 

We stated in the previous section that we do not (yet) have candidates for the cardinality \X\ 
of an infinite set X. Wc will now proceed as if we can assign a cardinal number k = |X| to 
a set X, and define arithmetic operations on these cardinals. Each statement about cardinal 
arithmetic can be translated into a statement about operations on sets: The notion of cardinal 
number is not necessary, but it is convenient. We use the letters k, A, /x, u to denote cardinals. 
Recall that is the set of all functions from B to A. 

Definition 5.2.1 Suppose that k = \A\, A = \B\. 

(a) K + X := 1^4 U where we require that An B = 0. 

(b) K-X:= \AxB\. 

(c) := \A^\. 

□ 

Remarks 5.2.2 (a) Note in the definition of cardinal addition that if A, B arc not disjoint, 
one can always replace them with equivalent sets which are disjoint. Simply choose x ^ y, 
and define A' := A x {x}, B' := B x {y}. Then \A\ = \A'\, \B\ = \B'\, and A' n B' = 0. 



71 



Cardinals 



(b) Clearly = 1 and , I'' = 1. Furthermore = if k > 0. Observe, however, that there 
is a unique function — )• 0, so that O'^ = 1. 

Exercise 5.2.3 Check that the above definitions make sense, i.e. are independent of the 
choice of A, B. Thus check that 

(a) \i Ar\B = 0, Ar\B = 0,\A\ = \A\ and \B\ = \B\, then \AU B\ = \A\J B\. 

(b) If \A\ = \A\ and \B\ = \B\, then \Ay.B\ = \Ay. B\. 

(c) If \A\ = \A\ and \B\ = \Bl then \A^\ = \A^\. 

□ 

The next proposition shows that cardinal arithmetic is better behaved than ordinal arithmetic. 
Proposition 5.2.4 (a) k + X = X + k. 

(b) {k + x) + = K + {x + 

(c) K ■ X = X ■ K. 

(d) {k ■ X) • fi = K ■ {X • fi). 

(e) K ■ {X + fl) = K ■ X + K ■ fl. 

(f) K-^+M = . ^/._ 

(g) [k ■ XY = ■ Xf". 

(h) If n < X, then K + /_i<A + /i, n ■ fj, < X ■ and k,^ < X^. 

(i) // < A < fi, then < k^. 

□ 



Exercise 5.2.5 Prove Proposition 5.2.4 



[E.g. to prove k^^^" = ■ k^, let \A\ = k, \B\ = A, \C\ = n with BnC = 0, and show that 
there is a bijection from A^'^'-^ onto A^ x A'-' .] 



□ 



Exercise 5.2.6 Recall from Exercise 5.1.3 that if X is an infinite countable set, then \X\ = 
We thus denote the cardinality of an infinite countable set by the symbol lo. Show that 

UJ + U = UJ-LO = U}. 

□ 

The following theorem relates the cardinalities of power sets to cardinal arithmetic. 
Theorem 5.2.7 // \X\ = k, then \V{X)\ = 2'^ . Hence k < 2'^ for all cardinals k. 



Set Theory 



72 



Proof: We have to show that there is a bijection from V{X) onto 2^ , where 2 := {0, 1}. 

Observe that every subset Y C X induces a unique characteristic function (or indicator 
function) xy ■ X 2 which, for each x £ X, indicates whether or not x gY. Thus: 

f 1 ifxeY 
Xy{x) ■= \ .r . for X G X 

10 if X ^Y 

Conversely, every function / : X — )• 2 indicates a unique subset Xj := {x £ X : f{x) = 1}. 
There is therefore a correspondence between subsets of X and functions X — t- 2. To be 
precise, define 

^> : V{X) ^2^ -.Y ^ xy{-) ^ -.2^ ^ V{X) : f ^ Xf 

Then it is easy to see that <I> o vj; = [d^x and ^I/ o $ = idp(x)- Hence ^' = and $ is a 
bijection. 

It follows by Theorem 5.1.6 that k < 2'^ for all cardinals k. 



5.3 Alephs 

We are now ready to define the notion of cardinal number for certain sets: The alephs are the 
cardinal numbers of well-orderable sets. 

Recall that every well-ordering is order-isomorphic to a unique ordinal. Thus if X is well- 
orderable, then there is a bijection a X from some ordinal a to the set X. Conversely, if 
there is a bijection / : a X : x^, then / induces a well-ordering on X: Define x^ < x^ 
if and only if ^ < 77. 

Thus a set is well-orderable if it and only if it can be placed into a bijective correspondence 
with some ordinal. Equivalently, a set X is well-orderable if and only if it can be enumerated 
by an ordinal, i.e. if and only if there is an q G ON such that X = {x^ : ^ < a}. 

The order-type of an enumeration need not be unique: At the beginning of this chapter, 
we showed how to enumerate the set of natural numbers by various ordinals: w, w + a; and 
w X a; + 1. Since the size of the set of natural numbers is unaffected by how we order it, we 
have 

|a;| = |a; + a;| = |w-a; + l| 

Many other enumerations of the set of natural numbers are possible, but it is clear that the 
natural numbers cannot be enumerated by an ordinal which is < a;: oj is the smallest ordinal 
which can enumerate the set of natural numbers. This leads to the following definition: 

Definition 5.3.1 (a) An ordinal a is said to be a cardinal number (or just cardinal) if and 
only if for all ordinals /3: 

/3 < a =^ < \a\ 

(b) If a set X is a well-orderable set, then \X\ is the smallest ordinal which enumerates it, 
i.e. 

IXI = smallest ordinal a for which \X\ = \a\ 



□ 



73 



Cardinals 



Thus a is a cardinal if and only if there is no bijection f : /3 ^ a for any f3 < a, i.e. if and 
only a cannot be enumerated by any /3 < a. Moreover, if X is well — orderable, then \X\ is a 
cardinal. 

An ordinal a which is a cardinal is also called an initial ordinal: It is the first ordinal to 
attain the size |a|. 

Clearly 0, 1, 2, . . . , w are cardinal numbers. Furthermore, uj is the smallest infinite cardinal. 



Remarks 5.3.2 (a) The set of cardinals is therefore a subclass of the set of ordinals. As 
such, it inherits the well-ordering on ON. Observe that if a, /3 are cardinals, then 



a< (3 in ON 



|a| < |/3| 



By Definition 5.3.1 we see that a < (3 implies |a| < 

and \a\ < 



Conversely, if a, /3 are cardinals 



then it cannot be the case that /3 < a, and hence a < p. 

There is therefore no chance of confusion if we use the same symbol < for the order 
relation on both the class of ordinals and the class of cardinals. 



(b) We have defined arithmetic operations on both the class of ordinals and the class of 
cardinals. Here, however, care must be taken. The ordinal product uj ■ co is not at all the 
same as the cardinal product u ■ to. Indeed, for the ordinal product we have uj ■ uj > oj, 
yet |a; • wl = \uj\. Thus the ordinal product w • a; is an ordinal which is not a cardinal. 
The cardinal product is simply uj ■ uj = u. 

It is moreover not hard to see that for ordinal exponentiation, uj^ := Unew'^" count- 
able, i.e. that for ordinal exponentiation uj^ is a countable set. However, for cardinal 
exponentiation, we clearly have |a;|l^l > 21'^' > |a;|, i.e. for cardinal exponentiation, uj^ is 
an uncountable set. The two interpretations of uj^ do not even have the same cardinality! 

Thus when doing arithmetic, you have to keep track of whether you are dealing with 
ordinals or cardinals. Only for the natural numbers do ordinal and cardinal arithmetic 
coincide. 



□ 



Here is an easy result: 
Proposition 5.3.3 Every infinite cardinal is a limit ordinal. 
Proof: If a > w is infinite, then the map / : a + 1 — )• a defined by 



/(7) := < 

is a bijection, i.e |a + 1| = \a\. Thus a + 1 is not a cardinal. 



if 7 = a 
7 + 1 if 7 < w 
7 if u; < 7 < a 



Set Theory 



74 



As we stated earher, clearly 0, 1, 2, . . . , a; are cardinal numbers. Are there any others? We 
know that \X\ < \V{X)\. However, we generally do not know if V{X) is a well-orderable set, 
even if X is well-orderable. Thus though \X\ may be a well-defined cardinal number (in the 



sense of Definition 5.3.1), the object |'P(X)| may not have any meaning in isolation. 

Nevertheless, as we will now show, there are as many cardinals as there are ordinals. Since 
ON is a proper class, so is the class of cardinals. 

Let X be any set (not necessarily well-orderable). Then V{X) is a set. Now if Y <Z X, 
then a well-ordering of y is a subset of 1" x y. Hence for each Y G V^X), the class 

:= {-< G V{Y X Y) well-orders Y} 

is a set, by Separation. It follows that W := IJyeVix) ^ s^^' possible 

well-orderings of subsets of X . We can define a class function F which assigns to each 
well-ordering -< G W the unique ordinal to which it is order-isomorphic. By the Axiom of 
Replacement, -F[W] is a set. Thus |J -F[W] = supF[W] is an ordinal, which may or may 
not belong to -F[W]. In any case, the successor ordinal of sup-F[W] cannot be a member of 
F[W]. Thus {q G on : a F[W]} is a non-empty class of ordinals, and therefore has a 
least member — call it h{X). 

Now observe that, by definition, a G F[yV] if and only if X has a subset which can be 
enumerated by a, i.e. 

a G -F[yV] <;=^ there is an injection from a to X 

Hence h{X) is the least ordinal a for which there exists no injection from a into X. 

The reasoning above shows that the ordinal h{X) always exists, for any set X — well- 
orderable or not. Thus: 



Definition 5.3.4 If X is a set, define the Hartog's number h{X) of X as follows: 
h{X) := least ordinal a such that there is no injection a X 

□ 

Proposition 5.3.5 For any set X , h{X) is a cardinal number. Moreover, h{X) ^ \X\. 

Proof: If a < h(X), then there is an injection a ^ X. If it was the case that |a| = |/i(X)|, 
there would exist an injection h{X) ^ X — contradiction. Hence \a\ < \h{X)\ for all 
a < h{X). 

Similarly, if |/i(^)| < X, then there would exist an injection h{X) ^ X — contradiction. 
Hence \h{X)\ ^ |A:|. 

H 

Observe that we are not saying that h{X) > \X\. That would be the case if the Law of 
Trichotomy holds, i.e. if (AC) holds. However, without (AC), it is possible that \X\, h{X) 
are incomparable in size. 



Proposition 5.3.6 For every a G ON there is a cardinal which is greater than a. Indeed, 
h{a) is the least cardinal > a. 



75 



Cardinals 



Proof: Both a, h{a) are ordinals. Thus either h{a) < a, or else a < h{a). Since ^ 
we must have a < h{a). Thus h{a) is a cardinal which is > a. 

If /3 is a cardinal and /3 < h{a), then there is an injection f3 ^ a, and hence |/3| < \a\. 
Because /3 is a cardinal, it then follows that (3 < a. Hence h{a) is the least cardinal > a. 

H 

If K is a cardinal, then is the least cardinal > k, i.e. the successor of k in the class 
of cardinals. We denote it by 

:= 

We already remarked that the order relation on the class of cardinals is a restriction of the 
order relation on the class of ordinals. Here is another result linking the two — The supremum 
of a set C of cardinals does not depend on whether you regard C as a set of ordinals or a set 
of cardinals: 

Proposition 5.3.7 If C is a set of cardinals, then sup C :=\JC is a cardinal. 

Proof: We know that a := (J C is an ordinal, and that it is the least ordinal which is > k 
for every k £ C. Now suppose that f3 < a. By definition of a, there is k G C such that 
/3 < K < a. Because k is a cardinal, it follows that |/3| < |k| < \a\. Hence |/3| < \a\ whenever 
/3 < a. 

We can now define an enumeration of the class of all infinite cardinals, the aZep/i^ 

^^o,^^l,^^2, • • . . . . • • • - • • 

by transfinite recursion: 

b^o := ^0 is the set of natural numbers 

'Ra+i '■= ^a+i ■= succcssor ordinals a + 1 

"i^x = '■= supji^Q, : a < A} if A is a limit ordinal 

We use "i^a and cj^ interchangeably. The ^l;-notation is typically used when we want to 
emphasize the size of a set, whereas the w-notation when we are concerned also with the 
order-type. 

Observe that ui is the least uncountable ordinal, i.e. that a < wi if and only if \a\ < |a;|. 
Then U2 is the least ordinal whose cardinality is > |a;i|, etc. 

Exercise 5.3.8 Show that the map F : ON — )■ ON : a i— t- Wq, is normal, i.e. strcitly 
increasing and continuous. Deduce that uja > a for all a G ON. Also conclude that there 
are arbitrarily large ordinals for which = a. 

□ 

Next, we observe that {^^Q, : a £ ON} enumerates all the infinite cardinals: 

''Aleph N is the first letter of the Hebrew alphabet. The next two, Beth 3, and Gimel 1 will also make an 
appearance in cardinal arithmetic. 



Set Theory 



76 



Proposition 5.3.9 // k is an infinite cardinal, then there is a unique a £ ON such that 



Exercise 5.3.10 Prove Proposition 5.3.9 



[Hint: Sliow tiiat if k is a cardinal, then k < i^K+i- Now show by induction that the following 
holds for all all a 

If K is a cardinal such that k < ^^Q,, then k = H« for some ordinal (3 < a. 



□ 

The alephs were introduced by Cantor. We emphasize once again that the alephs are 
the sizes (cardinalities) of well-orderable infinite sets. An infinite set X is well-orderable if 
and only if \X\ = Hq, for some a. For Cantor, the restriction to well-orderable sets posed 
no problem: Already in 1883 he had proposed as a Law of Thought that every set is well- 
orderable. However, this proposal met with much opposition, and thus Cantor tried to find 
a proof. The first proof was, in fact, found by Zermelo, but his proof used a new principle 
for set formation — the Axiom of Choice. We shall have much more to say about this in the 
next chapter. 



5.4 The Canonical Well-Ordering of ON x ON 



Recall that, given a well-ordered set(X, <) , we defined in Lemma 4.1.2 the canonical well- 
ordering {X X X,\z). We can define the canonical well-ordering of the class ON x ON in the 
same manner: Given a,/3,a',P' G ON, we put 



{a, 13) < {a',f3') 



max{a, /3} < max{a', /3'} 
or max{a, (3} = max{a', /3'} and a < a' 
or max{a,/3} = max{a',/3'}, a = a , and (3 < [3' 



It is straightforward to verify that this does indeed define a well-ordering of ON x ON, i.e. 
a total ordering with the property that every non-empty subclass of ON x ON has a least 
element. 

Observe that if 7 G ON, then the restriction < f (7 x 7) of the canonical well-ordering on 
ON X ON to the set 7 x 7 is precisely the canonical well-ordering of 7 x 7. 
Here are some observations that will be useful in the sequel: 

• Observe that the initial segment {{(,,r]) : ??) < (a,/3)} is a set, as it is a subset of 
(max{a,/3} + 1) x max{a,/3} + 1). 

• Further observe that 

(^, T]) < (0, a) <;=^ ^ < a A rj < a i.e. {(^, rj) : (^, r]) < (0, a)} = a x a 



Thus each a x a is an initial segment of the canonical well-ordering of ON x ON. 



77 



Cardinals 



• Since every well-ordered set is order-isomorphic to a unique ordinal, we may define a 
map r : ON x ON — )• ON as follows: T{a, /?) is the order-type of the initial segment 
induced by {a, (3) in the canonical well-ordering of ON x ON, i.e. 



r(a,/3) := order-type of {{C,v) ■ i^^v) < 



• r is an order-preserving injection from ON x ON into ON: No well-ordered set is 
order-isomorphic to an initial segment of itself. 



• If /3 < a, then r(0,/3) < r(0,a): This is because r(0, a) is the order-type of of the 
initial segment a x a of ON x ON. Since /3 x /3 is an initial segment of ON x ON and 
/3 X /3 C a X a, we see that /3 x /3 is an initial segment of a x a. 



• ran(r) is downwards-closed: If 7 G ran(r) and < 7, then 6 S ran(r). This is because 
some initial segment of ON x ON has order-type 7, and an initial segment of that 
initial segment must have order-type 5. 



a 



r is surjective: The map $ : ON — t- ON 
$(a) > Q for all a £ ON, by Theorem 4.1.6 



r(0, a) is strictly increasing, and hence 
It follows that a < T{0,a) G ran(r). 
Since ran(r) is downwards-closed, it follows that a G ran(r), for all a G ON. 

We reiterate: The map F : ON x ON — t- ON is an order-preserving bijection. We refer 
to r as the canonical bijection from ON x ON to ON. 



Exercise 5.4.1 Consider the map ^' : ON — )■ ON : a 1— )• r(a x a). Show that ^' is a 
normal function. Conclude that r(a x a) > a for all a G ON. Further deduce that there are 
arbitrarily large ordinals for which T{a x a) = a. 



□ 



With the previous exercise in mind, the next proposition should not be too surprising: 
Proposition 5.4.2 r{uja x ^a) = for all a G ON. 

Proof: Recall that clIq, X clIq, = : {£,,tj) < (0,u;q,)} is an initial segment of ON x ON, and 

that T{uja X Wa) = r(0,a;Q,) > uja- Suppose that a is the least ordinal such that T{uja x uja) 
u)(x, SO that r(wQ, X oja) > uJa- Then there exists (^,7?) G Wq, x such that T{(,,t]) = oja- 
Since cj^ is a limit ordinal, we can find an ordinal 5 so that ^, r/ < J < Wq. Then J x 5 is an 
initial segment of ON x ON. Since (^, 77) £ 6 x 5, we have uJa C T{5 x 6). Since F is injective, 
it follows that Ua < |F((5 x 5)\ = \6 x 5\. Now \6\ = Uf^ for some /3 < a, and by the minimality 
assumption F(a;^ x ujfs) = u^p. Since F is injective, we see that ujp = \T{u)p x ujp)\ = \ujp x ujp\. 
It follows that 

Wq, < |(5 X 5| = \ujp X w/jI = ojp 
— contradiction, since cj^ < Wq. Hence there is no ordinal a such that T{uja x Wa) 7^ Wq,. 



Set Theory 



78 



5.5 Arithmetic of Alephs 

Cardinal addition and multiplication are trivial: 
Theorem 5.5.1 

Proof: Since T{u}a x <^a) = and F is injective, it follows that 
Thus 

b^Q, • Kq, = b^Q, for all a G ON 



Now if 7 = max{Q,/3}, then by Proposition 5.2.4 it follows that 

= max{H„, H^} < + < H.^ + = 2 • < H7 • ^^-^ = 

Similarly, 

= max{^^a, ^^/j} < Ha • < ^7 • = ^7 



Remarks 5.5.2 (a) It is important to note that the above theorem holds only for alephs. If 
the Axiom of Choice holds, then every infinite cardinal is an aleph, as we shall soon be 
able to show. However, the statement 

For all infinite 

i.e. the statement that for all infinite sets A we have |^ x j4| = |^|, is equivalent to (AC). 

(b) Finite sums and products of alephs are now easy to calculate. Exponentiation is altogether 
more difficult. We know, for example, that 2*^'^ > Hq. We do not know if 2*^'^ is an aleph, 
i.e. if 2^0 = H„ for some a G ON. With (AC), 2^" will be an aleph. Even with (AC), 
however, we are not able to determine the value of (x for which 2^^' — Hq;. 

The axioms of ZFC place very few constraints on the possible values of a. a might be 
any of 1,2,3,. ... It cannot be u. It can be any of a; + 1, cj + 2, w + S, . . . , or for that matter 
Hi, H2, H3, .... Again, we cannot have a = H(^. 

The assumption that a = 1 — i.e. that 2^° = Hi — is due to Cantor, and is known as 
the Continuum Hypothesis. Here, the word continuum refers to the set M of real numbers. 
The Generalized Continuum Hypothesis is the assertion that 

2^" = H„+i for ah a G ON 

□ 

The next theorem shows what 2^° has to do with the continuum. 
Theorem 5.5.3 

IMI = 2^0 



79 



Cardinals 



□ 

Exercise 5.5.4 We prove the preceding theorem. Assume here that the basic properties of 
the sets Q, M of rational and real numbers are known. (We have not yet established that Q, M 
can be formalized inside ZFC. However, such a formalization must of course capture all these 
known properties.) 

(a) Show that |Q| = Nq. 

(b) Observe that if r G M, then r = sup{g G Q : g < r} — a known property. Conclude that 
|M| < 2^0. 

(c) An element / G 2'^ is a function / : w — )■ {0, 1}, which can be thought of as a sequence of 
O's and I's. With each such function is associated a real number whose decimal expansion 
consists of digits given by this sequence. Show that 2^° < |M|. 

(d) Finally, conclude that |M| = 2^°. 

□ 

5.6 Project: Cardinal Arithmetic and The Axiom of Choice 



Problem 1: 

We prove a theorem of Tarski (1924): 

Theorem: In ZF, (AC) holds if and only if = k for every infinite cardinal k. 

1.1 First, explain why if (AC) holds, then = k for every infinite cardinal. 

1.2 Next, we show that if k is an infinite cardinal and if K is an aleph, then 

K + K = K • K implies k < H or K > k 

So suppose that K + ii = K-ii. Let K be an infinite set such that \K\ = k, and let {A, <) 
be a well-ordered set such that |yl| = N. 

(i) Explain why there exist disjoint sets Ki and Ai such that \Ki\ = k, \Ai\ = K and 
K X A = KiUAi. 

(ii) Explain why exactly one of the following possibilities must be the case: Either 

(1) 3k£ Kya€ A {{k, a) G Ki), or 

(2) yk£ K3a£ A {{k,a) G ^i). 

(iii) Show that in case (1), we have H < k. 

(iv) Suppose now that case (2) holds. For each k £ K, let be the least element of A 
such that {k,ak) G A^. Observe that {{k,ak) ■ k G K} C Ai. Conclude that k < i^. 



Set Theory 



80 



1.3 Now suppose that k'^ = k for every infinite cardinal. Let X be any infinite set, with 
cardinaUty A = \X\, and let K := h{X) be the Hartog's number of X. Observe that K is 
indeed an aleph, as the notation suggests. 

(i) Show that (A + H)^ > A • H. 

(ii) Conclude that A + K = A • 

(iii) Explain why we cannot have H < A. Conclude that A < 

(iv) Deduce that X can be well-ordered. Hence explain why (AC) holds. 

□ 



Problem 2: 

We prove a theorem of Sierpihski (1947): 

Theorem: In ZF, (GCH) implies (AC). 

Recall that k <2'^ for any cardinal k. The Generalized Continuum Hypothesis (GCH) is 
the following assertion: If k is an infinite cardinal, then there are no cardinals strictly between 
K and 2'^. To be precise 

If K, A are infinite cardinals such that « < A < 2**, then either A = k or A = 2**. 

2.1 We first prove, without assuming either (AC) or (GCH), 

Lemma 1: If k is an infinite cardinal, and if h{K) is the Hartog's number of k, then 



Let X be a set such that \X\ = k. Then C is a partial order relation on V{X). Thus 
if C V{X), then {X , C) is a partial ordering. It may happen that {X, C) is a well- 
ordering, or it may not. Define 

W ■.= {X <Z V{X) : {X, C) is a well-ordering} 

(i) Show that W C P(7^(X)). 

(ii) Define an equivalence relation on W by 

X Kiy [X, C) is order-isomorphic to {y, C) 

and let 

E := W/ « 

be the set of equivalence classes of «. Explain why E C V{V{V{X))). Conclude 
that |E| < 22'". 



81 



Cardinals 



(iii) Each E £E, is a collection of all subsets of V{X) that have the the same order-type. 
Thus with each E & K we can associate a unique ordinal oe- Thus E can itself be 
regarded as a well-ordered set, with £' ^ F if and only if as < ap- Conclude that 
|E| is an aleph. 

(iv) Show that |E| ^ \X\. Conclude that |E| > h{K). 

[Hint: Let ij be the order-type of the well-ordered set E. Suppose that |E| < \X\. 
Then there is an injection / : 77 — )■ X : ^ x^. Define Xjs := {x^ : ^ < /?}, and 
X := {Xp : /3 < ry}. Observe that (Af, C) has order-type ry. The initial segment of 
E determined by the equivalence class of X then has order-type rj. . .] 

(v) Finally, conclude that h{tC) < 2^ . 

2.2 Now prove the following easy results in cardinal arithmetic, not assuming (AC) or (GCH): 

(i) Lemma 2: If k > Hq, then 2 ■ k = k. 

Remark: Saying k > is not the same as saying that k is infinite, but that it is 
Dedekind infinite (i.e. has a countable subset). 
[Hint: Show 1 + k = k.] 

(ii) Lemma 3: IfK> Ko, then 2" + k = 2'^. 

(iii) Lemma 4: If X, k are cardinals such that 2 ■ k = k and X + k > 2'^, then X> 2'^. 
[Hint: Choose disjoint sets Ki,K2,L so that \Ki\ = \K2\ = k, \L\ = X. Justify the 
equations 

\LUKi\ = X + K = 2'^ = 2''+'^ = \r{Ki U K2)\ 
Let f : L\J Ki ViKiU K2) he a bijection. If £; C ^2, define 

E* ■.= EU{xeKi :x^f{x)} 

Show that the map V{K2) -)• V{Ki \J K2) : E ^ E* is injective. Further show 
that each E* f[Ki\- Conclude that each E* = f{y) for some y E L. Deduce that 
> \V{K2)\ = 2\] 

2.3 Henceforth, assume (GCH): Let k > b^o be an infinite cardinal, and let /i(k) be the 
Hartog's number of k. Define a sequence {pn)n<uj of cardinals inductively as follows: 

PQ •■= Pn+i = 2''" 

Prove: 

Lemma 5: If k> and h{n) < Pn+i, then either K is an aleph, or h(^K) ^ p^- 

(i) Observe that 2 ■ pn = Pn, by Lemma 2. 

(ii) Show that p„ < /i(k) + pn < 2^^" for all n. 

(iii) Conclude from (GCH) that we have twho possible cases: Either 

(1) /i(k) + Pn = Pn, or else 

(2) h{K)+pn = 2P^. 

(iv) Show that in case (1), we have hi^K) ^ pn- 



Set Theory 



82 



(v) Show that in case (2), we have h{K) > 2''" > k by Lemma 4. Conclude that k is an 
aleph. 

2.4 Finally, let X be any infinite set. Define k = 2^o+l^l. 

(i) Observe that Lemma 1 states that < p3. Also observe that ^ po. Use 
Lemma 5 to conclude that k is an aleph. 

(ii) Deduce that X can be well-ordered. Hence explain why (AC) holds. 



□ 



Chapter 6 

The Axiom Of Choice 



Recall: 

Axiom of Choice: Every family of non-empty sets has a choice function: If is a family of 
non-empty sets, then there is a function f on. X which chooses from each X E X an element 
fix) G X. 

Exercise 6.0.1 Show by induction that every finite family X of non-empty sets has a choice 
function. 

Thus (AC) is needed only for infinite X. 

□ 

6.1 Choice Functions, Partitions and Cartesian Products 

Our first result shows that for (AC) to hold, it suffices that it holds for every every family X 
of mutually disjoint non-empty sets: 

Proposition 6.1.1 Let (AC)* be the following statement: 

(AC)*: If X is a partition (i.e. a family of non-empty pairwise disjoint sets), then 
there is a set C which contains precisely one element from each X E X (i.e. C CiX 
is a singleton for every X E X). 

Then (AC) and (AC)* are equivalent in ZF. 

Proof: Clearly (AC) impUes (AC)*. 

Conversely, suppose that X is a family of non-empty sets, not necessarily pairwise disjoint. 
For each X E X, let X := {X} x X. Then X := {X : X € X} is a family of pairwise disjoint 
sets. By (AC)*, there is a set C which contains precisely one element from each X £ X. Then 
C is a set of ordered pairs, and thus a binary relation. By definition, if {X, x) G C, then x e X. 
Moreover, C is clearly a function: If {X, x) e C and {X, y) E C then {X, x), {X, y) e C Ci X, 
and hence x = y. It follows that C is a choice function on X. 

H 

Proposition 6.1.2 In ZF, the following are equivalent: 



83 



Set Theory 



84 



(a) (AC) 

(h) Every surjection has a right inverse. 

f 

Proof: Suppose that (AC) holds, and that X Y is a surjection. For each y £ Y define 
Xy := f~^[{y}]- Then X := {Xy : y £ Y} is a family of non-empty sets which partitions 
X. By (AC), X has a choice function h : X ^ y]X with the property that h{Xy) £ Xy 
for ah y £ Y. In particular, f{h{Xy)) = y. Define g : Y ^ X : y ^ h{Xy). Then 
{fog){y) = f{h{Xy))=y. 

Conversely, suppose that every surjection has a right inverse. Let X he a family of disjoint 
non-empty sets, and let f :[_} X ^ X he defined as follows: 

If x G Af , then f{x) is the unique X £ X so that x £ X 

Then / is a surjection, and thus has a right inverse g : X ^yjX. Clearly g{X) £ X, i.e. g 



is a choice function for the partition X. By Proposition 6.1.2, (AC) holds 



Recall that the cartesian product Aq x Ai of two sets is defined to be the set of all ordered 
pairs (ao,ai), where oq £ Aq and ai £ Ai. We can then define products with more factors 
inductively: 



ji+l n 

^fc := ^fc X An+i i.e. tIq x • • • x ^„ x An+i := (tIq x X • • • X An) X An+l 
k=0 k=0 

However, this allows us to define products with only finitely many factors. How should we 
define a product nie/ ^« with infinitely many factors, i.e. where the index set / has infinite 
cardinality? More generally, if ^ is a non-empty collection of sets, can we define Y\ Here 
the intention is that if ^ = {Ai : i £ I}, then = Yliei (just as = IJie/ ^*-) 
Clearly we have 13^=0 {(^O) oi) ■ ao £ Aq A ai £ Ai. Thus we ought to have 

YlAi := {{ai : i £ I) -.yi £ I {ai £ Ai)} 

The problem is that we have not yet defined ordered tuples {ai : i £ I) of infinite length. 
However, we can clearly think of each such tuple as a function, i.e. 

{ai : i £ I) is the function f : I ^ \^ Ai defined by f{i) = ai 

Thus the function / chooses, for each i E J, an element f{i) £ Ai. Then 

Ai :={/:/ is a function I A- \^ Ai such that f{i) £ Ai for all i £ 1} 

i&I i£l 

i.e. the cartesian product Hie/ °^ functions / which choose, for each i £ I, 

an element f{i) £ Ai. 

There is an obvious natural bijection between the old definition of x as a set of 
ordered pairs and the new definition as a set of functions {0, 1} ^ AqU Ai: The ordered pair 



85 



Axiom of Choice 



(ao, ai) £ AqX Ai corresponds to the function {0, 1} ^ AqU Ai : i-^ oq, 1 ai- Henceforth 
we shall identify the two objects without further explanation. It will usually be clear from 
context which definition of product is meant. 

Now if ^ is a non-empty non-indexed collection of sets, then we can index A by itself: 

A := {Xa : Ae A} where Xa := A 

Then H = Has^ °^ functions f : A ^ \JA with the property that 

f{A) E Xa, i.e. with the property that f{A) G A. Thus if / G f^^, then / is a function 
which chooses, for each ^4 G ^ an element f{A) E A, i.e.: 

Definition 6.1.3 is the set of all choice functions on A. 

□ 

The following proposition is now obvious: 
Proposition 6.1.4 The following statements are equivalent in ZF: 

(a) (AC) 

(b) A product of a non-empty family of non-empty sets is non-empty: Whenever A is a 
non-empty family of non-empty sets, then Y\A ^ 0. 

□ 



Proposition 6.1.4 is the reason that (AC) was also called the multiplicative axiom in the 
past. 



6.2 Well— Ordering 

Cantor proposed as an obvious Law of Thought the Well-Ordering Principle: Every set can 
be well-ordered. 

The intuition behind this law was simple. Suppose that X is a set. Pick an element 
xq E X. If any elements are left over, pick an element xi from X — {xq}. If any are left over, 
pick an X2 €z X — {xq, xi}. This defines a sequence by transfinite recursion: If x^ have been 
chosen for ^ < a, then pick an element Xq E X — {x^ : ^ < a}. Just keep doing this until you 
run out of elements. If this happens at stage f3, then X = {x^ : ^ < /?}. This enumeration of 
X induces well-ordering of X in an obvious manner. 

The discovery of the paradoxes generated increased skepticism about the foundations of 
set theory, and this led Cantor to seek a proof of his Well-Ordering Principle. The first such 
proof was given by Zermelo in 1904 (and again in 1908). In doing so, Zermelo explicitly 
formulated the Axiom of Choice. 

Proposition 6.2.1 Let X be a set. In ZF the following are equivalent: 

(a) The set X can be well-ordered. 

(b) The family X := V{X) — {0} of non-empty subsets of X has a choice function. 



Set Theory 



86 



Proof: (b) =^ (a): Cantor's intuition can be implemented as follows: Suppose that F is a 
choice function for X := V{X) — {0}. Pick a set c such that c ^ X. We extend to a 
function on V by defining F{x) = c \i x ^ X . Define a function G on ON by transfinite 
recursion follows: 

G{a) = F(X-ran(G \ a)) 

Observe that this means that G(a) & X — {G(^) : ^ < a} as long as X — ran(G \ a) ^ 0. 

Now if G{a) 7^ c for any a G ON, then G is injective, and hence the function G^^ : 
ran(G) — t- ON exists, and is a bijection. Since ran(G) C X, we see that = ON is 

a set, by Replacement — contradiction. Hence there is a G ON such that G{a) = c. If ag 
is the least such a, then X — ran(G \ qq) = 0, and hence X = {G(C) : S, < ckq}. Define a 
relation -< on X by 

G(0 ^ G{r]) if and only if ^ < r] 

Then G : (oq, <) — {X, -<) is an order-isomorphism, so -< well-orders X. 
(a) (b): If {X, -<) is a well-ordering, define F on X := V{X) - {0} by: 

F(Y) := -< -least element of Y 

To be precise F is the set 

F = {{Y,y) e X X X :y eV A eY {y ^ z)} 

which exists by Separation. 

Then F is clearly a choice function on X. 

H 

As an immediate corollary we have: 

Theorem 6.2.2 In ZF the following are equivalent: 
(a) (AC) 

(h) Every set can he well-ordered. 

Proof: (a) =^ (b): Suppose that X is a set. By (AC), V{X) — {0} has a choice function. 
By proposition 6.2. 1[ X can be well-ordered. 



(b) =^ (a): Suppose that X is a. family of non-empty sets. Let X = \] X . Then X C 
V{X) — {0}. Since X can be well-ordered, there is a choice function F for V{X) — {0}. Then 
F \ X is a, choice function for X . 



Recall that the alephs ^^q, were defined to be the cardinals of the well-orderable sets. (AC) 
implies that every set is well-orderable. In particular, for every infinite set X there is a unique 
a G ON such that \X\ = Nq,. Hence (AC) considerably simplifies cardinal arithmetic of all 
sets. We shall investigate cardinal arithmetic in ZFC in a later section. 



87 



Axiom of Choice 



6.3 Zorn's Lemma and Other Maximal Principles 

We wish to show how, by introducing a certain axiom on sets of sets 
instead of the well-ordering theorem, one is enabled to make the proofs 
shorter and more algebraic. 

— Max Zorn, 

A Remark on Method in Transfinite Algebra, 1935 

Recall the following: 
Definition 6.3.1 Let (P, <) be a partially ordered set. 

(i) If X C P, then an element ti G P is an upper bound for X \i\/x ^ X [x < u). 

(ii) An element m G P is called a maximal element of P if G P {m < x). 

(iii) A non-empty subset C C P is called a chain in P if C is totally ordered by <, i.e. for 
all x,y £ C, either x < y or y < x. 

(iv) A chain C in P is said to be a maximal chain if it is not a proper subset of any other 
chain in P. 

□ 

Theorem 6.3.2 In ZF, the following are equivalent: 

(a) (AC) 

(b) Zorn's lemma (1935): If every chain in a non-empty partially ordered set (P, <) has 
an upper bound, then P has a maximal element. 

(c) Hausdorff's Maximal Principle (1909): Every chain in a non-empty partially ordered 
set can be extended to a maximal chain. 

Proof: (a) (b): Let (P, <) be a partially ordered set with the property that every chain 
in P has an upper bound, and let P be a choice function for the family V{P) — {0}. Also 
choose some set q ^ P. For ^ C P, define 'W A := {x £ P : \/a £ A [a < x)} . By transfinite 
induction, define a map G on ON as follows: If a G ON, 



G{a) :-- 



P(tt ran(G \ a) if tt ran(G \ a) ^ 
a else 



To explain in simple English, suppose that := G'(^) has been defined for ^ < a. Then 
Oq, := G{a) is an element chosen (by F) from {x G P : < a (ag < x)}, i.e. is an element 
of P so that aa > for all ^ < a, provided such an element exists. If such an element does 
not exists, then aa ■= q- 

If Oq, 7^ for all a G ON, then : a G ON] is a sequence of distinct elements of P, 
i.e. G : ON — t- P is an injection, which is impossiblej^ Let A be the least ordinal such that 

^By a similar argument as in the proof of Proposition 6.2. l[ Here is another proof: If h{P) is the Hartog's 
number of P, then {aa : a < h{P)l would be an injection from h{P) to P, contradicting the very definition of 
Hartog's number. 



Set Theory 



88 



a\ = q. Then C := {a^ : ^ < A} is a chain in P, and therefore has an upper bound u in P. 
We claim that u is a maximal element of P: For if not, then there is a; G P such that u < x. 
Hence tt {o^ : ^ < A} / 0, so oa G -P, i.e. q e P — contradiction. 

(b) ^ (c): Let C be a chain in a partially ordered set (P, <). Let 

r := {D C P : D is a chain in P, and D D C} 

Then (P,(^) is a partially ordered set, ordered by set inclusion. We show that every chain 
in P has an upper bound in P. So suppose that C := {Ci : i G /} is a chain in P. Define 
U ■.= {JC. Then 

(i) [/ is a chain in (P, <): For if x, y G U, there are i,j G / such that x £ d and y G Cj. 
Since C is a chain in (P, C), we have either Ci C Cj or Cj C Cj. Suppose without loss 
of generality that Cj C Cj. Then both x,y G Cj. Since Cj is a chain in (P, <), we have 
must have either x < y oi y < x. 

(ii) J7 G P: Clearly C CU. 

(iii) U is an upper bound of C in P: If d G C, then Ci C\JC = U. 

It follows from Zorn's Lemma that P has a maximal element Cm Then clearly Cm is a maximal 
chain in (P, <) which extends C. 

(c) =^ (a): We show that every family of non-empty sets has a choice function. Let ^ be a 
family of non-empty sets. Let F be the collection of all partial choice functions, i.e. 

F :={/:/ is a function, dom(/) C X, and f{X) G X for all X G dom(/)} 

Order F by inclusion, so that (F, C) is a partial ordering. By the Hausdorff Maximal Principle, 
there is a maximal chain C in F. Let P := IJC. Then 

(i) P is a function: For if [X,x), {X,y) G P, then there exist /, 5 G C so that {X,x) G / 
and {X,y) G g. Since C is a chain in (F, C), either / C (7 or 3 C /. Suppose without 
loss of generality that f ^ g. Then both {X, x), {X, y) G g. Since 5 is a function, x = y, 
as required. 

(ii) P G F: Indeed, if {X,x) G P, there is / G C so that {X,x) G /. Observe that 
F{X) = X = f{X). Since / is a partial choice function, f{X) G X, and thus F{X) G X. 

(iii) dom(P) = X. For suppose there is X E X — dom(P). As each element of X is non- 
empty, one may choose x e X. Then G := FU{{X, x)} G F also. Moreover, D := CU{G} 
is a chain, since / C G for all / G F. Then D is a chain which properly extends C — 
contradiction, since C is a maximal chain. 

Thus P is a choice function with domain X. 



89 



Axiom of Choice 



Remarks 6.3.3 Various maximum principles were discovered in the years since Zermelo 
proposed the Axiom of Choice, by Hausdorff, Kuratowski, Tukey, Teichmiiller, Zorn and 
others. 

Zom's Lemma is the (AC)-equivalent that has become the industry standard, a tool 
required by all mathematicians working with infinite sets. Although it goes by the moniker 
of "Lemma", Zorn actually proposed it as an axiom. Indeed, Zorn proposed the following 
maximum principle. 

(MP): Let ^ be a family of sets partially ordered by C and closed under the taking 
of unions of C-chains. Then A has a maximum element. 

We saw in Chapter 2 that every partially ordered set {X, <) is order- isomorphic to a family 
of sets ordered by inclusion: Define X := {], x : x E X}, where X x := {y E x : y < x}. Then 
{X, <) and {X, C) arc order-isomorphic. In particular, and chain in X corresponds to one in 
X, any maximal element in X corresponds to one in X, and vice versa. Thus the maximum 
principle (MP) is equivalent to Zorn's Lemma. 

Observe that the proof of Zorn's lemma from (AC) required transfinite recursion. Zorn's 
lemma is a method by which one can avoid transfinite recursion — it is built in, as it were. 
This usually makes the technique more palatable to the working mathematician. 

□ 

Here follows an example of the use of Zorn's lemma: 
Theorem 6.3.4 Every vector space has a basis. 

Let F be a vector space, and let P be the family of all linearly independent subsets of V. We 

claim that (P, C) satisfies the hypotheses of Zorn's Lemma. To see this let C := {Cj : i E 1} 
is a chain in P. We must show that C has an upper bound. Define U = \JC Then: 

(i) [7 G P: For if vi,...,Vn € U, then there arc G / such that € Cj^. Since C 
is a chain, the finite set {Cjj, . . . ,Cj^} has a maximum element, say Cj .. Then each 
vi, . . . ,Vn G Cij . Since Ci- is a linearly independent subset of V, the vectors vi, . . . ,Vn 
are linearly independent. Hence ?7 is a linearly independent subset of V also. 

(ii) U is an upper bound for C in (P, C) Each Q G C has Ci C U. 

By Zorn's Lemma, P has a maximal element B, i.e. a maximal linearly independent 
subset. If w G F is such that v ^ B, then B U {v} is not linearly independent. Hence there 
are 6i, . . . , 6„ G -B and scalars ai, . . . , a„ so that aibi H — • + anbn + v = 0. It is now easy to 
see that each v eV is a linear combination of members of so that 5 is a basis. 

H 

We end with a short list of results that depend on (AC) for their validity. 

• Every vector space has a basis. Moreover, any two bases have the same cardinality. 

• Tychonov's Theorem in topology: The product Hie/ of arbitrary family of com- 
pact spaces is compact (in the product topology). Tychonov's Theorem is equivalent to 
(AC). 



Set Theory 



90 



• Nielsen-Schreier Theorem in group theory: A subgroup of a free group is free. 

• Hahn-Banach Theorem in functional analysis. This leads to various infinite-dimensional 
separating hyperplanc theorems, which assert that any two disjoint closed convex sets 
can be separated by a hyperplane. 

• Every field has a unique algebraic closure (up to isomorphism). 

• Compactness Theorem in logic: If every finite subset of a set S of first-order sentences 
has a model, then E has a model. 

6.4 Cardinal Arithmetic in ZFC 

(AC) simplifies cardinal arithmetic considerably. It also extends its range to infinite sums 
and products. 

Recall that, by definition, \X\ < \Y\ if and only if there is an injection X ^ Y. Every 
injection has a left inverse (without using (AC)) which is a surjection. Using (AC), every 
surjection has a right-inverse, which is an injection. Hence (using (AC)), \X\ < \Y\ if and 
only if there is a surjection Y ^ X . 

Also recall that the alephs arc the cardinals of wcU-orderable sets: If X is well-order able, 
then there is an order isomorphism (and thus a bijection) from some ordinal to X. Then \X\ 
is defined to be the least ordinal for which there is a bijection onto X. Now, since the Axiom 
of Choice implies that every set can be well-ordered, every infinite cardinal |X| is an aleph, 
i.e. for every infinite set X there is an a € ON such that |X| = K^. 

Thus the cardinals form a well-ordered class 

0<1<2<. ..,<i^o< i^i <••-,< i^u;<--- 

Every infinite cardinal appears on the list '■ ol G ON], i.e. the class of ordinals and the 

class of alcphs are "order-isomorphic" as classes. 
We defined the alephs inductively as follows 

Ko := 

Kq+i := for successor ordinals a + 1 
Ka '■= sup{Kq, : a < A} for limit ordinals A 

For obvious reasons, we call a cardinal of the form Kq+i a successor cardinal. A cardinal of 
the form where A is a limit ordinal, is called a limit cardinal. Every cardinal is either a 
limit cardinal or a successor cardinal. 

For example ^uj-,^uji are limit cardinals, whereas ^i., ^^j^i^^^^^j^^.^j^^ are successor car- 
dinals. 

In particular, any two cardinals are comparable (this is the Law of Trichotomy): If X,Y 
are sets, then either |X| < |y| or |y| < \X\. For if \X\ = and \Y\ = K^, then < if 
and only if a < /3. 

Furthermore, as we showed earlier that + = • = ^a-, using the canonical 
well-ordering on ON x ON, we must have « + = = for every infinite cardinal k. It 
follows easily that 

n + \ = K ■ \ = max{K, A} 



91 



Axiom of Choice 



for any two infinite cardinals k,X. 

Here is an oft-used fact that reUes on (AC): 

Theorem 6.4.1 (a) The union of a countable family of countable sets is countable. 

(b) If X is a family of sets, then 

\[jX\< \X\-svip{\X\ -.XeX] 

Proof: (a) Let X := {X„ : n < cj} be a countable collection of countable sets. For each 
n < oj, there is a surjection /„ : a; ^ Xn : m i-> Xn,m which lists the elements of i.e. 

Xn ■■= {fn{m) :m<u} = {xn,m : m < u;} 

Choose such an for each n < u. We then have a surjection 

/ : a; X a; -» [J A" : (n, m) Xn,m 

Since we know |a; x a;| = Kq • Kq = Kq, we see that (J is countable. 

(b) Let K := \X\ and let A := sup{|X| : X G X}. Thus we can enumerate 

X := {Xa : a < k} 
Then, for each a < k, we can choose an enumeration 

Xa ■= {xa,i3 '■ f3 < Xa} whcrc = Aq, := \Xa\ < A 
Fix xq g\JX, and define a map p : k x X ^ \JX hy 

, f Xa,fi if ^ < Aa 

I xq else 

Then p is surjective, so | [J -Y] < k • A. 

H 

Proposition 6.4.2 If2<K<X and X is infinite, then 

= 2^ 

Proof: 2^ < < (2'')^ = 2'^'^ = 2^, since k-X = max{K, A} = A. 

H 

It follows, for example, that = 2^°. 

With (AC), we can define infinite sums and products of cardinals as follows: 

Definition 6.4.3 (a) Let {ki : i G /} be an indexed set of cardinals. Let {Xi : i G 1} be a 
family of pairwise disjoint sets such that \Xi\ = Ki. Define 



Set Theory 



92 



(b) Let {Ki : i G 1} be an indexed set of cardinals. Let {Xi : i G /} be a family of sets such 
that \Xi\ = Ki. Define 

llKi = \llXi\ 

iei iei 

where Hie/ -^i set of all choice functions, i.e. the set of all functions f : I ^ [Jiei 
with the property that f{i) G Xi for all iei. 

(c) If K, A are cardinals, define 

K^'^ := sup{/c^ : // a cardinal, /j, < X} 

□ 

Remarks 6.4.4 Observe that without (AC), it does not follow that J2iei '^i well-defined. 
One could have \Xi\ = \Yi\ for all i G /, without having lUie/^'^il = lUie/^^l (where 

{Xi : i G /} and {1^ : i G /} are families of pairwise disjoint sets). The reason is that to 
construct a bijection which shows that | IJjg/ Xj| = | IJjg/ Yi\, one first has to choose bijections 
fi:Xi>—^ Yi that show \Xi\ = \Yi\. 

Thus (AC) is necessary to make the definition of Yliei '^i independent of the sets Xj. A 
similar remark holds for Hie/ 

□ 

Remarks 6.4.5 1. Observe that if for all i G I, and if \I\ = A, then 

Ki = K ■ X 

iei 

i.e. sums and products behave as they ough to. For if X is a set of cardinality \X\ = k, 
and if we set Xi := X x {i} for all i G /, then IJie/ X x I are the same set. Now 

by definition X^jgj Ki = \ Ujg/ -^^l k ■ X = \X x I\. 

2. Further observe that if for all i G /, and if |/| = A, then 

YlKi = K^ 

iei 

i.e. products and exponentials behave as they ought to: For if X is a set of cardinality 
\X\ = K, and if we set Xi := X for all i & I, then Yiiel -^i °^ functions from 

/ to X, i.e. Hie/^i ^^'^ -^^ ^^^^ same set. Now by definition. Hie/ '^i — \ W.iei-^i\ 
and K^ = \X^\. 

3. Next, observe that 

iei iei 

For if \Xi\ = Ki, then there is an obvious bijection ^ : nie/^'^i'* ~^ (Wiei ■^i)'^ defined 
as follows. If / G Yli^iXf, then f : I ^ Uiel lias fi := f{i) : X ^ Xi for each 
i G /. Thus fi{S,) G Xi for each i G I and ^ < A. Define / : A ^ Hiei^i 

fm) = Mo = fm)- 



93 



Axiom of Choice 



4. Similarly, 

□ 

Proposition 6.4.6 (a) Suppose that A is an infinite cardinal. If Ki > for all i < X, then 

Ki = X - sup Ki 

(b) Suppose that X is an infinite cardinal. : i < XJ is an increasing sequence of non-zero 

cardinals, then 

JJ = (sup Ki)^ 

Proof: (a) Let a := "^i^x '^i' ^ supj^i : i < A}. Clearly ^ < a. Observe that 

a = Ki < n = X ■ jJL < X ■ a (★) 

i<\ i<X 

Further note that, since each > 0, 

X = ^\ <^Ki = a 

i<\ i<\ 

Hence A ■ cr = cr, so by (★) it follows that a = X - jjl. 

(b) Let vr := Hka let := supjKj : i < A}. Observe that since the Ki are non-zero, 

we have tt > /x. Now since Ki < jj, for all i, we have 

To prove the reverse inequality, recall that ^^^x A = A x A = A. Thus we may partition A 
into a family {Aj : j < A} of pairwise disjoint sets with cardinalities \Aj\ = A. Then 

TT = n «i = n ^ ) = n = 

i<\ j<\ ieAj j<\ 

(The fact that Y\i<\ = n7<A(n7eA ^*^y establish.) 



If A is a set and k a cardinal, define 

[A]^ := {X C A : \X\ = A} [A]<^ — {X^A: \X\ < X} 
Proposition 6.4.7 (a) If \A\ = k > X, then 



K^ 



Set Theory 



94 



(b) If \A\ = K> X, and A is an infinite cardinal, then 

[A\<^ 



Proof: (a) If / G A^, then / : A -> ^, so / C A x A, and |/| = A. Since \\ x A\ = X ■ k = k, 
we see that < |[A x = It follows that = \A^\. On the other hand, for 

every X G [A]^, there is at least one function Fx : X ^ A whose range is X. Hence 
[^]^| < \A^\ = K^. 
(b) We have 

[A]<' = u [Ar 

/U a cardinal <A 

SO by (a) 

I [^]^| = ^ = A • sup{«;'' : n a cardinal < A} = A • k^-^ = 

II a cardinal <\ 

H 

Lastly, we state and prove: 
Theorem 6.4.8 (Konig's Theorem) // < Aj for all i G I, then 

iel i&I 

Proof: Suppose, on the contrary that Hie/ — ^iei l"'^*! ~ 

Since | Hie/ -^il ^ Sis/ '^i^ '^^^ ^^'^ * partition {1^ : z G /} of Ilie/ "'^i ^^^h that < Kj. 
Here, the Yi are pairwise disjoint, and IJie/ ^ ~ Tlie/ 

For each i G /, let Pj be the projection of Yi onto the i*^ coordinate, i.e. 

Observe that Pi C Xj. Now |Pj| < \Yi\ = Ki < Aj = \Xi\, so there is Xi £ Xi — Pi for each 
i G I. Define h : I —> [Ji^j Xi : i ^ Xi. Then certainly h G Ilie/"'^*' Since h{i) Pi for any 
i G /, we see that h ^Yi for any i G /. It follows that h G niei" "'^^ ~ Uie/ ^ — contradiction. 

H 

Corollary 6.4.9 For every cardinal k, we have k < 2". 
Proof: 1 < 2. Now 

i<K i<K; 

H 

Observe that we require the inequality Ki < Aj to be strict in Konig's Theorem. For 
example, if Kn = Xn = 2^° for all n < Kq, then 

2^0 = Ko • 2^° = 2^0 

n<Ko 

and 

Y[ 2^0 = (2^0)^0 = 2^0 
n<Ko 



95 



Axiom of Choice 



6.5 Length, Area, Volume and (AC) 

"Length" is a non-negative real number associated with certain subsets of the real line M, i.e. 
it is a function £(•) which assigns to a subset C M its length C{E). Intuitively, the length 
function should have certain properties: 



l£. If C M is bounded, then <£(£;)< oo. 

2c- C{-) is additive: If F are disjoint bounded subsets of M, then C{E U F) = C{E) + 

More generally, £(•) is countably additive: if Ei, E2, ■ ■ ■ En, ... (n G N) is a sequence 
of mutually disjoint bounded subsets M, then >C(U^i ^n) = X^^i J~-{En)- 

3c- is translation invariant: If a set F is obtained by shifting a set E, then C{F) = 
C{E)- 

Ac- If is a interval, then C{E) = length of interval. 

Using an argument due to Vitali in 1905, we show that it is impossible to assign a length 
to every bounded subset of M, i.e. there is no function which satisfies each of the properties 
(l)-(4) of C{-) above, and which is defined for every bounded subset of M. Thus there are 
subsets of M which have no length. This does not mean that these sets have zero length; it 
means that there is no number which can be called their length, and which is consistent with 
(l)-(4). 

Example 6.5.1 Define an equivalence relation ~ on M by 

X ~ y *^=^ y — X G Q 

Let {Ei : i G /} be an enumeration of the equivalence classes of ~. Note that if x G M, then 
there exists ^ G Q such that < x + g < 1. Now since x ~ x + 5, we see that for every x 
there is y G [0, 1] such that x ~ y. Thus [0, 1] H / for every i G /. 

Now piclij^for each i G / one Xj G [0, 1] n Ei, and define a Vitali set H hy H := {xj :G /}. 
Thus for each y G M there is a unique i £ I such that y ^ Xi- 

For q £ Q, define H + q := {xi + q : i G /}. First note that the H + q are mutually disjoint: 
For if y G {H + q) r\ {H + g') for rational numbers g, g', then y = Xi + q = xj + q' for some 
i,j G /, and thus Xj = xj + {q' — q), i.e. Xj ~ Xj- It follows that Xj = Xj, thus that q = q' , 
and thus that H + q = H + q' . 

Next, we claim that for each y G M there is a unique g G Q such that y G H + q := {xi + q : 
i £ I}. Indeed, existence follows from the fact that there is an i G / such that y ^ Xi, so that 
q := y — Xi has the property that y G H + q. Uniqueness follows from the disjointness of the 
H + q- 

Now let {qn : n G N} be an enumeration of Q Pi [—1, 1]. Note that if x G [0, 1], there is a 
unique i £ I such that x — Xj G Q H [—1, 1], so that x G UneN(-^ + Since H C [0, 1], we 
also have UneN(-^ + Qn) ^ [-1,2]. Thus: 

[0,1] c\J{H + qn)^ [-1,2] 

neN 



■^This requires the Axiom of Choice. 



Set Theory 



96 



Now suppose that the Vitah set H has an length i.e. that C{H) exists. Each H + qn is a 
translation of if, and thus C{H+qn) = C{H) for all n G N. Now since [0, 1] C \J^^^{H+qn) C 
[-1,2], it follows that 



The fact that 1 < Y.n=i^^^) implies that C{H) > 0, whereas the fact that E^=i'C(^) < 
3 < oo implies that C{H) — contradiction. Hence C M is a bounded set to which a 
length cannot be assigned. 



Just so, under (AC), some subsets of fail to have an area, and some subsets of fail to 
have a volume. The latter seems particularly counterintuitive: Given any subset of M^, to 
find its volume, just dump it in a full bucket of water, and measure how much water flows 
out. 

It gets even worse, as will be shown in the next section. 

6.6 Project: The Banach— Tarski Theorem 

The paradoxical decomposition of the unit ball U in due to Banach and Tarski (1924) is 
one reason why the Axiom of Choice may be regarded with some suspicion. It says that the 
unit ball U can be broken up into finitely many pieces (into five pieces, in fact), and that 
these pieces can be reassembled to form two new unit balls! 

First, we introduce some terminology and notation so that we can phrase the Banach- 
Tarski Theorem. 

• An isometry on is a distance preserving map, i.e. a map / : ^ such that 
||/(x) — /(y)|| = ||x — y|| for all x, y G M^. Observe that any isometry is clearly a 
bijection. 

Translations and rotations (and compositions thereof) are isometries. 

• Two sets A^B C are congruent if and only if there is an isometry / : — )■ 

so that f[A\ = B. This means that A, B have the same shape — think of congruent 
triangles — and that / moves the figure A to the figure B. 

We write A = B \i A, B are congruent sets. 

• We say that a set A can be reassembled into a set B if there is a partition (^i)i<n of A 
and a partition {Bi)i^n of B (into the same finite number of pieces) such that Ai = Bi 
for all i < n. 

In that case, we write A^ B. 

The idea is that the set A can be broken up into pieces Ai, that each piece Ai can be 
moved by an isometry to Bi, and that the pieces Bi make up the set B. Thus A can be 
broken up into finitely many pieces, and the pieces can be reassembled to form B. 



\n=l / 
Furthermore, the sets H + qn are mutually disjoint, so 





□ 



97 



Axiom of Choice 



Theorem: (Banach-Tarski) 

The unit ball U in can be partitioned into two disjoint sets U = U1UU2 such that 

Ui ^ U and U2 ^ U 

□ 



Problem 1. 

(a) Show that « is an equivalence relation on P(M^). 

(b) Show that if (^j)i<n is a partition of A and {Bj)i^n is a partition of B such that Ai ~ Bi 
for all i < n, then Am B. 

(c) Prove a "Schroder-Bernstein Theorem" for ss: If A^ C B C A and A, then S A. 
[Hint: Let (Aj)i<„ and (^j^)j<„ be partitions of A and A^, and let fi : Ai ^ A\ be 
isometrics so that fi[Ai] = A\. Define / = Ui<n /»' / agrees with /j on ^Ij. Define 

^° = A, = /[A'^], 5° = 5, B"+i = /[B'^] 

Let C := Un<a;(^" - -S")- Observe that f[C] C C, and that /[C] !==j C. Further observe 
that A = C\J{A-C) &ud B = f[C\ U (A - C), and deduce that A S.] 

□ 

Our next aim is to partition the unit ball U into five pieces which can be reassembled to 
form two unit balls. To do that we will partition the unit sphere S into pieces having certain 
properties. {S is the boundary of U .) Let c denote the centre of the unit ball. Each X <^ S 
determines a unique X <^U as follows: 

X := set of all points in ?7 — {c} whose projection along a ray 
from c to the surface S lies in X. 

Clearly, \i X,Y C. S are such that X then also X KiY . 

Consider now two axes do,di in centred at c. For dcfinitcncss, assume that do is the 
Z-axis, and that di lies in the XZ-plsme at an angle 6 to the Z-axis. Consider two rotations: 

ijj = Rotation by 7r/3 about do (p = Rotation by tt about di 

Observe that tp, ip are isometries which map S to S. 

Let / be the identity. Note that if)^ = I and (p^ = /, so that (p~^ = ip,ip'''^ = ip^- 
Consider the group G of all rotations generated by ip, (p. Take, for example, the element 

a := (poipoipoloil^oip. Using the facts that 1]/' = I and p'^ = /, we can reduce this to 
a = o ip. In this way, every element of G has an irreducible representation of the form 

(p o ip^^ o (p o tp^^ o . . . or tp^^ o (p o tp^^ o (p o . . . 

Each such an irreducible representation has a length: 

1{I)=0, = l{',p^^) = 1, l{ip o ^p^'^) = l{ip^^ o ^) = 2, l{p) o ^p^^ o p)) = 2, . . . 



Set Theory 



98 



Now the group G depends on the angle 9 between (Iq and di. We want to show that 
there is so that every element of G has a unique irreducible representation. Equivalently, 
we want to show that we can choose 9 so that no non-trivial irreducible representation yields 
the identity /. 

Problem 2. 

(a) Show that each element of a G G can be represented by an orthogonal matrix (rotation 
matrix) whose entries are bivariate polynomials in cos 9 and sin 9 (i.e. for each entry aij 
of the matrix a there is a polynomial Pij{x, y) so that a^j = pij{cos9, sin 9)). 

[Hint: A rotation by tt about di can be effected by rotating the axis di by 9 onto the 
Z-axis, then rotating by tt, and then rotating the axis by —9 back to its original position.] 

(b) Show that an irreducible representation a G G of length > 1 yields the identity element 
if and only if z := e*^ is a solution to a system of polynomials. 



[Hint: cos^ = 




(c) Deduce that for each a G G there are at most finitely many 9 so that a yields the identity 
element. Conclude that there exists a so that no irreducible representation of length 
> 1 yields the identity element. 

□ 

Remarks: We've tried to avoid group theory in the above. What we were actually doing, 
should you know some group theory, is to show that the free product Z2 * Z3 of the groups 

(if) = Z2 and (ip) = Z3 is embeddable into the group SO(M^). If Go is the group of rotations 
generated by ip, tp when the angle between do, di is 9, then there is an obvious homomorphism 
Tr^) : Z2 * Z3 — )■ Ge- We showed that there exists 9 such that ng is an isomorphism. 

□ 

Henceforth, fix a so that no irreducible representation of length > 1 yields the identity of 
the corresponding group G. Then every element of G has a unique irreducible representation. 
We define a partition G = ^Ui?UGby induction on the length of irreducible representations. 
The elements of length < 1 are classified as follows: 

Now classify the elements of length > 2 according to the following rules: 

• If a starts with ip^'^, then 

(1) a e A =^ ifoae B. 

(2) a e BUG =^ (poae A. 

Observe that if a starts with (p, i.e. if a = (po/3, then ipoa has irreducible representation 
/3, since (po a = (p o o (3 = p. Since the length of /3 is < the length of a, o a = ^ has 
already been classified. 



99 



Axiom of Choice 



• If a starts with then 

tpo a e B and il^"'^ o a e C. 
ijjo a e C and 'tp~^ o a e A. 
ijjo a e A and ^"-^ o a G -B. 

If a starts with tp"^^, then ■0='=^ o a has aheady been classified. 
Problem 3: Show that 

(p[A] = BuC ip[A]=B tp-\A] = C 

[Hint: To show ip[A] C B L) C , let a G A. If q starts with ip, then a = ip o (3 for some 
/3 G i? U C, by rule (2). Hence ipoa = (3GB\JC. Else, if a starts with i^^^, then by rule 
(1), ^poa e B C BUC. Hence (p o a e B U C ioi all a e A. 

Conversely, to show B U C C ip[A], suppose that a £ B L) C. If a starts with (p, then 
a = ip o p for some (3 £ A (and thus ip o /3 E B), so a E p>[A]. Else if a starts with ip^^, then 

o a G A, by (2), so a = o (99 o q) G <^[A]. Hence a G <^[A] for all a G -B U C. 
The other equations can be proved similarly.] 

□ 

We now use the partition G = AUBUC to construct a partition S = X\JY\JZ\jQof 
the unit sphere. This, in turn, will induce a partition of the unit ball B. 

Problem 4: We show that there is a partition S = XUYUZUQof the unit sphere S, 
where 

\Q\=^o and X = Y = Z = YUZ 

(a) Observe that |G| = b^o- Also note that every rotation of S other than the identity has 
exactly two fixed points. Let Q be the set of all x G 5 so that a; is a fixed point of some 
aeG- {I}. Conclude that \Q\ = \Ho\. 

(b) For X G 5 - Q, let 

:= {a{x) : a G G} 

Define a binary relation Ron S — Q by xRy < — > x e Py. Show that R is an equivalence 
relation. 

(c) The Axiom of Choice implies that there is a set M which contains exactly one element 
of each equivalence class of R. Define 

X --A-M ■-{a{x):xeM,aeA}, Y := B ■ M, Z = C-M 

where G = AU B U C is the partition of G constructed earlier. Show that 

ip[X] = YUZ, iPlX] = Y, ^~'^[X] = Z 



{3) aeA =^ 
{A) aeB =^ 
(5) a G C =^ 



(d) Conclude that X = Y = Z = YUZ. 



Set Theory 



100 



□ 

Now recall that if X C S, then X is the set of all points inU — {c} whose projection along 
a ray from the centre c onto the surface S lies in X. The partition S = XUYUZUQofS 
therefore induces a partition 

U = XUYUZUQU{c} 

of the unit ball. Moreover, clearly 

X f-^ Y Z Y U Z 

Problem 5: 

(a) Show that Y U Z ^ X \J {Y U Z). Deduce that 

X^Y^Z^YUZ^XUYUZ 

(b) Show that U ^ X\jQU{c}. 

(c) Show that there is a rotation tt G such that Tr[Q] Ci Q = 0. Conclude that 7t[Q] C 
XUYUZ. Using the fact that Z Ki XUYUZ, deduce that there is T C such that 

[Hint: Choose an axis of rotation whose poles are not members of Q. For each angle 
5 G [0, 27r), let its denote a rotation by S about this axis. For q G Q, define 
Aq := {S G [0, 27r) : Trs{q) G Q}. Then A := [j^^g Ag is countable. Choose S G [0, 27r)-A, 
and let tt := tts-] 

(d) Let p & Z — T. Observe that 

U^XUQU{c}^Yufu{p}CYUZCU 
Conclude that YUZ ^U. 

(e) Finally note that U = UiUU2, where Ui := X[jQU{c} and U2:=YUZ. This concludes 
the proof of the Banach-Tarski Theorem. 

□