Full text of "Set Theory"

See other formats

UMIVERSITEIT-STELLEIMBOSCH-UIMIVERSITY

jou kennisvennoot • your knowledge partner

SET THEORY

P. Ouwehand

Department of Mathematical Sciences
Stellenbosch University

2012

i

ii

ZFC

ZFC 0. Set Existence: There is a set.

3x{x = x)

ZFC 1. Extensionality: If sets x, y have the same elements, then x = y.

Vx Vy [V2;(2; Gx-H-zGy)— >-a; = y]

ZFC 2.tp Separation Schema: If P{x) is a property of sets (with parameter p) describable by a
first-order sentence ip{x,p), then for any set Z and any p there is a set F = {x G Z : P{x)}
which consists precisely of those elements of Z which have the property P.

yzyp^y^x^xEy-^-^xEzA if{x,p))

ZFC 3. Pairing: For any x,y there is a set z = {x,y} whose elements are precisely x and y.

Vx Vy 3z [Vit; {w E z -(-^ w = x'V w = y)]

ZFC 4. Union: For every set x there is a set y = |J x which is the union of all the elements of x.

Vx 3y Vz G y ■<->■ 3w G x (z G w)]

ZFC 5. Power Set: For any set x there is a set y = V{x) whose elements are precisely all the subsets
of X.

Vx 3y G y -H- z C x)

ZFC 6. Infinity: There is an infinite set.

3x [0 G X A Vy (y G X ^ y U {y} G x)]

ZFC 7. Choice: Every family of non-empty sets has a choice function: If x is a family of non-empty
sets, then there is a function f on x which chooses from each it; G x an element f{w) G w.

Vx \iw G X ^ 0) ^ 3/ ( / is a function A dom(/) = x A ^/w G x {f{w) G w))]

ZFC 8.,^ Replacement Schema: A relation R{u, v) between sets which is describable by a first-
order sentence (^(n, v,p) (with parameter p) is said to be functionlike if for every u there is

at most one v such that R{u, v).

If R is functionlike and x is a set, then the image z = R[x] of x under i? is a set.

(yu Vv Vu; [ip{u, v,p) A <^(n, — >■ t; = w]^ — >■ Vx 3y V?; (?; G y ■<->■ 3n G x (^(-u,

ZFC 9. Foundation: Every non-empty set has a G-minimal element.

Vx [x 7^ -)• 3y G X (y n x) = 0]

iii

These notes are for a short course in set theory at the undergraduate level at Stellenbosch
University. No pretense at orignality is claimed. Though amplified by material from a number
of additional sources, the debt to the first few chapters of the book Set Theory, by Thomas
Jech, Springer 2003, should be easily discernible.

iv

Contents

1

The Axioms of Set Theory

1.1 Introduction!

1.2 Naive Set Theory

1.3 The Iterative Conception of Set

1.4 The Axioms of ZF(]I

1.5 Sets and ClassesI

12 Relations and FunctionsI

12.1 Ordered Pairs and Finite Cartesian ProductsI

12.2 Relationsl

12.3 FunctionsI

2.4 Equivalence Relations

12.5 Order Relationsl

13 Natural NumbersI

3.1 The Set uj of Natural NumbersI

3.2 ((J, g) is a Transitive Well-Ordered Set . . . .

3.3 RecursionI

13.4 Finite and Infinite SetsI

14 Ordinalsl

4.1 Well-Orderings

14.2 Ordinalsl

14.3 Transfinite Induction and Recursion!

14.4 Ordinal Arithmetic!

14.4.1 Limits!

4.4.3 Multiplication

4.4.4 Exponentiation

4.5 Project: Goodstein's Theorem

15 Cardinals I

5.1 Cardinality: Basic Definitions

15.2 Cardinal Arithmetic in ZF!

5.3 Alephs

5.4 The Canonical Well-Ordering of ON x ON

5.5 Arithmetic of Alephs

V

vi

CONTENTS

5.6 Project: Cardinal Arithmetic and The Axiom of Choice 79

The Axiom Of Choicel

83

83

6.2 Well-Ordering] 85

6.1 Choice Functions, Partitions and Cartesian Products

6.3

Zorn's Lemma and Other Maximal Principles] 87

90

95

96

6.4 Cardinal Arithmetic in ZFCl

6.5 Length, Area, Volume and (AC)

6.6 Project: The Banach-Tarski Theorem

Chapter 1

The Axioms of Set Theory

1.1 Introduction

Every mathematician needs to know some set theory. "Set theory", according to (the set

theorist) Kenneth Kuncn, "is a theory of everything". AH of mathematics can be done within
the framework of set theory. This means that all mathematical objects can be interpreted as
sets, and all mathematical statements about these objects can be phrased in the language of
sets. Not only can this be done: Typically, it is done.

We assume here that, to some extent, you are already aware of this. For example, a group
is an ordered tuple {G, , e), where

• G is a set.

• ■ : G X G ^ G is a function which plays the role of group multiplication.

• T-Gisa function which plays the role of multiplicative inverse

• e G G is an element which plays the role of multiplicative identity element.

You probably know that a function / : ^4 — t- i? is a set: It a subset of the Cartesian product
A X B, where we regard (a, 6) G / if and only if /(a) = b. The Cartesian product A x B of
two sets is itself a set, namely the set of ordered pairs (a, b) where a & A and b G B. An
ordered pair (a, b) can be interpreted as a set also, namely (a, b) := {{a}, {a, b}}. Ordered
tuples of higher order can be defined iteratively,

(a, 6, c) := ((a, 6), c) = {{(a, 6)}, {(a, 6), c}} = {{{{a}, {a, 6}}}, {{{a}, {a, 6}}, c}}

The group laws can then be written as equality of certain functions: For example, the asso-
ciative law {ab)c = a{bc) is the assertion that the functions given by the compositions

G X G X G ^ G X G ^ G : (a,b,c) ^ {ab, c) {ab)c

and

G X G X G ^ G X G ^ G : {a,b,c) ^ {a,bc) ^ a{bc)

are equal. Since functions are sets, the associative law for G is equivalent to the assertion that
two sets are equal. In principle, everything about groups can be formulated in the language of
sets. It will look horrendous, it is true, but it can be done. Thus the whole group (G, , e)

1

Set Theory

2

is a set — not just the base set G on which the group operations are defined — the whole
object {G, -'^ , e). And each element of (G, , e) is a set,. . . and each element of an element
of (G, -'^ , e) is a set, and. . . ,

In the same way all the objects you encounter in mathematics — natural numbers, com-
plex numbers, topological spaces, differentiable manifolds, tangent bundles. Boolean algebras,
measure spaces, stochastic processes, pseudo-differential operators — can be interpreted as
sets. We do not say that these objects are sets; indeed, we prefer to completely avoid meta-
physical speculation about what mathematical objects might be, and in what sense they might
be said to exist. We simply mean that we can construct sets that behave like these mathemat-
ical objects. This kind of thing is not new to you: You're probably quite used to regarding
real numbers as points on a line. But a point is not the same thing as a real number. "A
point," says Euclid, "is that which has no parts". A real number, on the other hand, is an
element of a complete ordered field. But, thought of in the right way, the points on a line can
be said to behave like the real numbers, and thus can be interpreted as real numbers.

So we will try to show that all of mathematics can be done in the set-theoretic universe.
Many of the set-theoretic ideas required are rather straightforward, and can be developed
without recourse to the axiomatic method (though every mathematician must be aware of
the potential pitfalls associated with a purely "naive " view of sets) .

But apart from its intrinsic importance as a fundamental vehicle for for doing other math-
ematics, set theory is also an independent branch of mathematics in its own right, regarded as
one of the four main branches of mathematical logic. In other words, set theory — like group
theory or number theory — is worth pursuing for its own sake: "Set theory is the mathemat-
ical science of the infinite" , according to the set theorist Thomas Jech. You probably already
know that the natural numbers N and the rational numbers Q both form countable sets — i.e.
they have the same size (cardinality): |N| = |Q|. If you know that, you probably also know
that the set of real numbers M is uncountable, and thus that infinity comes in different sizes,
with |N| < |M|. Some historians date the birth of the autonomous subject Set Theory to that
day (7 December 1873) that Georg Cantor — who is regarded as the creator of the subject
— wrote to Richard Dedekind to inform him of this fact. A short while later, in 1878, Cantor
framed the Continuum Problem: Is there a set A whose size is strictly between that of the
natural numbers and the real numbers, i.e. such that |N| < |^| < |M|? Cantor was unable
to settle this question, though he hypothesized that there is no such set A — the Continuum
Hypothesis. The M^er-mathematician David Hilbert was one of the first to appreciate the
power of set theory, and its importance to general mathematics. Accordingly the Continuum
Problem was first on the famous and influential list of 23 problems that he presented at the
Second International Congress of Mathematicians in Paris, 1900. This innocuous-appearing
question initially resisted all efforts, and was finally settled in a rather surprising way: It was
proved by Kurt Godel in 1938 that the Continuum Hypothesis cannot be disproved, and then
by Paul Cohen in 1963 that the Continuum Hypothesis cannot be proved! In order to obtain
such results, it was necessary to write down a precise list of axioms for set theory — the
Zermelo-Praenkel Axioms — so that there would be agreement on what constitutes a valid
set-theoretic proof. What Godel and Cohen proved was that the Continuum Hypothesis is
independent of these axioms. (There remains open the possibility that mathematicians of the
future will discover a new plausible axiom for set theory, which will then settle the Continuum
Problem in one way or the otheirl) Set theory is riddled with problems like these, where it

^Indeed, some mathematicians believe that the Continuum Hypothesis was recently settled by the set

3

Axioms of Set Theory

can be proved that the problem is unsolvabl^ Such independence proofs require methods
from logic (model theory, proof theory, recursion theory), and thus an axiomatic approach is
here indispensable.

This chapter is concerned with the foundations of set theory, and a little philosophjj^
will be unavoidable. In fact, the title of the first ever monograph dedicated to the sub-
ject — Cantor's Grundlagen einer allgemeinen Mannigfaltigkeitslehre: Ein mathematisch-
philosophischer Versuch in der Lehren des Unendlichen, 1883 — makes clear that, in set
theory, mathematics and philosophy were inextricably intertwined right from the start.

Exercise 1.1.1 All active branches of mathematics have problems that are unsolved. It is
usually tacitly agreed that such an unproved statement is either true or false. But set theory
has problems that are provably unsolvable. This leads to a philosophical question: Do you
believe that the Continuum Hypothesis must be either true or false — we just do not know
which? Or do you believe that it simply doesn't make sense to ask whether the Continuum
Hypothesis is true or false? Can you back up your belief with an argument?

□

1.2 Naive Set Theory

Classifying objects into sets is a basic operation of the human mind. It is more basic even
than counting: Before you can count some objects, you first have to decide which objects to
count, i.e. you first have to mentally group them together, and separate them from objects
you do not want to count. Nothing seems more natural, therefore, than to collect objects
which share a common property into a set. Naive Set Theory (NST) is a formal theory which
allows us to do just that. It has the following axioms:

NST 1. Extensionality: If two sets X,Y have the same elements, then X = Y.
NST 2.p Full Separation: If P(x) is a property of sets, then there is a set

Y = {x: P{x)}

which consists precisely of those x which possess the property P.

Observe that the Full Separation Axiom is not a single axiom, but an axiom schema: We
have an axiom for every possible property P(-) (though right now we won't be too precise
about exactly what constitutes a "property"). Thus there are infinitely many axioms.

The first axiom states that a set is determined by its members, and by nothing else. In a
sense, it tells us what it means to be a set. The second axiom allows us to show that many
sets actually exist. For example:

theorist Hugh Woodin, and that it is false. Others dispute this.

^Assuming that the axioms of set theory are consistent, that is. But Godel proved that we cannot prove
that the axioms of a set theory are consistent without assuming the consistency of some more powerful theory.
In other words, it is impossible to prove from the axioms of set theory that set theory is consistent.

''l highly recommend browsing the online Stanford Encyclopedia of Philosophy, http: //plato . Stanford,
ledu for good introductions to the philosphy of mathematics, and much more besides.

Set Theory

4

• If P{x) = {x x), then {x : P{x)} is a set with no elements (since x = x for any x). By
the Axiom of Extensionahty any two empty sets are equal, because they have exactly
the same elements. Hence a unique empty set exists, and we denote it by 0.

• If P(x) = (x = x), then {x : P{x)} is the set of all sets: Any set x is equal to itself.

• If P{x) = (x = a V X = 6), then {x : P{x)} is the set {a, b}.

• If P{x) = {\/z £ X {z £ a)), then {x : P{x)} is the power set of a, i.e. the set of all
subsets of a.

• If P{x) = (x is a real number > 0), then {x : P{x)} is the set of all real numbers greater
or equal to zero. (At this point, we do not yet know that there are any real numbers,
so it is at this stage possible that this set is empty.)

Towards the end of the 19th Century, mathematicians became increasingly more concerned
with the foundations of mathematical knowledge, and the status of mathematical "reality" .
Some logicians — amongst whom were Gottlob Frege and Bertrand Russell — tried to reduce
mathematics to logic. In June 1902, Russell wrote a letter to Frege, warmly expressing his
appreciation for the latter's work, but adding that "There is just one point where I have
encountered a difficulty" . Russell asked Frege to consider the property

P{x) = {x ^ x)

If y = {x : X ^ x} exists (which, according to the Full Separation Axiom, it must), then there
are two possibilities: Either y £ y or y ^ y.

(i) If y G y, then P{y) must hold (since y consists of exactly those sets which have property
P). In that case y ^ y — contradiction.

(ii) li y ^ y, then certainly P{y) holds. So y is one of the sets that has property P, and
therefore y £ {x : P{x)}, i.e. y £ y — contradiction.

Thus the theory NST is inconsistent! Russell's paradox had devastating consequences for

Frege's programme to reduce arithmetic to logic.

Russell's paradoi^ (also called Russell's antinomy) is just the simplest example of the

problems that may arise from a too liberal notion of set. Cantor himself was aware of such

problems by 1899 at the latest, and dealt with this by denying the status of sethood to certain

collections (multiplicities).

For a multiplicity can be such that the assumption that all of its elements
'are together' leads to a contradiction, so that it is impossible to conceive
of the multiplicity as 'one finished thing'. Such multiplicities, I call
absolutely infinite or inconsistent multiplicities. As we can readily see,
the 'totality of everything thinkable ', for example, is such a multiplicity;
later still other examples will turn up.

If on the other hand, the totality of the elements of a multiplicity can be
though of without contradiction as 'being together', so that they can be
gathered together into 'one thing', I call it a consistent multiplicity or
set.

— Cantor, letter to Dedekind, 1899.

^Apparently, Ernst Zermelo was aware of Russell's paradox some years before. But it was Russell who
understood its devastating consequences, and who brought it to the attention of the wider world.

5

Axioms of Set Theory

Thus, according to Cantor, the 'totahty of everything thinkable' is inconsistent: There is no
set of all set^

If we can't just lump anything thinkable together into a set, then what can we do? A
number of different approaches have been suggested. Some mathematicians and philosophers
thought that Russell's paradox indicated that there were problems not just with an intuitive
notion of set, but with logic itself. (It is easy to rephrase Russell's paradox in purely logical
term^. The intuitionists, for example, thought that the basic laws of logic do not apply
to infinite collections. Amongst others, they denied the validity of the Law of the Excluded
Middl^ Other mathematicians thought that classical logic was inherently sound, and that
one need merely analyse the notion of set with greater care and precision, if one wants to
avoid contradictions and inconsistencies. This is the approach we will take. But avoiding an
inconsistent set theory is harder than one might think. Firstly, a number of set theories have
been proposed with the aim of avoiding Russell's paradox, but which subsequently turned out
to be inconsistent after all — notably the system ML (Mathematical Logic) proposed in 1940
by the great logician Willard van Orman Quine. And many of these set theories are artificial:
They are, as Russell commented, not "such as even the cleverest logician would have thought

In the next section, we will look at an intuitively compelling view of the set-theoretic
universe, in which sets are built up in stages, with each set made up from simpler sets.
This picture of the set-theoretic universe then suggests certain axioms for set formation, the
Zermelo-Fraenkel Axioms, or ZF. If ZF set theory is consistent, then we may never know it:
Godel's 2""^ Incompleteness Theorem states (roughly) that a theory that is strong enough to
contain basic arithmetic cannot prove its own consistency. If ZF is inconsistent, then one day
we may find a contradiction. However, ZF has been used for nearly a century already, and
none has been found as yelrl

^How did Cantor come to this conclusion? Define the power set V{X) to be the set of all subsets of X.
Cantor proved that \X\ < \'P{X)\. Now if X is the set of all sets, then Vi^X) C X, since every member
of 'P{X) is a set. But then [^(X)! < \X\, and hence the assumption that all sets can be collected into a
published in 1932.

^ Here's how: "It seems to make perfect sense to inquire of a property whether it applies to itself or not.
The property of being red, for instance, does not apply to itself, since red is surely not red, whereas (the
property of being) abstract, being itself abstract, applies to itself. Calling the property of not applying to
itself 'impredicable', we arrive at the paradoxical consequence that impredicable is impredicable if and only
if impredicable is not impredicable. The property-theoretical (logical) variant is as paradoxical as the set-
theoretical (mathematical) one." — Foundations of Set Theory, by Fraenkel, Bar-Hillel and Levy, 2nd revised
edition, Springer 1973.

^This is the law which asserts that for any statement A, either A is true or A is false. The intuitionists
therefore denied that the tautology A V ^A is true.

*In September 2011, the mathematician Edward Nelson announced that arithmetic (to be precise, Peano
Arithmetic (PA)) is inconsistent. His proof turned out to have an error. Nelson is hardly a "lightweight"
mathematician, and this episode reveals that serious mathematicians may have legitimate qualms about the
consistency of even the most basic and longstanding mathematical theories.

By Godel's 2"^* Incompleteness Theorem, PA is not strong enough to prove its own consistency. ZF is strong
enough to prove that PA is consistent [ZF h Con(Pyl)), but since we don't know if ZF is consistent, that proof
is also suspect. (In 1936 the logician Gerhard Gentzen proved that PA is consistent. His proof goes beyond
ordinary arithmetic by requiring a transfinite induction, but only up to a countable ordinal.)

Set Theory

6

1.3 The Iterative Conception of Set

Definition: By a "set" we mean any collection M into a whole of
definite distinct objects m (which are called the "elements" of M) of our
perception or of our thought.

— Georg Cantor, 1895

(First sentence of Contributions to the Founding of the Theory of Trans-
finite Numbers.)

A set, according to Cantor, is 'any collection . . . , into a whole of def-
inite, well- distinguished objects... of our intuition or thought.' Cantor
also defined a set as a 'many, which can be thought of as one, i.e., a
totality of definite elements that can be combined into a whole by a law. '
One might object to the first definition on the grounds that it uses the
concepts of collection and whole, which are notions no better understood
than that of set, that there ought to be sets of objects that are not objects
of our thought, that 'intuition' is a term laden with a theory of knowl-
edge that no one should believe, that any object is 'definite, ' that there
should be sets of ill- distinguished objects, such as waves and trains, etc.,
etc. And one might object to the second on the grounds that a many' is
ungrammatical, that if something is a many" it should hardly be thought
of as one, thai totality is as obscure as set, that it is far from clear how
laws can combine anything into a whole, that there ought to be other
combinations into a whole than those effected by laws, ' etc., etc.

— George Boolos, 1971

(Opening lines The Iterative Conception of Set,
J. Philosophy 68 no.8, 1971, pp. 215-231.)

Russell's paradox involved the multiplicity of all sets which are not elements of themselves.
Can a set belong to itself?

Looked at one way, the answer appears to be "Yes!" : Let A be the set of all sets definable,
in English, using fewer than 100 words. We have just defined A in English using fewer than
100 words. Hence A E A. However, "definable in English" is a rather obscure term. English
has just finitely many words, so there are just finitely many sentences of length < 100 words.
Thus there are at most finitely many natural numbers definable in < 100 words. Now let n
be the smallest natural number not definable in fewer than 100 words Contradiction. . .

Another example is the set

X:= {{{...0...}}}

where there are infinitely many brackets arount the empty set. Then clearly {X} = X, so

X ex.

But looked at from another angle, the answer to our question appears to be "No!" , however:
In order to collect some objects into a set — i.e in order to make a set — one has to have
those objects first. Thus the elements which make up a set are, in some sense, more primitive
than the set itself. As an analogy, one can think of the natural numbers as being "created"
from the number by the operation of repeatedly adding 1. We start with zero, then we make

7

Axioms of Set Theory

1, then we make 2, then we make 3, etc. Now it must not be thought that natural numbers
are continuously being created — "make" is just a manner of speaking. In the same way,
one can think of sets of being made up as follows: We start off with some primitive objects
called atoms (also called individuals, or urelemente), which are not sets, and have no elements
themselves. At stage 0, we construct all sets of atoms. Then at stage 1, we construct all sets
made up of atoms and sets created at stage 0. At stage 2, we construct all sets consisting of
atoms and sets created at stages and 1. Thus if we start with two atoms a, b, then we get
the following:

Stage 0: New sets 0, {a}, {b}, {a, b}.

Stage 1: New sets {0}, {{a}}, {{6}}, {{a, b}}, {0, {a}}, {0, {b}}, {0, {a, b}}, {0, {a}, {6}}, . . .
Stage 2: New sets {{0}}, {0, {0}}, {{{a}}}, ■ ■ •
Stage 3: . . .

We do this for all the stages 0,1,2, But it doesn't stop there! Once we have all these sets,

we can construct sets made up of atoms and sets constructed at some finite stage. Thereafter
we can construct sets made up of atoms and sets just constructed, at the first infinite stage,
for example the set

{0, {0} , {{0}},{{{0}}},...} w

1 bracket 2 brackets 3 brackets

You may have noticed that we have not said anything about the nature of the atoms.
What should or could atoms be? We might take them to be basic objects that we need in
mathematics, e.g. natural - or real numbers, or points and lines. However, it turns out that
we do not need atoms at all! Instead, we can build up all of mathematics from nothing, i.e.
from the empty set. To do this, define a pure set as follows:

• Atoms are not pure.

• A set is pure if and only if all its elements are pure.

Thus at stage 0, we form the set 0, which is pure — vacuously, because none of its members
are impure. At stage 1, the set {0} is pure. At stage 2, we have the pure sets {{0}} and
{0, {0}}, etc. The infinite set (★) constructed at the first infinite stage is clearly pure as well.

We will only be concerned with pure sets. Thus when we say "set" , we mean "pure set" .
This means that in our mathematical universe, everything is a set.

Now observe that sets are made again and again. For example, is a subset of all sets,
and hence is created anew at each stage. But for each set there is a least stage, i.e. a stage
at which the set appears for the first time.

We thus have the following intuitions about set formation via stages:

(i) There is a least stage at which we have the set 0.

(ii) Every stage a has a successor a + 1. At stage a + l we can construct sets whose elements
are sets constructed at stage a and earlier. This means that if a collection (multiplicity)
has elements all of which are created at stage a or earlier, then that collection is a set,
and is "created" by stage a + 1.

Set Theory

8

(iii) Having made sets at a whole lot of stages, there is least a new stage, at which we can
construct sets whose elements are sets constructed at an earlier stage.

(iv) A collection (multiplicity) is a set if and only if it is formed at some stage.

(v) For every set, there is a least stage at which that set is formed.

Thus we have stages 0, 1, 2, 3, . . . , n . . . . After that comes a stage — call it stage oj. After
stage uj come stages

a; + 1, a; + 2, . . . , 2a;, 2a; + 1, ... , 3a;, ...,a;-a;,...,a;,...,a;,...,a; ,...,a;

Thereafter comes another stage — call it uji, etc. These stages conform to Cantor's two
"generating principle^': (1) Every stage a has an immediate successor stage a + 1, and (2)
given any sequence of stages without a last element, there is a stage which comes immediately
afterwards.

If a, /3 are stages, we write a < /3 to mean that stage a comes before stage /3. When we
form a set at some stage /3, its elements must be sets that have been constructed at some stage
a < /3. It is therefore impossible for a set to belong to itself. Furthermore, it is impossible to
get a loop such as

Xi G X2 G X3 • • • G e Xi

Now we consider a property that will turn out to play a very important role in the future
development of set theory: Let A be any non-empty collection of stages, and let F be the
collection of all stages that come strictly before any of the stages in A, i.e.
F := {7 : V(5 G A (7 < (5)}. We can then consider two cases:

• Either F is empty. In that case if the zeroth stage belongs to A, so is the least
element of A.

• Else F is non-empty. In that case intuition (iii) above says that there is a least stage
a which lies above all 7 G F. We now claim that a is the least element of A. To see
this, note first that each 5 G A lies strictly above all 7 G F. Since a is the least stage
which lies above all 7 G F, it follows that V5 G A (a < (5). Next observe that it cannot
be the case that V(5 G A (a < (5), for then q G F. Hence there is some /3 G A such that
(3 < a. But since /3 G A we also have a < (3, and thus q = /3. In particular, a G A.
Since V5 G A (a < 5), we see that a is the least element of A.

In both cases, therefore, we have shown that A has a least element:

Any non-empty collection of stages has a least element.

All of the above is at the intuitive level, and must be made formal. And we will accomplish
this: For example, the intuitive idea of a stage will be captured by the formal notion of an
ordinal. The fact that each set is "created" at some stage will be be evident when we consider
the cumulative hierarchy of sets and the Axiom of Foundation.

Cantor, Grundlagen einer allgemeinen Manmchfaltgkeitslehre, 1883.

9

Axioms of Set Theory

1.4 The Axioms of ZFC

Set theory is that branch of mathematics whose task it is to investi-
gate mathematically the fundamental notions "number", "order", and
"function", taking them in their pristine, simple form, and to develop
thereby the logical foundations of all arithmetic and analysis; thus it
constitutes an indispensable component of the science of mathematics.
At present, however, the very existence of the discipline seems to be
threatened by certain contradictions, or "antinomies" , that can be de-
rived from its principles — principles necessarily governing our think-
ing, it seems - and to which no entirely satisfactory solution has yet
been found. [. . . jUnder these circumstances there is at this point nothing
left for us to do but to proceed in the opposite direction, and starting
from set theory as it is historically given, to seek out the principles re-
quired for establishing the foundations of this mathematical discipline.
In solving the problem we must, on the one hand, restrict these prin-
ciples sufficiently to exclude all contradictions and, on the other, take
them sufficiently wide to retain all that is valuable in this theory.
Now in the present paper I intend to show how the entire discipline
created by Cantor and Dedekind can be reduced to a few definitions and
seven principles, or axioms, which appear to be mutually independent.

— Ernst Zermelo, 1908

(Opening lines of Untersuchungen iiber die Grundlagen der Mengen-
lehren I, Mathcmatischcn Annalen 65, pp. 261-281, 1908. Translation
in van Heijenoort, From Frege to Godel)

You may already have encountered various systems of axioms, such as the axioms for a vector
space, or of a field. A model of a system of axioms is a mathematical object that satisfies
these axioms. Thus a field is a model of the field axioms.
Systems of axioms can be separated into two classes:

I. Some systems of axioms arc obtained by studing a number of mathematical objects/theories
and abstracting certain common elements. Examples of such arc the axioms of group
theory, of field theory, of vector spaces, of topology, of lattice theory, of probability
theory. . . . These systems of axioms have many models, i.e. there are many different
kinds of group, many different kinds of field, many different kinds of vector space, ....
Moreover, these axioms are intended to have many models.

II. Other systems of axioms are meant to describe and characterize one kind of mathematical
objects, i.e. these axioms have an intended model. The axioms of arithmetic are meant
to capture just the set N of natural numbers with the operations +, x and relation <
on N. The axioms of a complete ordered field are intended to capture the set of real
numbers. Euclid's axioms of geometry are intended to characterize space.

In some cases, systems of axioms succeed in characterizing exactly the intended model.
Notably the axioms for a complete ordered field exactly capture the reals, in that any two

models of these axioms are isomorphic (i.e. essentially "the same"). In many cases, it is
impossible to completely capture an object with a set of axioms: If a set of first-order

Set Theory

10

axioms has just one an infinite model, it will have many non-isomorphic models "

The axioms of set theory are of the second kind: They are intended to describe the universe
of all sets. Below, we will write down these axioms, and briefly justify them in terms of the
iterative conception of sets. In the chapters that follow, we will develop the consequences
of these axioms in some detail, and (hopefully) convince you that all of mathematics can be
interpreted in the universe of sets.

The first set of axioms for set theory were presented by Ernst Zermelo in 1908, in a paper
entitled Investigations in the Foundations of Set Theory I. An additional schema of axioms
was suggested by Abraham Fraenkel in 1922 (and independently, by Thoralf Skolem in 1923).
We shall phrase these axioms in the language of first order predicate logic. The language of
set theory has at its disposal the following symbols, whose meaning is assumed to be already
known to you. We have at our disposal a countable collection of

• Variables: vo,vi,V2,V3, . . .

As non-logical symbols, we have the two binary predicate symbols

• Equality: =

• Membership: G

The atomic formulas of our theory are expressions

X = y X £ y

and these can be combined into formulas by means of

• Logical connectives: A (conjunction, and),V (disjunction, or),-i (negation, not), — )■ (im-
plication, implies), <-)• (equivalence, if and only if).

• Quantifiers: V (for all), 3 (there exists).

• Brackets: (,)

It is assumed that you already understand what a (well-formed) formula is, when an instance
of a variable in a formula is free or bound, etc. All of this is part of our metatheory, the basic
background theory in which we discuss our formal set theory. It's obvious — if you stop
to think for a minute — that one cannot develop any theory (mathematical, or otherwise)
from absolutely nothing. We have to assume that we have a common background language
in which to develop our formal theory, and a common basis of knowledge that we can use as
springboard from which to start. For us, this metalanguage and metatheory includes a natural
language (English) augmented by a few mathematical notions, such as the intuitive notion
of natural number, the notion of a finite set, and very elementary first-order logic. This
foundation, involving only intuitively obvious finitary notions and - reasoning, is (hopefully)
obvious a priori, and its validity not in doubt

^"This is the Lowenheim- Skolem Theorem, a basic result in the branch of mathematical logic called model
theory. It follows from this theorem that if the axioms of set theory are consistent, then there is a countable
model of set theory, i.e. a model in which there re just countably many sets — a result that Skolem found

^^If you do doubt it, your mathematics will look very different. For example, for the intuitionists every
function / : R — ^ R is continuous. This follows logically from certain assumptions that are quite natural.

11

Axioms of Set Theory

Apart from the above symbols, we will introduce symbols for defined notions as we proceed.
Many of these will undoubtedly be familiar to you, such as U, n, 0, — for union, intersection,
empty set and set difference. We will use x,y, z,X,Y, Z, . . . as symbols for variables, and
also use brackets [,], j , , to make expressions easier to understand. We will also employ

expressions such as x ^ y rather than the formally correct = y), and we will omit brackets
wherever it makes an expression easier to read without risk of ambiguity. Thus, for example,
we will write

Vx 3y (x y) instead of Vfo (3fi {^{vq G vi)))

The latter is a well-formed sentence of our formal language, but the former is much easier to

We also assume that you are aware of various logical equivalences, such as

(/? V ■i/' -'(-'(^ A -I'i/') and Mx ip (-k/^)

etc.

Furhermore, we introduce the following notation:

(Vx y) (f means Vx(x G y — >■ (p)

and

(3x & y) if means 3x {x & y A ip)

ZFC 0: The Axiom of Set Existence

ZFC 0:

There is a set.

3x {x = x)

Without this axiom, we have nothing to talk about. (For some systems of axioms, models
with no elements are acceptable, and even desirable. Category theorists, for example, like to
work with categories that have initial objects, and these can be empty. But our axioms are
meant to characterize a single "object" : The universe of all sets — and we believe that that
universe is non-empty.) This axiom is not strictly necessary, as it is implied by the Axiom
Infinity.

ZFC 1: The Axiom of Extensionality
ZFC 1:

If two sets have the same elements, then they are equal.

Vx Vy [Vz (z G X -H- 2; G y) — >■ X = y]

The extension of a property is the collection of all objects which have that property, i.e. to
which that property extends.
It is not a priori clear that

{Moons of Venus} = {Moons of Mercury}

Set Theory

12

or that

{x'^ ■.xeR} = : y G M}

The intension of those definitions of sets — i.e. the way in which they are defined —
is different. However their extension is the same. Both the sets {Moons of Venus} and
{Moons of Mercury} are empty, and are thus equal to 0. Both the sets {x'^ : x € M} and
{y^^ : y G M} consist exactly of all non-negative real numbers, and are thus equal to [0,oo).
The Axiom of Extensionality states the intension is not important when it comes to set
equality, but only the extension.

We may now introduce a new symbol C for the relation of inclusion, a binary relation on
sets which is defined as follows.

Definition 1.4.1 We say that a set j; is a subset of y and write a; C y if and only if every
element of x is an element of y:

X y ■^=^ \/z {z £ X ^ z £ y)

We also write y ^ x instead of x C y.

We say that x is a proper subset of y, and write x C y (or y ^ x), if x y and x ^ y.

□

The Axiom of Extensionality can now be written thus:

2;^yAyCx— 7>x = y

Remarks 1.4.2 • The binary relations E and C are sometimes confused, especially by
beginners. They mean quite different things, however. Indeed, observe that it is always
the case that x C x, but that it is seldom the case that x G x. (Indeed, never: That
X G X contradicts the iterative conception of sets was discussed in §1.3[ and will follow
formally from the Axiom of Foundation. )

• Another role of the Axiom of Extensionality is to limit the universe of discourse to sets.
In particular, it says that if two objects x, y of our universe have the same elements,
then X = y. It therefore follows, for example, that there are no atoms. (For if a is an
atom, then a and have the same elements, so a = — contradiction.)

□

ZFC 2: The Axiom Schema of Separation

The remaining axioms, apart from the Axiom of Foundation, are axioms of set formation, i.e.
they assert that certain sets exist, or that certain collections (multiplicities) are sets. The
guiding principle, expounded by Cantor, Russell and Zermelo, is that of limitation of size.
Sets are collections that are "not too big." This prevents one forming a "set of all sets", for
example.

ZFC 2^:

Let if{x,pi,...,pn) be first-order formula with free variables amongst x,pi, . . . ,pn- For
any set z, there is a set y consisting of all those x G z which have the property (p.

Wz Vpi ... ypn ByWx {x £ y -ir^ X £ z A ^p{x,pi, . . . ,pn))

13

Axioms of Set Theory

Given a set z and a property we may separate all those elements of z which have the
property if, and put them together to form a new set y. By the Axiom of Extensionality,
there is just one such set y. We therefore may introduce some new notation: Let

{x e z : (p{x,p)}

denote the unique set y whose existence is asserted by the Axiom Schema of Separation.
Observe that y is a sub-collection of z. The Separation Axiom asserts that the collection
(multiplicity) y is a set.

It is easy to justify the existence of such a set y from the iterative conception of sets:
Given the set z, we know that it is "created" at some stage a. All the elements x of z are
therefore created at some stage strictly earlier than a. In particular, the elements of y are
created at some stage strictly earlier than a, as y C. z. But at stage a we make sets out of all
those collections whose elements are sets created at an earlier stage. In particular, we make
a set out of the collection y at (or possibly even before) stage a.

Observe that Separation is not a single axiom, but an axiom schema: We have one axiom
for every first-order ip{x,p). However, given any formula (or string of symbols) it is a
mechanical procedure to check whether or not \$ is an instance of the separation axiom.

The Axiom Schema of Separation is also called the Axiom Schema of Comprehension or
the Aussonderung-axiom in the literature. It will, in fact, follow from the Axiom Schema of
Replacement (ZFC 7), so it isn't strictly necessary. But it is easier to use than Replacement,
and is on the original list of axioms that Zermelo published in 1908.

Exercise 1.4.3 (a) Prove that the existence of a unique empty set follows from the axioms
ZFC 0-2. We may therefore define to be the unique set with no elements.

(b) Prove that ZFC 0-2 imply that every set x has at least one subset y which is not an
element of x, i.e. prove that Vx 3y (y C a; A y x). Deduce that the collection of all sets
is not a set.

(c) Recall that if a, b are sets, then their intersection a n 6 is the collection of all elements
which belong to both a and b. Further recall that a — 6 is the collection of all elements
which belong to a but not to b. Show that ZFC 0-2 imply that if a, b are sets, then so
are aCib and a — b.

□

Remarks 1.4.4 In the exercise above, we formally introduced the symbols 0,n,— . Previ-
ously, we introduced the symbol C and the notation {x £ a : f}. We will use these introduced
notions as though they are part of our formal language. They can always be eliminated,
however. For example,

\/x 3y {y C x) is shorthand for ^x 3y \/z {z e y ^ z £ x)

and

{yea: ip{y,p)}r]b = is shorthand for 3z ^y {y e z y e a A ip{y,p)) A-i3x {x e z A x e b)

□

Set Theory

14

ZFC 3: The Axiom of Pairing
ZFC 3:

Given two sets x, y there is a set z whose members are precisely x,y.

Va; Vy Vi/; (w^z-^w = x\/w = y)

By the Axiom of Extensionality, any two sets whose members are precisely x, y are equal,
and thus we may define the unordered pair {x, y} to be the unique set whose elements are
precisely x and y, i.e.

z = {x, y} is shorthand for xEz/KyEzA Vu; {w E z — >■ w = x V w = y)

We then define the singleton {x} to be the set {x,x}.

The Axiom of Pairing is easy to justify via the iterative conception of sets: If a and b are
sets, then each is created at some stage, say stages a and p. Then {a, b} will exist at any
stage 7 which is strictly greater than both a and /3.

Remarks 1.4.5 The Axiom of Pairing is often stated in the following weaker form: Given
any two sets x, y, there exists a set z that they both belong to. However, together with the
Separation axiom, this weaker form implies our form. For if z is a set which has x, y amongst
its elements, then

{x, y} := {w & z : w = xV w = y}

□

ZFC 4: The Axiom of Union
ZFC 4:

For any set x, there is a set y whose elements are precisely the elements of elements of x.

Vx 3y Vz [z G y -H- 3w G x {z £ w)]

By the Axiom of Extensionality, there is a unique such set y, which is called the union of the
set X, and denoted by

y = (Jx

or (more commonly) by

y = [^{w : w & x} or y = vj

We can justify this axiom as follows: If x is a set, then it is formed at some stage a. Each

element of x is therefore created at some stage < a, and thus the elements of elements of x
are created before stage a as well. Now the union of a; — i.e. the collection of all elements
of elements of a; — is a collection of elements that are all created strictly before stage a, and
hence |J x is formed as a set by stage a at the latest.

Remarks 1.4.6 The Axiom of Union may be stated in the following weaker form: Given

any set x, there exists a set z that has amongst its elements all the elements of elements of x.
Together with the Separation axiom this weaker form implies our form. For if z such a set,
then

[J X := {y G z : 3w G x (y G w)}

15

Axioms of Set Theory

□

We may now define (with the aid of the Axiom of Pairing) the sets

xUy :=[J{x,y} x U y U z :=\^{x,y, z} xi U • • • U j;„ = |J{xi, . . . , j;„}

etc.

Exercise 1.4.7 (a) Suppose that C := {y : (p{y,p)} is the cohection of all sets that have
some propertjj^ given by the first-order formula C need not be a set itself — it might
be "too big" — but each of its elements is a set. Define the intersection P| C of C to be
the collection of all sets z with the property that z belongs to each element of C, i.e.

f]C = {z:\/y€C{zGy)}

Show that ZFC 0-2 imply that if the collection C is non-empty, then P| C is a set.
Further, use ZFC 0-3 to define the intersections x Ciy, x (1 y Ci z etc. Observe that the
Axiom of Union is not required for the existence of intersections.

(b) Show that P| is not a set.

□

We may also write y = P| C as f]{y : y £ C} oi y = Clyec U- Further, x Ciy := {^{x, y},

etc.

Exercise 1.4.8 If the following identities and inclusions are unfamiliar, now^^ is the time to
prove them.

(a) x C y n z iff X C y and x'^z; x^yUziSx^y and x ^ z.

(b) 2;Cyiff2;ny = xiffxUy = ?/iffx — y = 0.

(c) The distributive laws: x Ci {y U z) = {x Ci y) U {x Ci z) and xU {y Ci z) = (x U y) Ci {x U z).

(d) De Morgan's Laws: x — {y U z) = {x — y) (1 {x — z) and x — {y (1 z) = {x — y) U {x — z).

ZFC 5: The Axiom of Power Set

ZFC 5:

For any set x, there is a set y whose elements are precisely the subsets x.

yx 3y \/z {z £ y -f-)- z C x)

By the Axiom of Extensionality, there is a unique such set y, which is called the power set of
the set X, and denoted by

y = V{x)

The Axiom of Power Set may be justified as follows: If x is a set formed at stage a, then
all the elements of x are formed before stage a. But then each subset of x is a collection of
elements that are created before stage a, so each subset of x is made into a set by stage a at
the latest. So the collection of all subsets of x is a collection of sets that all exist by stage a,
and thus 'P(x) is made into a set by stage a + 1 at the latest.

^■^The technical term for such a collection is class — cf. [|1.5
"Yes, NOW!

Set Theory

16

Exercise 1.4.9 Show that if x,y are sets, then {x,y} € V{V{x) UV{y)).

Given the above, do we need the Axiom of Pairing? Or does Pairing foUow from Union
and Power Set?

ZFC 6: The Axiom of Infinity

"But concerning your proof, I protest above all against the use of an
infinite quantity as something completed, which is never permissible in
mathematics. The infinite is merely a way of speaking, the true meaning
being a limit which certain ratios approach indefinitely close, while others
are permitted to increase without restriction. "

— Gauss, letter to H. Schumacher, 1831

□

" The existence of the infinite will never again be deniable while that of
the finites is nevertheless upheld. If one permits either to fall, one must
do away as well with the other. "

— Cantor, Grundlagen, 1883.

The axioms ZFC 0-5 ahow us to prove that there are infinitely many sets, but they
do not allow us to prove that there is an infinite set. This is not hard to see: The axioms
ZFC 0-2 allows us to deduce the existence of possibly just one set, namely 0. Further,
axioms ZFC 2-5 allow us to create new sets from old, by taking subsets, pairs, unions, and
power set. But if we have at our disposal only finite sets (i.e. if we are at a stage where
only finite sets have been created), then the operations of taking subsets, pairing, union and
power set are not capable of forming infinite sets. Indeed, even ZFC 0-5 together with ZFC
7-9 are not sufficient to prove the existence of an infinite set, and the existence of such a
set is, in fact, denied by the intuitionists and finitists. Following Aristotle, the concept of
infinity was separated into two distinct versions. On the one hand, there is the idea of an
actual infinity, i.e. a completed, finished, definite thing consisting of infinitely many elements
or parts. On the other, there is potential infinity, where one has a sequence of things which
is never completed or finished. Aristotle allowed only potential infinities as non-paradoxical,
and many mathematicians and philosophers agreed. Certainly, this point of view, eloquently
expressed in the quote of Gauss above, had much to recommend it, and under its guidance
mathematicians were able to develop completely rigorous definitions of limits without the use
of infinitesimal^^

It was Cantor who made the use of the actual infinity respectable in mathematics. It was
Zermelo who first understood that its existence had to be asserted:

ZFC 6:

There is an infinite set.

3x [0 £ X A yy {y G X -

■> yU{y}e x)]

The philosopher George Berkeley ridiculed the use of infinitesimals: "May we not call them the ghosts
of departed quantities?" Strangely enough, Cantor too was resolutely opposed to the use of infinitely small
numbers, and "refuted" them with arguments that could be applied equally to his infinitely large transfinite
numbers. But the logician Abraham Robinson resurrected them and legitimised their use with his invention
of the hyperreals and non-standard analysis.

17

Axioms of Set Theory

Let us analyze this axiom: Define an operation S{-) on sets by

S{x) := xU {x}

This means that

y = S{x) yz{z£y-^z£xVz = x)

By the Axioms of Pairing and Union, S{x) is a set whenever x is. For reasons that wiU
become clear later, the function S{-) is called the successor function. The Axiom of Infinity
states that there is a set which has the following amongst its elements:

0, 5(0), S{S{0)), S{S{S{0))),...

"Clearly", such a set must be infinite. However, we have not yet defined what wc mean by
an infinite set — this requires the formal notion of a natural number. Let us say that a set
X is inductive if it satisfies the condition in the Axiom of Infinity, i.e. if

0GxAVy(yGx^ S{y) G x)

The Axiom of Infinity therefore asserts that there is an inductive set. (The set of natural
numbers will then be defined to be the smallest inductive set.) Once wc have defined what
we mean by an infinite set, we shall be able to prove that an inductive set is indeed infinite.
Moreover, we shall show that if an infinite set exists, then so does an inductive set. Thus the
existence of an infinite set will turn out to be equivalent to the existence of an inductive set.

Let us quickly see how to justify this axiom via the iterative conception of sets: Observe
that is formed at stage 0. Then S{0) = {0} is a set whose elements arc formed at stage 0,
so S{0) is formed at stage 1. Then S{S{0)) = {0,{0}} is a set whose elements are formed
at stage 1 and earlier so SS{0)) is made into a set at stage 2. Thus:

is formed at stage

S{0) is formed at stage 1

SS{0) is formed at stage 2

SSS{0) is formed at stage 3

These sets therefore are all formed before the first infinite stage oj, and hence the inductive
set

{0,S{0),SS{0),SSS{0),...}

is formed at stage cj.

ZFC 7: The Axiom of Choice
ZFC 7:

Every family of non-empty sets has a choice function: If x is a family of non-empty sets,
then there is a function f on x which chooses from each w E x an element f{w) G w.

\/x [Vu) G a; (w 7^ 0) — )• 3/ ( / is a function A dom(/) = a; A \/w G x {f{w) G w))]

Set Theory

18

In the next chapter, we will see that functions can be interpreted in the universe of sets, so
that to the statement "/ is a function" can be written in the language of set theory. Observe
that if / is a choice function on x, then / : x — )• |J x. (Of course, not every / : x — )• |J x is a
choice function.)

The Axiom of Choice — formulated explicitly by Zermelo in 1904 — is an immensely
powerful tool. It seems intuitively obvious, and yet has generated so much controversy that it
deserves (and will get) a chapter on its own. The reason that this axiom was initially eyed with
such suspicion by a large number of mathematicians is that the axiom is non-constructive. It
simply asserts the existence of an object — a choice function — without saying how such an
object might be constructed. The axioms ZFC 2-6 have a slightly different character: They
assert that certain well-defined collections are sets. The Axiom of Choice (abbreviated AC) ,

on the other hand, asserts the existence of an undefined set And this is precisely the source

of its power: If one could define a choice function for a particular family x, then one would
not need the AC to assert its existence.

Remarks 1.4.10 It is important to note that one does not need AC to "choose" an element
from a non-empty set. This fact has often been misunderstood. The fact is that AC doesn't
really allow you to "choose"; that's just a manner of speaking. What AC does do is assert
that a certain set exists. Thus if w is non-empty set, then x := {w} is family of non-empty
sets, and AC says that there is a function / : x — t- |J x such that f{w) G w. But we don't
need AC for that ! If it; is non-empty, then (by definition of "non-empty" and first-order
logic) there is an element a G w. We may not know what a looks like — i.e. we may not
explicitly be able to choose a particular a — but there is such an a. It will be clear after
you've read the next chapter that the function / : x — )• |Jx which has f{w) = a exists, i.e. a
choice function exists.

In the same way, it can be shown that any finite family of non-empty sets has a choice
function (because it follows from the Axioms of Pairing and Union show that every finite
collection of sets is a set). But a problem arises when we have an infinite family x of non-
empty sets: It is always possible to choose an element from each w G x. What is not clear is
that one can do this "in one go", i.e. simultaneously for all w £ x, via some function /.

□

I'm not sure if one can justify the Axiom of Choice via the iterative conception of sets.
However, in 1938 Kurt Godel proved the following result: If ZF (i.e. set theory without AC)
is consistent, then so is ZFC (set theory with AC). Put differently, AC cannot introduce an
inconsistency into set theory. If ZFC is inconsistent, then ZF was already inconsistent also.

ZFC 8: The Axiom of Replacement

The remaining two axioms — Replacement and Foundation — are very important for the
subject of set theory itself, and for metamathematics. They play only a tiny role in ordinary
mathematics however.

I do not see know to derive the Axiom Schema of Replacement from the iterative concep-
tion of sets. It is easy to derive it from the idea of limitation of size, however. By that we
mean that a set is a collection that is not "too big" . Here's how:

We shall soon see — and you are no doubt aware already — that functions are sets. Remember, sets are
all there is.

19

Axioms of Set Theory

Suppose that ip{x, y,p) is a first-order formula in the language of set theory which defines
a "function" . Here x, y play the role of independent and variable, and p the role of a fixed
parameter. Informally, such a formula defines a "function" F when for any x there is at most
one y such that ip{x,y,p) holds. In that case we have a "function" F given by

y = F{x) if and only if ip{x, y,p) holds

The Axiom of Relacement asserts that if F is such a "function" and if X is a set, then the
direct image

F[X] := {F{x) :xeX} = {y:3xeX ^{x, y,p)}

is a set also. The use of quotation marks around the word function is essential here. We shall
see in the next chapter how to interpret functions as sets. But it will follow from ZFC 0—5
that the direct image of a set under a function is a set — The Axiom of Replacement isn't
necessary for that. The function F we use here may be "too big" to be a set, however.
Nevertheless, even though F may be too big, its image F[X\ is surely no bigger than X. So
if X is a set, then F[X] can't be too big to be a set.

All this may be confiising at first. Fortunately wc don't need to talk about the size of a
"function" F to formulate the Axiom Schema of Replacement:

ZFC 8(^:

Suppose that (p{x,y,pi, . . . ,pn) is a first-order formula with free variables amongst x,y
and p := {pi, . . . ,pn)- Suppose that for every x there is at most one y so that (p{x,y,p).
If X is a set, then the collection {y :3x e X (p{x, y,p)} is a set.

\/p VixMyMz {ip{x,y,p)/\Lp{x,z,p) ^ y = z) — >■ VX3y Vy (y G y ■<->■ -^x ^ X ip{x,y,p))\

Observe that Replacement — like Separation — is an Axiom Schema. We have one axiom
for every ip.

The Axiom Schema of Replacement was suggested independently by Abraham Praenkel
and Thoralf Skolem in 1922. It allows the creation of many, many new sets. Whereas the
other axioms of ZFC allow one to form new sets that are at most a few levels higher than
the sets from which they are created, the Axiom of Replacement allows one to form sets at
very high levels. (E.g. it is possible that the sets X is at a "low level" and that the created
set F[X\ at a very "high level". ) In particular, it necessary to develop a theory of ordinal
numbers and transfinite induction. As we shall see, the ordinals form the backbone that
supports the entire set-theoretic universe, so this axiom is indispensable to the set theorist.

Set Theory

20

ZFC 9: The Axiom of Foundation

A well-known scientist (some say it was Bertrand Russell) once gave a
public lecture on astronomy. He described how the earth orbits around
the sun and how the sun, in turn, orbits around the center of a vast
collection of stars called our galaxy. At the end of the lecture, a little old
lady at the back of the room got up and said: "What you have told us is
rubbish. The world is really a fiat plate supported on the back of a giant
tortoise. " The scientist gave a superior smile before replying, "What is
the tortoise standing on?" "You're very clever, young man, very clever,"
said the old lady. "But it's turtles all the way down!"

— Stephen Hawking, A Brief History of Time

In the set-theoretic universe, its sets all the way down. Every element of a set is a set.
Every element of an element of a set is a set. But how far is all the way down'? Not that far,
as it turns out.

According to the iterative conception of sets, sets are made in stages, with each set being
made up of elements that have been formed at a strictly earlier stage. We saw also that any
non-empty collection of stages has a least element. Suppose now that a; is a non-empty set.
Let r be the collection of all stages associated with an element of x, i.e.

r = {7 : There is x G X such that z is formed at stage 7}

The r is a non-empty collection of stages, so has a least clement a. Let y G X be an element
formed at stage a. If z G y, then z is formed at a stage /3 < a. Now /3 F, because a is
least, and hence z ^ x. It follows that if 2; G y, then z ^ x, i.e. that y Ci x = 0. We have
thus shown (informally, from the iterative conception of sets), that for every non-empty set

• there is y E x such that y n x = 0.

We now take this to be an axiom:

ZFC 9:

Vx (x 7^ -

■> 3y G x(y n x = 0)

Exercise 1.4.11 Show using the Axiom of Foundation that "the turtles don't go very far
down" in ZFC: There are is no sequence of sets xi, X2, X3, . . . such that

. . . Xn+l G x„ G . . . X3 G X2 G Xi
It follows that every G-decreasing chain

xi 3 X2 3 xs 3 . . .

is of finite length. (Technically, we say that the G-relation is wellfounded.)

□

21

Axioms of Set Theory

The Axiom of Foundation is also referred to as the Axiom of Regularity in the hterature.
It was introduced by John von Neumann in 1925. While it is of scant use to the ordinary
mathematician, it is of great import to the set theorist, as we shall see in a later chapter. In
essence, together with Replacement, it allows us to formalize the iterative conception of set
within ZFC itself, and build up the universe of sets from nothing, by iterating the operations
of power set and union along the ordinals.

1.5 Sets and Classes

The version of set theory that we have developed (i.e. ZFC) has just one type of object,
namely sets. Some set theories allow for another kind of object, however, namely classes. We
will use this terminology only informally. Thus, informally, a class is a collection of the form

C := {x : ipix,p)}

i.e. C is the collection of all sets that have property (p. Here x,p are sets, with p playing the
role of a parameter.

For example, the universe of all sets — denoted by V — is the class

V := {x : X = x}

Some classes are "too big" to be sets: Russell's paradox (together with Separation) shows
that V is not a set. But every set is a class, for if y is a set, then the class determined by the
formula ip{x, y) = x G y is clearly y itself:

{x : X ey} = y

A class that is not a set is called a proper class. Thus V is a proper class. Later, we shall see
that the class On of all ordinals is a proper class also.

The theory NBG (von Neumann-Bernays-Godel) is a commonly used set theory allows
both classes and sets as objects. (This is a conservative extension of ZFC: Any theorem about
sets is NBG is also a theorem of ZFC, and vice versa. Thus NBG says nothing about sets

Inside ZFC, however, classes are just a way of speaking: If

C := {x : ipix,p)}

is a class, then when we say x G C, we mean simply that the statement (p{x,p) is true. Thus
any mention of C can be avoided, simply by using its defining formula (/?.

Exercise 1.5.1 Observe that every element of a class is a set. Show that a class is a set if
and only if it is an element of some class.

□

In Exercise 1 1.4. 7 it was shown that the intersection of a non-empty class is a set, i.e. that

if C 7^ 0, then f] C is a set.

The Axiom Schema of Separation states essentially that a subclass of a set is a set:

{x £ z : ip{x,p)} = zri{x : ip{x,p)} = z CiC

Set Theory 22

The Axiom Schema of Replacement can be formulated as follows: If F is a class functior
and X is a set, the direct image F[X] is a set.

^''i.e. a "function" between classes. For example, the successor function S{x) :— xU {x} is a class function
S : V V.

Chapter 2

Relations and Functions

In this chapter, we revise some elementary notions in mathematics, and show how they can
be interpreted in the universe of sets. It is assumed that you've encountered the material in
§2.1-2.5 before, so we move at speed. For the development below, the axioms ZFC — 5
usually suffice. The Axiom of Choice (AC, ZFC 7) also occurs in a few exercises, but its use
is always stated explicitly.

2.1 Ordered Pairs and Finite Cartesian Products

We can use the notion of an unordered pair to define the notion of an ordered pair. An ordered
pair (x, y) is an object which, intuitively, contains x, y in that order. The essential property
that we want ordered pairs to possess is the following:

(x, y) = (a, b) if and only if x = a and y = h

In particular, (x,y) ^ {y,x) (unless x = y). But by the Axiom of Extensionality

{x,y} = {y,x}

We can, however, interpret ordered pairs as sets, i.e. find sets which exhibit the correct
behaviour. We follow the Polish mathematician Kazimierz Kuratowski, and define

{x,y) := {{x},{x,y}}

Exercise 2.1.1 Show that Kuratowski's definition of orded pair results in the correct be-
haviour, i.e. that (x, y) = (a, h) if and only if x = a and y = b. Also show that the following
alternative definitions of ordered pair fail to exhibit the desired behaviour:

{a,b):={{a},{b}}, {a,b) := {a,{b}}

Finally, give one other definition of ordered pair that does exhibit the correct behaviour.

□

23

Set Theory

24

Observe that the set {x, y) is a concrete reahzation of the abstract idea of an ordered pair
{x,y). This is not to say that is {x,y), but merely that the set {x,y) behaves as an

ordered pair should.

Once we have defined ordered pairs as sets, we are able to find interpretations of ordered

{x, y, z) ■= {{x, y),z), {x, y, z, w) := {{x, y, z),w),

and, in generaQ

{Xi, . . .,Xn+l) ■■= {{Xl, . . .,Xn),Xn+l)

Given two sets X, Y, we may define the Cartesian product X xY X and Y as follows:

X X Y = {z : 3x £ X 3y £ y {z = {x,y))}

We then define

X xY X Z := {X xY) X Z

etc. Observe that in that case

X xY X Z = {{x, y,z) : X e X,y eY,z e Z}

Similarly, we define X"^ := X x X, X^ := X x X x X , etc.

We have not yet shown that Cartesian products are sets. That is the objective of the next
exercise:

Exercise 2.1.2 Let X,Y be sets.

(a) Write down a first-order formula pair{z, x, y) which states that z = {x, y}.

(b) Write down a first-order formula ordered_pair{z , x, y) in the langage of set theory which
states that z = {x,y).

(c) Show that ii x £ X and y £Y, then {x, y) £ V{V{X U Y)).

(d) Show that X x y is a set.

□

Here are some easy exercises about the arithmetic of Cartesian products:

Exercise 2.1.3 (a) {X \JY) x Z = {X x Z)U {Y x Z), {X r\Y) x Z = {X x Z)r\{Y x Z),
{X -Y) X Z = {X X Z) - {Y X Z),

(b) (A X B) n {X X Y) = {An X) X {B nY).

(c) X xY = ii and only if X = or F = 0.

□

^We are not using natural numbers in this definition, but giving a schema for defining ordered tuples of
arbitrary length. If you want to define an ordered 100-tuple, this requires a formula containing 100 variables.
The concept/number 100 is not required. However, in order to define all finite ordered tuples simultaneously,
using induction, we need the concept of natural number. This will have to wait until a later chapter.

25

Relations and Functions

2.2 Relations

In mathematics, one studies many relations between various objects x, y. For example:

• X < y where x, y are numbers.

• X £ y, where x, y are sets.

• X = y, where x,y are natural numbers/real numbers/sets/matrices/. . .

• x\y {x divides y), where x,y are integers, or polynomials.

• X is a root of y, where x is a real number, and y a real function.

If R stands for some relation, we often write xRy to mean that x, y satisfy that relation.
Relations need not be binary, i.e. they need not be between just two objects. Furthermore,

• x,y,z are related if and only if x is congruent to y modulo z, where x, y, z are natural
numbers.

• x,y,z are related if and only if x(y) = z, where x is a real function, and y,z are real
numbers.

• x,y,z are related if and only if the dot product of x and y is z, where x, y are two
n-dimensional vectors, and z is a real number.

• X, y, z, w are related if and only if the point (x, y) lies on a circle of radius z and centre
w.

Observe that the order of x,y,z,. . . may matter. If the relation is strict inequality <,
then X is related to y if and only if x < y. This does not mean y is also related to x.

Given some relation, we can interpret it as a set R as follows: Let R be the set of all
ordered pairs (x, y) with the property that x is related to y. Thus

Definition 2.2.1 A binary relation i? is a set of ordered pairs. When {x,y) G R, we may
indicate this by writing xRy.

An n-ary relation i? is a set of ordered n-tuples. When (xi, . . . , Xn) G R, we may indicate
this by writing R{xi, . . . , x„).

For example, the relation of strict inequality on the natural number^ N := {0, 1, 2, 3, ... } is
the set < defined by

<:= {(0,1), (0,2), (0,3), (0,4),..., (1,2), (1,3), (1,4),..., (2,3), (2, 4),..., (3, 4),...}

□

Definition 2.2.2 Suppose that i? is a binary relation.

■^Informally speaking, that is. In the end of course, they will be of the same type, since everything is a set.
''We will define the natural numbers later in this chapter.

Set Theory

26

(a) The domain and range of R are defined by

dom(i2) := {x : 3y (xRy)} ran(i?) := {y : 3x (xRy)}

(b) If A is a set, then the image of A under R is defined by

R[A] := {y G ran(i?) -.JxeA {xRy)}

(c) If 5 is a set, then the inverse image of B under R is defined by

R-^[B] := {x e dom(i?) :3yeB (xRy)}

(d) If A is a set, then the restriction of i? to A is the binary relation R \ A defined by

R\ A:= {{x,y) e R : x e A}

□

If X,Y are sets, then any subset R C X x Y is a relation, with dom(i?) C X, and
ran(i?) C y. In that case, we say that i? is a relation from X to Y. If R C X x X, we say
that R is a relation on X.

For any set X there is a identity relation Idx on X defined by

Idx := {{x,x) :xeX}

so that X Idx y if and only if x,y e X and x = y. (Thus Idx is just the relation of equality,
restricted to the set X.)

Exercise 2.2.3 Let i? be a binary relation.

(a) Show that if {x, y) € i?, then x, y G |J |J i?. Conclude that that dom(i?) and ran(i?) are
sets.

(b) Show that R C dom(i?) x ran(i?).

(c) Define the inverse R"^ of R by

(x, y) e R~^ iff xRy
Show that R~^ is a set, and thus a binary relation.

(d) If S is another binary relation, define the composition S o R of R with S by

{x,z) €SoR iff 3y {xRy A ySz)
Show that o is a set, and thus a binary relation.

(e) Show that composition of binary relations is associative:

To{SoR) = {ToS)oR

27

Relations and Functions

(f) Consider the relation < on the set of natural numbers. What is < ^1 Show that <
o <=<.

(g) Show that if dom(i?) = X, ran(i?) C Y, then Idy o R = R = Roldx-

□

Observe that xS o Rz if and only if there is y such that xRySz — the order of i?, S is
reversed! (This is why some texts write Ro S where we write S o R. We have reasons for
preferring the current version, however, as will become apparent in the next section.)

Definition 2.2.4 A binary relation i? on a set X is

(a) Reflexive if Vx G X (xRx).

(b) Symmetric if Vx, y E X {xRy — > yRx).

(c) Asymmetric if Vx, y £ X {xRy — -^{yRx)).

(d) Antisymmetric if Vx, y E X {xRy A yRx —^x = y).

(e) Transitive if Vx, y,z E X {xRy A yRz — >■ xRz).

(f) Connected if Vx, y E X {xRy V x = y V yRx).

□

Exercise 2.2.5 Suppose that i? is a binary relation on a set X.

(a) Show that -R is reflexive if and only if Idx ^ ^•

(b) Show that R is symmetric if and only if R~^ C R.

(c) Show that R is asymmetric if and only if i? fl = 0.

(d) Show that R is transitive if and only if R o R C R.

(e) Show that R is connected if and only if i? U R~^ U Idx = X x X.

□

2.3 Functions

Informally, a function / can be thought to induce a binary relation between elements x,y,
namely: x and y are related if /(x) = y. But not every binary relation is a function, for
functions arc single-valued, i.e. if /(x) = y and /(x) = z, then we must have y = z. This

Definition 2.3.1 A function (or map) / is a binary relation with the property that if (x, y) G

/ and (x, z) £ f, then y = z.

We may write /(x) = y instead of (x, y) G / or x / y.

Set Theory

28

□

When / is a function with dom(/) = X and ran(/) C Y, we indicate this by writing

/ : X — 7- y. We may also write x ^ y to mean that f{x) = y (when / is clear from context).

If X, Y are sets, then the collection of all functions f : X ^ Y is denoted by Y-^ . (Some
authors use ^Y.)

Exercise 2.3.2 Show that if X, Y are sets, then X^ is a set.

Observe that the identity relation Idx on a set X is a function, with Idx : X ^ X : x x.
Obviously, it will be referred to as the identity function as well.

Since a function is a binary relation, the notions of domain, range, inverse, image, inverse
image, restriction and composition make sense. In particular any function / will possess an
inverse relation f~^. However, /^^ need not be a function.

Definition 2.3.3 Let f : X ^ Y he a function.

(a) If the binary relation f~^ is a function f :Y X, i.e. if f~^ is a function and dom(/~^) =
Y, then we say that / is invertible.

(b) We say f : X ^ Y is injective (or one-to-one) if Vx, x' G X f{x) = f{x') — >■ a; = x'). We
may write f : X ^ Y (or f : X ^ Y) to assert that / is an injection.

(c) We say that f : X Y is surjective (or onto) if Y = ran(/), i.e. if Vy G 3a; €
X {f{x) = y). Wc may write f : X ^Y to assert that / is a surjcction.

(d) We say that f : X Y is a bijective if / is both injective and surjective. We may write
/ : X >— » y to assert that / is a bijection.

Exercise 2.3.4 Let / : X — > y be a function.

(a) Show that a composition of in/sur/bijcctions is an in/sur/bijection.

(b) Show that the relation f~^ is a function if and only if / is injective.

(c) Show that if / is invertible, then / o f~^ = Idy, and o / = Idx-

(d) Say that a function g : Y ^ X is a left inverse of / if ^ o / = Idx- Show that / has a
left inverse if and only if it is an injection.

(e) Say that a function h : Y ^ X is a right inverse of / if and only if f o h = Idy. Using
AC, show that a function has a right inverse if and only if it is a surjection.

(f) Show that if / has both a left inverse g and a right inverse h, then / is invertible, and

□

□

9 = h = f

-1

(g)

Show that / is invertible if and only if it is a bijection.

□

29

Relations and Functions

Suppose that X is a sets, and that F : I ^ X : i Xi is a, surjection. Then

X = {F{i) ■.iel} = {xi:iel}

The function F is called an indexing of the set X. The set / is called the index set, each i G I
is called an index, and F{i) = xi is called the i*^ term.

Note that any set can be indexed by some set. For example, a set X can be indexed by
itself: Let I = X, and let Xi = i. Later on, however, we shall often use ordinals, a kind of
transfinite number, to index sets.

If X = {xi : i e I}, we will write

[jx=ijx, n^=n^^

etc.

Exercise 2.3.5 Let / : X — >■ y be a function. Show that the inverse image f~^[-] is a
function

/-![.] : p(y) ^ : B ^ f-^[B] := {x e X : f{x) G B}

which preserves the basic set-theoretic operations U'D'"' (^-S- show that /~^[Uie/^] ~
Uielf'^i^il for any family Yi G V{Y).)

□

Associated with any product X xY are associated projections itx-,t^y defined by

-Kx X xY ^ X : {x,y) ^ X Try ■ X x Y ^ Y : {x,y) y

Exercise 2.3.6 Suppose that fx'-A^X,fY'-A^Y are functions with common domain
A. Show that there is a unique function f : A ^ XxY such that fx = T^x^f and /y = nyof-
[Hint: Let f C. A x {X x Y) consist of all ordered pairs (a, {fx{o), fyio))) where a ^ A.

□

2.4 Equivalence Relations

Recall that any set X carries an identity relation Idx defined by

Idx := {{x,x) : X G X}

which is just the relation of equality, restricted to the set X.

The relation of equality is a class relation which satsifies following properties:

• = is reflexive: x = x.

• = is symmetric, li x = y, then y = x.

• = is transitive: li x = y and y = z, then x = z.

A relation which satisfies these same properties is called an equivalence relation. To be precise:

Set Theory

30

Definition 2.4.1 A binary relation R on a set X is an equivalence relation if and only if

(i) R is reflexive, i.e. \/x G X [xRx).

(ii) R is symmetric, i.e. Vx,y G X — t- yRx).

(iii) i? is transitive, i.e. \/x,y,z £ X (xRy A yi?2; — t- xRz).

□

It is easy to see that if n is a non-zero integer, then the relation =„ of equivalence modul^
n is an equivalence relation on the set Z of integers.

An equivalence relation R can be thought of as a generalization of the identity relation:
If xRy, then x, y are in a certain sense identified, i.e. regarded as "equal" for the purpose at
hand. When we collect together all the elements which are made "equal" we get equivalence
classes:

Definition 2.4.2 If R is an equivalence relation on a set X and x £ X, then the equivalence
class of X modulo R is the set defined by:

[x\r ■.= {y£X: xRy}

□

Some authors may write x/R instead of [x]ji.

The following device yields a fruitful way of looking at equivalence relations:

Definition 2.4.3 A family V of non-empty sets is called a partition of a set X if and only
if:

(i) [j{P:PeV} = X.

(ii) Any two distinct members oiV are disjoint, i.e. P,Q £ V and P ^ Q implies PCiQ = 0.

□

The members P €z V are sometimes called the blocks of the partition. Observe that (i) of the
above definition states that each x £ X belongs to at least one block, whereas (ii) states that
no x £ X belongs to more than one block. Thus each x £ X belongs to exactly one block.

Exercise 2.4.4 (a) Every equivalence relation on X induces a partition of X: Let R be an
equivalence relation on a set X. Define a family of sets Vr by

Vr := {[x]r -.xeX}

Show that Vr is a partition of X.
^Recall that a =„ b if and only if 6 — a is divisible by n.

31

Relations and Functions

(b) Every partition of a set X induces an equivalence relation on X: Let 7^ be a partition of
X. Define a binary relation R^p on X by

xR-py if and only if {x e P ^y e P)

(i.e. X R-p y if and only if x, y belong to the same block of V.) Show that R-p is an
equivalence relation. Further show that the blocks of V are precisely the equivalence
classes of R-p, i.e. that [x]ii^ = P if and only if x G P.

(c) Show that the constructions in (a), (b) above are inverses of each other, in the following
sense: Starting from an equivalence relation P, the construction in (a) yields a partition
Vr. If we then apply the construction in (b) to the partition Vr, we get an equivalence
relation R-p^. This relation is precisely the original equivalence relation P, i.e. Pp^ = P.
Similarly, Vr-p = V.

□

If P is an equivalence relation on a set X, then the set of equivalence classes — which we
above denoted by Vr — is usually denoted by X/ P. The equivalence class of an element x
— above denoted by [x\r — is often denoted by x/R.

Here is another reason why equivalence relations are ubiquitous in mathematics:

Exercise 2.4.5 (a) Every function induces an equivalence relation on its domain: Suppose
that / : X — > y is a function. Define a binary relation P on X by

xRy iff fix) = f{y)

The relation P is called the kernel of /, and denoted P = ker/. Show that ker/ is an
equivalence relation.

(b) Assuming the Axiom of Choice, every equivalence relation on a set X is induced by a
function with domain X: Show that if P is an equivalence relation, then there is a
function / with domain X such that P = ker /.

□

The following exercise gives two equivalents of AC:

Exercise 2.4.6 Show that the following are equivalent:

(a) AC

(b) For every partition V there is a set C which has precisely one element in common with
each block P gV.

(c) Every surjection has a right inverse.

□

Set Theory

32

2.5 Order Relations

Definition 2.5.1 (a) A binary relation < on a set X is said to be a partial order relation if
< is:

(i) reflexive, i.e. x < x,

(ii) antisymmetric, i.e. x < y and y < x imply x = y, and

(iii) transitive, i.e. x < y and y < z imply x < z.

(b) A binary relation < on a set X is said to be a strict partial order relation if < is

(i) asymmetric, i.e. x < y implies y ^ x.

(ii) transitive, i.e. x < y and y < z imply x < z.

(c) A total ordering is a connected partial ordering. A strict total ordering is a connected
strict partial ordering. A (strictly) totally ordered set is also called a chain.

□

A partially ordered set is a pair {X, <) where < is a partial ordering on X.

Observe that Idx is a partial ordering on X, and that is a strict partial ordering on X.
Given a partial ordering <, we can define a strict partial ordering < by

X < y if and only if x < y A x ^ y

i.e. < := < - Idx-

Conversely, given a strict partial ordering <, we can define a partial ordering < by

X < y if and only if x < y V x = y

i.e. < := < u Idx-

Moreover, < is total if and only if < is total.
Definition 2.5.2 Let < be a partial ordering on X.

(a) Let Y C X. An element yo e Y is said to be a <-minimal element of Y if there is no

y G Y such that y < yo-

We will refer to yo simply as a minimal element of y if < is clear from context.

(b) Let y C X. An element yo eY is said to be the <-minimum element of Y if and only if
Vy G F (yo < v)-

We will refer to yo simply as the minimum element of y, or the least element of y, if <
is clear from context.

(c) The partial ordering < is said to be well-founded if every non-empty subset of X has an
<— minimal element.

(d) A total ordering < is called a well-ordering if every non-empty subset of X has a <-
minimum element.

□

33

Relations and Functions

Exercise 2.5.3 Let {X, <) be a partial ordering, and let Y C X).

(a) Every minimum element of y is a minimal element. The converse may be false.

(b) A subset of a partial ordering can have at most one minimum element.

(c) If (X, <) is a chain, then every minimal element of Y is also a minimum element of Y.

□

Definition 2.5.4 Let {A, <) and {B, ^) be partially ordered sets.

(a) A function f : B is order-preserving if and only if

x<y ^ f{x) ^ f{y)

(b) {A, <) and (B, ^) are said to be order-isomorphic if there is a bijection f : A ^ B
with the property that both /, ^ are order preserving. Such an / is called an order-
isomorphism.

□

Exercise 2.5.5 (a) Let <) and (B,^) be partially ordered sets. Show that a bijection
f : A ^ B is an order-isomorphism if and only

x<y ^ f{x) < f{y)

(b) Give an example to show that not every order-preserving bijection is an order-isomorphism.

(c) Show that {A, <) and {B, :<) are chains, then any order-preserving bijection is an order-
isomorphism.

(d) Show that if X is a set and X is a family of subsets of X, then the inclusion relation C
is a partial ordering on X, i.e. {X, C) is a partial ordering.

(e) Let {X, <) be a partial ordered set. For x e X, define

I X := {y e X : y < x} and X := {], x : x & X}
Show that {X, <) and (X, C) are order-isomorphic.

(Thus, in some sense, C is the only partial order relation that exists, up to isomorphism.)

□

Set Theory 34

Chapter 3

Natural Numbers

God created the natural numbers; everything else is the work of man.

— attributed to Leopold Kronecker

It seems that Kronecker may have almost literally believed this. When Lindemann proved
in 1882 that vr is trancendental (i.e. not the root of a polynomial with rational coefficients),
Kronecker opined that the proof was indeed beautiful, but since irrational numbers do not
exist, Lindemann's proof had no meaning.

Certainly, Kronecker was the first and most vociferous critic of Cantor's transfinite sets[^

In 1889, Giuseppe Peano listed five axioms with the intention of providing a rigorous
foundation for the natural numbers. The Peano Postulates are:

I. is a natural number.

II. Every natural number has a successor s(n) which is also a natural number.
(The intention is that s(n) = n + 1.)

III. is not the successor of any natural number.

IV. If two natural numbers have the same successor, then they are identical, i.e. if s(n) =
s{m), then n = m.

V. If X is a set of natural numbers such that:

(i) G X,

(ii) If n £ X, then s{n) £ X,

then every natural number belongs to X.

(This is the Principle of Mathematical Induction.)

The informal idea is that the set of natural numbers is the set

{0, s(0), s{s{0)), s{s{s{0))), ...}

so that every natural number is obtained from by finitely many applications of the successor
function s(-). The point of this chapter is to create the natural numbers: We shall construct
an object which satisfies the Peano Postulates inside the set-theoretic universe.

^Doron Zeilberger is a well-known modern mathematician who holds similar opinions and publishes them
regularly. They're worth reading. Go have a look on the www.

35

Set Theory

36

In this chapter, we work mainly with the theory ZF~, which is ZFC without the Axioms
of Choice and Foundation. AC will be used occasionally, but its use will in all cases be stated
expplicitly.

3.1 The Set lu of Natural Numbers

Let us briefly recall some definitions made when we introduced the Axiom of Infinity: We
defined the successor function S{-) on sets by

S{x) := xU {x}

(Some authors write x~^ instead of S{x).) We defined a set X to be inductive if

(i) e X, and

(ii) X is closed under S{-): U x £ X, then S{x) £ X.
The Axiom of Infinity is the statement

There exists an inductive set
Now let I be the class of all inductive sets:

I := {x : Ind(j;)} where Ind(x) = £ x A {y £ x ^ S{y) £ x)

By the Axiom of Infinity, I is non-empty. It follows from Exercise 1.4.7 that P|I is a set.
(Indeed, if A is any inductive set, then f]I = {x £ A : Ind(x)} is a set by Separation.)
We will give this set a name:

and call u the set of natural numbers. Another commonly used symbol for the set of natural
numbers is N.

It is easy to see that uj is itself an inductive set. It follows that uj is the smallest inductive
set: If X is any inductive set, then u <Z x.

Exercise 3.1.1 (a) Show that the intersection of a non-empty class of inductive sets is an
inductive set.

(b) Conclude that uj is the smallest inductive set: If x is any inductive set, then a; C x.

□

Since £ u and u is closed under S{-), we see that u must have the following elements
(which we simultaneously provide with names):

:=

1 := 5(0) = {0}
2:= 5(1) = {0,1}
3:= 5(2) = {0,1,2}
4:= 5(3) = {0,1,2,3}

= {0,l,2,...,n-l}

37

Natural Numbers

Observe that the set n has (informal) n elements.

If n S cj, we write n + 1 := S{n) (i.e. n + 1 is another notation for S{n) := nU {n}).
The following result is easy:

Theorem 3.1.2 (Principle of Mathematical Induction) Suppose that X <^ u has the proper-
ties that

(i) e X.

(n) IfneX, then n+1 e X.
Then X = uj.

□

Exercise 3.1.3 Prove the Principle of Mathematical Induction.

□

Let us investigate the structure of the set uj in greater detail. We want to verify that uj, S
satisfy the Peano Postulates. We won't take the simplest route — things are about to get
quite technical — but the route we follow is good preparation for the general study of the
ordinals. In particular, we will also study the order relation on the natural numbers.

3.2 (cj, g) is a Transitive Well— Ordered Set

Above, we saw that 1 = {0}, 2 = {0, 1}, 3 = {0, 1, 2} . . . etc. Thus, intuitively, m < n if
and only if m G n. In other words, it seems that the G-relation acts as a strict total ordering
< on the set of natural numbers. Of course, the set of natural numbers yields the primary
example of a well-ordered seQ It is the fact that the natural numbers are well-ordered
that makes possible proofs by induction and definitions by recursion, so the importance well-
ordering is hard to overstate.

In this section, we make precise and verify the above assertions for our interpretation uj
of the set of natural numbers. We begin with some definitions:

Definition 3.2.1 (a) A set X is transitive if each element of X is also a subset of X:

Trans(X) = Vx € X[x C X)
(This means that if y G x and x G X, then also y €z X.)

(b) An element y S x is said to be an ^-minimal element of x if there is no z G y is such that
z G X. Put differently, y G x is an G-minimal element of x if y H x = 0.

(c) A set X is well-founde^\i every non-empty subset z C x has a G-minimal element.

^Recall that a total ordering (X, <) is a well-ordering if every non-empty subset of X has a least element.
We will investigate well-ordered sets in more detail in the next chapter.

''By the Axiom of Foundation, every non-empty set has an €-minimal element, i.e. every set is well-founded.
But we do not want to use this axiom here if we can avoid it.

Set Theory

38

□

Lemma 3.2.2 (a) If x is transitive, then so is S{x).
(h) If X is well-founded, then x ^ x.

(c) If X is transitive and well-founded, then so is S{x).

(d) If x,y are sets such that (i): x is transitive and well-founded, and (ii): S{x) = S{y),
then X = y.

Proof: (a) If y G S{x) = x U {x}, then either y G x or y = x. Since x is transitive, y C x in
either case, and thus y C S{x) also.

(b) {x} is a non-empty subset of x, and thus has a S-minimal element, i.e. there is
y € {x} so that y n {x} = 0. But then x = y, so x (1 {x} = 0, contradicting the fact that
X G X n {x}.

(c) We have already seen that S{x) is transitive if x is. Now suppose further that x is
well-founded, and that y Q S{x) is non-empty. We distinguish two cases:

Case (i): If y fix is non-empty, then since x is well-founded, yDx has an S-minimal element
z, i.e. there there is z € y D x such that z Ciy D x = 0. We claim that z is an G-minimal
element of y also. For suppose that there is w £ z Dy. Then certainly w G z. But as z G x
and X is transitive, we must have w G x as well. But then w G z D y D x — contradiction,
because z Ci y Ci x = 0.

Case (ii): If y D x = 0, then y = {x}. But since we know that x x, we see that x is a
E-minimal element of y.

(d) Suppose X ^ y. Then x G y and y G x. Since x is transitive, x G x. But since x is

H

Lemma 3.2.3 (a) If X is inductive, then {x G X : x CI X} is inductive also.
Hence uj is transitive.

(h) If X is inductive, then {x G X : x is transitive and well-founded} is inductive also.
Hence every element n G u is transitive and well-founded.

(c) UJ is well-founded.

(d) If X is inductive, then {x G X : x = V 3y G X (x = S{y))} is inductive also.
Hence every non-0 element of oj is the successor of some element ofuj.

Proof: (a) Let y := {x € X : x C X}. Then gY. Now suppose that x gY. Then x G X,
so S{x) G X, because X is inductive. Further, x C X, and hence S{x) C X. Since S{x) G X
and S{x) C X, we see that S{x) G Y. Hence Y is inductive.

It follows that {n G a; : n C w} is an inductive subset of co. Since uj is the smallest inductive
set, we see that n C a; holds for every n G oj. Thus oj is transitive.

(b) Let Y = {x G X : X is transitive and well-founded}. Certainly gY. Now if x G y,
then ^(x) G X (because X is inductive). Moreover, ^(x) is transitive and well-founded (by

Lemma 3.2.2). Hence S{x) G Y, and thus Y is inductive.

39

Natural Numbers

It follows that {n £ uj : n is transitive and well-founded} is an inductive subset of w, and
thus that it is equal to w.

(c) Let X C a; be a non-empty set, and let n £ X. If nOX = 0, then n is an G-minimal
element of X. Else n H X is a non-empty subset of the well-founded set n, and therefore
nf] X has an G~minimal element m. We claim that m is an G-minimal element of X also.
Indeed, since m G n and n is transitive, we have m <^ n, and hence mriX = nirinr]X = 0.

(d) Let Y := {x G X : X = V 3y £ X {x = S{y))}, so that a non-empty element
of X belongs to Y if and only if it is the successor of some element of X. Then clearly
G Y. Moreover, if x £ Y, then clearly S{x) is the successor of some element of X, and
hence S{x) G Y also.

As above, we see that {n G a; : n = V 3m E w (n = S{m))} is an inductive subset of u, and
hence is equal to u.

We can now see that uj, S do indeed satisfy Peano's axioms:
I. £ CO, because u is inductive.
II. If n G w, then S{n) G to, because u is inductive.

III. / S{n) for any n, because is the empty set, whereas n £ S{n).

IV. If S{n) = S{m), then n = m, by Lemmas 3.2.2[ d) and 3.2.3| ^b).
V. The Principle of Mathematical Induction holds, by Theorem [3.1.2

Now let's investigate the order relation on the set of natural numbers. Define a binary
relation < on a; as follows:

m < n if and only if m £ n

As usual, "n < m" means ^'n = m V n < m".

Observe that 0<1<2<3<..., because 0g1g2g3g

Theorem 3.2.4 (oj, £) is a transitive strict well-ordered set.

□

We already know that cj is a transitive well-founded set, and that the same is true for
any n £ uj. The proofs of the remaining assertions of this theorem are given in the following
exercise:

Exercise 3.2.5 (a) Show that < n for all n £ uj.

[Hint: Let X := {n £ uj : < n} and show that X is an inductive subset of uj.]

(b) Show that for all m,n £ uj we have m < n + 1 if and only if m < n.

(c) Show that for all m,n £ uj, we have m < n if and only if m + 1 < n.
[Hint: Let X := {n £ uj : Vm £ uj (m < n — )• m + 1 < n)}.]

(d) Show that the relation < is transitive: li k < m and m < n, then k < m.

Set Theory

40

(e) Show that the relation < is asymmetric: li n < m, then m ^ n.

(f ) Show that the relation < is connected: For any n, m, either n < m oi n = m oi m < n.
[Hint: Show, with the aid of (a)-(c) that the set {n e oj : Vm e oj {n < m V n =
m V m < n} is inductive.]

(g) Show that if X C a; is non-empty, then X has a <-least element.

□

We thus see that w is well-ordered by the G-relation. Furthermore, if n G w, then n C w,
since cj is transitive. Since every subset of a well-ordered set is also well-ordered, we see that
each n G w is also well-ordered by G. Thus if either x e oj ot x = oj, then x has the following
properties:

(i) X is transitive.

(ii) The G-relation well-orders x.

This means that every n E u, as well as oj itself, is an ordinal: An ordinal is a transitive set
which is well-ordered by the G-relation. So we have ordinals

0, 1,2,3, ...,a;

But now it follows easily that ^(t^'), SS{u}), SSS{lu), . . . are transitive and well-ordered by G
also. This yields transfinite ordinals

u + l,uj + 2,uj + 3, . . .

We will examine ordinals more closely in the next chapter, where we shall see that every
well-ordering is order-isomorphic to a unique ordinal.

We remark that assuming the Axiom of Foundation, the G-relation is a well-ordering as
soon as it is a linear ordering. However, we have not used Foundation in this chapter.

3.3 Recursion

In practically all branches of mathematics there are instances of definitions by induction.
Typically, one is given an initial object xq and then a rule to transform a given object Xn into
a new object Xn+i- The following theorem shows that our definition of the natural numbers
allows for such definitions:

Theorem 3.3.1 (Recursion Theorem — Version I)

Let f : X X be a function, and let xq E X. Then there exists a unique function g : uj ^ X
with the properties that

3(0) = xo g{n + 1) = f{g{n)) (*)

□

The idea is that ^(0) = xo, xi = g{l) = f{g{0)) = f{xo), X2 = g(2) = f{g{l)) = f{xi),

etc.

The proof of the above theorem is given by the next exercise:

41

Natural Numbers

Exercise 3.3.2 (a) First prove uniqueness: If gi : uj ^ X and , g2 '■ uj ^ X are two functions
which both satisfy (*) of Theorem 3.3.1, then gi = g2-

(b) Let be the set of ah subsets of F C x X with the following properties:

(i) (0,xo) G F;

(ii) If (n, x) G F, then (n + 1, f{x)) G F.

Let (7 := f^J-- Explain why g is a set, and why g £ J^.

(c) Since g is a set of ordered pairs it is a binary relation. Show that dom(g) = oo.

(d) It remains to show that the relation g is a function, i.e. that if (n,x), (n, y) G g, then
X = y. Define

S := {n £ oj : Vx, y £ X {{n, x) £ g A {n, y) £ g ^ x = y)}
We must show that S = oj. First show that £ S.

[Hint: Suppose that (0, y) £ g for some y ^ xq. Show that G — {(0, y)} G J^.]

(e) Finally, show that S = u.

[Hint: Show that S is inductive. If n G 5, there is a unique x such that (n, x) G G, and
hence certainly (n + 1, /(x)) G G. Suppose that {n + l,y) £ G for some y / x. Show that
G-{{n + l,y)}£T]

□

In some definitions by induction, one needs not just the current value Xn to define Xn+i,
but also earlier values Xm for m < n. For example, the Fibonacci sequence is defined by

Xq = Xi = 1 Xn+l = Xn-l + Xn for n > 1

In general, we may want to define function of xq, xi, . . . , x^j. The way to do this is

as follows: Given a set X, let

X^'^ := {h : h is a function with dom(/i) G uj and ran(/i) C X}

It is not hard to see that X^^ is a set, as each h £ X^^ is a subset oi u x X. If /i G X^^ has
dom(/i) = n, and h{k) = £ X for k < n, then we indicate this state of affairs by writing

h = {ho, . . . ,/i„_i]

i.e. we may think of an element of X'^^ as being finite sequence of elements of X. This is
merely a temporary notation to make the discussion that follows easier to understand. Note
that the empty sequence | ] := G X'^^.

Now let / : X^^ X he a function. Operating on an intuitive level, we can inductively
define a sequence

2^0 = /(ID Xi = f{lxoj) X2 = /([xo,xiI) . . . X„+i = /(|xo,xi . . . ,x„])
If we (still operating intuitively) define a function g : uj ^ X : n ^ Xn, then we have

g(0)=xo = /([l) = /(<7fO)

Set Theory

42

because g \ = = I}.
Next,

<7(l) = xi = /(Ixol) = /(gri)

because 5 f 1 is the finite sequence 1(7(0)] = |xol-
Then,

5(2) = X2 = /([xo,xil) = /(5 r2)

because g \ 2 = lg{0),g{l)} = [xcxi], etc.

The fohowing theorem states that the above construction can be accomphshed within the
confines of ZF~ :

Theorem 3.3.3 (Recursion Theorem — Version II)

For any set X and any function f : X"^^ ^ X there is a unique function g : uj ^ X such
that

g{n) = f{g\n) = f{{g{d),g[l),...,g{n -1)1) for all n ^ uj
Proof: Uniqueness of g follows easily by induction. To prove that g exists, we use Theorem

3.3. 1| First, define F : X<'^ X<'^ by F{h) := /i U {(dom(/i), /(/i))}. This means that
F([xo, . . . , x„_i|) := {xo, . . . , Xn-i, filxo, • • • , a;n-il)l

By Theorem 3.3.1 there is a unique G : uj ^ X^^ such that

G(0) = I1 G(n + l) = F(G(n))

Thus G(n + 1) = G{n) U {(n, f{G{n)))}. Observe that each G{n) is a function G{n) : n — t- X,
and that G{n + 1) f n = G{n). Now let g := Uran(G) = [j^^^ G{n). Then g : co ^ X, and
g \ n = G{n) for all n £ uj. Hence g{n) = G{n + l)(n) = f{G{n)) = f{g \ n), as required.

Now that we've constructed the natural numbers, it is possible to define the operations
of addition and multiplication on the set of natural numbers by recursion. However, we will
leave this till a later chapter, where we shall define these operations on the class of all ordinals.
We can also:

• Construct the integers Z as a set of equivalence classes of ordered pairs of natural
numbers.

• Construct the rational numbers Q as a set of equivalence classes of integers.

• Construct the real numbers M, as the set of Dedekind cuts {L, U), where L, U are non-
empty subsets of Q with the property that U has no minimum element, and for all / S L
and u G U we have / < u.

• Construct the complex numbers C as the set of ordered pairs of real numbers.

Of course, we must also define all the usual operations and relations on these sets as well. All
this can be done, but we won't pursue it in this course. Try Edmund Landau's Foundations
of Analysis for a mercilessly rigorous approach.

43

Natural Numbers

3.4 Finite and Infinite Sets

Observe that if n G u, then n = {m : m < n}, because < is G. Further observe that if
m,n G u, then m < n if and only if m C n, because G is transitive on the set of natural
numbers.

Definition 3.4.1 Suppose that n G A set X is said to have n elements if and only if there
is a bijection from X to the set n. In that case we say that the cardinality of X is n, and
write |X| = n. A set X is said to be finite if there is n G a; such that \X\ = n. A set is said
to be infinite if and only if it is not finite.

□

Thus a set X has n elements if it can be indexed by the set n, i.e. can be written
X := {xi : i G n}. Since the G is <, this can also be written as X := {xi : i < n}. This state
of affairs may also be indicated by writing X := {xq, xi, . . . , Xn-i}- Clearly each n G a; is a
finite set, and |n| = n.

Tlieorem 3.4.2 (a) The notion of "number of elements of a finite set" is well-defined: If
\X\ = n and \X\ = m, then n = m.

(h) A subset of a finite set is finite: If X is finite, and y C X. Then Y is finite. In addition,
\Y\ < \X\.

(c) The image of a finite set is finite: If X is finite and f is a function, then f[X] is finite.

(d) The union of a finite family of finite sets is finite.

(e) The power set of a finite set is finite.

(f) A set X is infinite if and only if for every n £ oj there is an injection f : n ^ X.

(g) (Using AC) A set X is infinite if and only if there is an injection f : oj ^ X .

□

The proofs of the above assertions are contained in the exercises that follow. First, how-
ever, we state and prove a Lemma:

Lemma 3.4.3 Let n £ uj and let X be a proper subset of n. Then there is no bijection
f : X n.

Proof: Let m G a; be least such that there a bijection / : m X onto a proper subset

X ^ m. Then m > (because = has no proper subsets), and hence by Lemma 3.2.3 'd)
there \s n £ uj such that m = n + 1.

Now consider two cases:
Case 1: If n G X, there is k < n such that f{k) = n. Let g : n + 1 n + 1 be the bijection
which has g{k) = n, g{n) = k, and which has g{m) = m for all m ^ {k,n}. (Thus g is the
permutation which swaps k and n, and leaves all other elements of n + 1 unmoved.) Then
f '■= f ° 9 ^ bijection f : n + 1 >^ X with f{n) = n. It is easy to see that / \ n maps n

Set Theory

44

onto X — {n}, a proper subset of n. But this contradicts the minimahty of m = n + 1.
Case 2: li n ^ X, then X C n. It is now easy to see that / \ n maps n onto X — a
proper subset of n — again contradicting the minimahty of m.

-\

Exercise 3.4.4 (a) Prove Theorem 3.4.2[ a): Show that if \X\ = n and \X\ = m, then
n = m.

[Hint: First show that if n 7^ m, then either n ^ m or else m ^ n. Now use Lemma

3.4.3

(b) Show (without using the Axiom of Choice) that the foUowing are equivalent for n,m £ ui:

(i) n < m.

(ii) There is an injection n ^ m.

(iii) There is a surjection m ^ n.

[Hint: (ii) implies (i): If / : n ^ m and m < n, then / is a bijection from n onto
ran/ ^ n.

(iii) implies (ii): If g : m ^ n, one can define a right inverse f : n ^ m without using
AC]

(c) Suppose that X, Y are finite sets. Show (without using AC) that the following are equiv-
alent:

(i) 1^1 < \y\-

(ii) There is an injection f : X Y.

(iii) There is a surjection g lY ^ X.

[Hint: Use (b).]

(d) Prove Theorem 3.4.2K b): Show that if X is finite and Y C X, then Y is finite, and
\Y\ < \X\.

(e) Prove Theorem 3.4.2 c): Suppose that X is a finite set, and that / is a function. Show
that f[X]\ is a finite set, and that |/[^]| < |^|.

□

Exercise 3.4.5 (a) Suppose that X, Y are finite sets. Show that X UY is finite.

[Hint: Recall that we haven't defined addition on the natural numbers, only the successor
function. Suppose that m is least for which there are finite sets X, Y with |X| = m and
such that X UY infinite. Show that m = n + 1 for some n. Let x €z X, and consider
X' ■.= X - {x} and Y' ■.= YU {x}.]

(b) Prove Theorem [3A2](d) : Show that if X = {Xq, Xi, . . . , Xn-i} is a finite family of finite
sets, then |J = |Jfc=o -^k is finite.

(c) Prove Theorem |3.4.2[ e): Show that if X is finite, then V{X) is finite.

[Hint: Induction on \X\. Observe that if x G X, then V{X) = XVjy, where X := {Y (1
X ■ X ^Y] a.ndy := {Y X : X e Y].]

45

Natural Numbers

□

Exercise 3.4.6 Prove Theorem 3.4.2[ f): Show that a set X is infinite if and only if there is
an injection f : n ^ X for every n £ u. Conclude that every inductive set is infinite.

□

A set X is said to be Dedekind infinite if there is a bijection from X onto a proper subset
of X, and Dedekind finite otherwise.

Exercise 3.4.7 (a) Show that every finite set is Dedekind finite, and that every Dedekind
infinite set is infinite.

(b) Prove (without using AC) that a set X is Dedekind infinite if and only if there is an
injection uj X.

[Hint: Suppose that Y X is a proper subset of X, and that f : X Y is an
injection. Let xq £ X — Y, and define g : u) ^ X hy

g{0) = xo g{n + l) = figin))

If g : uj ^ X, consider f : X ^ X hy

fix)

where S is the successor function.

X if X ^ ian{g)
g{S{g'^{x))) ifxGran(5)

(c) Prove (with the aid of AC) that a set X is (in)finite if and only if it is Dedekind (in)finite.
Conclude that Theorem 3.4.2[ g) holds.

[Hint: Suppose that X is the set of all non-empty subsets of X, and that F : X ^ [JX
is a choice function. Consider g : uj ^ X defined by

g{n) := F{X — ran(g \ n))

and use Theorem [3X2t f).]

□

Set Theory 46

Chapter 4

Ordinals

4.1 Well— Orderings

Let (X, <) be a totally ordered set. If x £ X, then we define the initial segment given by x
by:

X{x) := {y £ X : y < x}

Thus X{x) is the set of all elements that are strictly below x. Clearly X{x) is a totally
ordered set also (with the relation <f X(x) x X{x) inherited from X).

Lemma 4.1.1 Suppose that (X, <) is a total ordering. Define X := {X{x) : x G X}. Then
the map f : X ^ X : x ^ X{x) is an order-isomorphism from (X, <) to {X ^ C).

Proof: It suffices to show that x < y if and only if X{x) C X{y). It is obvious from the
transitivity of < that X[x) C X{y) whenever x < y. Conversely, suppose that X{x) C X{y).
If X ^ y, then y < x, so y G X{x), and hence y G X(y) — contradiction.

H

When two partial orderings {X, <) and (Y, -<) are order-isomorphic, we say that they
have the same order-type.

Recall that a total ordering (W, <) is called a well-ordering if every non-empty subset
of X has a <-least element. Clearly, the set of natural numbers, together with its usual
ordering, is a well-ordering.

The following lemma gives two methods for constructing well-orderings on products. It
will prove useful when we discuss the arithmetic of ordinal and cardinal numbers.

Lemma 4.1.2 (a) Suppose that {X, <) and {Y, -<) are strict partially (totally, well-) ordered
sets. Define the lexicographic ordering C on X x Y by

(xi,yi) C (x2,y2) xi < X2 V (xi = X2 Ayi-<y2)

Then \Z is a strict partial(totally, well-) ordering on X x Y .

^Lexicography: The editing or making of a dictionary.
In a dictionary, words are ordered lexicographically. To decide which word occurs first in the dictionary, you
look at their first letters. If these are the same, you look at their second letters. . .

47

Set Theory

48

(b) Suppose that {X,<) is a strict well-ordered set. Define the canonical ordering C on the
product X X X by

max{x,y} < max{x',y'}
max{x, y} = max{x', y'} and x < x
max{x, y} = maxja;', y'}, x = x\ and y < y'

Then \Z is a strict well-ordering on X x X .

{x,y) C {x',y')

or
or

□

Exercise 4.1.3 Consider (w, <), the set of natural numbers equipped with its usual order
relation.

(a) Write down the first ten elements in the lexicographic ordering \Zi of oj x uj. Hence show
that (w X w, C;) is a well-ordering, but not order-isomorphic to (w, <).

[Hint: Observe that the element (1,0) is not the successor of any {n,m) & uj x oj.]

(b) Prove Lemma 4.1.2| ^a).

(c) Write down the first ten elements in the canonical ordering Cc of u x uj. Hence show that
(a; X UJ, Cc) is order-isomorphic to {uj, <), and thus a well-ordered set.

(d) Prove Lemma 4.1.2[ b).

□

Exercise 4.1.4 (a) Show that a finite total ordering is a well-ordering.

[Hint: You must show that if {X, <) is a finite total ordering, and if y is a non-empty
subset of X, then Y has a least element. Do this by induction on \Y\.]

(b) Here is a useful characterization of well-ordered sets: A total ordering (X, <) a well-
ordering if and only if there is no sequence (xn) of elements of X such that xi)x2 > xs >
.... Prove this (using AC).

[Hint: One direction is easy. For the other, suppose that Y C X has no least element,
and let F be a choice function for family the non-empty subsets of Y. Define g : uj ^ Y
recursively by g{n) = F{Y - {g{0), g{l), . . .,g{n- 1)}).]

□

It is precisely well-ordering that allows arguments by induction:

Theorem 4.1.5 (Induction on a Well-Ordering) Let (W, <) be a well-ordered set. Let E C
W be a set with the following properties.

(i) The least element ofW is a member of E.

(a) For all X G X, ifW{x) C E, then x G E.

Then E = W.

49

Ordinals

Proof: Else let wo be the least element oi W — E. Then W{wo) C E, so wq £ W —

H

In preparation for the definition of the transfinite ordinals, we will now begin our inves-
tigation of the structure of well-ordered sets. Most of the results that follow in this section
are due to Cantor.

Theorem 4.1.6 Let {W,<) be a well-ordered set.

(a) Let f : W ^ W be a strict order-preserving map. Then x < f{x) for all x £ W.

(b) Idw is the only order-automorphism from W to W.

Proof: (a) Let E := {x £ W : x ^ f{^)}- If E ^ 0, then E has a least element xq.
Since xq ^ /(xq), we have /(xq) < xq, and hence /(/(xq)) < f{xo). But then /(xq) € E,
contradicting the fact that xq is the least element of E. Hence E = 0.

(b) If / is an order-preserving automorphism, then both the maps /, f~^ are strict order-
preserving. Hence x < /(x) and /(x) < /^^(/(x)).

H

Corollary 4.1.7 If (W, <) and {V, :<) are order-is amorphic well-ordered sets, then there is
a unique order-isomorphism from W onto V.

□

Corollary 4.1.8 No well-ordering (W, <) is order-is amorphic to an initial segment W{wo)
of itself.

□

Theorem 4.1.9 Let (W, <) and {V, :<) be well-ordered sets. Then exactly one of the following
is the case:

(a) W,V are order-is amorphic.

(b) W is order-isomarphic to an initial segment of V , i.e. there is vq G V such that W is
order-is amorphic to V{vo).

(c) V is order-isomarphic to an initial segment ofW, i.e. there isw^^W such that W{wq)
is order-is amorphic to V .

Proof: Define a set of ordered pairs / by

/ := {{w,v) £ W X V : W{w) is order-isomorphic to V{v)}

Observe first that / is a function: For suppose that {w, vi) and {w, V2) both belong to /, where
vi / f2- Without loss of generality, we may assume that vi -< V2. Then V{vi) and ^(^2) are
order-isomorphic (as each is order-isomorphic to W{w)). But then, if h : ^(^2) — ?• V{vi) is

an order-isomorphism, we must have h{vi) -< vi, which contradicts Theorem 4.1.6

Set Theory

50

Next observe that / is a strict order-preserving map: For suppose that {wi,vi), {w2, V2) €
/ are such that wi < W2- We want to show that vi -< V2- Now there is an order-isomorphism
h : W{w2) Viv2), and this (when restricted) yields an order-isomorphism / f W{wi) :
W{wi) — >■ V{h{wi)). Hence {wi,h{wi)) G /, so that h{wi) = vi. But as h{wi) G V{v2), we
see that vi -< V2, as required.

In particular, it follows that / is one-to-one.

Observe that dom(/) is downwards-closed: If wi < W2 and W2 G dom(/), then wi G
dom(/) as well. Indeed, there isv2&V such that W{w2) is order-isomorphic to V{v2)- Since
wi G W{w2), there must be vi G ^(^2) such that W{wi) is order-isomorphic to ^(^1), so
that wi G dom(/).

In the same way, it can be shown that ran(/) is downwards-closed.

Now we distinguish three cases:

Case 1: dom/ ^ W: In that case, let wq be the <-lcast element of W — dom/. We claim
that / is an order-isomorphism from W{wo) onto V. Now because dom(/) is downwards-
closed, there is no w G dom(/) such that u'o < w. Hence dom(/) = W{wo). If / is not sur-
jective, let vq be the ^-least element ofV — ran(/). Then clearly W{wo) is order-isomorphic
to V{vo) — contradicting the fact that wq ^ dom(/). Hence / is a strictly order-preserving
map with dom(/) = W{wo) and ran(/) = V.

Case 2: ran/ ^ V: In that case, let vq be the -< -least element of F — ran(/). Arguing as
in Case 1, it follows that / is an order-isomorphism from W onto V{vO).

Case 3: dom/ = W and ran/ = V. In that case, / is an order-isomorphism of W onto V.

H

4.2 Ordinals

Recall that the set uj of natural numbers was shown to have the following properties:

(i) a; is a transitive set: Vn (ra G a; n C w).

(ii) UJ is well-ordered by the G-rclation: The relation < on u defined by n < m <^=^ n £ m
is a strict total-ordering on lo, and every non-empty subset of u has a <-least element.

Furthermore, we saw that every n E lo has precisely the same properties: Each n G a; is
transitive, and well-ordered by G.

This suggests a definition, due to John von Neumann:

Definition 4.2.1 A set is said to be an ordinal if and only if it is transitive and well-ordered
by G.

□

Thus UJ is an ordinal, as is each n e uj.

Exercise 4.2.2 Recall that n = {m : m < n} for all n G uj. Thus uj{n) = n for all n G w,
where uj{n) is the initial segment of uj determined by n. In fact, this property characterizes
the ordinals: Show that a strict well-ordering {W, <) is an ordinal if and only if W{x) = x
for all X gW.

[Hint: (<^): v < w iS v £ W{w) iSvGw. li v £w £W, then v G W{w) C W.]

51

Ordinals

□

We shall use the first few letters of the Greek alphabet to denote generic ordinals. Since
G is a strict order relation, we shall also write a < (3 when a G (3.

Observe that if a is a non-zero ordinal and xq is the first (i.e. the least) element of a,
then xq = = 0. For if x G xq, then x G a (by transitivity of a), and x < xq (because < is
G), which contradicts the fact that xq is the least element of a. Suppose now that xi is the
next element in a, assuming it has one, i.e. xi is the least element in a which is > xq. Then

X G xi <^=^> X < xi <;=^ X = Xo

and hence xi = {xq} = {0} = 1. Now if X2 is the next element of a (again, assuming there is
one), then

X G X2 <^=^ X < X2 <^=^ X = Xo V X = Xl

and so X2 = {xo, xi} = {0, 1} = 2.

It thus appears that each ordinal a starts 0<1<2<3<.... Each of these elements is
itself an ordinal.

In Remarks 4.2.13, we will explain that the existence of an inductive set is equivalent to
the existence of an infinite set. Thus keep the Axiom of Infinity in mind while reading the
proofs of the next few propositions and theorems, and observe that these proofs do not rely
on this axiom.

Proposition 4.2.3 // a is an ordinal and 13 £ a, then /3 is an ordinal.

Proof: Suppose that a is an ordinal, and that (3 G a. We first show that f3 is transitive,
i.e. that if 7 G (3, then 7 C Now if 7 G /3 and 5 G 7, then (by transitivity of a, we have
S,^, 13 G a. Since < is G on a, we have 5 < 7 < /?, and since < is transitive, we have 6 < /3,
i.e. (5 G /3. Thus 7 C /3.

H

Now observe that every ordinal is a transitive wellfounded set: Every non-empty subset
of an ordinal has a least element, which must be an G-minimal element. If X, Y are transitive
wellfounded sets, then a G-isomorphism from X to y is defined to be a bijective map vr :
X ^ Y with the property that

Xl G X2 <^=^ vr(xi) G 7r(x2) (xi,X2GX)

Note that if a, f3 are ordinals, then an G-isomorphism from ct to /3 is precisely an order-
isomorphism.

Also observe that if vr : X — )■ y is an G-isomorphism between transitive wellfounded sets,
and X G X, then 7r(x) = {vr(x') : x' G x} = 7r[x]: Indeed, if y' G vr(x), then, there is some
x' G X such that 7r(x') = y' . It follows that 7r(x') G vr(x), and hence that x' G x.

The following proposition will be useful.

Proposition 4.2.4 Suppose that X,Y are transitive well-founded sets, and that tt : X ^ Y
is an G-isomorphism. Then 7r(x) = x for all x G X . Hence X = Y .

Set Theory

52

Proof: If {x £ X : 7r{x) 7^ x} is non-empty, it has a 6-minimal element xq. Now if x £ xq,
then also x £ X, and hence 7t{x) = x. Thus

7r(xo) = {vr(x) : x G xq} = {x : x G xq} = xq

— contradiction. Hence {x E X : 7r(x) 7^ x} = 0.

H

Theorem 4.2.5 // a, /3 are ordinals, then exactly one of the following is the case:
(i) a = f3, or
(a) a £ 13, or
(Hi) P £ a.

Proof: a, /3 are both transitive sets well-ordered by G, and hence wellfounded. Hence at
most one of the above alternatives can be the case.

By Theorem 4.1.9|either (i) a, (3 are order-isomorphic, or (ii) a is isomorphic to an initial

segment of (3, or (iii) /3 is order-isomorphic to an initial segment of a.

Suppose that a is isomorphic to an initial segment of (3, i.e. that there is f3o £ (3 and an
order-isomorphism vr : a /3(/3o)- Observe that

/3(/3o) := {7 G /3 : 7 < /?o} = {7 G /3 : 7 e /3o} = /3n /3o = /3o
because /3q C /3. Thus vr is an G-isomorphism from the transitive wellfounded set a onto the

transitive wellfounded set /3o- By Proposition 4.2 A we have a = /3o, and thus a £ f3.

The remaining two cases can be dealt with in a similar fashion.

H

The next corollary follows immediately:
Corollary 4.2.6 If a, f3 are ordinals, then a £ f3 if and only if a'^ /3.

□

We denote the class of all ordinals by

ON := {a : Q is an ordinal}
We shall soon see that ON is a proper class, i.e. not a set. For the moment, however, note

that ON is a class which is well-ordered by £: By Theorem 4.2.5 G is a strict linear ordering
on ON. To see that it is a well-ordering, suppose that Y is a non-empty subclass of ON,
and let a G Y. If a H Y = 0, then a is the least element of Y. Else, a H Y is a non-empty
subset of the ordinal a, and hence has a least element (3. That /3 is easily seen to be the least
element of Y. Thus every non-empty subclass of ON has a least element.
Here are some additional facts about ordinals that are easy to establish:

Proposition 4.2.7 (a) If a is an ordinal, then a = {{3 : (3 < a}.

53

Ordinals

(b) If a, (3 are ordinals, then exactly one of the following is the case: either a is an initial
segment of f3, or a = 13, or f3 is an initial segment of a.

(c) If C is a non-empty class off ordinals, then {^C is an ordinal. In fact, [^C is the least
element of C (i.e. H C S C and H ^ = inf C ).

(d) If C is a set of ordinals, then \JC is an ordinal. Furthermore, [jC = supC.

(e) If a is an ordinal, so is S{a) := aL){a}. Furthermore, S{a) is the successor of a, i.e.
the least ordinal which is > a.

□

Exercise 4.2.8 Prove the preceding proposition.

□

Since S{a) is the successor of a we write a + 1 := S{a). An ordinal a is said to be a
successor ordinal if an only if there is /3 such that a = (3 + 1. An ordinal which is not a
successor ordinal is called a limit ordinal. Observe that is a limit ordinal, as is uj. Indeed,
uj is the smallest non-zero limit ordinal: For if a < a; is a non-zero ordinal, it is a non-
zero natural number, and hence a successor ordinal. Furthermore, the successor of a natural
number is a natural number, so ui is not a successor ordinal.

Exercise 4.2.9 Suppose that ao < ai < 02 is a strictly increasing sequence of ordinals.
Show that sup„^^ a„ is a limit ordinal.

□

Exercise 4.2.10 If a is a limit ordinal, then a = sup{/3 : f3 < a}. If a is a limit ordinal,
then sup a < a.

□

Exercise 4.2.11 Show that the class ON of all ordinals is a proper class, i.e. not a set.
[Hint: If ON is a set, then by Proposition 4.2.7, sup ON is an ordinal.]

□

The result of the previous exercise — that the collection of ordinals is too big to be a set
— is known as the Burali-Forti paradox, after Burali-Forti, who published it in 1897. It is
probably the earliest of the paradoxes that set theory had to contend with in its initial years.

The Axiom of Replacement has not yet been needed in our development of set theory, but

Theorem 4.2.12 Every well-ordered set is isomorphic to a unique ordinal.

Set Theory

54

Proof: Let {W, <) be an arbitrary well-ordered set. Define a binary relation F by

{w,a) G F

w

£ W and a G ON and W{w) is order-isomorphic to a.

Then by Theorem 4.2.5| -F is a class function. By Theorem |4.1.9 F is injective. Moreover, it
is easy to see that F is strictly order-preserving.

Next, we show that if w G W, then is an ordinal. (We are not claiming that

W{'w) C dom(F).): Firstly, since W{w) is a set, so is F[Vl^(u))], by the Axiom of Replacement.
Now 7^ ON, because ON is not a set (by Exercise 4.2.11). If 7 is the least ordinal

which is not an element of F[W{w)], then certainly F[VF(t(;)] = {j3 e ON : /3 < 7} = 7.

In the same way it can be shown that is an ordinal.

Now we demonstrate that dom(F) = W: For else W — dom(F) has a least element w.
Then ^ := F[VF(tt;)] is an ordinal, and F : W{w) — )• ^ is a order-preserving bijection, and
hence an order-isomorphism. Hence {w, ^) S -F — contradiction.

Thus if we define a := F[W], then F is an order-isomorphism from W onto a. Hence
every well-ordered set is order-isomorphic to some ordinal. That ordinal must be unique.

because no two distinct ordinals are order-isomorphic (by Theorem 4.1.9 and Proposition

Remarks 4.2.13 Recall that the Axiom of Infinity asserted: There is an inductive set. The
set oj of natural numbers is then the smallest inductive set. A set X was defined to be finite
if it can be placed into bijective correspondence with some n £ ui, i.e. if \X\ = n for some
n £ uj. A set was said to be infinite if it is not finite. We showed that a set X is infinite if
and only if for every n G uj there is an injection n ^ X.

We now want to show that, in the theory ZFC 0-5 + Replacement, the existence of an
inductive set is equivalent to the existence of an infinite set. First suppose that there exists
an inductive set X. It follows that u is a set (namely the intersection of the non-empty class
of all inductive sets). li n G to then n C w, so that there is an obvious injection n ^ to. Hence
OJ is infinite. This proves that if there is an inductive set, then there is an infinite set.

Conversely, suppose that there is an infinite set X. Certainly every natural number is
an ordinal. But it is, at this stage, conceivable that ON is the class of all natural numbers,
i.e. that u = ON is a proper class. (Obviously, if ON contains some ordinal a > uj, then
OJ G ON as well, by transitivity, and hence u would be a set. So if a; is a proper class, then
OJ = ON.) Let X := {Y C. X '.Y is finite}. By Power Set and Separation, Af is a set. Define
a binary relation F by

(Y,n) e F <?=^ n£ujA\Y\=n

Then F is clearly a class function (i.e. if {Y,ni) G F and {Y,n2) G F, then m = 712). By
the Axiom of Replacement, F[J\!] is a set. We claim that F[X] = uj. If not, there is a least
m G OJ such that m F[Af], because oj = ON is a well-ordered class. Now m 7^ 0, since
£ X. Thus m = n + 1 for some n £ oj. Hence n G -F[Af], i.e. there is y C X such that
\Y\ = n. But since X is infinite, we cannot have y = X, and thus there is x £ X — Y . Then
|yu{x}| = n + l, so 71 + 1 G F[X] — contradiction. Thus F[X] = is a set, i.e. an inductive
set exists.

□

55

Ordinals

Remarks 4.2.14 Observe that since a; is an ordinal, we may inductively define new ordinals

CO + n (for n G w) by

oj + := iu CO + {n + I) = S{uj + n)

The class uU {u + n : n € uj} is clearly transitive an well-ordered by €, and thus ought to be
an ordinal: call it a; + The problem is that the Axioms ZFC 0-7 are not strong enough to
prove that {a; + n : n G a;} is a set. It is, at this stage, conceivable that u + oj = ON is the
class of all ordinals.

Consider, however, the formula (p{n, y) which asserts that n € a; and that y = u + n.
Clearly ip(n,y) defines a function F, in the sense that if (p{n,y\) and ip{n,y2), then yi = 7/2-
By the Axiom of Replacement, F[uj] = {y : 3n E co {(p(n, y)} is a set, i.e. {cj + n : n G w} is a
set. Then a; + a; := a; U {a; + n : n G is a set as well. Clearly the ordinal a; + a; is a limit
ordinal.

□

4.3 Transfinite Induction and Recursion

Induction is a major tool for proving results about natural numbers. Recursion is a major
tool for constructing functions whose domain is the set of natural numbers. The aim of this
section is to extend these tools to the class ON.

The Induction Principle for the natural numbers reads as follows: If X C w is such that
(i) G X, and (ii) n e X ^ n + 1 e X, then X = uj.

Each non-zero natural number is a successor ordinal. In general, however, there are also limit
ordinals that need to be taken care of. Hence:

Theorem 4.3.1 (Transfinite Induction) Let X. be a class of ordinals. Suppose that

(i) G X;

(ii) If a then it follows that a + 1 G X;

(Hi) If a is a limit ordinal and ^ G X for all (3 < a, then it follows that a G X.
Then X = ON.

Proof: If X / ON, there is a least ordinal a such that a X. By (i), a / 0. By (ii), a is
not a successor ordinal. By (iii), a is not a limit ordinal — contradiction, as a limit ordinal
is, by definition, an ordinal which is not a successor ordinal.

H

Exercise 4.3.2 Let X be a class of ordinals. Suppose that Va G ON (a C X — >■ a G X).
Show that X = ON.

□

Set Theory

56

Next, we discuss transfinite recursion. Recall that our second version of the Recursion
Theorem for natural numbers read as follows: If / : X^'^ — )■ X is a function (where X"^^ is
the set of all finite sequences in X, i.e. the set of all function from a natural number to the
set X), then there is a unique function g : oj ^ X such that

g{n) = f{g\n) = f{[g{Q),...,g{n-l)})

where |xo, . . • , ic^-i] is the function from the set n := {0, 1, . . . , n — 1} to the set X given by
k ^ Xk. In particular, ^(0) = /(0), ^(l) = /([<?(0)1), g{2) = f{\g{Q),g{l)}), etc. Each g{n)
is a function of the previously defined values ^(O), . . . , g(n — 1).

To extend this result to the transfinite realm, we introduce similar notation and terminol-
ogy: If a is an ordinal, then an a-sequence (or transfinite sequence of length q) is a function
with domain the set a = : ^ < a}. We write:

\x^ : ^ < aj is the function with domain the set a = : ^ < a} given by ^ i-^

Given a sequence s := [x^ : ^ < a] and a set x, we obtain the sequence s U {{a,x)} with
domain a + 1, and denote it by {x^ : < aj^x.

We also write : G ON] for a class function on ON.

Suppose now that -F is a class function, defined on the class V^*^'^ of all transfinite
sequence^ i.e. F assigns to each transfinite sequence of sets [x^ : < a] a set -F(|x^ : ^ < a]).
We can then create a "sequence" |x^ : ^ € ON] recursively as follows:

Xa := F{lx^ : C < a])

so that Xa is defined as a function of previously defined values x^, (^ < a). Thus:

Xo:=F{0), xi:=F([xo]), X2 := F([xo, xi]),

This shows how to define x„ recursively for all n < a; — and we knew how to do that already.
But now we continue: By the Axiom of Replacement, {xn : n < cj} is a set, and hence the
function |x„ : n < ujj exists (i.e. is a set). We thus have

Xui ■■= F{lxn ■■ n < u)}), x^+1 := F([x„ : n < a;]^X(^), x^+2 := F(|x„ : n < wJ^x^^Xo^+i),

In general, if q G ON is such that xg has already been defined for all ^ < a, then by the
Axiom of Replacement, the function [x^ : ^ < a] exists (i.e. is a set), and we may define
Xa :=F(|x^ :^<a]).

In this way we build up a class function G : ON — )• V : a i— )• x^

Theorem 4.3.3 (Transfinite Recursion) Let F be a class function (on VJ. Then there is a
unique function G on ON such that

G{a) = F{G \ a) for all a G ON (*)

Proof: Define a first-order formula ip{a, h) which asserts that a G ON, and /i is a function
h := |x^ : < a} with domain a such that

For all /3 < a, /i(/3) = F{h \ (3), i.e. x,3 = F{lx^ : ? < /3])
^We include the finite sequences in the class of transfinite sequences.

57

Ordinals

We show that tp defines a class function: If both 'ip(a, hi) and ip{a, hold, then hi = h2,
i.e. the function h is unique (assuming it exists). Indeed, suppose that (3q is the least ordinal
< a such that hiiPo) / /i2(/3o). Then /ii(/3o) = : ^ < /3ol) = F(I/i2(0 : ? < /^o) =

In particular, it follows by the Axiom of Replacement that if hp) holds for all /3 < a,
then {hp : /3 < a} is a set.

Now let (p{a,x) = 3h [-(/'(a,/!) Ax = F{h)]. If both (p{a,x) and {p{a,y), hold, then
X = F(h) = y, where h is the uniqe function such that ip{a,h). Hence ip defines a class
function

G{a) = X <;=^ ip{a,x)

It remains to show that dom(G) = ON, i.e. that G{a) is defined for every a G ON. If not,
let ao be the least ordinal such that oq dom(G). Then for each f3 < ao, there is a unique
function hp : fi ^ W such that hp). We now consider two cases:

Case 1: Suppose that ao = /? + 1 is a successor ordinal. In that case there is hp = \x^ : ^ < j3\
such that tp{f3,hp) holds. If we define xp = F{hp), and then define h := : ^ < /SJ^xp,
then clearly V'(q^O) h) holds also, so that oq € dom(G) after all.

Case 2: Suppose that ao is a limit ordinal. By the Axiom of Replacement, {hp : f3 < ao} is a
set. It is easy to show that if ?? < /? < ao, then h^j = hp fa. It follows that h := IJ/3<ao
a function with domain ao. Furthermore, if we define

Xp := h{(3) for /3 < ao

then h = [xg : < ao], and hp = [x^ ■ S, < P} ^ot all /3 < ao- Clearly, therefore, '4j{aQ,h)
holds, and hence ao S dom(G) in this case also.

We have now shown that there exists a class function G : ON — )■ V with the required
property {k). Suppose that G' is another. If G 7^ G', let a be the smallest ordinal such that
G(a) ^ G'{a). Then G f a = G' f a, and hence G(a) = F(G f a) = F{G' \ a) = G'{a) —
contradiction. Hence G = G', i.e. there is a uniqe function with the property (★).

H

4.4 Ordinal Arithmetic
4.4.1 Limits

Definition 4.4.1 (a) A transfinite sequence [7^ : < a] of ordinals is said to be strictly
increasing if

T] < ^ implies 7,, < 7^

(b) If A > is a limit ordinal, then the limit of a strictly increasing sequence [7^ : ^ < A] of
ordinals is defined by

hm 75 := sup{7^ : ^ < A} = M 75

c —

(This limit exists, by Proposition 4.2.7

(c) A strictly increasing sequence [7^ : ^ < a] (or [7^ : C £ ON]) of ordinals is said to be
normal if it is "continuous" , i.e. if for for every limit ordinal A in its domain we have

7a = lim 75 i.e. hm7^ = 71;^^

Set Theory

58

□

Similarly, we say that a class function F : ON — )• ON is normal if it is strictly increasing
and continuous. Of course, any sequence is a function.

Proposition 4.4.2 Suppose that [7^ : ^ G ON] is a normal sequence of ordinals.

(a) 7g > C for all^e ON.

(b ) jx is a limit ordinal if X is a limit ordinal.

(c) The sequence has arbitrarily large fixed points.- For every a G ON there is fi > a such
that = j3.

□

Exercise 4.4.3 Prove Proposition 4.4.2

[Hint for (c): Define /3o := 7Q>/3n+i := Iji^^P ■= 1™ l^n]

□

Definition 4.4.4 (Ordinal Addition) We define a class function + : ON x ON — ON by

transfinite recursion as follows. Let a £ ON.

(i) a + := a.

(a) a + (/3 + 1) := (a + /3) + 1 for successor ordinals (3 + 1.

(Hi) a + B := limfa + £) for limit ordinals /3 > 0.

□

Observe that, for each a G ON, we have defined above a function Sa ■ ON — )• ON : /3 1— )• a+/3.
It is easy to see that each Sa{-) is a normal function (i.e. strictly increasing and continuous).
It is also easy to see that ordinal addition coincides with the usual addition on the set of
natural numbers.

Lemma 4.4.5 For all a,f3,j £ ON, we have

a + (/3 + 7) = (a + /3) + 7

Proof: By induction on 7: For all ordinals a, (3 we have

a+(/3 + 0) = a + /3 = (a + /3) +

Next, if 7 is such that q + (/3 + 7) = (a + /3) + 7 holds for all a, /3, then

a + (/3+(7 + l)) = a + ((/3 + 7) + l) = (a + (/3 + 7)) + 1 = ((a + /3)+7) + l = (a + /3) + (7 + l)

and hence the result holds for successor ordinals 7 + 1.

Finally if 7 is limit, and a + {fi + rf) = (a + /?) + ?? for all r] < ^, then

a + (/3 + 7) = a + lim(/3 + r?) = lim(a + {(3 + 77)) = lim((a + f3) + i]) = {a + (3) + j
and hence the result holds for limit ordinals.

59

Ordinals

H

However, ordinal addition fails to be commutative! For example:

1 + a; = lim (1 + n) = a; u + 1 := S(uj) > u

n<u!

so 1 + UJ ^ UJ + 1.

Consider now uj + 1. This is the least ordinal which is > Hence it is order-isomorphic
to the set of natural numbers, followed by a new maximum element.

Similarly, a; + 2 = (a; + l) + l has another additional element added to the top: It is order-
isomorphic to io followed by two new elements.

Next, consider oj + oj = lim{uj + n). This is the least ordinal which is > cj + n for every n < uj.

n<ui

Clearly, therefore, u + oj \s order-isomorphic to w, followed by another copy of u.
We make this precise.

Suppose that (X, -<x) and {Y, -<y) are disjoint totally ordered sets. We can form their
linear sum (Xuy, -<) by gluing X, Y together in such a way that every element of X precedes
every element of Y . This means that the linear sum of X, Y is order-isomorphic to an ordering
consisting of X followed by Y.

Thus we define ^ on X U y as follows: u^v ^ X {JY then

u ~< V

u,v E X and u -<x v
or u, v G y and u -<y v
or u e X and v eY

Observe that ~< = U -(y U {X xY).

If {X,^x), {Y,^y) are totally ordered sets, but X CiY ^ 0, then we can make them
disjoint by a standard trick. Let X := {0} x X, and define -<x by {0,x) -<x (0,a;') if and
only if X -<x x' . Clearly (X, -<x) and (X, -<^) are order-isomorphic.

Similarly, let Y := {1} x Y and define -<y by (l^y) ~<y O-^v') if '^^^Y V u' ■ Then
X, Y are disjoint and order isomorphic to X, Y.

Observe that the linear sum of X,y is a sort of lexicographic ordering: For i,j G {0, 1} and
u, u G X U y we have:

(z, u) < {j, v) i < j V {i = j /\u <v)

where you have to keep track to which set each < belongs.

Proposition 4.4.6 If a, 15 ^ ON, then a + 15 is order-isomorphic to the linear sum of a and
15 (in that order).

Proof: By induction on (5.

H

Exercise 4.4.7 Let a,/3,7 G ON.

(a) Show that if /3 < 7, then a + /3 < a + 7.

Also find an example where /3 < 7, but /? + a = 7 + a.

Set Theory

60

(b) Show that if a < /?, then there is a unique ordinal 7 so that a + 7 = /?.

Also find an example where a < j3, but the equation x + a = j3 has many solutions x.
[Hint: The set : a < ^ < /3} is well-ordered, so is order-isomorphic to some ordinal 7.]

□

4.4.3 Multiplication

Definition 4.4.8 (Ordinal Multiphcation) We define a class function ■ : ON x ON — > ON
by trans finite recursion as follows. Let a G ON.

(i) a • := 0.

(a) a ■ {P + 1) := a ■ P + a for successor ordinals (3 + 1.
(Hi) a - fi := lim(a • ^) for limit ordinals P > 0.

□

Observe that, for each a G ON, we have defined above a function P„ : ON — t- ON : /? a-/3.
It is easy to see that each -Pa(') is a normal function (i.e. strictly increasing and continuous).
It is also easy to see that • coincides with the usual multiplication on the set of natural
numbers.

Furthermore, ordinal multiplication is associative:
Lemma 4.4.9 For all a,/3,7 G ON, we have

a • (/3 ■ 7) = (a ■ /3) • 7

□

Exercise 4.4.10 Prove the preceding Lemma.

□

Observe that

2 ■ u = lim 2 ■ n = Lo uj-2 = u}-{1 + 1)=uj-1 + uj = u + co

n<ijj

Thus ordinal multiplication also fails to be commutative: 2 ■ u ^ cj ■ 2.

Indeed 2-oj looks like w-many copies of the total ordering 2: Each element of oj is replaced

by a copy of 2.

On the other hand, co ■ 2 looks like 2 copies of co: Each element of 2 is replaced by a copy of

UJ.

We make this precise:

Suppose that {X, -<x) and {Y, -<y) are totally ordered sets. The linear product of these
sets is the set X xY equipped with the reverse lexicographical ordering -<, i.e.

{x,y) ~< {x',y')

y -<y y

or y = y' and x -<x x'

61

Ordinals

Thus, if we keep y €Y fixed, the subset Xy := {{x,y) : x G X} C X x y is a copy of X. We
see furthermore that if y -<y y' , then each element of the copy Xy precedes each element of
the copy Xy' in the reverse lexicographic ordering. Thus, roughly, {X xY,^) consists of "F
copies of X" .

Proposition 4.4.11 If a,P E ON, then a - 13 is order-isomorphic to the linear product of a
and P (in that order).

Proof: By induction on jS.

H

Exercise 4.4.12 Let a,^,7 G ON.

(a) Show that a • (/? + 7) = a • /3 + a • 7.

Also find an example where (a + /3) • 7 7^ a • 7 + /3 • 7.

(b) Show that if /3 < 7 and a > 0, then a • /3 < ct ■ 7.

Also find an example where /3 < 7 and a > 0, but /3 • a = 7 • a.

□

4.4.4 Exponentiation

Definition 4.4.13 (Ordinal Exponentiation) We define a class function ON x ON — >■ ON
by transfinite recursion as follows. Let a G ON.

(i) ftO := 1.

(ii) a^'^^ := ■ a for successor ordinals /3 + 1.
(Hi) = lim for limit ordinals (3 > 0.

□

Observe that, for each a G ON, we have defined above a function : ON — )• ON :
P a^. It is easy to see that each Ea{-) is a normal function (i.e. strictly increasing and
continuous). It is also easy to see that this function coincides with the usual exponentiation
on the set of natural numbers.

Exercise 4.4.14 Show that if a,/3, 7 G ON, then

□

Set Theory

62

4.5 Project: Goodstein's Theorem

Goodstein's Theorem is a a result about the natural numbers involving just the basic arith-
metic operations. To state it, we introduce some defintions and terminology.

Wc describe here the procedure for writing a natural number in "superbase" form. Given
a natural number n > 2, every m G w has a unique representation in base n, i.e. there are
unique /cq, . . . , fc^ G {0, . . . , n — 1} such that

d

m = ki ■

Each exponent i in the n*-term of the above representation may be an arbitrary natural
number. In particular, it is possible that i > n. In that case, we can write i itself in base n,
i.e. i = X]j=i h ' - some of the exponents j are > n, one repeats the procedure. Clearly,
after finitely many steps, the number m will have an expression in which no numbers > n
appear. This is the representation of m in superbase n.

For example, if we we want to write the number m = 1 026 in superbase form, with base
n = 2. Then

1 026 = 2^° + 2^ = 22'+2^ +2'= 22^'+^+2^ + 2^
No numbers > 2 appear in this representation of 1 026 in superbase 2.

Problem 1. Write 59 106 in superbase 3.

□

Now consider the Goodstein algorithm for generating a sequence of natural numbers (m„)
starting from an arbitrary natural number m:

Step 1. Let mi := m.

Step 2. Write mi in superbase 2.

Replace all occurrences of the symbol '2' by the symbol '3', to get a new number.
Subtract 1 from this new number, to get the number m2-

Step 3. Write m2 in superbase 3.

Replace all occurrences of the symbol '3' by the symbol '4', to get a new number.
Subtract 1 from this new number, to get the number m^.

Step 4. Write m^, in superbase 4.

Replace all occurrences of the symbol '4' by the symbol '5', to get a new number.
Subtract 1 from this new number, to get the number m^.

Step 5. :

We illustrate this for m = mi =5.
Step 1. mi = 2^ + 1, so m2 = 33 + 1 - 1 = 27.

63

Ordinals

Step 2. m2 = 3^, so ma = 4^ - 1 = 255.

Step 3. mg = 3 • 43 + 3 • 42 + 3 • 4 + 3. Hence 7714 = 3- 5=^ + 3- 52 + 3- 5 + 3 = 467.

Continuing, we get 775, 1197, 1751, 2454, 3 325, 4382, 5 643, 7126, 8849, 10830, 13087,
15 637, 18499, 21691, 25 231, ...

Theorem (Goodstein's Theorem)

The sequence {mn)n will eventually reach the number 0, for any starting value m = mi.

□

The above theorem was proved in 1944. The remarkable fact about Goodstein's Theo-
rem is this: Even though it is a statement involving only the basic arithmetic of the natural
numbers, it requires transfinite ordinals in order to prove it. In 1981, Baker and Paris
proved that Goodstein's theorem implies the consistency of first-order Peano Arithmetic. By
Godel's Second Incompleteness Theorem, PA is not strong enough to prove its own consis-
tency. Hence Goodstein's Theorem is not provable in PA. It follows that transfinite methods
are necessary even for proving results in number theory! Goodstein's theorem is unprovable
without recourse to the transfinite.

We now formalize the Goodstein algorithm.

Definition: Given a base n>2, define the function Sn oj ^ oj recursively by

d d

5„(0) := Sn[Y.^i- nf) :=Y,h-{n + 1)^"«

1=0 i=0

Then, for each n > 1, define functions Qn ■ uj lo recursively by

gi{m) := m gn+i{m) := Sn+i{gn{m)) - 1

□

Problem 2: Show by induction that the function Sn effects the replacing of n by n + 1, so
that

k ■ {n+ ly if i < n
fc • (n + 1)"+^ if i = ra

Snik ■ n') =
etc.

Conclude that the sequence {gn{fn))n is precisely the Goodstein sequence starting with

m.

□

Problem 3.

(a) Calculate the first few terms mi,m2, ■ ■ ■ ,mioo, ... of the Goodstein sequence {gn{^))n
starting with mi = 5. (Play around, and push it as far as the recursive depth permitted
by your programming language will allow. Observe that the terms don't seem to be
converging to 0.. . . )

Set Theory

64

(b) We now try to understand how and why the sequence m„ will eventually reach 0.

(i) Suppose that, for some m, it is the case that gni'm)
in base n + 1. For which k is gk{m) = 0?

(ii) Suppose that, for some m, it is the case that gni'm)
which k is gki^n) = 0?

(iii) Suppose that, for some m, it is the case that gnim)
which k is gk{m) = 0?

(c) Calculate how long it will take for (m„)n to reach zero when m = 4.

Hint: There's doubling at work: m2 = 2-3^ + 2-3+2. Then = 2-6^ + l-6 + 5 has linear
coefficient reduced by 1. Then mu = 2 ■ 12^ + 0-12 + 11 has linear coefficient reduced
once more. Then 717.23 = 1 • 24^ + 23-24 + 23 has the quadratic coefficient reduced by 1.
Then 77747 will have the linear coefficient reduced from 23 to 22. Doubling 22 more times,
the quadratic coefficient of nin will disappear when n = 1 + 3 + 6 + 12 + 24 - -- + 3- 2^^.
In that case m„+i = (77 + 1) - (77 + l)-*^ + (77 + 1) - (77 + 1)°. Double (77 + 1) more times to
remove the linear coeficient altogether. . . ]

□

To prove that eventually reaches zero, we use transfinite ordinal arithmetic. For 77 > 2,
define a function \I'„ : w — )■ ON as follows:

d d
*„(0):=0 *„(^A;i-77^) :=^A:i-a;*"«

i=0 i=0

For example, *2(10) = ^2(2^ + 2^) = u;*2(3) +^^2(1) ^ ^^{2^+2") ^^^^{2°) ^ ^u,*2(i)+<^*2(o) ^
Problem 4:

(a) Show that for all 2 < 77, 777 < co,

*„(777) = *„+l(5n(?77))

(b) Show that each is a strictly increasing function, i.e. that ^ni'm + 1) > *n('T7) for all
2 < 77, 777 < a;.

(c) Show that if 2 < 77, 777 < a; and gni'm) > 0, then

*n+2(5n+l("i)) < *n+l(5nM)

(d) Now prove Goodstein's Theorem: V777 377 igni'm) = 0).

= 77 = - (77 + 1)^ + 77 - (77 + 1)°
= 2- (77 + 1)^ + 0- (77 + 1)°. For

= 77 - (77 + 1)"^ + 77 - (77 + l)'^. For

□

Chapter 5

Cardinals

5.1 Cardinality: Basic Definitions

Ordinals are for counting (i.e. enumerating) sets. Such an enumeration automatically induces
a well-ordering on a set X: Here is the first element xq, here is the next xi,. . . , here is the
the next Xi^,. . . . We do not yet know whether or not every set X can be enumerated in such
a fashion. If X is well-orderable (i.e. if there exists a well-ordering relation on X), then X
can be thus enumerated, because every well-ordering is order-isomorphic to a unique ordinal.
However, even when we do know that a set X can be well-ordered, the well-order induced
by the enumeration of its elements is not generally unique (unless X is finite). For example,
the set Lo of natural numbers can be enumerated in the usual fashion:

0,1,2,3,... (1)

It can also be enumerated by first listing all the even numbers, and then all the odd numbers:

0,2,4,. ..,1,3,5,... (2)

We can enumerate it also by first listing all the positive numbers which are not powers of
some prime, followed by all the prime powers, followed by zero, as follows:

1, 6, 10, 12, 15, . . . , 2, 4, 8, 16, ... , 3, 9, 27, 81, . . . , 5, 25, 125, . . . , 7, 49, (3)

The order-type of the above enumerations are: (1) w, (2) u + uj, and (3) lo-uj + 1.

Obviously, the size of the set of natural numbers does not depend on how we enumerate it.
To measure the size of a set, we therefore need a different notion, namely that of cardinality.
The idea is that two sets A, B have the same size if there is a correspondence between the
elements of A and B, such that to each element a ^ A there corresponds a unique element
b & B, and vice versa.

Definition 5.1.1 (a) We say that two sets A,B have the same cardinality (or the same
cardinal number) and write

\A\ = \B\ there exists a bijection A^ B

In that case we say that A, B are equivalent, or equipollent, or have the same power.

65

Set Theory

66

(b) We say that the cardinahty of a set A is smaller than the cardinality of B, and write

\A\ < \B\ there exists an injection A B

We also write \A < \B\ to mean \A\ < \B\ but \A\ / \B\.

□

Recall that in the chapter in natural numbers, we wrote |X| = n if there is a bij action
from X to the set n. In particular |n| = n for all n € c<j. We also showed that \n\ = \m\ if and
only if n = m. Thus we may define the finite cardinal numbers to be the natural numbers.
For infinite sets we do not yet have a candidate for the object \X\, i.e. the symbol \X\ is
(at this stage) meaningless when taken in isolation. We have given meaning to statements of
the form \X\ = \Y\ and \X\ < \Y\, where \X\ does not occur in isolation. Thus a statement
such as \X\ = \X\, though quite easy to prove, does not follow trivially from the logical rules
that govern the symbol =. What we are asserting when we say that \X\ = \X\ is not that
the object \X\ is equal to itself — we do not yet have an object \X\ — but that there is a
bijection from X to X. Equality doesn't come into it at all.

Cardinal numbers |X| as definite objects can be defined. However, such a definition
requires either the Axiom of Choice (in which case l-'^l is the least ordinal which is able to
enumerate X), or else the Axiom of Foundation (in which case l-'^^l is the family of all sets
of minimal rank which are equivalent to X — the definition of rank requires the Axiom of
Foundation) . In this chapter we avoid the use of both these axioms.

The next exercise shows that the relations defined above behave as their notation suggests.

Exercise 5.1.2 (a) Show that equivalence of sets is an equivalence relation on the class of
all sets:

(i) 1^1 = 1^1;

(ii) If ^1 = \B\, then \B\ = \A\;

(iii) If \A\ = \B\, and \B\ = \C\, then \A\ = \C\.

(b) Show that |A| <

(c) Show that if \A\ = \A'\ and \B\ = \B'\, then \A\ < \B\ if and only if \A'\ < \B'\.

(d) Show that if |^| < \B\ and \B\ < |C|, then |^| < |C|.

□

We say that a set X is countable if and only if |X| < \lo\, i.e. if and only if there is an
injection X ^ oj. Clearly, each natural number is countable, as is u itself. A set which is not
countable is said to be uncountable.

Exercise 5.1.3 (a) Show (without using (AC)) that a set X is countable if and only if there
is a surjection oj -» X.

Thus X is countable if and only if it can be enumerated by natural numbers, possibly
with repetitions

X := {xn : n G a;}

67

Cardinals

(b) Show that a subset of a countable set is countable.

(c) Show that X is countable if and only if either X is finite (i.e. \X\ = n for some n G w)
or \X\ = \oj\. Thus we cannot have

n < \X\ < \uj\ for all n G w

g

[Hint: Let a; -» X, where X is infinite. Define h : oj ^ X recursively by

h{ri) = g{m.m.{k E u : g{k) ran(/i f n)})
and show that /i is a bijection.]

(d) Show that the union of two countable sets is countable. Conclude that the union of a
finite family of countable sets is countable]^

(e) Show that the cartesian product of two countable sets is countable. Conclude that the
cartesian product of a finite family of countable sets is countable.

[Hint: First show that u x uj ^ u : (z, j) i— )• 2* • (2j + 1) — 1 is a bijection. (You may
assume that the usual laws of arithmetic hold.)]

□

Intuition for the sizes of sets may also suggest to you that < is antisymmetric, i.e. that
if 1^1 < \B\ and \B\ < \A\, then |j4| = \B\. However, this is not at all obvious: One has to
show that if there is an injection from A to B and an injection from B to A, then there is a
bijection from A to B. Nevertheless, we will proceed to show that it is the case. The proof
of this fact is the first (in these notes) that deserves the appellation "beautiful" .

Theorem 5.1.4 (Schroder-Bernsteirj^ Theorem)

If A,B are sets such that \ A\ < \B\ and \B\ < \ A\, then \ A\ = \B\.

i

Proof: We first reduce the problem to the situation where B (1 A. Suppose that A ^ B

j

and B ^ A are injections. Define B := j[B], and Ai := (j o i)[^]. Then

AiOBCA, \Ai\ = \A\, \B\ = \B\

If we can show that \A\ = \B\, then also |^| = \B\.

We have thus reduced the problem to the following situation: Given that Ai <^ B <^ A

f

and that \ Ai\ = \ A\, we have to show that |^| = \B\, i.e. that there exists a bijection A B.

g

Now since |^| = \Ai\, there is a bijection A Ai, so that Ai = g[A]. Now define two
sequences of sets |A„ : N < ujj and : n < ojj hy induction as follows:

Ao := A An+i := g[An] Bq := B Bn+i := g[Bn]

^You may know that the union of a countable family of countable sets is countable, but to prove that, you
need (AC).

^Also known as the Cantor-Bernstein Theorem.

Set Theory

68

Since 5 is a bijection one can think oi Aq, Ai, A2, ■ ■ ■ as copies of the same set A. Similarly
Bo, Bi, B2, ■ ■ ■ are copies of B. A peek at Figure 5.1 may clarify matters.
Moreover,

Ao^Bo^Ai^Bi^---^An^Bn^ An+l 5 Bn+1 5 . . .

When we form set differences of successive sets, we obtain disjoint sets

Aq — Bq, Bq — Ai, Ai-Bi, B1 — A2, ... An — Bn, Bn — An+l,

Thus we can write A, B as disjoint unions of these set differences.

A= \J {An-Bn)U U {Bn-An+l) B= [j - 5„) U |J (S^-^+i)

0<n<aj

0<n<aj

l<n<aj

0<n<tj

Now observe that g (suitably restricted) is a bijection from each set difference to the next
one of the same type. To be precise.

g[An — Bn] = An+1 — Bn+1

g[Bn — An+l] = Bn+1 — An+2

f

We can now construct the required bijection A B. The basic idea is to use g to move each
point X G An — Bn to the corresponding point f{x) G An+i — Bn+i- Points x G Bn — An+i
are left unmoved, however. Thus we define

g{x) ifxe IJ {An-B,

0<n<ui

X if X G U {Bn- An

+1;

0<n<t>;

Now observe that / is a bijection from Uo<n<aj(^"~^") onto Ui<n<aj(^"~-^")- clearly
is also a bijection from Uo<n<a;(^n ~ ^n+i) onto itself. It follows easily that / : A B is a
bijection.

Remarks 5.1.5 With the Schroder-Bernstein Theorem in place, we see that < defines a
partial ordering on the equivalence classes of =. Another intuitively plausible fact is the Law
of Trichotomy, which asserts that < induces a total ordering: If A, B are sets, then

Either |^| < \B\ or |^| = \B\ or \A\ > \B\

Cantor assumed that the Law of Trichotomy held, but was unable to supply a proof. In fact,
the Law of Trichotomy is equivalent to (AC) as we shall see in a later chapter.

Recall that a set X is Dedekind infinite if and only if there is a bijection from X onto a
proper subset of X. We have seen in an exercise that every Dedekind infinite set is infinite.
Furthermore, we saw that a set X is Dedekind infinite if and only if there is an injection
uj X , i.e. if and only if \uj\ < \X\. To prove that an infinite set is Dedekind infinite
required (AC) however. Thus without (AC) it is possible that \X\ > n for all n £ uj, yet
\X\ ^ In that case X,u} are incomparable elements in the ordering <.

Cardinals

Figure 5.1: Illustration of the Schroder-Bernstein Theorem. The sets An are circles, and
the sets Bn are rectangles. Furthermore the sets An — Bn are shaded white, whereas the
sets Bn — Aji^i arc shaded grey. The function / moves each point in a white region to the
corresponding point in the next white region. It leaves points in the grey regions unmoved.

Set Theory

70

□

Right now, we do not yet know that there are uncountable sets. The next theorem, owing to
Cantor, shows that there are:

Theorem 5.1.6 For every set X, we have

\X\ < \V{X)\

Proof: Clearly X — > V{X) : x i-> {x} defines an injection form X into 'P{X). Hence

\X\ < \V{X)\.

Let f : X 'Pi^) be an arbitrary function. We show that / cannot be a surjection:
Indeed, define an element Y eV(X) by

Y --{xeX f{x)}

If y G ran(/), then Y = f{y) for some y E X. Then

y G y ^ ye fiy) ^ y^Y

— contradiction. Hence Y ran(/), i.e. / is not surjcctivc.

Since / was arbitrary, it follows that there is no surjection — and thus no bijection —
from X onto P{X). Hence \X\ /

H

As a consequence, V{oj) is an uncountable set.

5.2 Cardinal Arithmetic in ZF

We stated in the previous section that we do not (yet) have candidates for the cardinality \X\
of an infinite set X. Wc will now proceed as if we can assign a cardinal number k = |X| to
a set X, and define arithmetic operations on these cardinals. Each statement about cardinal
arithmetic can be translated into a statement about operations on sets: The notion of cardinal
number is not necessary, but it is convenient. We use the letters k, A, /x, u to denote cardinals.
Recall that is the set of all functions from B to A.

Definition 5.2.1 Suppose that k = \A\, A = \B\.

(a) K + X := 1^4 U where we require that An B = 0.

(b) K-X:= \AxB\.

(c) := \A^\.

□

Remarks 5.2.2 (a) Note in the definition of cardinal addition that if A, B arc not disjoint,
one can always replace them with equivalent sets which are disjoint. Simply choose x ^ y,
and define A' := A x {x}, B' := B x {y}. Then \A\ = \A'\, \B\ = \B'\, and A' n B' = 0.

71

Cardinals

(b) Clearly = 1 and , I'' = 1. Furthermore = if k > 0. Observe, however, that there
is a unique function — )• 0, so that O'^ = 1.

Exercise 5.2.3 Check that the above definitions make sense, i.e. are independent of the
choice of A, B. Thus check that

(a) \i Ar\B = 0, Ar\B = 0,\A\ = \A\ and \B\ = \B\, then \AU B\ = \A\J B\.

(b) If \A\ = \A\ and \B\ = \B\, then \Ay.B\ = \Ay. B\.

(c) If \A\ = \A\ and \B\ = \Bl then \A^\ = \A^\.

□

The next proposition shows that cardinal arithmetic is better behaved than ordinal arithmetic.
Proposition 5.2.4 (a) k + X = X + k.

(b) {k + x) + = K + {x +

(c) K ■ X = X ■ K.

(d) {k ■ X) • fi = K ■ {X • fi).

(e) K ■ {X + fl) = K ■ X + K ■ fl.

(f) K-^+M = . ^/._

(g) [k ■ XY = ■ Xf".

(h) If n < X, then K + /_i<A + /i, n ■ fj, < X ■ and k,^ < X^.

(i) // < A < fi, then < k^.

□

Exercise 5.2.5 Prove Proposition 5.2.4

[E.g. to prove k^^^" = ■ k^, let \A\ = k, \B\ = A, \C\ = n with BnC = 0, and show that
there is a bijection from A^'^'-^ onto A^ x A'-' .]

□

Exercise 5.2.6 Recall from Exercise 5.1.3 that if X is an infinite countable set, then \X\ =
We thus denote the cardinality of an infinite countable set by the symbol lo. Show that

UJ + U = UJ-LO = U}.

□

The following theorem relates the cardinalities of power sets to cardinal arithmetic.
Theorem 5.2.7 // \X\ = k, then \V{X)\ = 2'^ . Hence k < 2'^ for all cardinals k.

Set Theory

72

Proof: We have to show that there is a bijection from V{X) onto 2^ , where 2 := {0, 1}.

Observe that every subset Y C X induces a unique characteristic function (or indicator
function) xy ■ X 2 which, for each x £ X, indicates whether or not x gY. Thus:

f 1 ifxeY
Xy{x) ■= \ .r . for X G X

10 if X ^Y

Conversely, every function / : X — )• 2 indicates a unique subset Xj := {x £ X : f{x) = 1}.
There is therefore a correspondence between subsets of X and functions X — t- 2. To be
precise, define

^> : V{X) ^2^ -.Y ^ xy{-) ^ -.2^ ^ V{X) : f ^ Xf

Then it is easy to see that <I> o vj; = [d^x and ^I/ o \$ = idp(x)- Hence ^' = and \$ is a
bijection.

It follows by Theorem 5.1.6 that k < 2'^ for all cardinals k.

5.3 Alephs

We are now ready to define the notion of cardinal number for certain sets: The alephs are the
cardinal numbers of well-orderable sets.

Recall that every well-ordering is order-isomorphic to a unique ordinal. Thus if X is well-
orderable, then there is a bijection a X from some ordinal a to the set X. Conversely, if
there is a bijection / : a X : x^, then / induces a well-ordering on X: Define x^ < x^
if and only if ^ < 77.

Thus a set is well-orderable if it and only if it can be placed into a bijective correspondence
with some ordinal. Equivalently, a set X is well-orderable if and only if it can be enumerated
by an ordinal, i.e. if and only if there is an q G ON such that X = {x^ : ^ < a}.

The order-type of an enumeration need not be unique: At the beginning of this chapter,
we showed how to enumerate the set of natural numbers by various ordinals: w, w + a; and
w X a; + 1. Since the size of the set of natural numbers is unaffected by how we order it, we
have

|a;| = |a; + a;| = |w-a; + l|

Many other enumerations of the set of natural numbers are possible, but it is clear that the
natural numbers cannot be enumerated by an ordinal which is < a;: oj is the smallest ordinal
which can enumerate the set of natural numbers. This leads to the following definition:

Definition 5.3.1 (a) An ordinal a is said to be a cardinal number (or just cardinal) if and
only if for all ordinals /3:

/3 < a =^ < \a\

(b) If a set X is a well-orderable set, then \X\ is the smallest ordinal which enumerates it,
i.e.

IXI = smallest ordinal a for which \X\ = \a\

□

73

Cardinals

Thus a is a cardinal if and only if there is no bijection f : /3 ^ a for any f3 < a, i.e. if and
only a cannot be enumerated by any /3 < a. Moreover, if X is well — orderable, then \X\ is a
cardinal.

An ordinal a which is a cardinal is also called an initial ordinal: It is the first ordinal to
attain the size |a|.

Clearly 0, 1, 2, . . . , w are cardinal numbers. Furthermore, uj is the smallest infinite cardinal.

Remarks 5.3.2 (a) The set of cardinals is therefore a subclass of the set of ordinals. As
such, it inherits the well-ordering on ON. Observe that if a, /3 are cardinals, then

a< (3 in ON

|a| < |/3|

By Definition 5.3.1 we see that a < (3 implies |a| <

and \a\ <

Conversely, if a, /3 are cardinals

then it cannot be the case that /3 < a, and hence a < p.

There is therefore no chance of confusion if we use the same symbol < for the order
relation on both the class of ordinals and the class of cardinals.

(b) We have defined arithmetic operations on both the class of ordinals and the class of
cardinals. Here, however, care must be taken. The ordinal product uj ■ co is not at all the
same as the cardinal product u ■ to. Indeed, for the ordinal product we have uj ■ uj > oj,
yet |a; • wl = \uj\. Thus the ordinal product w • a; is an ordinal which is not a cardinal.
The cardinal product is simply uj ■ uj = u.

It is moreover not hard to see that for ordinal exponentiation, uj^ := Unew'^" count-
able, i.e. that for ordinal exponentiation uj^ is a countable set. However, for cardinal
exponentiation, we clearly have |a;|l^l > 21'^' > |a;|, i.e. for cardinal exponentiation, uj^ is
an uncountable set. The two interpretations of uj^ do not even have the same cardinality!

Thus when doing arithmetic, you have to keep track of whether you are dealing with
ordinals or cardinals. Only for the natural numbers do ordinal and cardinal arithmetic
coincide.

□

Here is an easy result:
Proposition 5.3.3 Every infinite cardinal is a limit ordinal.
Proof: If a > w is infinite, then the map / : a + 1 — )• a defined by

/(7) := <

is a bijection, i.e |a + 1| = \a\. Thus a + 1 is not a cardinal.

if 7 = a
7 + 1 if 7 < w
7 if u; < 7 < a

Set Theory

74

As we stated earher, clearly 0, 1, 2, . . . , a; are cardinal numbers. Are there any others? We
know that \X\ < \V{X)\. However, we generally do not know if V{X) is a well-orderable set,
even if X is well-orderable. Thus though \X\ may be a well-defined cardinal number (in the

sense of Definition 5.3.1), the object |'P(X)| may not have any meaning in isolation.

Nevertheless, as we will now show, there are as many cardinals as there are ordinals. Since
ON is a proper class, so is the class of cardinals.

Let X be any set (not necessarily well-orderable). Then V{X) is a set. Now if Y <Z X,
then a well-ordering of y is a subset of 1" x y. Hence for each Y G V^X), the class

:= {-< G V{Y X Y) well-orders Y}

is a set, by Separation. It follows that W := IJyeVix) ^ s^^' possible

well-orderings of subsets of X . We can define a class function F which assigns to each
well-ordering -< G W the unique ordinal to which it is order-isomorphic. By the Axiom of
Replacement, -F[W] is a set. Thus |J -F[W] = supF[W] is an ordinal, which may or may
not belong to -F[W]. In any case, the successor ordinal of sup-F[W] cannot be a member of
F[W]. Thus {q G on : a F[W]} is a non-empty class of ordinals, and therefore has a
least member — call it h{X).

Now observe that, by definition, a G F[yV] if and only if X has a subset which can be
enumerated by a, i.e.

a G -F[yV] <;=^ there is an injection from a to X

Hence h{X) is the least ordinal a for which there exists no injection from a into X.

The reasoning above shows that the ordinal h{X) always exists, for any set X — well-
orderable or not. Thus:

Definition 5.3.4 If X is a set, define the Hartog's number h{X) of X as follows:
h{X) := least ordinal a such that there is no injection a X

□

Proposition 5.3.5 For any set X , h{X) is a cardinal number. Moreover, h{X) ^ \X\.

Proof: If a < h(X), then there is an injection a ^ X. If it was the case that |a| = |/i(X)|,
there would exist an injection h{X) ^ X — contradiction. Hence \a\ < \h{X)\ for all
a < h{X).

Similarly, if |/i(^)| < X, then there would exist an injection h{X) ^ X — contradiction.
Hence \h{X)\ ^ |A:|.

H

Observe that we are not saying that h{X) > \X\. That would be the case if the Law of
Trichotomy holds, i.e. if (AC) holds. However, without (AC), it is possible that \X\, h{X)
are incomparable in size.

Proposition 5.3.6 For every a G ON there is a cardinal which is greater than a. Indeed,
h{a) is the least cardinal > a.

75

Cardinals

Proof: Both a, h{a) are ordinals. Thus either h{a) < a, or else a < h{a). Since ^
we must have a < h{a). Thus h{a) is a cardinal which is > a.

If /3 is a cardinal and /3 < h{a), then there is an injection f3 ^ a, and hence |/3| < \a\.
Because /3 is a cardinal, it then follows that (3 < a. Hence h{a) is the least cardinal > a.

H

If K is a cardinal, then is the least cardinal > k, i.e. the successor of k in the class
of cardinals. We denote it by

:=

We already remarked that the order relation on the class of cardinals is a restriction of the
order relation on the class of ordinals. Here is another result linking the two — The supremum
of a set C of cardinals does not depend on whether you regard C as a set of ordinals or a set
of cardinals:

Proposition 5.3.7 If C is a set of cardinals, then sup C :=\JC is a cardinal.

Proof: We know that a := (J C is an ordinal, and that it is the least ordinal which is > k
for every k £ C. Now suppose that f3 < a. By definition of a, there is k G C such that
/3 < K < a. Because k is a cardinal, it follows that |/3| < |k| < \a\. Hence |/3| < \a\ whenever
/3 < a.

We can now define an enumeration of the class of all infinite cardinals, the aZep/i^

^^o,^^l,^^2, • • . . . . • • • - • •

by transfinite recursion:

b^o := ^0 is the set of natural numbers

'Ra+i '■= ^a+i ■= succcssor ordinals a + 1

"i^x = '■= supji^Q, : a < A} if A is a limit ordinal

We use "i^a and cj^ interchangeably. The ^l;-notation is typically used when we want to
emphasize the size of a set, whereas the w-notation when we are concerned also with the
order-type.

Observe that ui is the least uncountable ordinal, i.e. that a < wi if and only if \a\ < |a;|.
Then U2 is the least ordinal whose cardinality is > |a;i|, etc.

Exercise 5.3.8 Show that the map F : ON — )■ ON : a i— t- Wq, is normal, i.e. strcitly
increasing and continuous. Deduce that uja > a for all a G ON. Also conclude that there
are arbitrarily large ordinals for which = a.

□

Next, we observe that {^^Q, : a £ ON} enumerates all the infinite cardinals:

''Aleph N is the first letter of the Hebrew alphabet. The next two, Beth 3, and Gimel 1 will also make an
appearance in cardinal arithmetic.

Set Theory

76

Proposition 5.3.9 // k is an infinite cardinal, then there is a unique a £ ON such that

Exercise 5.3.10 Prove Proposition 5.3.9

[Hint: Sliow tiiat if k is a cardinal, then k < i^K+i- Now show by induction that the following
holds for all all a

If K is a cardinal such that k < ^^Q,, then k = H« for some ordinal (3 < a.

□

The alephs were introduced by Cantor. We emphasize once again that the alephs are
the sizes (cardinalities) of well-orderable infinite sets. An infinite set X is well-orderable if
and only if \X\ = Hq, for some a. For Cantor, the restriction to well-orderable sets posed
no problem: Already in 1883 he had proposed as a Law of Thought that every set is well-
orderable. However, this proposal met with much opposition, and thus Cantor tried to find
a proof. The first proof was, in fact, found by Zermelo, but his proof used a new principle
for set formation — the Axiom of Choice. We shall have much more to say about this in the
next chapter.

5.4 The Canonical Well-Ordering of ON x ON

Recall that, given a well-ordered set(X, <) , we defined in Lemma 4.1.2 the canonical well-
ordering {X X X,\z). We can define the canonical well-ordering of the class ON x ON in the
same manner: Given a,/3,a',P' G ON, we put

{a, 13) < {a',f3')

max{a, /3} < max{a', /3'}
or max{a, (3} = max{a', /3'} and a < a'
or max{a,/3} = max{a',/3'}, a = a , and (3 < [3'

It is straightforward to verify that this does indeed define a well-ordering of ON x ON, i.e.
a total ordering with the property that every non-empty subclass of ON x ON has a least
element.

Observe that if 7 G ON, then the restriction < f (7 x 7) of the canonical well-ordering on
ON X ON to the set 7 x 7 is precisely the canonical well-ordering of 7 x 7.
Here are some observations that will be useful in the sequel:

• Observe that the initial segment {{(,,r]) : ??) < (a,/3)} is a set, as it is a subset of
(max{a,/3} + 1) x max{a,/3} + 1).

• Further observe that

(^, T]) < (0, a) <;=^ ^ < a A rj < a i.e. {(^, rj) : (^, r]) < (0, a)} = a x a

Thus each a x a is an initial segment of the canonical well-ordering of ON x ON.

77

Cardinals

• Since every well-ordered set is order-isomorphic to a unique ordinal, we may define a
map r : ON x ON — )• ON as follows: T{a, /?) is the order-type of the initial segment
induced by {a, (3) in the canonical well-ordering of ON x ON, i.e.

r(a,/3) := order-type of {{C,v) ■ i^^v) <

• r is an order-preserving injection from ON x ON into ON: No well-ordered set is
order-isomorphic to an initial segment of itself.

• If /3 < a, then r(0,/3) < r(0,a): This is because r(0, a) is the order-type of of the
initial segment a x a of ON x ON. Since /3 x /3 is an initial segment of ON x ON and
/3 X /3 C a X a, we see that /3 x /3 is an initial segment of a x a.

• ran(r) is downwards-closed: If 7 G ran(r) and < 7, then 6 S ran(r). This is because
some initial segment of ON x ON has order-type 7, and an initial segment of that
initial segment must have order-type 5.

a

r is surjective: The map \$ : ON — t- ON
\$(a) > Q for all a £ ON, by Theorem 4.1.6

r(0, a) is strictly increasing, and hence
It follows that a < T{0,a) G ran(r).
Since ran(r) is downwards-closed, it follows that a G ran(r), for all a G ON.

We reiterate: The map F : ON x ON — t- ON is an order-preserving bijection. We refer
to r as the canonical bijection from ON x ON to ON.

Exercise 5.4.1 Consider the map ^' : ON — )■ ON : a 1— )• r(a x a). Show that ^' is a
normal function. Conclude that r(a x a) > a for all a G ON. Further deduce that there are
arbitrarily large ordinals for which T{a x a) = a.

□

With the previous exercise in mind, the next proposition should not be too surprising:
Proposition 5.4.2 r{uja x ^a) = for all a G ON.

Proof: Recall that clIq, X clIq, = : {£,,tj) < (0,u;q,)} is an initial segment of ON x ON, and

that T{uja X Wa) = r(0,a;Q,) > uja- Suppose that a is the least ordinal such that T{uja x uja)
u)(x, SO that r(wQ, X oja) > uJa- Then there exists (^,7?) G Wq, x such that T{(,,t]) = oja-
Since cj^ is a limit ordinal, we can find an ordinal 5 so that ^, r/ < J < Wq. Then J x 5 is an
initial segment of ON x ON. Since (^, 77) £ 6 x 5, we have uJa C T{5 x 6). Since F is injective,
it follows that Ua < |F((5 x 5)\ = \6 x 5\. Now \6\ = Uf^ for some /3 < a, and by the minimality
assumption F(a;^ x ujfs) = u^p. Since F is injective, we see that ujp = \T{u)p x ujp)\ = \ujp x ujp\.
It follows that

Wq, < |(5 X 5| = \ujp X w/jI = ojp
— contradiction, since cj^ < Wq. Hence there is no ordinal a such that T{uja x Wa) 7^ Wq,.

Set Theory

78

5.5 Arithmetic of Alephs

Cardinal addition and multiplication are trivial:
Theorem 5.5.1

Proof: Since T{u}a x <^a) = and F is injective, it follows that
Thus

b^Q, • Kq, = b^Q, for all a G ON

Now if 7 = max{Q,/3}, then by Proposition 5.2.4 it follows that

= max{H„, H^} < + < H.^ + = 2 • < H7 • ^^-^ =

Similarly,

= max{^^a, ^^/j} < Ha • < ^7 • = ^7

Remarks 5.5.2 (a) It is important to note that the above theorem holds only for alephs. If
the Axiom of Choice holds, then every infinite cardinal is an aleph, as we shall soon be
able to show. However, the statement

For all infinite

i.e. the statement that for all infinite sets A we have |^ x j4| = |^|, is equivalent to (AC).

(b) Finite sums and products of alephs are now easy to calculate. Exponentiation is altogether
more difficult. We know, for example, that 2*^'^ > Hq. We do not know if 2*^'^ is an aleph,
i.e. if 2^0 = H„ for some a G ON. With (AC), 2^" will be an aleph. Even with (AC),
however, we are not able to determine the value of (x for which 2^^' — Hq;.

The axioms of ZFC place very few constraints on the possible values of a. a might be
any of 1,2,3,. ... It cannot be u. It can be any of a; + 1, cj + 2, w + S, . . . , or for that matter
Hi, H2, H3, .... Again, we cannot have a = H(^.

The assumption that a = 1 — i.e. that 2^° = Hi — is due to Cantor, and is known as
the Continuum Hypothesis. Here, the word continuum refers to the set M of real numbers.
The Generalized Continuum Hypothesis is the assertion that

2^" = H„+i for ah a G ON

□

The next theorem shows what 2^° has to do with the continuum.
Theorem 5.5.3

IMI = 2^0

79

Cardinals

□

Exercise 5.5.4 We prove the preceding theorem. Assume here that the basic properties of
the sets Q, M of rational and real numbers are known. (We have not yet established that Q, M
can be formalized inside ZFC. However, such a formalization must of course capture all these
known properties.)

(a) Show that |Q| = Nq.

(b) Observe that if r G M, then r = sup{g G Q : g < r} — a known property. Conclude that
|M| < 2^0.

(c) An element / G 2'^ is a function / : w — )■ {0, 1}, which can be thought of as a sequence of
O's and I's. With each such function is associated a real number whose decimal expansion
consists of digits given by this sequence. Show that 2^° < |M|.

(d) Finally, conclude that |M| = 2^°.

□

5.6 Project: Cardinal Arithmetic and The Axiom of Choice

Problem 1:

We prove a theorem of Tarski (1924):

Theorem: In ZF, (AC) holds if and only if = k for every infinite cardinal k.

1.1 First, explain why if (AC) holds, then = k for every infinite cardinal.

1.2 Next, we show that if k is an infinite cardinal and if K is an aleph, then

K + K = K • K implies k < H or K > k

So suppose that K + ii = K-ii. Let K be an infinite set such that \K\ = k, and let {A, <)
be a well-ordered set such that |yl| = N.

(i) Explain why there exist disjoint sets Ki and Ai such that \Ki\ = k, \Ai\ = K and
K X A = KiUAi.

(ii) Explain why exactly one of the following possibilities must be the case: Either

(1) 3k£ Kya€ A {{k, a) G Ki), or

(2) yk£ K3a£ A {{k,a) G ^i).

(iii) Show that in case (1), we have H < k.

(iv) Suppose now that case (2) holds. For each k £ K, let be the least element of A
such that {k,ak) G A^. Observe that {{k,ak) ■ k G K} C Ai. Conclude that k < i^.

Set Theory

80

1.3 Now suppose that k'^ = k for every infinite cardinal. Let X be any infinite set, with
cardinaUty A = \X\, and let K := h{X) be the Hartog's number of X. Observe that K is
indeed an aleph, as the notation suggests.

(i) Show that (A + H)^ > A • H.

(ii) Conclude that A + K = A •

(iii) Explain why we cannot have H < A. Conclude that A <

(iv) Deduce that X can be well-ordered. Hence explain why (AC) holds.

□

Problem 2:

We prove a theorem of Sierpihski (1947):

Theorem: In ZF, (GCH) implies (AC).

Recall that k <2'^ for any cardinal k. The Generalized Continuum Hypothesis (GCH) is
the following assertion: If k is an infinite cardinal, then there are no cardinals strictly between
K and 2'^. To be precise

If K, A are infinite cardinals such that « < A < 2**, then either A = k or A = 2**.

2.1 We first prove, without assuming either (AC) or (GCH),

Lemma 1: If k is an infinite cardinal, and if h{K) is the Hartog's number of k, then

Let X be a set such that \X\ = k. Then C is a partial order relation on V{X). Thus
if C V{X), then {X , C) is a partial ordering. It may happen that {X, C) is a well-
ordering, or it may not. Define

W ■.= {X <Z V{X) : {X, C) is a well-ordering}

(i) Show that W C P(7^(X)).

(ii) Define an equivalence relation on W by

X Kiy [X, C) is order-isomorphic to {y, C)

and let

E := W/ «

be the set of equivalence classes of «. Explain why E C V{V{V{X))). Conclude
that |E| < 22'".

81

Cardinals

(iii) Each E £E, is a collection of all subsets of V{X) that have the the same order-type.
Thus with each E & K we can associate a unique ordinal oe- Thus E can itself be
regarded as a well-ordered set, with £' ^ F if and only if as < ap- Conclude that
|E| is an aleph.

(iv) Show that |E| ^ \X\. Conclude that |E| > h{K).

[Hint: Let ij be the order-type of the well-ordered set E. Suppose that |E| < \X\.
Then there is an injection / : 77 — )■ X : ^ x^. Define Xjs := {x^ : ^ < /?}, and
X := {Xp : /3 < ry}. Observe that (Af, C) has order-type ry. The initial segment of
E determined by the equivalence class of X then has order-type rj. . .]

(v) Finally, conclude that h{tC) < 2^ .

2.2 Now prove the following easy results in cardinal arithmetic, not assuming (AC) or (GCH):

(i) Lemma 2: If k > Hq, then 2 ■ k = k.

Remark: Saying k > is not the same as saying that k is infinite, but that it is
Dedekind infinite (i.e. has a countable subset).
[Hint: Show 1 + k = k.]

(ii) Lemma 3: IfK> Ko, then 2" + k = 2'^.

(iii) Lemma 4: If X, k are cardinals such that 2 ■ k = k and X + k > 2'^, then X> 2'^.
[Hint: Choose disjoint sets Ki,K2,L so that \Ki\ = \K2\ = k, \L\ = X. Justify the
equations

\LUKi\ = X + K = 2'^ = 2''+'^ = \r{Ki U K2)\
Let f : L\J Ki ViKiU K2) he a bijection. If £; C ^2, define

E* ■.= EU{xeKi :x^f{x)}

Show that the map V{K2) -)• V{Ki \J K2) : E ^ E* is injective. Further show
that each E* f[Ki\- Conclude that each E* = f{y) for some y E L. Deduce that
> \V{K2)\ = 2\]

2.3 Henceforth, assume (GCH): Let k > b^o be an infinite cardinal, and let /i(k) be the
Hartog's number of k. Define a sequence {pn)n<uj of cardinals inductively as follows:

PQ •■= Pn+i = 2''"

Prove:

Lemma 5: If k> and h{n) < Pn+i, then either K is an aleph, or h(^K) ^ p^-

(i) Observe that 2 ■ pn = Pn, by Lemma 2.

(ii) Show that p„ < /i(k) + pn < 2^^" for all n.

(iii) Conclude from (GCH) that we have twho possible cases: Either

(1) /i(k) + Pn = Pn, or else

(2) h{K)+pn = 2P^.

(iv) Show that in case (1), we have hi^K) ^ pn-

Set Theory

82

(v) Show that in case (2), we have h{K) > 2''" > k by Lemma 4. Conclude that k is an
aleph.

2.4 Finally, let X be any infinite set. Define k = 2^o+l^l.

(i) Observe that Lemma 1 states that < p3. Also observe that ^ po. Use
Lemma 5 to conclude that k is an aleph.

(ii) Deduce that X can be well-ordered. Hence explain why (AC) holds.

□

Chapter 6

The Axiom Of Choice

Recall:

Axiom of Choice: Every family of non-empty sets has a choice function: If is a family of
non-empty sets, then there is a function f on. X which chooses from each X E X an element
fix) G X.

Exercise 6.0.1 Show by induction that every finite family X of non-empty sets has a choice
function.

Thus (AC) is needed only for infinite X.

□

6.1 Choice Functions, Partitions and Cartesian Products

Our first result shows that for (AC) to hold, it suffices that it holds for every every family X
of mutually disjoint non-empty sets:

Proposition 6.1.1 Let (AC)* be the following statement:

(AC)*: If X is a partition (i.e. a family of non-empty pairwise disjoint sets), then
there is a set C which contains precisely one element from each X E X (i.e. C CiX
is a singleton for every X E X).

Then (AC) and (AC)* are equivalent in ZF.

Proof: Clearly (AC) impUes (AC)*.

Conversely, suppose that X is a family of non-empty sets, not necessarily pairwise disjoint.
For each X E X, let X := {X} x X. Then X := {X : X € X} is a family of pairwise disjoint
sets. By (AC)*, there is a set C which contains precisely one element from each X £ X. Then
C is a set of ordered pairs, and thus a binary relation. By definition, if {X, x) G C, then x e X.
Moreover, C is clearly a function: If {X, x) e C and {X, y) E C then {X, x), {X, y) e C Ci X,
and hence x = y. It follows that C is a choice function on X.

H

Proposition 6.1.2 In ZF, the following are equivalent:

83

Set Theory

84

(a) (AC)

(h) Every surjection has a right inverse.

f

Proof: Suppose that (AC) holds, and that X Y is a surjection. For each y £ Y define
Xy := f~^[{y}]- Then X := {Xy : y £ Y} is a family of non-empty sets which partitions
X. By (AC), X has a choice function h : X ^ y]X with the property that h{Xy) £ Xy
for ah y £ Y. In particular, f{h{Xy)) = y. Define g : Y ^ X : y ^ h{Xy). Then
{fog){y) = f{h{Xy))=y.

Conversely, suppose that every surjection has a right inverse. Let X he a family of disjoint
non-empty sets, and let f :[_} X ^ X he defined as follows:

If x G Af , then f{x) is the unique X £ X so that x £ X

Then / is a surjection, and thus has a right inverse g : X ^yjX. Clearly g{X) £ X, i.e. g

is a choice function for the partition X. By Proposition 6.1.2, (AC) holds

Recall that the cartesian product Aq x Ai of two sets is defined to be the set of all ordered
pairs (ao,ai), where oq £ Aq and ai £ Ai. We can then define products with more factors
inductively:

ji+l n

^fc := ^fc X An+i i.e. tIq x • • • x ^„ x An+i := (tIq x X • • • X An) X An+l
k=0 k=0

However, this allows us to define products with only finitely many factors. How should we
define a product nie/ ^« with infinitely many factors, i.e. where the index set / has infinite
cardinality? More generally, if ^ is a non-empty collection of sets, can we define Y\ Here
the intention is that if ^ = {Ai : i £ I}, then = Yliei (just as = IJie/ ^*-)
Clearly we have 13^=0 {(^O) oi) ■ ao £ Aq A ai £ Ai. Thus we ought to have

YlAi := {{ai : i £ I) -.yi £ I {ai £ Ai)}

The problem is that we have not yet defined ordered tuples {ai : i £ I) of infinite length.
However, we can clearly think of each such tuple as a function, i.e.

{ai : i £ I) is the function f : I ^ \^ Ai defined by f{i) = ai

Thus the function / chooses, for each i E J, an element f{i) £ Ai. Then

Ai :={/:/ is a function I A- \^ Ai such that f{i) £ Ai for all i £ 1}

i&I i£l

i.e. the cartesian product Hie/ °^ functions / which choose, for each i £ I,

an element f{i) £ Ai.

There is an obvious natural bijection between the old definition of x as a set of
ordered pairs and the new definition as a set of functions {0, 1} ^ AqU Ai: The ordered pair

85

Axiom of Choice

(ao, ai) £ AqX Ai corresponds to the function {0, 1} ^ AqU Ai : i-^ oq, 1 ai- Henceforth
we shall identify the two objects without further explanation. It will usually be clear from
context which definition of product is meant.

Now if ^ is a non-empty non-indexed collection of sets, then we can index A by itself:

A := {Xa : Ae A} where Xa := A

Then H = Has^ °^ functions f : A ^ \JA with the property that

f{A) E Xa, i.e. with the property that f{A) G A. Thus if / G f^^, then / is a function
which chooses, for each ^4 G ^ an element f{A) E A, i.e.:

Definition 6.1.3 is the set of all choice functions on A.

□

The following proposition is now obvious:
Proposition 6.1.4 The following statements are equivalent in ZF:

(a) (AC)

(b) A product of a non-empty family of non-empty sets is non-empty: Whenever A is a
non-empty family of non-empty sets, then Y\A ^ 0.

□

Proposition 6.1.4 is the reason that (AC) was also called the multiplicative axiom in the
past.

6.2 Well— Ordering

Cantor proposed as an obvious Law of Thought the Well-Ordering Principle: Every set can
be well-ordered.

The intuition behind this law was simple. Suppose that X is a set. Pick an element
xq E X. If any elements are left over, pick an element xi from X — {xq}. If any are left over,
pick an X2 €z X — {xq, xi}. This defines a sequence by transfinite recursion: If x^ have been
chosen for ^ < a, then pick an element Xq E X — {x^ : ^ < a}. Just keep doing this until you
run out of elements. If this happens at stage f3, then X = {x^ : ^ < /?}. This enumeration of
X induces well-ordering of X in an obvious manner.

The discovery of the paradoxes generated increased skepticism about the foundations of
set theory, and this led Cantor to seek a proof of his Well-Ordering Principle. The first such
proof was given by Zermelo in 1904 (and again in 1908). In doing so, Zermelo explicitly
formulated the Axiom of Choice.

Proposition 6.2.1 Let X be a set. In ZF the following are equivalent:

(a) The set X can be well-ordered.

(b) The family X := V{X) — {0} of non-empty subsets of X has a choice function.

Set Theory

86

Proof: (b) =^ (a): Cantor's intuition can be implemented as follows: Suppose that F is a
choice function for X := V{X) — {0}. Pick a set c such that c ^ X. We extend to a
function on V by defining F{x) = c \i x ^ X . Define a function G on ON by transfinite
recursion follows:

G{a) = F(X-ran(G \ a))

Observe that this means that G(a) & X — {G(^) : ^ < a} as long as X — ran(G \ a) ^ 0.

Now if G{a) 7^ c for any a G ON, then G is injective, and hence the function G^^ :
ran(G) — t- ON exists, and is a bijection. Since ran(G) C X, we see that = ON is

a set, by Replacement — contradiction. Hence there is a G ON such that G{a) = c. If ag
is the least such a, then X — ran(G \ qq) = 0, and hence X = {G(C) : S, < ckq}. Define a
relation -< on X by

G(0 ^ G{r]) if and only if ^ < r]

Then G : (oq, <) — {X, -<) is an order-isomorphism, so -< well-orders X.
(a) (b): If {X, -<) is a well-ordering, define F on X := V{X) - {0} by:

F(Y) := -< -least element of Y

To be precise F is the set

F = {{Y,y) e X X X :y eV A eY {y ^ z)}

which exists by Separation.

Then F is clearly a choice function on X.

H

As an immediate corollary we have:

Theorem 6.2.2 In ZF the following are equivalent:
(a) (AC)

(h) Every set can he well-ordered.

Proof: (a) =^ (b): Suppose that X is a set. By (AC), V{X) — {0} has a choice function.
By proposition 6.2. 1[ X can be well-ordered.

(b) =^ (a): Suppose that X is a. family of non-empty sets. Let X = \] X . Then X C
V{X) — {0}. Since X can be well-ordered, there is a choice function F for V{X) — {0}. Then
F \ X is a, choice function for X .

Recall that the alephs ^^q, were defined to be the cardinals of the well-orderable sets. (AC)
implies that every set is well-orderable. In particular, for every infinite set X there is a unique
a G ON such that \X\ = Nq,. Hence (AC) considerably simplifies cardinal arithmetic of all
sets. We shall investigate cardinal arithmetic in ZFC in a later section.

87

Axiom of Choice

6.3 Zorn's Lemma and Other Maximal Principles

We wish to show how, by introducing a certain axiom on sets of sets
instead of the well-ordering theorem, one is enabled to make the proofs
shorter and more algebraic.

— Max Zorn,

A Remark on Method in Transfinite Algebra, 1935

Recall the following:
Definition 6.3.1 Let (P, <) be a partially ordered set.

(i) If X C P, then an element ti G P is an upper bound for X \i\/x ^ X [x < u).

(ii) An element m G P is called a maximal element of P if G P {m < x).

(iii) A non-empty subset C C P is called a chain in P if C is totally ordered by <, i.e. for
all x,y £ C, either x < y or y < x.

(iv) A chain C in P is said to be a maximal chain if it is not a proper subset of any other
chain in P.

□

Theorem 6.3.2 In ZF, the following are equivalent:

(a) (AC)

(b) Zorn's lemma (1935): If every chain in a non-empty partially ordered set (P, <) has
an upper bound, then P has a maximal element.

(c) Hausdorff's Maximal Principle (1909): Every chain in a non-empty partially ordered
set can be extended to a maximal chain.

Proof: (a) (b): Let (P, <) be a partially ordered set with the property that every chain
in P has an upper bound, and let P be a choice function for the family V{P) — {0}. Also
choose some set q ^ P. For ^ C P, define 'W A := {x £ P : \/a £ A [a < x)} . By transfinite
induction, define a map G on ON as follows: If a G ON,

G{a) :--

P(tt ran(G \ a) if tt ran(G \ a) ^
a else

To explain in simple English, suppose that := G'(^) has been defined for ^ < a. Then
Oq, := G{a) is an element chosen (by F) from {x G P : < a (ag < x)}, i.e. is an element
of P so that aa > for all ^ < a, provided such an element exists. If such an element does
not exists, then aa ■= q-

If Oq, 7^ for all a G ON, then : a G ON] is a sequence of distinct elements of P,
i.e. G : ON — t- P is an injection, which is impossiblej^ Let A be the least ordinal such that

^By a similar argument as in the proof of Proposition 6.2. l[ Here is another proof: If h{P) is the Hartog's
number of P, then {aa : a < h{P)l would be an injection from h{P) to P, contradicting the very definition of
Hartog's number.

Set Theory

88

a\ = q. Then C := {a^ : ^ < A} is a chain in P, and therefore has an upper bound u in P.
We claim that u is a maximal element of P: For if not, then there is a; G P such that u < x.
Hence tt {o^ : ^ < A} / 0, so oa G -P, i.e. q e P — contradiction.

(b) ^ (c): Let C be a chain in a partially ordered set (P, <). Let

r := {D C P : D is a chain in P, and D D C}

Then (P,(^) is a partially ordered set, ordered by set inclusion. We show that every chain
in P has an upper bound in P. So suppose that C := {Ci : i G /} is a chain in P. Define
U ■.= {JC. Then

(i) [/ is a chain in (P, <): For if x, y G U, there are i,j G / such that x £ d and y G Cj.
Since C is a chain in (P, C), we have either Ci C Cj or Cj C Cj. Suppose without loss
of generality that Cj C Cj. Then both x,y G Cj. Since Cj is a chain in (P, <), we have
must have either x < y oi y < x.

(ii) J7 G P: Clearly C CU.

(iii) U is an upper bound of C in P: If d G C, then Ci C\JC = U.

It follows from Zorn's Lemma that P has a maximal element Cm Then clearly Cm is a maximal
chain in (P, <) which extends C.

(c) =^ (a): We show that every family of non-empty sets has a choice function. Let ^ be a
family of non-empty sets. Let F be the collection of all partial choice functions, i.e.

F :={/:/ is a function, dom(/) C X, and f{X) G X for all X G dom(/)}

Order F by inclusion, so that (F, C) is a partial ordering. By the Hausdorff Maximal Principle,
there is a maximal chain C in F. Let P := IJC. Then

(i) P is a function: For if [X,x), {X,y) G P, then there exist /, 5 G C so that {X,x) G /
and {X,y) G g. Since C is a chain in (F, C), either / C (7 or 3 C /. Suppose without
loss of generality that f ^ g. Then both {X, x), {X, y) G g. Since 5 is a function, x = y,
as required.

(ii) P G F: Indeed, if {X,x) G P, there is / G C so that {X,x) G /. Observe that
F{X) = X = f{X). Since / is a partial choice function, f{X) G X, and thus F{X) G X.

(iii) dom(P) = X. For suppose there is X E X — dom(P). As each element of X is non-
empty, one may choose x e X. Then G := FU{{X, x)} G F also. Moreover, D := CU{G}
is a chain, since / C G for all / G F. Then D is a chain which properly extends C —
contradiction, since C is a maximal chain.

Thus P is a choice function with domain X.

89

Axiom of Choice

Remarks 6.3.3 Various maximum principles were discovered in the years since Zermelo
proposed the Axiom of Choice, by Hausdorff, Kuratowski, Tukey, Teichmiiller, Zorn and
others.

Zom's Lemma is the (AC)-equivalent that has become the industry standard, a tool
required by all mathematicians working with infinite sets. Although it goes by the moniker
of "Lemma", Zorn actually proposed it as an axiom. Indeed, Zorn proposed the following
maximum principle.

(MP): Let ^ be a family of sets partially ordered by C and closed under the taking
of unions of C-chains. Then A has a maximum element.

We saw in Chapter 2 that every partially ordered set {X, <) is order- isomorphic to a family
of sets ordered by inclusion: Define X := {], x : x E X}, where X x := {y E x : y < x}. Then
{X, <) and {X, C) arc order-isomorphic. In particular, and chain in X corresponds to one in
X, any maximal element in X corresponds to one in X, and vice versa. Thus the maximum
principle (MP) is equivalent to Zorn's Lemma.

Observe that the proof of Zorn's lemma from (AC) required transfinite recursion. Zorn's
lemma is a method by which one can avoid transfinite recursion — it is built in, as it were.
This usually makes the technique more palatable to the working mathematician.

□

Here follows an example of the use of Zorn's lemma:
Theorem 6.3.4 Every vector space has a basis.

Let F be a vector space, and let P be the family of all linearly independent subsets of V. We

claim that (P, C) satisfies the hypotheses of Zorn's Lemma. To see this let C := {Cj : i E 1}
is a chain in P. We must show that C has an upper bound. Define U = \JC Then:

(i) [7 G P: For if vi,...,Vn € U, then there arc G / such that € Cj^. Since C
is a chain, the finite set {Cjj, . . . ,Cj^} has a maximum element, say Cj .. Then each
vi, . . . ,Vn G Cij . Since Ci- is a linearly independent subset of V, the vectors vi, . . . ,Vn
are linearly independent. Hence ?7 is a linearly independent subset of V also.

(ii) U is an upper bound for C in (P, C) Each Q G C has Ci C U.

By Zorn's Lemma, P has a maximal element B, i.e. a maximal linearly independent
subset. If w G F is such that v ^ B, then B U {v} is not linearly independent. Hence there
are 6i, . . . , 6„ G -B and scalars ai, . . . , a„ so that aibi H — • + anbn + v = 0. It is now easy to
see that each v eV is a linear combination of members of so that 5 is a basis.

H

We end with a short list of results that depend on (AC) for their validity.

• Every vector space has a basis. Moreover, any two bases have the same cardinality.

• Tychonov's Theorem in topology: The product Hie/ of arbitrary family of com-
pact spaces is compact (in the product topology). Tychonov's Theorem is equivalent to
(AC).

Set Theory

90

• Nielsen-Schreier Theorem in group theory: A subgroup of a free group is free.

• Hahn-Banach Theorem in functional analysis. This leads to various infinite-dimensional
separating hyperplanc theorems, which assert that any two disjoint closed convex sets
can be separated by a hyperplane.

• Every field has a unique algebraic closure (up to isomorphism).

• Compactness Theorem in logic: If every finite subset of a set S of first-order sentences
has a model, then E has a model.

6.4 Cardinal Arithmetic in ZFC

(AC) simplifies cardinal arithmetic considerably. It also extends its range to infinite sums
and products.

Recall that, by definition, \X\ < \Y\ if and only if there is an injection X ^ Y. Every
injection has a left inverse (without using (AC)) which is a surjection. Using (AC), every
surjection has a right-inverse, which is an injection. Hence (using (AC)), \X\ < \Y\ if and
only if there is a surjection Y ^ X .

Also recall that the alephs arc the cardinals of wcU-orderable sets: If X is well-order able,
then there is an order isomorphism (and thus a bijection) from some ordinal to X. Then \X\
is defined to be the least ordinal for which there is a bijection onto X. Now, since the Axiom
of Choice implies that every set can be well-ordered, every infinite cardinal |X| is an aleph,
i.e. for every infinite set X there is an a € ON such that |X| = K^.

Thus the cardinals form a well-ordered class

0<1<2<. ..,<i^o< i^i <••-,< i^u;<---

Every infinite cardinal appears on the list '■ ol G ON], i.e. the class of ordinals and the

class of alcphs are "order-isomorphic" as classes.
We defined the alephs inductively as follows

Ko :=

Kq+i := for successor ordinals a + 1
Ka '■= sup{Kq, : a < A} for limit ordinals A

For obvious reasons, we call a cardinal of the form Kq+i a successor cardinal. A cardinal of
the form where A is a limit ordinal, is called a limit cardinal. Every cardinal is either a
limit cardinal or a successor cardinal.

For example ^uj-,^uji are limit cardinals, whereas ^i., ^^j^i^^^^^j^^.^j^^ are successor car-
dinals.

In particular, any two cardinals are comparable (this is the Law of Trichotomy): If X,Y
are sets, then either |X| < |y| or |y| < \X\. For if \X\ = and \Y\ = K^, then < if
and only if a < /3.

Furthermore, as we showed earlier that + = • = ^a-, using the canonical
well-ordering on ON x ON, we must have « + = = for every infinite cardinal k. It
follows easily that

n + \ = K ■ \ = max{K, A}

91

Axiom of Choice

for any two infinite cardinals k,X.

Here is an oft-used fact that reUes on (AC):

Theorem 6.4.1 (a) The union of a countable family of countable sets is countable.

(b) If X is a family of sets, then

\[jX\< \X\-svip{\X\ -.XeX]

Proof: (a) Let X := {X„ : n < cj} be a countable collection of countable sets. For each
n < oj, there is a surjection /„ : a; ^ Xn : m i-> Xn,m which lists the elements of i.e.

Xn ■■= {fn{m) :m<u} = {xn,m : m < u;}

Choose such an for each n < u. We then have a surjection

/ : a; X a; -» [J A" : (n, m) Xn,m

Since we know |a; x a;| = Kq • Kq = Kq, we see that (J is countable.

(b) Let K := \X\ and let A := sup{|X| : X G X}. Thus we can enumerate

X := {Xa : a < k}
Then, for each a < k, we can choose an enumeration

Xa ■= {xa,i3 '■ f3 < Xa} whcrc = Aq, := \Xa\ < A
Fix xq g\JX, and define a map p : k x X ^ \JX hy

, f Xa,fi if ^ < Aa

I xq else

Then p is surjective, so | [J -Y] < k • A.

H

Proposition 6.4.2 If2<K<X and X is infinite, then

= 2^

Proof: 2^ < < (2'')^ = 2'^'^ = 2^, since k-X = max{K, A} = A.

H

It follows, for example, that = 2^°.

With (AC), we can define infinite sums and products of cardinals as follows:

Definition 6.4.3 (a) Let {ki : i G /} be an indexed set of cardinals. Let {Xi : i G 1} be a
family of pairwise disjoint sets such that \Xi\ = Ki. Define

Set Theory

92

(b) Let {Ki : i G 1} be an indexed set of cardinals. Let {Xi : i G /} be a family of sets such
that \Xi\ = Ki. Define

llKi = \llXi\

iei iei

where Hie/ -^i set of all choice functions, i.e. the set of all functions f : I ^ [Jiei
with the property that f{i) G Xi for all iei.

(c) If K, A are cardinals, define

K^'^ := sup{/c^ : // a cardinal, /j, < X}

□

Remarks 6.4.4 Observe that without (AC), it does not follow that J2iei '^i well-defined.
One could have \Xi\ = \Yi\ for all i G /, without having lUie/^'^il = lUie/^^l (where

{Xi : i G /} and {1^ : i G /} are families of pairwise disjoint sets). The reason is that to
construct a bijection which shows that | IJjg/ Xj| = | IJjg/ Yi\, one first has to choose bijections
fi:Xi>—^ Yi that show \Xi\ = \Yi\.

Thus (AC) is necessary to make the definition of Yliei '^i independent of the sets Xj. A
similar remark holds for Hie/

□

Remarks 6.4.5 1. Observe that if for all i G I, and if \I\ = A, then

Ki = K ■ X

iei

i.e. sums and products behave as they ough to. For if X is a set of cardinality \X\ = k,
and if we set Xi := X x {i} for all i G /, then IJie/ X x I are the same set. Now

by definition X^jgj Ki = \ Ujg/ -^^l k ■ X = \X x I\.

2. Further observe that if for all i G /, and if |/| = A, then

YlKi = K^

iei

i.e. products and exponentials behave as they ought to: For if X is a set of cardinality
\X\ = K, and if we set Xi := X for all i & I, then Yiiel -^i °^ functions from

/ to X, i.e. Hie/^i ^^'^ -^^ ^^^^ same set. Now by definition. Hie/ '^i — \ W.iei-^i\
and K^ = \X^\.

3. Next, observe that

iei iei

For if \Xi\ = Ki, then there is an obvious bijection ^ : nie/^'^i'* ~^ (Wiei ■^i)'^ defined
as follows. If / G Yli^iXf, then f : I ^ Uiel lias fi := f{i) : X ^ Xi for each
i G /. Thus fi{S,) G Xi for each i G I and ^ < A. Define / : A ^ Hiei^i

fm) = Mo = fm)-

93

Axiom of Choice

4. Similarly,

□

Proposition 6.4.6 (a) Suppose that A is an infinite cardinal. If Ki > for all i < X, then

Ki = X - sup Ki

(b) Suppose that X is an infinite cardinal. : i < XJ is an increasing sequence of non-zero

cardinals, then

JJ = (sup Ki)^

Proof: (a) Let a := "^i^x '^i' ^ supj^i : i < A}. Clearly ^ < a. Observe that

a = Ki < n = X ■ jJL < X ■ a (★)

i<\ i<X

Further note that, since each > 0,

X = ^\ <^Ki = a

i<\ i<\

Hence A ■ cr = cr, so by (★) it follows that a = X - jjl.

(b) Let vr := Hka let := supjKj : i < A}. Observe that since the Ki are non-zero,

we have tt > /x. Now since Ki < jj, for all i, we have

To prove the reverse inequality, recall that ^^^x A = A x A = A. Thus we may partition A
into a family {Aj : j < A} of pairwise disjoint sets with cardinalities \Aj\ = A. Then

TT = n «i = n ^ ) = n =

i<\ j<\ ieAj j<\

(The fact that Y\i<\ = n7<A(n7eA ^*^y establish.)

If A is a set and k a cardinal, define

[A]^ := {X C A : \X\ = A} [A]<^ — {X^A: \X\ < X}
Proposition 6.4.7 (a) If \A\ = k > X, then

K^

Set Theory

94

(b) If \A\ = K> X, and A is an infinite cardinal, then

[A\<^

Proof: (a) If / G A^, then / : A -> ^, so / C A x A, and |/| = A. Since \\ x A\ = X ■ k = k,
we see that < |[A x = It follows that = \A^\. On the other hand, for

every X G [A]^, there is at least one function Fx : X ^ A whose range is X. Hence
[^]^| < \A^\ = K^.
(b) We have

[A]<' = u [Ar

/U a cardinal <A

SO by (a)

I [^]^| = ^ = A • sup{«;'' : n a cardinal < A} = A • k^-^ =

II a cardinal <\

H

Lastly, we state and prove:
Theorem 6.4.8 (Konig's Theorem) // < Aj for all i G I, then

iel i&I

Proof: Suppose, on the contrary that Hie/ — ^iei l"'^*! ~

Since | Hie/ -^il ^ Sis/ '^i^ '^^^ ^^'^ * partition {1^ : z G /} of Ilie/ "'^i ^^^h that < Kj.
Here, the Yi are pairwise disjoint, and IJie/ ^ ~ Tlie/

For each i G /, let Pj be the projection of Yi onto the i*^ coordinate, i.e.

Observe that Pi C Xj. Now |Pj| < \Yi\ = Ki < Aj = \Xi\, so there is Xi £ Xi — Pi for each
i G I. Define h : I —> [Ji^j Xi : i ^ Xi. Then certainly h G Ilie/"'^*' Since h{i) Pi for any
i G /, we see that h ^Yi for any i G /. It follows that h G niei" "'^^ ~ Uie/ ^ — contradiction.

H

Corollary 6.4.9 For every cardinal k, we have k < 2".
Proof: 1 < 2. Now

i<K i<K;

H

Observe that we require the inequality Ki < Aj to be strict in Konig's Theorem. For
example, if Kn = Xn = 2^° for all n < Kq, then

2^0 = Ko • 2^° = 2^0

n<Ko

and

Y[ 2^0 = (2^0)^0 = 2^0
n<Ko

95

Axiom of Choice

6.5 Length, Area, Volume and (AC)

"Length" is a non-negative real number associated with certain subsets of the real line M, i.e.
it is a function £(•) which assigns to a subset C M its length C{E). Intuitively, the length
function should have certain properties:

l£. If C M is bounded, then <£(£;)< oo.

2c- C{-) is additive: If F are disjoint bounded subsets of M, then C{E U F) = C{E) +

More generally, £(•) is countably additive: if Ei, E2, ■ ■ ■ En, ... (n G N) is a sequence
of mutually disjoint bounded subsets M, then >C(U^i ^n) = X^^i J~-{En)-

3c- is translation invariant: If a set F is obtained by shifting a set E, then C{F) =
C{E)-

Ac- If is a interval, then C{E) = length of interval.

Using an argument due to Vitali in 1905, we show that it is impossible to assign a length
to every bounded subset of M, i.e. there is no function which satisfies each of the properties
(l)-(4) of C{-) above, and which is defined for every bounded subset of M. Thus there are
subsets of M which have no length. This does not mean that these sets have zero length; it
means that there is no number which can be called their length, and which is consistent with
(l)-(4).

Example 6.5.1 Define an equivalence relation ~ on M by

X ~ y *^=^ y — X G Q

Let {Ei : i G /} be an enumeration of the equivalence classes of ~. Note that if x G M, then
there exists ^ G Q such that < x + g < 1. Now since x ~ x + 5, we see that for every x
there is y G [0, 1] such that x ~ y. Thus [0, 1] H / for every i G /.

Now piclij^for each i G / one Xj G [0, 1] n Ei, and define a Vitali set H hy H := {xj :G /}.
Thus for each y G M there is a unique i £ I such that y ^ Xi-

For q £ Q, define H + q := {xi + q : i G /}. First note that the H + q are mutually disjoint:
For if y G {H + q) r\ {H + g') for rational numbers g, g', then y = Xi + q = xj + q' for some
i,j G /, and thus Xj = xj + {q' — q), i.e. Xj ~ Xj- It follows that Xj = Xj, thus that q = q' ,
and thus that H + q = H + q' .

Next, we claim that for each y G M there is a unique g G Q such that y G H + q := {xi + q :
i £ I}. Indeed, existence follows from the fact that there is an i G / such that y ^ Xi, so that
q := y — Xi has the property that y G H + q. Uniqueness follows from the disjointness of the
H + q-

Now let {qn : n G N} be an enumeration of Q Pi [—1, 1]. Note that if x G [0, 1], there is a
unique i £ I such that x — Xj G Q H [—1, 1], so that x G UneN(-^ + Since H C [0, 1], we
also have UneN(-^ + Qn) ^ [-1,2]. Thus:

[0,1] c\J{H + qn)^ [-1,2]

neN

■^This requires the Axiom of Choice.

Set Theory

96

Now suppose that the Vitah set H has an length i.e. that C{H) exists. Each H + qn is a
translation of if, and thus C{H+qn) = C{H) for all n G N. Now since [0, 1] C \J^^^{H+qn) C
[-1,2], it follows that

The fact that 1 < Y.n=i^^^) implies that C{H) > 0, whereas the fact that E^=i'C(^) <
3 < oo implies that C{H) — contradiction. Hence C M is a bounded set to which a
length cannot be assigned.

Just so, under (AC), some subsets of fail to have an area, and some subsets of fail to
have a volume. The latter seems particularly counterintuitive: Given any subset of M^, to
find its volume, just dump it in a full bucket of water, and measure how much water flows
out.

It gets even worse, as will be shown in the next section.

6.6 Project: The Banach— Tarski Theorem

The paradoxical decomposition of the unit ball U in due to Banach and Tarski (1924) is
one reason why the Axiom of Choice may be regarded with some suspicion. It says that the
unit ball U can be broken up into finitely many pieces (into five pieces, in fact), and that
these pieces can be reassembled to form two new unit balls!

First, we introduce some terminology and notation so that we can phrase the Banach-
Tarski Theorem.

• An isometry on is a distance preserving map, i.e. a map / : ^ such that
||/(x) — /(y)|| = ||x — y|| for all x, y G M^. Observe that any isometry is clearly a
bijection.

Translations and rotations (and compositions thereof) are isometries.

• Two sets A^B C are congruent if and only if there is an isometry / : — )■

so that f[A\ = B. This means that A, B have the same shape — think of congruent
triangles — and that / moves the figure A to the figure B.

We write A = B \i A, B are congruent sets.

• We say that a set A can be reassembled into a set B if there is a partition (^i)i<n of A
and a partition {Bi)i^n of B (into the same finite number of pieces) such that Ai = Bi
for all i < n.

In that case, we write A^ B.

The idea is that the set A can be broken up into pieces Ai, that each piece Ai can be
moved by an isometry to Bi, and that the pieces Bi make up the set B. Thus A can be
broken up into finitely many pieces, and the pieces can be reassembled to form B.

\n=l /
Furthermore, the sets H + qn are mutually disjoint, so

□

97

Axiom of Choice

Theorem: (Banach-Tarski)

The unit ball U in can be partitioned into two disjoint sets U = U1UU2 such that

Ui ^ U and U2 ^ U

□

Problem 1.

(a) Show that « is an equivalence relation on P(M^).

(b) Show that if (^j)i<n is a partition of A and {Bj)i^n is a partition of B such that Ai ~ Bi
for all i < n, then Am B.

(c) Prove a "Schroder-Bernstein Theorem" for ss: If A^ C B C A and A, then S A.
[Hint: Let (Aj)i<„ and (^j^)j<„ be partitions of A and A^, and let fi : Ai ^ A\ be
isometrics so that fi[Ai] = A\. Define / = Ui<n /»' / agrees with /j on ^Ij. Define

^° = A, = /[A'^], 5° = 5, B"+i = /[B'^]

Let C := Un<a;(^" - -S")- Observe that f[C] C C, and that /[C] !==j C. Further observe
that A = C\J{A-C) &ud B = f[C\ U (A - C), and deduce that A S.]

□

Our next aim is to partition the unit ball U into five pieces which can be reassembled to
form two unit balls. To do that we will partition the unit sphere S into pieces having certain
properties. {S is the boundary of U .) Let c denote the centre of the unit ball. Each X <^ S
determines a unique X <^U as follows:

X := set of all points in ?7 — {c} whose projection along a ray
from c to the surface S lies in X.

Clearly, \i X,Y C. S are such that X then also X KiY .

Consider now two axes do,di in centred at c. For dcfinitcncss, assume that do is the
Z-axis, and that di lies in the XZ-plsme at an angle 6 to the Z-axis. Consider two rotations:

ijj = Rotation by 7r/3 about do (p = Rotation by tt about di

Observe that tp, ip are isometries which map S to S.

Let / be the identity. Note that if)^ = I and (p^ = /, so that (p~^ = ip,ip'''^ = ip^-
Consider the group G of all rotations generated by ip, (p. Take, for example, the element

a := (poipoipoloil^oip. Using the facts that 1]/' = I and p'^ = /, we can reduce this to
a = o ip. In this way, every element of G has an irreducible representation of the form

(p o ip^^ o (p o tp^^ o . . . or tp^^ o (p o tp^^ o (p o . . .

Each such an irreducible representation has a length:

1{I)=0, = l{',p^^) = 1, l{ip o ^p^'^) = l{ip^^ o ^) = 2, l{p) o ^p^^ o p)) = 2, . . .

Set Theory

98

Now the group G depends on the angle 9 between (Iq and di. We want to show that
there is so that every element of G has a unique irreducible representation. Equivalently,
we want to show that we can choose 9 so that no non-trivial irreducible representation yields
the identity /.

Problem 2.

(a) Show that each element of a G G can be represented by an orthogonal matrix (rotation
matrix) whose entries are bivariate polynomials in cos 9 and sin 9 (i.e. for each entry aij
of the matrix a there is a polynomial Pij{x, y) so that a^j = pij{cos9, sin 9)).

[Hint: A rotation by tt about di can be effected by rotating the axis di by 9 onto the
Z-axis, then rotating by tt, and then rotating the axis by —9 back to its original position.]

(b) Show that an irreducible representation a G G of length > 1 yields the identity element
if and only if z := e*^ is a solution to a system of polynomials.

[Hint: cos^ =

(c) Deduce that for each a G G there are at most finitely many 9 so that a yields the identity
element. Conclude that there exists a so that no irreducible representation of length
> 1 yields the identity element.

□

Remarks: We've tried to avoid group theory in the above. What we were actually doing,
should you know some group theory, is to show that the free product Z2 * Z3 of the groups

(if) = Z2 and (ip) = Z3 is embeddable into the group SO(M^). If Go is the group of rotations
generated by ip, tp when the angle between do, di is 9, then there is an obvious homomorphism
Tr^) : Z2 * Z3 — )■ Ge- We showed that there exists 9 such that ng is an isomorphism.

□

Henceforth, fix a so that no irreducible representation of length > 1 yields the identity of
the corresponding group G. Then every element of G has a unique irreducible representation.
We define a partition G = ^Ui?UGby induction on the length of irreducible representations.
The elements of length < 1 are classified as follows:

Now classify the elements of length > 2 according to the following rules:

• If a starts with ip^'^, then

(1) a e A =^ ifoae B.

(2) a e BUG =^ (poae A.

Observe that if a starts with (p, i.e. if a = (po/3, then ipoa has irreducible representation
/3, since (po a = (p o o (3 = p. Since the length of /3 is < the length of a, o a = ^ has

99

Axiom of Choice

• If a starts with then

tpo a e B and il^"'^ o a e C.
ijjo a e C and 'tp~^ o a e A.
ijjo a e A and ^"-^ o a G -B.

If a starts with tp"^^, then ■0='=^ o a has aheady been classified.
Problem 3: Show that

(p[A] = BuC ip[A]=B tp-\A] = C

[Hint: To show ip[A] C B L) C , let a G A. If q starts with ip, then a = ip o (3 for some
/3 G i? U C, by rule (2). Hence ipoa = (3GB\JC. Else, if a starts with i^^^, then by rule
(1), ^poa e B C BUC. Hence (p o a e B U C ioi all a e A.

Conversely, to show B U C C ip[A], suppose that a £ B L) C. If a starts with (p, then
a = ip o p for some (3 £ A (and thus ip o /3 E B), so a E p>[A]. Else if a starts with ip^^, then

o a G A, by (2), so a = o (99 o q) G <^[A]. Hence a G <^[A] for all a G -B U C.
The other equations can be proved similarly.]

□

We now use the partition G = AUBUC to construct a partition S = X\JY\JZ\jQof
the unit sphere. This, in turn, will induce a partition of the unit ball B.

Problem 4: We show that there is a partition S = XUYUZUQof the unit sphere S,
where

\Q\=^o and X = Y = Z = YUZ

(a) Observe that |G| = b^o- Also note that every rotation of S other than the identity has
exactly two fixed points. Let Q be the set of all x G 5 so that a; is a fixed point of some
aeG- {I}. Conclude that \Q\ = \Ho\.

(b) For X G 5 - Q, let

:= {a{x) : a G G}

Define a binary relation Ron S — Q by xRy < — > x e Py. Show that R is an equivalence
relation.

(c) The Axiom of Choice implies that there is a set M which contains exactly one element
of each equivalence class of R. Define

X --A-M ■-{a{x):xeM,aeA}, Y := B ■ M, Z = C-M

where G = AU B U C is the partition of G constructed earlier. Show that

ip[X] = YUZ, iPlX] = Y, ^~'^[X] = Z

{3) aeA =^
{A) aeB =^
(5) a G C =^

(d) Conclude that X = Y = Z = YUZ.

Set Theory

100

□

Now recall that if X C S, then X is the set of all points inU — {c} whose projection along
a ray from the centre c onto the surface S lies in X. The partition S = XUYUZUQofS
therefore induces a partition

U = XUYUZUQU{c}

of the unit ball. Moreover, clearly

X f-^ Y Z Y U Z

Problem 5:

(a) Show that Y U Z ^ X \J {Y U Z). Deduce that

X^Y^Z^YUZ^XUYUZ

(b) Show that U ^ X\jQU{c}.

(c) Show that there is a rotation tt G such that Tr[Q] Ci Q = 0. Conclude that 7t[Q] C
XUYUZ. Using the fact that Z Ki XUYUZ, deduce that there is T C such that

[Hint: Choose an axis of rotation whose poles are not members of Q. For each angle
5 G [0, 27r), let its denote a rotation by S about this axis. For q G Q, define
Aq := {S G [0, 27r) : Trs{q) G Q}. Then A := [j^^g Ag is countable. Choose S G [0, 27r)-A,
and let tt := tts-]

(d) Let p & Z — T. Observe that

U^XUQU{c}^Yufu{p}CYUZCU
Conclude that YUZ ^U.

(e) Finally note that U = UiUU2, where Ui := X[jQU{c} and U2:=YUZ. This concludes
the proof of the Banach-Tarski Theorem.

□ 