Here comes trouble.

In lesson 2 we studied sets and in lesson 3 we studied their subsets. In the previous

lesson we studied sets whose elements are themselves sets, since after all a set is indeed a “thing”, and “things” are what make up sets. Whether it feels like it or not, we’ve actually come a long way in those three lessons—so long, in fact, that we almost can’t go any further. Let me explain.

As you may have noticed, everything that we’ve defined and discussed so far has been in some sense as “obvious” as it can be. We started in lesson 2 by considering what could be the most fundamental mathematical object, and naturally our answer was “anything”. In other words, we can take anything we’d like in the whole universe and even those things **not** in the universe and consider whatever this “thing” is as an element. We then looked for the second most fundamental concept, and again naturally came to

the conclusion that “a collection of things” is the natural next step. After considering

one “thing” (an element), what could be more obvious than wanting to consider “some

things” (a set)? All of this seems very intuitive, and indeed it is. More importantly, it is

extremely general, and therefore extremely powerful. But remember that when it comes

to mathematics, intuitiveness means nothing—all that is important is logical precision

and consistency. Let us explore whether or not we have these things here. We’ll indeed

find that we don’t quite have logical consistency yet, and this is a problem that has led to

an entire group of people studying “set theory”, which is primarily the study of how to

make our notions of sets logically sound (regardless of what our intuition says about the

matter).

First let us consider the following “warm-up” paradox. Suppose you’re a librarian and

you’re given two blank books. Suppose also that you’re asked to record every book in

the library into one of these two blank books in such a way that every book is recorded

once and only once in one of the two books. Let us get more specific. Suppose you

label your two blank books “book A” and “book B”, and you’re told to write down

every book in the library in either book A or book B in such a way that **every book is**

** recorded only once** (i.e., you don’t miss any books, and you don’t double count any

books). There are several ways you could do this. You could write all of them down in

book A exactly once, leaving book B empty. This does fulfill the requirements! You

could also record each book into book A if it has the color red somewhere on its cover,

and all the books that have red nowhere on their covers into B. Since every book either

has red or it doesn’t, you’ll get each and every book exactly once, and you’ll therefore

fulfill the requirement.

Suppose you let me come up with the rule, however. Seeing as I’m not the brightest bulb in the barrel, I come up with the following rule: **Record every book that contains its** ** own title somewhere within its text into book A, and record the books that do not ****contain their own title in their text into book B**. Thus, since the word “Harry Potter” shows up inside the text of the Harry Potter series, all of the Harry Potter books get recorded into A. Any book that goes from start to finish without mentioning its own title, however, gets recorded into B. Let’s notice that although this is a somewhat odd rule, it **should** satisfy the requirements: any book either contains its title within its pages or it doesn’t, and therefore every book will be recorded into either A or B. There is no trick up my sleeve here (or anywhere)—this is a perfectly good rule.

Now you, being smarter than I, realize that book A and book B are also “books in the library” and therefore, according to the requirement, need to be categorized into either A or B. But you’re confused, and rightfully so. What book does book B get recorded into? Recall that book B is the book that records all of the books that do not have their own title within their pages. We know that book B needs to go** somewhere**, because our rule should apply to every book in the library. Let us therefore do what mathematicians do best: guess. Suppose we record book B into book A, so that we’re **supposing** (guessing) that book B does contain its title within its pages (seeing as it’s recorded into A). But if book B is recorded in A, then it is not recorded in B since we can’t double count our books. And if it’s not recorded in B, then it does not contain its title. But this contradicts the fact that it’s in A, which says that it does contain its title!

Therefore we’re supposed to assume that our guess was wrong—book B is not recorded in A, like we had guessed. But that means that book B is recorded in book B. Thus, book B contains its title within its own pages, and this contradicts the fact that any book recorded in book B does not contain itself within its pages! So we can’t even record book B into book B! Book B is simply un-recordable; it belongs neither in A nor in B. But this is strange, because our rule was supposed to record every book just like we were ordered.

In fact, this is much worse than strange—it is catastrophic. Again, there really is no trick up my sleeve. There’s no fancy play on words or hidden meaning. What you get is what you see, and what you see is a total breakdown in what seemed to be full-proof logic.

Therein lies the problem! Our logic only **seemed** to be full proof. In our own minds, every link in the chain worked perfectly. We had a perfectly well- defined problem, we came up with a perfectly well-defined rule, we used what seemed to be perfect logic, and we found ourselves led to a perfectly inescapable paradox. In order to do mathematics, we need a system of logic that is free of such paradoxes. Seeing as mathematics doesn’t bend to suit our intuition, we’re forced to bend our intuition in order to suit mathematics. But before we discuss how this is done, let us see how this paradox translates into the mathematical language that we’ve developed so far.

We can reformulate this entire paradox into the more precise language of set theory. We know that we can discuss sets of sets, where sets are themselves the individual elements of some set. Moreover, we know that a set is just any collection of elements that can be distinguished from each other. Thus, since we know how to distinguish any two sets and/or tell if they’re the same set, we can make up arbitrary collections of sets and define some new set whose elements are those prior sets. In particular, we could theoretically define the set of all sets. There’s nothing in our formalism that forbids us from doing so.

If we define the set of all sets, then we’re forced to realize that this set contains itself. After all, it is the set that contains **all sets** as elements. Since it is a set itself, it must contain itself as an element. Thus, the set of all sets contains itself. (For the curious reader, there goes the solution to the second exercise in lesson 2—now you know why I said it’s not obvious!) Therefore we know that some sets contain themselves, and some sets don’t. We’ve at least found one set that does contain itself. Of course, most sets don’t contain themselves. For example, the set {1,2,3,4} doesn’t contain itself because it only contains the numbers 1, 2, 3, and 4, whereas the set itself is a set of four numbers! Note also that the set {a} doesn’t contain itself either, because it contains only the element “a”, whereas the set itself is the set made up of only the element “a”. In other words, we know that we have to distinguish between the set and its elements, just as we’ve always done: a set of 10 apples does not contain itself because “the set of 10 apples” is not an element of this set (the elements of the set are apples, not sets of apples).

But since we know that there is at least one set that contains itself, we know that it should be an interesting question to study “the set of all sets that contain themselves”. Similarly, we could consider “the set of all sets that don’t contain themselves”. In other words, you hand me a set, I ask if it contains itself or not, we figure out that answer, and we place it in the appropriate set. Notice that we should be able to place every set in either “the set of all sets that contain themselves” or ”the set of all sets that don’t contain themselves”, since every set either contains itself or it doesn’t (this is the same as saying every book is either red or not red).

Accordingly, the set of all sets that do not contain themselves should be in one of these two sets. Let’s see which one. If “the set of all sets that do not contain themselves” is in “the set of all sets that do contain themselves”, then “the set of all sets that do not contain themselves” must contain itself . But this means that it both contains itself and doesn’t contain itself, because it is both in “the set of all sets that do contain themselves” and “the set of all sets that don’t contain themselves”. This is clearly a contradiction.

Thus, “the set of all sets that do not contain themselves” should be in “the set of all sets that do not contain themselves”, since it has to be in one or the other (and we just showed that it can’t be in “the other”). But then “the set of all sets that do not contain themselves” actually does contain itself! We therefore have another contradiction! We’ve found a set that cannot be categorized into either of these two sets, despite the fact that these two sets were supposed to contain, between the two of them, every imaginable set! Again, there is no trick up my sleeve. This is a genuine flaw in how we’ve been manipulating sets!

In order to avoid contradictions like these, we have to be a lot more careful about how we define and manipulate sets, despite the fact that what we were doing before seemed so obvious and intuitive. Again, it’s not up to us to decide how mathematics should behave, but rather up to us to align our logic and intuition to its behavior.

The details of how we should describe and manipulate sets take us far away from the

type of math that we would like to explore here, so we will leave this for now. We’ll

eventually find a way out of these problems using what’s called Category Theory, but

that won’t be for a while. The truth is that a good deal of mathematics can be discovered

by using this intuitive description of sets, and so that’s what we’ll do. Note that this most

certainly does not mean that the mathematics that we’ll discover together is in some way

placed on un-sturdy foundations. The math that we’ll describe will hold even with the

more careful, detailed, and technical definitions of sets. All we’re doing is avoiding that

tedium, but we’re not “dumbing down” mathematics, or deriving untrue statements. So long as we don’t deal with sets that lead to contradictions like the one above (which is known as Russell’s paradox), we’ll be deriving unassailably true and eternal statements.

The whole point of this lesson was to show that even in the seemingly naive and simple

theory of sets, there is immense subtlety and beauty (I find logical difficulties such as

these extremely gorgeous, and if the reader disagrees she should not lose heart—there

will be more beautiful things to come). For now, we happily admit the existence of these

subtleties, but we leave them behind and bull-doze our way into cooler and cooler

math…

I’m really loving your explanations so far, but the way you’ve presented Russell’s paradox feels a little misleading. The way you’ve built up to the paradox, you give the impression that allowing sets to contain themselves is what leads to Russell’s paradox; this is not the case. The trouble in “the set of all sets that don’t contain themselves” is the first part: “The set of all sets such that…”

[technical stuff: ZF without the axiom of foundation still avoids Russell’s paradox, but you can define a set A={A}, which is completely fine. Leaving out foundation really only means not every set lives in the cumulative hierarchy, which is a bit silly, so we gain a lot of convenience and lose nothing by adding foundation.]

I’m glad you’re liking it so far, and thanks for the insight on this lesson. What I’ve tried to do with this lesson and the ones leading up to it is show, or at least allude to the fact that the problems start with considering sets of sets. This I think is exactly what you mean when you say the problem starts with “the set of all sets such that”. We can of course consider sets of sets of numbers or sets of sets of apples without having to be too careful, but sets of sets with more abstract definitions we have to be more careful about. It is the technical ZF stuff that gets into figuring out just how careful we have to be, but at the end of the day (it seems to me at least) we begin to set ourselves up for trouble by considering sets of sets at all. In this lesson I use the fact that this consideration allows us to consider the set of sets that do (resp. do not) contain themselves, but this is just one example of a set of sets.

You clearly know all of this math, but I just wanted to clarify what I was viewing as “the real problem”. Of course, what ever “actually is” the real problem is way more subtle than what I’ve described, and that’s why there are so many people working on it. This lesson is here primarily to show how not be careful, even with some seemingly simple ideas, can lead to trouble. In short, Russell’s paradox can be seen by considering sets that contain themselves, but I tried to hint to the fact that it’s not containing yourself that’s the problem in its own right, but rather the much more general concept of sets of sets at all.

And thanks again for the insight!

“and record the books that do not contain their own title in their text into book B”

(…)

And if it’s not recorded in B, then it does not contain its title. “.

i’m a bit confused!

Right, great question, thanks for asking it. So the first sentence says that we take each book and ask “does this book contain the same words as its title anywhere within its text?”, and if it does do this, then we put record it in book A, and if it doesn’t, it gets recorded into book B. But then we want to ask this same question about book B itself. Namely, we want to pick up book B and look through its pages and see if it contains the words “Book B” anywhere within its interior pages. If it does not contain those words, then it does not contain its own title. Thus, if book B is itself not recorded in book B, meaning it is recorded in book A (because it must be recorded somewhere), then the words “book B” do not appear within the pages of book B. Accordingly, book B does not contain its own title.

Does that clear things up? If not, please let me know because a) I’m happy to try again and b) this is indeed somewhat confusing, and thus warranting of a healthy amount of confusion!

Thanks a lot for your interesting teaching. I have a question:

what is the difference between ( B is subset of A) and ( set A contains set B)?

I feel the answer by intuition, but I need more details.

Good question, and in fact there is no difference at all. The statement “B is a subset of A” is a completely equivalent statement to “the set A contains the set B”. No difference at all, just two different ways of conveying the exact same idea. Does that make sense? If not please don’t hesitate to ask!

Think it should be phrased as set A contains the elements of set B, rather than set B itself. eg.

B = {1, 2}

A = {1, 2, 3}

If set A contains set B, then it probably looks like this:

A = { B, 3} = { {1,2}, 3} which won’t make set B a subset of A.

If set A contains the elements of set B, then it probably looks like this:

A = {1, 2, 3}

Maybe it’s the language that is confusing rather than the maths! Or should one just say that A is the superset of B?

You’re entirely correct, though the distinction is already implied. In general, when it’s clear what we mean by “A contains B”, i.e., when it’s clear whether we mean “A contains B as a SUBSET” or “A contains B as an ELEMENT” (which, as you rightfully point out, are two very different statements), we simply say “A contains B”. As you can go back and check, whenever we say “A contains B”, it should be clear as to which meaning we adopt. If not, please do let me know. It is just too tedious to always specify the difference especially when the difference is clear (the mathematical community has implicitly agreed upon this convention).

Jpsk, Thanks a lot for your comment. TBOM, Thanks a lot for your reply.

LOL, sometimes we’re so used to some terminology or notation (sin^2 x is another notorious example) that we forget how much confusion they could create if viewed from a unfamiliar perspective!

To contain B meaning to contain it as an element definitely makes more sense linguistically, but I guess the use of subsets was so frequent that mathematicians stuck to it meaning containing its elements.

Unrestricted Comprehension

========================

It’s not just that there’s “nothing in our formalism that forbids us from” defining sets like the set of all sets and the set of all the sets that contain themselves (as an element, that is); I think you could mention our implicit assumption of unrestricted comprehension (i.e., `∀w_1,…,w_n ∃B ∀x (x∈B ⇔ φ(x,w1,…,wn))`) as it would make explicit where the trouble is coming from.

Russel’s Paradox

==============

Also, I agree with Coryory that the build-up from the set of all sets is a bit mis-leading! I realize that you’re just trying to transition smoothly between topics, but I myself only *really* could get it after reading the Wikipedia intro and then the paragraph in exclusion. As with bugs, the shorter the repro steps, the clearer the heart of the problem.

Meta

====

Also, forgive me for nitpicking, but it bother’s me to read ‘full-proof’; Google Dictionary and Urban Dictionary confirm my suspicion that it is a misspelling of ‘foolproof’ (meaning that the word ‘fool-proof’ is not fool-proof itself ;). As always, thanks for the posts! (I’m never able to cover more than one a day, as they force me to think and consequently send me down a rabbit hole into Wikipedia!)

Thanks as always for the reply. Even though discussing Unrestricted Comprehension may make this whole discussion a bit more “rigorous” (quotes because at the level of axiomatic set theory it’s tough to know what we means by that word), it would go against the philosophy of this site, which is to maintain a level of simplicity (as well as rigor) to both give properly give a feel for what higher math is all about while also being simple enough for non-experts and people not interested in spending hours figuring things out. So describing the assumption of unrestricted comprehension would be way too far off topic here, and completely scare away people in the exact way that this site is meant not to.

I understand, and so did the people behind the concept of footnotes: showing critical but specialized information to only those who knew or cared enough since 1849 (complete made it up as my Google-fu couldn’t find the answer before my 5-minute attention span timed out).

[ And if it’s not recorded in B, then it does not contain its title. ????]

If a book is not in the group that does not contain its own title, then it is in the book that does contain its own title (namely Book A). By definition then, it does contain its own title.

I don’t get it.

Exactly, that’s the conclusion one would want to draw. But, if we then ask what happens if it is in the group of books that do contain their own title, we ALSO get an impossibility. Therefore, the book can be classified in neither group! That’s the paradox. The paradox is that we cannot put it in EITHER group even though both groups were supposed to be sufficient to cover all books.

Namely, if we suppose book B does contain its own title, then it must be recorded in the group of books that contain their own title, which is book A, which means its NOT recorded in book B, which means it does not actually contain its own title!

Does that make sense? (And don’t worry, it’s supposed to take some time to sink in, it’s a paradox that helped make the whole world of conventional math come crashing down over a century ago.)

Aha, the penny drops.

It took me a while to get my head around the fundamental idea that the title “Book B” is listed in the book “Book B”. If this is so, then the title “Book B” must be listed in Book A because it contains its own title. But “Book B” can’t be listed in both Book A and Book B.

Thanks for explaining that.

Awesome! That’s exactly it 🙂 Always happy to help and don’t hesitate to ask anything else. And PS: “The penny drops” is a new one for me—I haven’t heard that phrase before…I’ll start using it!

Hello,

I’m really thankful to know this site, I start to learn true beauty of maths from your site.

For this lesson, I would like to ask you a question : Why do you have to guess if the book B (or A) will be recorded in book A or B ? For me, because the book A and B don’t have a title (they are blank in the begining) if I have to put them in book A or B, it will be definitely in book B. But to regard to the recording requirement, the books recorded must have a title so neither A nor B will be recorded because they don’t have a title.

Thank you for reading and replying,

Huyen

Hi there! I’m glad you’re liking the site 🙂 As for your question, I guess I’m a little confused, because books A and B do indeed have titles: their titles are “Book A” and “Book B”. Namely, suppose we gave more interesting names to the books, like “Alice in Wonderland” and “Beauty and the Beast”. And suppose we decided to record every book that contains its own title in “Alice in Wonderland” and every book that does not contain its own title in “Beauty and the Beast”. It doesn’t matter whether or not these books already have any text in them—we could suppose that they’re full-fledged books already and that we’re just using the last page or the margins to record all the books. Even in this circumstance, we can still ask the question: What book does “Beauty and the Beast” get recorded into? If it is the case that “Beauty and the Beast”, as well as “Alice in Wonderland” don’t

alreadyhave the phrase “Beauty and the Beast” in them, then we have the same paradox. Namely, we cannot record “Beauty and the Beast” in itself, for then it would contain its own title, and we cannot record “Beauty and the Beast” into “Alice in Wonderland”, for then “Beauty and the Beast” would not contain its own title and therefore should not be recorded in “Alice in Wonderland”. We therefore see that this paradox has nothing to do with whether or not books A and B already have text in them (i.e., are empty or not), or what their titles are. It only has to do with the recording scheme itself, which is a problem. Does that make sense? If not, I’m happy to try again! 🙂