The Thing Explainer Challenge

Randall Munroe of the xkcd fame has a new book coming up where he explains various concepts using a small repository of “simple” words (this is based on this xkcd comic). He recently posted this blog post, where he reveals a word checker program that he wrote to help him with the task.

So I figured, why not use this for explaining mathematical theorems.

The Challenge

Use the xkcd “simpler writer” to present a mathematical theorem. Bonus points for showing the proof as well. The only three words which are allowed outside the checker’s scope are the titles “Definition”, “Theorem” and “Proof”. (This is a variant of “Maths, Just in Short Words” by David Roberts from last year.)

Example: Cantor’s Theorem

Theorem. Given a set A, there is no means to give each member of A a single set whose members are members of A, such that every set whose members are members of A is given to some member of A.

Suppose that we have some way of giving each member of A a set whose members are members of A. Consider now the set whose members are exactly those who do not belong to the set they were given. If a member of A is given that set, then it is a member of the set it was given if and only if it does not belong to the set it was given. This is not possible, of course. So we found a set whose members are members of A, and it was not given to any member of A.

Feel free to add your simple writing of mathematical statements, their proofs and otherwise in the comments!

Name that number

In the best TV show ever produced, Patrick McGoohan plays the mysterious No. Six. He lives in The Village, where former spies are held. The people there are essentially captive, and they all have numbers instead of names. But he is not a number! He is a free man!

We find a similar concept in Zelda’s poem “Every man has a name” (לכל איש יש שם), which in Israel is closely associated with the Holocaust and with assigning numbers to people. But alas, we are all numbers in some database. Our ID numbers, employer number, the index under which you appear in the database. You are your phone number, and your bank account number. You are the aggregation of all these numbers. And more.

But we are not here to talk about people, we are here to talk about numbers. Numbers have names as well. $1$ is the name of the unit, $\pi$ is the name of the ratio between a circle’s circumference and its diameter, and so on.

One common fallacy is that “the set of nameable numbers is countable”. Joel Hamkins explains it well. But let me offer a slightly different approach as to why this is an inaccurate state of affairs. And while we’re at it, we can investigate what it means for a real number to have a name.

The naive notion of nameable numbers. We like to think about nameable numbers as numbers which we can explicitly define and name. Like $\pi$, like $e$. But that is not really hitting the nail on the head. What we do think about is that there is a mathematical definition for the number. So $\pi$ is some line integral, and $e$ is a certain limit, or defined from $\exp$ (which in turn can be defined as a unique function satisfying a differential equation, or as the inverse of a logarithm (which in turn can be defined as an integral)).

Let’s examine the definition of $e$ as $\exp(1)$, where $\exp$ is the inverse function of $\log$ with $\log t=\int_0^t\frac1x\operatorname{d}x$. We have integrals, we have inverse functions, and then we have evaluation. This is not a first-order definition over the ordered field $\RR$. It’s barely even a second-order definition.

So even naively, we cannot really make any appeal to first-order definability over $\Bbb R$ as an ordered field (if we could, then the algebraic reals would already include all the nameable numbers; whereas $\pi$ and $e$ are transcendental, they would be without names). Maybe we can say that a real number has a name, if there is some $n$-th order logic definition over $\Bbb R$. And of course, that sounds pretty good. And we can prove, now, that there are only countably many definitions, so only countably many reals will have names, and therefore most reals are undefinable, unnameable, you name it.

Here we run into two problems very very quickly. The first is independence, using second-order logic we can write the statement “The continuum hypothesis holds”, so if we are permitted to use second-order definitions, we can define the real which is $1$ if the continuum hypothesis is true, and $0$ is it is false. Is this real a nameable real? You know that it is either $0$ or $1$, but you have no practical way, sans adding more axioms to mathematics, to know which value it has.

Okay, so maybe that’s cheating, maybe you want to be able to prove from $\ZFC$ that a nameable number has a certain value. By this I mean, of course, that we should be able to decide with an algorithm whether or not a given rational number is smaller or larger than our nameable real. That sounds like a very cogent requirement from nameable reals. But again insufficient, simply because we can point at Chaitin’s constant and say that it has a fairly definite definition. You could argue against this, and that is fine, but to say that Chaitin’s constant is not a nameable number is essentially to say that only computable reals are nameable numbers. That is a valid approach to constructive mathematics, but it is not without difficulties. And since we want our mathematics to be simple and easy to use, computable analysis is not the road to take here.

The second problem, which is equally bad, is that $n$-th order logic definitions are simply not sufficient. It is not hard to come up with a real number which is not definable by any $n$-th order formula, but has a definite definition. Definite enough that we may consider it nameable. (And we should consider it nameable if we allow the “continuum hypothesis real” to be nameable!)

Set theory to the rescue? Okay, this is not really about set theory. This is about a foundational theory. Whichever it is that you like. If it allows some notion of real numbers which is “adequate”, then it will usually come (or be bi-interpretable with) some mathematical world in which these real numbers live. So we can ask that a nameable number is a number which can be definable in such fixed world. And here comes the kick ass result, that it is possible that every real number is definable. Does it mean that the real numbers are countable? Yes, but not necessarily inside the model. Since the definitions now are not “inside” the mathematical world, but rather “outside” that world (they are in the foundational theory’s meta-theory), we cannot quantify over the definitions and we cannot definably match a real with its definition.

So we get a universe where every real number is definable. And that is pretty amazing. Note that we’ve switched from the problematic nameable to the mathematical “definable”. Because the notion of “nameable” is not really a mathematical notion. We like to think about numbers with names as numbers which come up organically, and naturally, from our lives or nature. But the truth is that we cannot know or not know what had, have, will or hadn’t, haven’t and won’t come up and how these numbers are aligned with our perception.

And the major issue with names and definitions is that they live in the meta-theory. If you work with the real numbers, your meta-theory is “mathematics as you see it” or however you chose to formalize the notion of the real numbers and proofs and so on (let’s say, for the sake of things and since it’s my website, that $\ZFC$ is your choice). So now $\ZFC$ is your meta-theory. But once you allowed definitions not to come from some language about the real numbers, but to come from the entire wrath, might and power of your meta-theory, suddenly you find yourself appealing to its meta-theory.

If you’ve gone cross-eyed, don’t worry. You’re in good company. And what is my conclusion, then? That we should be remember that we formalize mathematics for a reason, and that some concepts like “nameable” are too ideal (in the Platonic sense of the word) to be given an explicit interpretation. Just like any definition of a chair is either circular (a chair is a chair), excessive (non-chair objects satisfy it), or insufficient (some chairs do not satisfy the definition).

How to solve your problems

Anyone who peruses mathematical Q&A sites, or had students come to office hours or send questions via other means (email, designated forums, carrier pigeons, or written on a note tied to a brick tossed into your office) knows the following statement: “I don’t know where to begin”, or at least one of its variants.

Richard Feynman, who was this awesome guy who did a lot of cool things (and also some physics (but I won’t hold it against him today)), has a famous three-steps algorithm for solving any problem.

  1. Write the problem down.
  2. Think. Real. Hard.
  3. Write the solution down.

While Feynman’s algorithm is quite simplistic, it really hits the nail on the head. But still, we seem to fail in solving a lot of our problems. Young students especially. Most of them might argue that the second step is really difficult, unless your name starts with Richard and ends with Feynman. While that’s not entirely wrong, what most people miss most of the time is the first step.

Writing the problem down does not mean just writing down the actual question as given to you in the exercise sheet, or writing the theorem that you wish to prove. It means that you have to unwind the definitions, and unwind exactly what you have to verify, until you hit a sufficiently strong bedrock of understanding.

For example, if you have to prove that $\aleph_1\leq2^{\aleph_0}$, then you need to first understand what all those symbols mean. $\aleph_1$ is the cardinal of the least uncountable ordinal, $\omega_1$; $2^{\aleph_0}$ is the cardinal of $\mathcal P(\Bbb N)$; $\leq$ means that there is an injection from $\omega_1$ into $\mathcal P(\Bbb N)$. So we need to show that there is such injection. If you’re unclear as to what do $\mathcal P$, “uncountable” and “injection” mean, then you have more definitions to unfold here.

Now we think. We create this graph of theorems and associations. What do we know about a power set? We know that $|X|\lt|\mathcal P(X)|$. Therefore the cardinal $2^{\aleph_0}$ is uncountable. We know that the axiom of choice implies that every two cardinals are comparable. We know that $\aleph_1$ is the smallest uncountable cardinal, therefore it cannot be strictly larger than $2^{\aleph_0}$. Ah, so we know that $\aleph_1\leq2^{\aleph_0}$.

The above description seems to include some redundancies. Students expect questions without any redundant details. Last year when I had a question with a minor additional detail (some function didn’t need to be surjective), I got complaints from students that they didn’t have to use that information. But that’s actually a good thing. To know and understand that some details are not needed, but maybe they are there to guide you towards some nontrivial piece of information.

We didn’t use the definition of $\leq$, but rather some abstract theorem about it and the cardinals involved; and we didn’t really use $\mathcal P(\Bbb N)$ and $\omega_1$; not to mention that we sort of skipped over a few trivial things. But was mentioning them really redundant information?

However there are two fine points here: the first being that these details help give us a more complete picture of the problem. Even if you don’t use all these details; and the second point is that the theorems that we did cite, did rely on that information implicitly. So it is always good to refresh your memory with these things. Of course, when you’re well trained in a particular topic, you have a rather comprehensive bedrock which is why your brain already made those connections and it was trivial to prove that $\aleph_1\leq2^{\aleph_0}$.

Now you might wonder, we could have written so many more details. We could have appealed to a dozen other definitions and theorems. Why these ones? Well. That requires practice. You will probably write all the theorems and definitions at first. And with time, and solutions, your brain will train itself to find those quicker connections where you don’t have to go all the way down to the turtles. Instead you only had to write the immediate definitions and one or two theorems.

And when you’ve got to that point, you know that you’re ready for the next level of exercises.

In any case, this is why I always tell my students in introductory courses that the first thing to do when you read an exercise is to read it and understand it. And when we do a homework question from the week before on the board, I usually copy it on the blackboard and ask my students what is the first thing to do here. Many times someone will shout “You do this mathematical manipulation” to which I always say “No. We first read and understand the problem!” and then we review the question, the relevant definitions, and then we usually move to the earlier suggestion. I strongly suggest every TA, or a professor that solves problems on the board with students, to do. It keeps students involved and the break from “dive into the proof” is always refreshing to students, even if they won’t admit it.

Finally, since we brought up Feynman and education. Here is a marvelous video of him explaining why he does not want to explain magnets and magnetic force. It’s taken from “Fun to Imagine” (see this page), and if you have the time, you should watch the entire thing or at least his explanation on fire which culminates in the poetic imagery that when you burn a piece of wood, the light and heat from the fire is the light that came from the sun, and was the energy that broke apart the carbon dioxide into carbon and oxygen which are reuniting in the flames. Awesome.