Eternal preliminaries part 2, filters and ultrafilters

Last time I wrote about the basic structures, (partial) semigroups. But algebra in the Stone-Cech compactification deals, well, with the Stone-Cech compactification. I will try to ignore the general theory of compactifications because we only deal with a very simple case — discrete spaces. Suffice it to say that any elementary topology book should have a chapter on compactifications if you want to read more.

Filters and ultrafilters

Already at the end of the first part, I needed to refer to the notions of filters. I don’t want to talk too much about filters or ultrafilters formally because a) we’re going to talk about them all the time anyway and b) wikipedia (to which I link) is much better at giving you a concise but broad overview than I am. Let me give you a cheat sheet though.

  • A family of subsets of $S$ is a filter if it does not contain $\emptyset$, is closed under taking finite intersections and supersets.
  • A family of subsets of $S$ has the (strong or infinite) finite intersection property or (i)FIP if the intersection of any finite subfamily is non-empty (infinite).
  • A family with (i)FIP generates a (free) filter by closing it under finite intersections and supersets.
  • A filter $F$ is an ultrafilter iff it is a maximal filter (with respect to inclusion) iff it is prime $(A\cup B \in F \leftrightarrow A \in F \mbox{or} B\in F)$ iff $(\forall A\subseteq S) A\in F \mbox{or} S\setminus A \in F$.
  • We identify $s\in S$ with the principal ultrafilter $\{ A\subseteq S : s\in A\}$.

Filters can be considered as 0-1-valued, finitely-additive measures (or rather the measure 1 sets of such a measure) in which case ultrafilters are complete measures which is an idea I might use in “prose” every once in a while. You can also consider them as a form of universal quantifiers which gives another intuition. A useful shorthand will be “almost all (with respect to some ultrafilter $p$)” or “$p$-many” or “$p$-large” etc. instead of, say, “there exists a set in the filter such that for every element in that set…”.

We’ll get back to this later. One final note, $F,G,H$ are usually denoting filters, $p,q,u$ ultrafilters. We’ll discuss many filters explicitly later (and in part 1 we already considered the filter $\sigma(S)$) but the existence of ultrafilters is a tricky business that requires quite a bit of the axiom of choice.

The Ultrafilter Lemma Every filter can be extended to an ultrafilter.

Proof.

  • The set of filters is partially ordered by inclusion.
  • The union of any chain of filters is a filter.
  • Apply Zorn’s Lemma to find a maximal element.

This theorem is weaker than the axiom of choice, but very strong already in itself (looking up that link I just learned that Tarski himself proved the existence of non-principal ultrafilters in 1930; wikipedia. awesome.). Of course, the real power lies in the three characterizations of ultrafilters in the cheat sheet, so let’s prove the difficult one

A filter $F$ is maximal iff $F$ is prime, i.e., $(\forall A\subseteq S) A\in F \mbox{ or } S\setminus A \in F$
Proof.

  • If $F$ is maximal, $A\subseteq S$, then either $A \in F$ or $S\setminus A \in F$.
    • If there exists $B\in F$ such that $A \cap B = \emptyset$, then $B \subseteq S\setminus A$, so $S\setminus A \in F$ and we’re done.
    • Otherwise every $B\in F$ has $A \cap B \neq \emptyset$ in which case $F\cup \{A\}$ has the FIP, hence generates a filter.
    • But $F$ was maximal, so the generated filter cannot gain new elements.
    • In other words, $A\in F$.
  • If $F$ is prime, then $F$ is maximal.
    • Consider any family $G$ such that $F \subseteq G$, but there exists $A \in G \setminus F$.
    • Since $A \notin F$, then by primeness of $F$, $S \setminus A \in F$.
    • Therefore, $A, S\setminus A$ are both in $G$.
    • In other words, $G$ is not a filter — in other words, $F$ is maximal.

The Stone-Cech compactification (apologies for missing haceks)

The set of ultrafilters is often denoted by $\beta S$ and it turns out to be the Stone-Cech compactification, i.e., the maximal compactification of $S$, because $S$ is discrete. There’s a gazillion things to be said about $\beta S$. To get started, we should celebrate the most practical and in fact characterizing property of the Stone-Cech compactification.

Universal Property of $\beta S$ If $X$ is compact and Hausdorff, $f: S \rightarrow X$ continuous (in our case, any map is), then there exists a unique continuous map $\beta f: \beta S \rightarrow X$ that extends $f$. We usually identify $\beta f$ with $f$ for convenience.

The easiest way to do this in our setting, is to take the limit along ultrafilters. But for now we don’t need to.

An interlude about extensions If $f: S \rightarrow S$, then we can describe the image quite nicely, namely \[ f(p) = \{ B : (\exists A \in p) f[A] \subseteq B \}. \]

Often this definition is given by $ f(p)= \{ B: f^{-1}[B] \in p\}$ but I think this is a perfect example of the stupid tendency of mathematicians to write a definition as efficiently as possible even though the compression does more harm than good — as a student it always confused the hell out of me and I mixed it up with preimage filters (which are more difficult to define unless $f$ is surjective). To remember: $f(p)$ is the unique ultrafilter generated by the family $ (f[A])_{A\in p}$, the filter generated by the images. Yes, it’s longer to write down, it’s not as self-contained a definition, but really: it does make more sense that way, no? And who’d think the self-contained definition in itself helps anyone understand anything anyway…

Extending the semigroup operation

We want to extend our (partial) semigroup operation to $\beta S$. The trouble is that the extension won’t be unique and from a theoretical point of view each of those non-unique extensions can be defined using different techniques (resulting in the same kind of extension). The problem of uniqueness also leads to four different descriptions when it comes to the continuity of the operation, but let’s first get started.

I “grew up” with the book by Neil Hindman and Dona Strauss, so I tend to follow their set up (regarding which kind of extension we want).

Using the universal property of $\beta S$

  • For each $s\in S$, we can consider $\lambda_s: S \rightarrow S\subseteq \beta S, t\mapsto s \cdot t$, i.e., multiplication with a fixed left-hand side.
  • This is a continuous map (since any map on $S$ is), so we can extend it to $\beta \lambda_s : \beta S \rightarrow \beta S$; we simply write $s \cdot q$ for this.
  • Now switch it around and for $q\in \beta S$ consider this extended multiplication with $q$ fixed on the right hand side, i.e., the map $\rho_q: S \rightarrow \beta S, q \mapsto s \cdot q$.
  • Again this is a continuous map (since any map on $S$ is), so we can extend it to $\beta \rho_q : \beta S \rightarrow \beta S$; and for this we write $\rho_q(p) = p \cdot q$.
  • Tada, we have our operation.

Of course, this gives us no tangible clue as to what such a product of ultrafilters actually looks like. But at least one thing is easy — multiplication with a fixed right hand side is continuous! I call this right-topological. You can see that we might start symmetrically and then we end up with a different operation (though very similar to our own). Also, some people like to call the above continuity left-topological (because its continuous in the left hand argument). So, lots of confusion… we’ll stick to this one.

The brute force definition

There’s thankfully a way to give the same definition by brute force (which is my favourite way to write it down), but let’s think about it naively. We have two ultrafilters $p,q$ and we have our operation $\cdot$. So why not just take $A \in p$ and $B\in q$ and look at all possible products $A \cdot B$? Collect all these $\{ A \cdot B: A \in p, B \in q\}$ and we get a nice little filter. Are we done? Well, the problem is that this will pretty much never give you an ultrafilter (if it does you either have a very simple operation or (say in $\mathbb{N}$) very, very special ultrafilters).

So what do we need to do? We need to complicate things (and if you try to write down to check where the above attempt of a definition fails, this complication comes naturally). Later I’ll introduce some notation to make nicer general nonsense, but let’s take a look first.

Extending multiplication to $\beta S$ For a semigroup $(S, \cdot) $ and $p,q \in \beta S$ we define the product $p \cdot q$ by \[ A \in p\cdot q \Leftrightarrow (\exists V\in p)(\exists {(W_v)}_{v\in V} \mbox{ in } q) \bigcup_{v\in V} v \cdot W_v \subseteq A. \]

Ok, quite a beast. Don’t despair! Remember what we tried first: sets of the form $V \cdot W = \bigcup_{v\in V} v \cdot W$. What the above definition tells us is that we need to allow the $W$ to be more flexible — possibly different for each $v$!

There is a different angle to look at this: the tensor (or Fubini) product of ultrafilters.

Tensor product of ultrafilters For $p,q \in \beta S$ define $p \otimes q \in \beta (S \times S)$ by \[ A\in p\otimes q \Leftrightarrow \{ s: \{ t : (s,t) \in A \} \in p \} \in q \]

Not much better, eh? Let’s take a look though: the tensor product is contains sets $A$ such that the first projection of $A$ lies in $p$ and additionally almost all fibers (in the sense of $p$) of the first projection lie in $q$. So you might say that the sets are $p$-large horizontally and $q$-large vertically.

What has this to do with the product we defined before? Well, the tensor product live on $S \times S$ and the multiplication is a map from $S \times S$ to $S$. So looking at the continuous extension $\beta \cdot$, i.e., the extension to $\beta (S\times S)$ (which is different from $\beta S \times \beta S$ btw) we can simply take the image, $\beta \cdot(p \otimes q)$. If you look at the “interlude” earlier regarding such images, you’ll notice that we get exactly the ultrafilter described in the brute force definition.

Still not happy? Yeah, I know that feeling… Ok, let me offer my favourite general nonsense notation.

  • For $s \in S, A\subseteq S$ define $s^{-1}A := \{ t \in S: st \in A\}$ (note: don’t have to be able to invert to define this…)
  • For $A\subseteq S, q \in \beta S$ define $A^{-q} := \{ s \in S: s^{-1}A \in q \}$
  • Then $A \in p \cdot q$ if and only if $A^{-q} \in p$.

Alright, much shorter now. But does it help? I don’t know. I certainly don’t claim to “really” understand this operation (but there’s a certain limit since, well, it’s on ultrafilters after all…). My notation for the set $A^{-q}$ is not standard (but there’s no notation, so I made it up for my thesis). This set consists of those elements that (inverse-)shift $A$ to make it an element of $q$. If $p$ contains it, we can expect elements in $p$ to contain elements that shift elements of $q$ into $A$ — which is maybe an idea.

One advantage is that you can check a few things more easily with the brute force definition.

  • The operation is right-continuous — $A^{-q}$ is exactly the neighbourhood that shows the continuity of $\rho_q$ with respect to $A$.
  • The operation is associative — just write it out.

Phew, that was a lot. But we’re finally ready to get to some real theorems!

Van Douwen spaces

At the winterschool Alan Dow gave quite challenging tutorials. He also mentioned something about van Douwen spaces.

Van Douwen Spaces

As formulated here

van Douwen space A countable $S$ is a van Douwen space if it is crowded (i.e. has no isolated points) and there is a 1-to-1 function from $\mathbf{N}$ to $S$ that extends to a $\leq$2-to-1 function from $\beta \mathbf{N}$ to $\beta S$.

What caught my interest was that there is an example that has something to do with idempotent ultrafilters. Let me introduce something first.

A partial order On the idempotent ultrafilters (on $\mathbf{N}$) define a partial ordering by
\[ p \leq_r q \Leftrightarrow q + p =p. \]

Digressing

This partial order (as well as its left counterpart and their intersection) is quite important in the algebra in the Stone-Cech-compactification. Mostly because this order has minimal idempotents which are central to the field. (pardon the pun) Oops, after ignoring its definition in my last post this is not a pun. So let me add: a set is in fact central if it is an element of a minimal idempotent. Central, get it? Ah, well…

Strongly right maximal

For van Douwen spaces it is useful to go in the other direction. There exist many right-maximal elements in this order, but even more can be said.

Strongly right maximal idempotents An idempotent ultrafilters $p \in \beta \mathbf{N}$ is strongly right maximal if
\[ q+ p =p \Rightarrow q= p. \]

Yevhen Zelenyuk once gave an example of a right-maximal that is not strongly right maximal assuming CH or MA (and even less). In any case these idempotents are very nice and thanks to Igor Protasov exist under ZFC alone. Nevertheless it is an open question whether consistenly all right-maximal idempotents are strongly right-maximal, i.e., if non-strongly but right-maximal idempotents exist under ZFC alone.

Back to van Douwen spaces

Anyhow, the main point is that strongly right maximal idempotents have an orbit that is a van Douwen space!

Let $p$ be strongly right maximal. Then $\mathbf{N} + p$ is a van Douwen space.

And this is what Alan Dow mentioned. Ignoring the crowdedness, this is really easy for in fact more holds in this case.

If $p$ is strongly right maximal, then
\[ \rho_p: \mathbf{N} \rightarrow \beta \mathbf{N}, n \mapsto n+ p \]
is injective, hence also its continuous extension to $\beta \mathbf{N}$ (which is naturally onto the orbit $\mathbf{N} +p$).

So in fact, it is not just a $\leq$2-to-one function, but an injective function. Strange, isn’t it? Strongly right maximality really only speaks of injectivity at $p$, but this is already enough.

Proof

The proof needs some basic stuff such as ‘multiplication with fixed right hand side is continuous’. Oh, and you need to know that natural numbers are cancelative…

  • Since $ ( \mathbf{N} , + ) $ is cancelative, the maps $ \lambda_{n} = n + \cdot$ are injective for all $n$.
  • Since $\lambda_n$ is continuous (on a discrete space), its extension to $\beta \mathbf{N}$ is injective as well.
  • Then $\rho_p$ is injective on $\mathbf{N}$.
    • If $n < k \in \mathbf{N}$ had $n+ p = k + p$, then by the above steps $k-n + p = p$.
    • Since $p$ is strongly right maximal, this would imply $n-k = p$ — which is absurd since $p$ is idempotent, hence free.
  • But then by continuity the whole of $\rho_p$ is injective.

I like that. Now, my favourite kind of idempotent ultrafilters are strongly summable ultrafilters. Those were the first examples of strongly right maximal idempotents, however their existence is independent of ZFC. On the other hand, they have much stronger properties and I would not be surprised if this affected their orbit, i.e., if that van Douwen space is not special somehow.

Understanding the Central Sets Theorem

To write the first post on the new domain I thought I might just write a little about what I’ve been studying recently — the Central Sets Theorem.

This theorem dates back to the 70s and the original formulation and proof are due to Hillel Furstenberg. In its current form as found say in De, Hindman, Strauss it is probably the strongest algebraic partition theorem around. I had encountered the theorem many times before, in books, lectures, papers and talks but I never truly developed an understanding for it. Since I recently felt it might give me an edge in a problem I’m working on I decided to take a better look.

Detour 1 — metamathematics

How do you achieve an understanding of a theorem? In an incomplete list I would include the following

  • Understand its most important application or corollary
  • Understand its statement
  • Understand its proof
  • Improve its proof
  • Understand how to come up with the proof
  • Give a different proof
  • Improve the theorem

I would say this list is in increasing order of understanding but that’s open for discussion.

I might write about the history (and applications) of the Central Sets Theorem some other time, but here I want to focus on its formulation; in fact, I don’t even want to write about what it means to be central (sorry) except that it is a partition regular notion.

Formulation

So, what does the usual formulation look like?

Central Sets Theorem
Imagine you are given finitely many sequences in a commutative semigroup $(S,+)$, say $\mathbf{y^0}, \ldots, \mathbf{y^\alpha}$ as well as a central set $C \subseteq S$.
Then you can find a sequence $\mathbf{a}$ in $S$ as well as a sequence $\mathbf{h}$ of non-empty, disjoint and finite subsets of $\mathbb{N}$ such that for $\beta \leq \alpha$ \[ FS ( {a_n} + {\sum_{i \in h_n} y_i^\beta} ) \subseteq C. \]

Complicated, no? I mean, a random bunch of sequences, some strange set and you find some other sequence and some weird subsets of of the natural numbers and then the IP-set of some strange sums are in that strange set — ye what?

Let’s cut it down a little and just consider the case $\alpha = 0$.

simple Central Sets Theorem
Imagine you are given a sequence $\mathbf{y}$ in a commutative semigroup $(S,+)$ as well as a central set $C \subseteq S$.
Then you can find a sequence $\mathbf{a}$ in $S$ as well as a sequence $\mathbf{h}$ of non-empty, disjoint and finite subsets of $\mathbb{N}$ such that \[ FS ( {a_n} + {\sum_{i \in h_n} y_i} ) \subseteq C. \]

Detour 2 — oversimplification

Even this special case of the standard formulation somehow focuses on aspects that get me sidetracked. So I attempted to formulate it in a way that gives (me) better focus.

Now, the theorem says all kinds of complicated things about the existence of a sequence of disjoint finite subsets of $\mathbb{N}$. Can I get around this? I thought I should be able to. Let’s start with a much weaker version of the theorem.

A weak simple Central Sets Theorem
Imagine you are given a subsemigroup $T \subseteq \mathbb{N}$ as well as a central set $C \subseteq \mathbb{N}$.
Then you can find a sequence $\mathbf{a}$ in $\mathbb{N}$ as well as a sequence $\mathbf{b}$ in $T$ so that \[ FS ( {a_n} + {b_n} ) \subseteq C. \]

I find this weaker version much easier to understand. It just says that I can always translate infinitely many elements from a given subsemigroup into the central set; additionally the finite sums stay within the set.

This is much weaker than the statement before. Of course, given a sequence $\mathbf{y}$ we could consider the generated subsemigroup and use the weaker version. But this would not guarantee the result of applying the Central Sets Theorem — Furstenberg’s theorem gives much more control over which elements are picked since there are no repititions in the sums of the generators.

Partial Semigroups

So where does this leave us? Well, when I hear finite subsets of $\mathbb{N}$ I think of my favourite structure — in fact the favourite structure for a lot of algebra in the Stone-Cech compactification on $\mathbb{N}$, the semigroup $\delta \mathbb{F}$. But let’s step back a little. The best way to think about $\delta \mathbb{F}$ is in terms of partial semigroups.

(Adequate) Partial Semigroups
A partial semigroup operation on a set $S$ is a map $\cdot: S \times S \rightarrow S$ such that associativity $s \cdot (t \cdot u) = (s \cdot t) \cdot u$ holds in the sense that if one side is defined so is the other and they are equal. A partial semigroup is adequate if the sets
\[
\sigma(s) := \{ t\in S : {s \cdot t} \mbox{ is defined} \}
\]
generate a filter, i.e., finitely many elements have a common compatible element.

This notion was introduced by Bergelson, Blass and Hindman in the 90s. It tells us that the operation, although partial, is associative in a strong way. Additionally, it makes sure the operation is not just empty but defined for many elements (well, ok it could be just one for all, but that’s not the point).

For ultrafilters the critical point is the following.

The semigroup $\delta S$
Given an adequate partial semigroup and $p,q$ ultrafilters containing all $\sigma(s)$. Then the operation
\[
p \cdot q = \{ A \subseteq S : \{ s : \{ t : s \cdot t \in A \} \in q \} \in p \}
\]
is well-defined and associative and semi-continuous. In other words, $\delta S$ is a closed semi-continuous semigroup.

Now this is somewhat surprising. Even though our operation is partial, these ultrafilters are a full semigroup! With all the bells and whistles it takes to do algebra in the Stone-Cech compactification.

What does this have to do with the Central Sets Theorem?

Denote the non-empty, finite subsets of $\mathbb{N}$ by $\mathbb{F}$. Consider the restriction of $\cup$ on $\mathbb{F}$ defined by
\[
s + t \mbox{ defined } \Longleftrightarrow \max(s) \cap \min(t) = \emptyset.
\]
Then in fact this constitutes a partial semigroup, adequate at that.

This partial semigroup structure could be called the free partial semigroup in the following sense: given any sequence $\mathbf{s}$ in any semigroup $S$ we can consider the induced partial semigroup on the set of finite sums ${FS( \mathbf{s} ) }$: we only allow sums where the index sets are disjoint (so that we are closed under our partial operation). Then all $FS$-sets are naturally isomorphic (in the appropriate sense of partial semigroups).

The weak version revisited

To come back to the weak version of the Central sets theorem — partial semigroups are exactly what it talks about. So let us reformulate,

simple Central Sets Theorem
Imagine we are given a partial subsemigroup $T$ of $(S,+)$ as well as a central set $C \subseteq \mathbb{N}$. Then we find sequences $\mathbf{a}$ in $\mathbb{N}$ and $\mathbf{t} \in T$ such that $FS ( {t_n} ) \subseteq T$ and
\[
{FS( a_{n} + t_{n}) \in C.}
\]

Now this sounds much closer to the original theorem. Since any sequence generates a partial semigroup on its $FS$-set (isomorphic to $\mathbb{F}$), this is in fact the Central Sets Theorem for just one sequence.

Leaving the simplification

However, the actual theorem is more than just some kind of induction on the above version. It is considerably stronger and here it is time to let go of the simplifications of partial semigroups again. For the theorem really does talk about $FS$-sets, i.e., partial semigroups isomorphic to $\mathbb{F}$. The strength lies in the fact that the infinite sequences can be chosen uniformly in the sense that we pick from the different partial semigroups in the same prescribed way.

Central Sets Theorem
Imagine you are given finitely many $FS$-sets in a commutative semigroup $(S,+)$, say ${FS( {\mathbf{y^0}} )}, {\ldots}, {FS( {\mathbf{y^\alpha}} )}$ as well as a central set $C \subseteq S$.
Then you can find a sequence $\mathbf{a}$ in $S$ as well as one disjoint sequence $\mathbf{h}$ in $\mathbb{F}$ such that for all $\beta \leq \alpha$ \[ FS ( {a_n} + {\sum_{i \in h_n} y_i^\beta} ) \subseteq C. \]

To see this strength at work it is time to look at the classical application.

Central sets in $( \mathbb{N},+)$ contain arbitrarily long arithmetic progressions
Take $\mathbf{y^\beta}$ to be the multiples of $\beta$ (for $\beta \leq \alpha$). Then the central set theorem guarantees we find $a_1, h_1$ such that for all $\beta \leq \alpha$ \[ (a_1 + \beta \cdot \sum_{i\in h_1} i) \in C.\]

For this application is obviously critical that the to-be-translated elements can be chosen uniformly. That’s all for now but I hope I can write a follow up some other time.

Matrices vs. idempotent ultrafilters part 2.5

Note: there seems to be some problematic interaction between the javascripts I use and blogspot’s javascripts which prevents longer posts from being displayed correctly. As long as I don’t understand how to fix this, I will simply split the posts.

We can also describe size and the algebraic structure.

  1. $A$ with $F_1$ ($F_2$) generates a right (left) zero semigroup (hence of size $2$, except for $x=0$).
  2. $A$ with $F_3$ or $F_4$ generates a semigroup with $AB$ nilpotent (of size $4$, except for $x=0$, where we have the null semigroup of size $3$).
  3. $A$ with $G_i$ generate (isomorphic) semigroups of size $8$. These contain two disjoint right ideals, two disjoint left ideals generated by $A$ and $B$ respectively.

Luckily enough, we get something very similar from our alternative for $A$.

Proposition In case $A = \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix}$ the solutions for $B$ being of rank one consist of five one – dimensional families namely (for $x\in \mathbf{Q}$)
\[
H_1(x) = \begin{pmatrix} 1 & x \\ 0 & 0 \end{pmatrix},
H_2(x) = \begin{pmatrix} x+1 & x \\ ( – x – 1) & – x \end{pmatrix},
H_3(x) = \begin{pmatrix} 0 & x \\ 0 & 1 \end{pmatrix},
H_4(x) = \begin{pmatrix} ( – x+1) & ( – x+1) \\ x & x \end{pmatrix},
\]
\[
H_5(x) = \begin{pmatrix} ( – x+1) & ( – x – 1 – \frac{2}{x – 2}) \\ x – 2 & x \end{pmatrix} , x \neq 2.
\]

As before we can describe size and structure.

  1. $A$ with $H_1$ ($H_2$) generates a right (left) zero semigroup (as before).
  2. $A$ with $H_3$ or $H_4$ generates a semigroup with $AB$ nilpotent (as before).
  3. $A$ with $H_5$ generates the same $8$ element semigroup (as before).

Finally, it might be worthwhile to mention that the seemingly missing copies of the $8$ element semigroup are also dealt with; e.g. $ – G_i$ generates the same semigroup as $G_i$ etc.

It is striking to see that the orders of all finite semigroups generated by rational idempotent two by two matrices are either $2^k,2^k + 1$ or $2^k + 2$.

At first sight it seems strange that we cannot find other semigroups with two generators like this. As another friend commented, there’s just not enough space in the plane. I would love to get some geometric idea of what’s happening since my intuition is very poor. But that’s all for today.