Monthly Archives: October 2011

epub, mathjax and the iPad — another attempt

It’s a funny thing. I don’t even own an iPad. But a lot of people are interested in getting an epub file with mathjax working on the iPad.

Why is that? Well, as far as I could find out the iPad remains the only “hardware” that does not block javascript within an epub file (epub uses html for its content but javascript is designated “should not” in the epub2 standard). Of course it’s really the software, iBooks, but mentioning the iPad will be much better SEO. ;)

Incidentally, the only other software I know that is not blocking javascript is the fantastic Calibre. Calibre’s reader seems to not care at all about enforcing the epub standard, it just render everything it finds (but I’ll get to that later).

So what happened?

A while ago, after an email exchange which is now mostly available online, I finally created an epub with a complete mathjax installation. Unfortunately, it was a fluke. The file was was not reliably rendered on the iPad, most likely because of its size (MathJax has 30.000 files for ~20MB unzipped). So Davide Cervone suggested to cut down on unnecessary files which iBooks should not need.

This led to a result that rendered reliably — unfortunately it rendered in a most irritating fashion: half a line below the intended one, writing happily across any other text on the next line, trailing out of the margin etc. That’s far from perfect, obviously.

In the mean time, Davide was able to use my epub file to run some tests — and yesterday told us that things are looking much better now that he can work on the issues.

Of course, iOS5 was released last week. It’s not clear to me if iBooks already supports epub3, but I know that Safari now supports (some) MathML so there’s a chance that iBooks would (since it uses the webkit variant of Safari to render html). So when I had a quick chance last Friday to get my hands on a friend’s freshly updated iPad, I cooked up a quick test file and it rendered; it wasn’t perfect but not totally bad either. With my luck, of course, this will also be a fluke and I won’t know before I get my hands on that iPad again…

In the mean time, and for posterity, here’s how I create epub files. (for the Pros: get ready to laugh at a dilettante).

The tools

Get your hands on

That’s it. (Well, unless you don’t know what those are and how to use them — I won’t cover how to install and run these).

All but ecub is open source, ecub is at least free for personal use — and of course everything runs on Linux, MacOS and Windows (I mostly use linux and sometimes a Mac; I can’t make guarantees for Windows).

Creating a minimal epub file with pandoc

I love pandoc (ecub was a great help, too, more about that later) so I’ll focus on it.

As you may know, here at Booles’ Ring I write using markdown and MathJax. I use pandoc whenever I want to convert this kind of content into something else (like LaTeX). But pandoc (as its name suggest) can handle much more.

So hit it! Take your favorite test html file (I use this post).

pandoc test.html -o test.epub

That should give you a working epub file — it ain’t fancy, but it’ll do for testing. Be warned that pandoc does not check if your (x)html actually validates. Since the iPad is picky about having valid epub files you should double check (I totally failed the first time and it took me ages to remember this…).

Fortunately, you installed calibre which includes a binary of epub-fix from the epub-tools by the fabulous people over at threepress.

So you find the epub-fix binary and run

epub-fix --epubcheck test.epub

If epub-fix finds errors, fix them: go into the epub file (which is just a zip file) and fix the (most likely html) file that throws an error; in the post I use, the html should complain about a part of the vimeo embedding.

When epub-fix is happy, send the file over to the iPad for a test spin (I use Dropbox for ease of sync). If even a simple test file does not work, throw your epub into threepress’s online validator just to be sure.

Oh, one more thing: remember to always delete your file from iBooks before your load its updated version. In my experience, iBooks does not update the file when something with the same metadata is already in the iBooks library (or maybe just sometimes, I don’t know, just watch out for that).

Slimming down mathjax

Well, right now we have a nice epub. But if you view it anywhere it will have your typical LaTeX commands all over the place — we need to add mathjax!

Davide Cervone gave me some advice to reduce a mathjax installation to a mere 1.3MB.

  • remove the MathJax/fonts/HTML-CSS/TeX/eot, svg, and png directories
  • remove the two OFT-files that start with “MathJax_Win” (guess why…)
  • remove the MathJax/unpacked, test, and docs directories
  • If you are only using TeX input (not MathML), then use the TeX-AMS_HTML-full configuration file.
  • In that case, remove the MathJax/jax/input/MathML, MathJax/jax/output/NativeMML directories, the MathJax/extensions/mml2jax.js and MathJax/extensions/jsMath2jax.js .
  • remove the “FontWarnings” and “v1.0-warnings” extensions, as well as all the configuration files you are not using.
  • remove the MathJax/jax/output/HTML-CSS/fonts/STIX directory

Now that your MathJax installation is small and tidy, just copy the remaining files into a suitable folder (how about “mathjax”?) inside the epub — an epub file is simply a zip file after all.

While you’re at it, you should add a suitable MathJax configuration to the html files in your epub file. If you’re using my post from above, you should add

<script type="text/x-mathjax-config">
MathJax.Hub.Config({
tex2jax: {
inlineMath: [ ['$','$'], ["\\(","\\)"] ],
displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
processEscapes: true
},
});
</script>
<script type="text/javascript" src="mathjax/MathJax.js?config=TeX-AMS_HTML-full"></script>

If you don’t use dollar signs for inline math, just take the last line.

Fixing your epub.

After this copying, we’ll have to repair our epub file. An important fact about epub: all files must be listed in the manifest (OPF) file. Since we don’t want to do that manually, we use epub-check again.

epub-fix --unmanifested --epubcheck test.epub

The “unmanifested” option (you guessed it) will ensure that all files will be added to the manifest. Beware: don’t try this on a full MathJax! Epub-fix will slow down after the first 1.000 files…

Now transfer your file to the iPad and low and behold some mathjax will render! Of course, you’ll find that this is not working: the rendering is broken right now. (As mentioned earlier, Davide is working on it)

iOS5 to the rescue?

Now this post gets flaky. As I wrote earlier, I have only had one test run with an iOS5 iPad, so this might not work. But the process is worthwhile documenting.

As I said above, the thing about iOS5 is that Safari and hence iBooks finally has some MathML support.

Since pandoc is incredibly versatile you won’t be surprised that it can produce MathML and that it is aware of MathJax. So all we have to do is modify our earlier command.

pandoc test.html --mathml -o test.epub

This way, the html now has mathml instead of the LaTeX commands. Just shoot this over to your iPad and see how it renders. What I remember from my quick test with my post mentioned earlier was that some characters would render twice (which I had seen with that unreliable full install of MathJax I mentioned earlier). Also, MathJax’s support for commands like \\color obviously won’t work without adding MathJax again.

Alternatively, you could try using MathJax’s mathml-rendering and see what happens (I hope to test that next week).

But what if I want to have it all?

As I wrote, I also created an epub file that had a full mathjax install inside of it. This is a terrible idea because a) it rendered only sometimes on the iPad b) every other ebook viewer rejected it or crashed.

But if you cannot resist (or want to modify my approach), here’s the a hurried how-to. Since epub-fix will come to a grinding halt adding 30.000 files to a manifest, use ecub instead.

Start ecub and use the new-project wizard, it’s pretty self-explanatory. Two points might be worth pointing out:

  • At “Choose import method” you’ll want “from an existing html file”.
  • At “Convert text files” check only “add any HTML file found” and “Also find files in folders under your project folder” (this step will take a short while).

After you’re back at the main window, you’ll still need to “compile” your epub file. This will take a long time. So long, in fact, you’ll think ecub is hanging. To convince yourself that it isn’t go to the project folder you designated in the wizard and watch the 30.000 files be copied into the folder and then watch content.opf grow in size (end result is ~3.5MB).

Where do we stand?

So for now, we have two broken ways to display mathematical content in an epub file on the iPad: use slimmed down MathJax or use MathML directly. Neither works perfectly but the key point is: they work in principle. Now we can look into the specifics to make things work better. Davide is looking into the mathjax side of things and with webkit (hence Safari, hence iBooks) there’s reason to hope that mathml support will improve, too.

Of course, what I really want is an Android reader with javascript or mathml support…

And that’s it for today. Any questions?


Addendum

Here are two files at your disposal.

Grigorieff forcing collapsing the continuum

This is a short technical post, more a note-to-self so that I know where to look this up if I ever need it again. It is also somewhat of a correction of something I said during my talk in Toronto in June.

Grigorieff forcing

If you don’t remember, here’s the quick and dirty (i.e. traditional) way to define Grigorieff (or Gregorieff depending on your choice of latinization) forcing: it consists of partial functions on $\omega$ which are defined on a “small” set, i.e., a set in the dual ideal of a filter. For simplicity, let’s focus on ultrafilters. asd s

Grigorieff Forcing Given a ultrafilter $U$ on $\omega$, let $$\mathbb{P}_U = { f: A \rightarrow \omega : \omega \setminus A \in U }.$$
Partially order such functions by $f\leq g$ iff $f \supseteq g$, i.e., $f$ has more information.

You can think of Grigorieff conditions as perfect binary trees with complete branching on an ultrafilter set and “parallel movement” elsewhere. But I said quick and dirty is enough here, so let’s not worry too much.

Grigorieff forcing is famous for being the forcing that Shelah used for the first model without P-points. One of the reasons this is possible is as follows.

Theorem (Shelah) If $U$ is a P-point, then $\mathbb{P}_U$ is proper and $\omega^\omega$-bounding. In particular, $\mathbb{P}_U$ does not collapse $\omega_1$.

Last week, David Chodounsky let me know that Bohuslav Balcar showed him the following “folklore” result.

Optimality If $U$ is not a P-point, then $\mathbb{P}_U$ collapses the continuum.

This result is mentioned in Jech’s Multiple Forcing book, but without proof and I have never seen one published. (Which, to tell the truth, is the reason I thought it was wrong but more about that later).

a proof

  • If $U$ is not a P-point, then there exists a partition $\bigcup_{n \in \omega} I_n$ such that every $A\in U$ intersects infinitely many $I_n$ in an infinite set.
  • In particular, no $I_n \in U$ and, without loss, all $I_n$ are infinite.
  • In the ground model $V$, let’s enumerate each $P(I_n)$ as $(A^\alpha_n)_{\alpha < \mathfrak{c}}$ (bijectively).
  • First observation The generic $\dot G \subseteq \omega$ has $$\Vdash I_n \cap \dot G \in V$$ for every $n$.
    • Fix $n$.
    • Since $I_n\notin U$, we can decide any condition $f$ arbitrarily on $I_n$.
    • In other words, there’s a dense set of conditions $g$ with $dom(g) \supseteq I_n$.
    • But any condition in this dense set forces what we want, i.e., $g \Vdash \dot G \cap I_n = g^{-1}(1) \cap I_n$ — which is a set in the ground model.
  • So let $G$ be a generic over $V$.
  • Second observation In $V[G]$, we can define a map $H: \omega \rightarrow (2^\omega)^V$, mapping $n$ to $\alpha$ with $A^n_\alpha = G \cap I_n$.
    • Check that this is possible because this intersection is a ground model set, hence appears in the enumerations we fixed earlier.
  • Third Observation $H$ is cofinal.
    • Given any $\alpha$, we want to find $n$ such that $H(n) > \alpha$.
    • For a density argument, fix any condition $f$.
    • Since $dom(f)$ is a small set, we can find $n$ such that $$|I_n \cap \omega \setminus dom(f)| = \omega.$$
    • Therefore, we can find $A \subseteq I_n \cap \omega \setminus dom(f)$ such that $A \cup f^{-1}(1) = A^n_\beta$ for some $\beta > \alpha$.
    • Extend $f$ to all of $I_n$ such that $f^{-1}(1) = A$.
    • Then $f \Vdash \dot G \cap I_n = A^n_\beta$ — as desired.

An honest mistake

David and I thought we had a proof that Grigorieff forcing with a stable ordered union ultrafilter is proper and $\omega^\omega$-bounding. This is, of course, impossible — and with this knowledge we could find the mistake in our proof. We still think that “morally” speaking there should be an analogue forcing for the union filter world. But that’s a different story.

A posting on wordpress-for-scientists

I just finished a long posting at the mailing list/google group WordPress For Scientist. This was spawned by today’s meeting with Sam and this week’s trouble with the papercite plugin. We really need to find a different solution. So I’m hoping both for a discussion here at Booles’ Rings as well as some help from the smart people on the mailing list.

Hello.

I was hoping for some advice and discussion regarding citation related plugins.

Since this has gotten a little longer: I will first describe the problem and then add some questions.

Over at boolesrings.org we have had some problems this week. At Booles’ Rings we’re experimenting with wordpress for academic homepages (of mathematicians). We’re essentially trying to find out what is useful and/or necessary for an academic web presence via wordpress.

Obviously, citations are important for documenting our own work and writing about other people’s work.

Since we’re all mathematicians, there’s the strong need for bibtex import which is why papercite is popular — it makes the move from BibTeX to wordpress very easy. Unfortunately, papercite is very buggy and we would like to replace it.

We’re faced with the question:

What do we need a citation plugin to do?

Practically speaking,

  1. bibtex import (but no dependence/sync)
    • We have to start somewhere and that’s where most people (in mathematics) come from.
  2. personal IDs for shortcode use
    • we’re human and we like to write ThatFamousPaper instead cryptic ids
      (I think mathematical writing is very different from scientific writing in this respect — papers can be holy objects…)
  3. a GUI to look up/search for new citations
    • Sometimes, you barely remember the paper’s title.
    • DOIs are cumbersome to look up anyway
    • Searching multiple sources (google scholar, mendeley, mathscinet, pubmed) would be nice while writing a post
    • Maybe even links-to-citation functionality when quoting online sources (blogs, mathoverflow etc)
  4. Reversibility
    • the citation in html (in a post) should include some form of metadata that can be processed automatically (pingbacks, aggregation, citation counts etc)

QUESTION 1: Do we have such a plugin?

1b) What plugins have which functionality?

  • Kcite is excellent when you have the DOI (well, depends on the DOI actually)
  • bibtex-importer does a great job using links giving a local search GUI — but shoudn’t citations be pages or a taxonomy?
  • papercite offers the familiarity of keeping on as we do in LaTeX
  • wpcitulike, bibliplugin seem to offer good external reference sources
  • zotpress seems to have almost everything, but requires zotero
  • teachpress and scholarpress have too much overhead
    1c) is there a plugin that uses Mendeley’s api?

QUESTION 2: How do we want citations to work?

Ok, this is in hopes for a discussion. My amateur thoughts.

  • reference management should be done by professionals not through personally hacked bibtex files (we mathematicians have a bad habit…)
  • references should be stored professionally, i.e., in the wp-database or in a professional outside tool (mendeley, zotero, citeulike) (take papercite as a terrible example relying on some random bibtex file somewhere)
  • even if an outside tools is used, actually referenced citations should always be stored in the database.
  • citations should be hardcoded into the post (when I review a preprint, I don’t want the reference to change to the published version later)

Well, this has become more of a blog post… I guess I’ll cross post it at boolesrings.org/krautzberger…

In any case, I hope I made a little bit of sense. Any help is greatly appreciated!

Best,
Peter.

Formal proofs are our democracy

Reading papers can lead to horrible acts. Today, I felt like mutilating a famous quote.

Source: Library of Congress, Reproduction number LC-USW33-019093-C  via http://en.wikipedia.org/wiki/File:Sir_Winston_S_Churchill.jpg

Many forms of communicating mathematics have been tried and will be tried in this world of sin and woe. No one pretends that formal proofs are perfect or all-wise. Indeed, it has been said that formal proofs are the worst form of communicating mathematics except all those other forms that have been tried from time to time.

When I come up with a mathematical result, I have the strong urge to share it, to communicate it. As a trained mathematician, I resort to the established mode of communication, formal proof.

This has two problems.

Formalizing is tricky

On the one hand, I might will make a mistake formalizing my thoughts. Of course, we mathematicians are in the terrible habit of finding that perfectly acceptable (have you noticed that there are no retractions in mathematics because of mistakes?). Almost all the time, even though a formal proof might be wrong or incomplete, it’s considered fixable and the result “essentially” correct (case in point: Perelman). It seems a majority agrees that there’s much more to a mathematician’s result than what might be written on paper.

In the same vein but much worse is the effect of formal proof on mathematical writing. Most papers are badly written and most proofs are written the wrong way around (like $\varepsilon$-$\delta$ proofs that start with a choice of $\delta$) or badly structured in other ways. It seems a lot of people are not aware that a formal proof is a miserable tool for communicating mathematics and has to be used very carefully to facilitate communication. Such care would, of course, clash with the all-encompassing publish-or-perish pressure that has led to the terrible style of “getting the least publishable unit passed a referee with minimal effort”.

All of the above is really just one big problem and luckily it is one that could be fixed by a functioning scientific community (unluckily, it most likely won’t be fixed).

Formalizing is impossible

The second problem however seems intrinsic and unsurmountable.

Source http://en.wikipedia.org/wiki/File:Moby_Dick_p510_illustration.jpg

Formal proofs cannot capture what I think when I think mathematics. The problem is that I cannot share my mathematical insights in their entirety since they are a complex combination of rational and emotional thought, intuitions, memory, successful failures and so on.

There might be a way to overcome this problem. Then again, there might be a better form of government than democracy.