Not really blogging

A. Makelov

Category Archives: Computer Science

Arbitrarily biasing a coin in 2 expected tosses

Here’s a neat probability trick that I learned from Konstantin Matveev and which, I think, everybody mildly interested in math should know about:

Problem. Given a fair coin, how do you (efficiently) generate an event E with probability 1/5?

Solution. We can, of course, toss the coin three times, giving us a total of 8 possibilities, then discard our least favorite 3 of them, and weigh the remaining  5 possibilities equally. This algorithm requires an expected number of tosses equal to 3\times 8/5=24/5. But, what if instead of 1/5, we have 1/1000000? You can easily see that the expected number of tosses to emulate a probability of 1/n grows logarithmically with n. But even worse, what if we had 1/\pi? Well, here’s a trick that gets rid of both of these problems: let

\frac{1}{5} = \displaystyle\sum_{i=1}^\infty \frac{a_i}{2^i}

for a_i\in \{0,1\} be the binary expansion of 1/5. Then, start tossing the coin until it lands heads, at some time I. If a_I=1, declare that E has occurred; otherwise, E has not occurred. Then clearly

\mathbb{P}[E]=\displaystyle\sum_{i=1}^\infty \mathbb{P}[E\ \big| \ I=i]\mathbb{P}[I=i]=\displaystyle\sum_{i=1}^\infty \frac{a_i}{2^i}=\frac{1}{5}

Furthermore, notice that \mathbb{E}[I]=2, regardless of the probability we want to emulate! Well, that seems pretty efficient. When you think about it some more, it really appears to be mind-boggling – you can emulate extremely small, or irrational, probabilities with just two expected tosses. Moreover, you don’t need to have the binary expansion of the probability in advance – you can pass the next digit depending on the status of your experiment.

Combining this with a standard unbiasing technique, say von Neumann unbiasing,, this gives you a very simple procedure that given a biased coin that lands heads with probability 0<p<1, allows you to simulate a biased coin that lands heads with probability 0<q<1 for any other q. Any binary source of randomness is convertible to any other such source.

But we haven’t said anything about the efficiency of unbiasing. There, we can’t do as well as in biasing: there is a fundamental obstacle, the information-theoretic limit. Roughly speaking, the amount of information a biased coin tells us is always strictly less than the amount of information we get from an unbiased coin – this is why biasing is easier than unbiasing. Fortunately, there is a procedure that lets us extract an unbiased stream of bits that on average achieves the best performance theoretically possible: see this paper by Mitzenmacher to learn more.

I guess the moral of all this is the following: if you’re stuck on a deserted island with \pi -1 other people, you need to decide who the first to be eaten is, and all you have in your random arsenal is a suspicious-looking coin handed to you by one of your shipmates, do not despair – you can still make sure you have a fair chance of surviving the day.

Reflections on “The library of Babel” and computational complexity

“…mirrors and copulation are abominable, because they increase the number or men.”

“Tlön, Uqbar, Orbis Tertius”, Jorge Luis Borges

You can tell that Borges was very fond of reflections, and now I intend to try to make him happy.

In short, the Cosmic Coincidence Control Center (and it seems that I’m included in that number?) was extremely busy last week. After finishing my first-ever short story, that feeble imitation of Borges, bearing the following arrogant dedication “While this story was being written, I thought I had stolen Borges’ style; but now I know – he stole my idea”, I was ruthlessly hunted down – so after all it was me who stole something, but hey, who is to say.

First, I decided to write my first paper for the science fiction class I’m taking (which is absolutely fun, thanks to this guy) on “The library of Babel”. OK, I can take that – after all, you might argue that I have free will and whatnot, so in fact it was not a coincidence.

Next, I randomly decided to watch a video by vihart called “Twelve tones”, cause, you know, it seemed to be her most popular one. And – bam! – there was “The library” again.

After that, I was even more randomly reading the chapter on randomized algorithms from the book on computational complexity by Oded Goldreich, and guess what, the quote at the beginning was:

I owe this almost atrocious variety to an institution which other republics
do not know or which operates in them in an imperfect and secret manner:
the lottery

Jorge Luis Borges, “The Lottery in Babylon”

I know, it’s not a library, it’s a lottery, but a lottery is just the closest equivalent of a library to people doing randomized algorithmis – after all, a bunch of monkeys randomly typing on a bunch of typewriters will produce the works of Shakespeare at some point. And a Babylon is like a baby Babel anyway.

Finally, it turned out that the book I blogged about last week, “Orphans of the sky”, is way too similar to “The library of Babel” – something I realized only after re-reading the library (or rather, “The library”. haha). It’s not just that both things came out in 1941 (yeah, I don’t know, it’s crazy), but they both construct extremely similar settings, visually and conceptually. Read them and you’ll know – don’t want to spoil anything!

All in all, it was pretty obvious that Borges was after me, and that he wouldn’t leave me alone unless I wrote something about the library and about computational complexity. So here we are now.

What is this library anyway? The premise of the story is simple enough: a library which contains all possible books 410 pages long, conveniently stacked in a seemingly infinite array of identical hexagonal galleries, which comprise all the world. It has the complete works of Shakespeare, the biographies of all people that have ever lived on Earth, the proofs of a bunch of conjectures in mathematics, these same proofs with the last line wrong, “The library of Babel”, etc. Sure, it’s a big place. It also has people randomly walking up and down and thinking they have it all figured, arguing that, you see, a pentagonal gallery would be fundamentally impossible, so that’s why galleries are hexagonal.

But I don’t really want to talk about the social metaphors of the library (a decent subject in its own right); rather, I like to think of it as a representative of a somewhat underrepresented part of SF, something you might reasonably call “math fiction”.  Borges wrote several other stories with a strong flavor of mathematics – “The Aleph”, “The garden of forking paths”, “Blue tigers”, “The book of sand”, to name a few amazing ones.

Is MF SF? I would argue that it is, for:

1) math is as good a science as any of your usual ‘favorites’ in SF – physics, chemistry, biology – and in fact, it is the language underlying all of them, a language of even greater expressive power

2) yes, all the ‘falsifiable hypothesis blabla’ stuff does apply to mathematics, and in fact, modern mathematics seems to rely more and more on simulations and experiments

3) MF has already sneaked in SF: there are works that arguably classify as MF which have won a bunch of awards. I know for I’ve read one such – “Permutation city” by Greg Egan, which I strongly recommend to people interested in the computational aspects of consciousness.

YAY MATH FICTION! So, “The library of Babel” uses a very simple mathematical idea – “the set of all sequences of a given length, in a given set of symbols” – to achieve very interesting and complicated effects, and that makes it great math fiction. Suppose you wanted to write a book, and you had some reasonably good idea of what you want it to be about, and you knew it wouldn’t be longer than 410 pages. It then seems very plausible that, if someone hands you a book and you read it, it will be qualitatively easier for you to tell if that’s the book (or a book) you want to write. Then, if you just go to the library and read all books (for there is a very big, but finite number of such books), you will finally find one that suits you!  So you’ll have achieved a qualitative improvement by increasing your efforts only quantitatively. Essentially, it might seem that you’ve written a book without writing it!

This has two consequences: one philosophical, one computational. First, is an author just a treasure-hunter? Does an author create a work, or has the work been there all the time, and the author is `merely’ the one who found it? What the hell?

But hey, that’s not a big deal. What if we try to write books in the way described above? What if we try to do math the way described above – if we want to prove a theorem, we just go through all possible proofs of a given length, for all lengths, until we find one that works? Then mathematical discovery will be more or less fully automated! Ideas of the sort motivated the computational revolution that was just starting at the time Borges wrote his story, and they shape much of modern computational complexity theory.

As for the point of the above example in this context – we might need some new, more practical definitions of quantitative and qualitative differences after all. Especially, when you’re searching for something, going through all possibilities should count as qualitatively more expensive than looking at a single one – and that’s some intuition for where the distinction between polynomial and exponential time in computer science came from. Here’s a nice paper on that topic that I don’t really understand (yeah, I don’t really understand either): Why Philosophers Should Care About Computational Complexity

mathbabe

Exploring and venting about quantitative issues

Rational Altruist

Adventures of a would-be do-gooder.

Gowers's Weblog

Mathematics related discussions

What's new

Updates on my research and expository papers, discussion of open problems, and other maths-related topics. By Terence Tao

Stefan's Blog (archive, for news see blog.krastanov.org)

Most of the blog moved to blog.krastanov.org