Archive for the ‘proof’ Category

P ≠ NP?

Tuesday, August 10th, 2010

A few days ago, Vinay Deolalikar, a Principal Research Scientist at HP Labs, began circulating a draft of a paper entitled “P ≠ NP”. The mathematics and computer science communities immediately erupted in a frenzy of excitement and activity.

The million dollar question: why the excitement? Well, that’s exactly it: it is a million-dollar question. The P vs NP problem is one of the seven Millenium Prize Problems established in 2000 by the Clay Mathematics Institute; each problem carries with it a prize of $1,000,000. One of the seven — the Poincaré Conjecture — has already been solved, by the Russian mathematician Grigori Perelman (who famously refused to accept both a Fields Medal and the $1 million). And it’s not even really about the $1 million prize; the P vs NP problem is widely agreed to be the most significant (and difficult) open question in theoretical computer science, so for someone to solve it would be a Really Big Deal.

So, did Deolalikar really solve it? Will he win the $1 million? Well… it’s way too early to tell! Here are a few things to keep in mind:

  • Since P vs NP is such a famous problem, there are tons of “attempts” at solving it published all the time. Most are by people who either have an overinflated estimate of their mathematical understanding, or believe the solution was told to them in a dream by benevolent aliens, or both. Hence they aren’t even really worth serious mathematicians’ time to read. Sad but true. But this is clearly not the case here: Deolalikar is a respected researcher who has done related work in the past, and his 100-page paper is well-written and demonstrates a good understanding of the relevant issues (you don’t have to take my word for it). Even if the proof ends up having some sort of fatal flaw, it’s clear that he has some interesting new ideas that may lead to other discoveries. Hence the excitement.
  • Unfortunately, there have already been some possible flaws pointed out. But keep in mind that any paper of this magnitude is bound to contain tons of errors, omissions, and typos (if you don’t believe me, try writing one sometime!). Only time will tell whether these flaws turn out to be fatal problems that invalidate Deolalikar’s entire approach, mistakes that can be fixed relatively easily, or just misunderstandings on the part of the people who pointed them out.
  • Supposing the flaws are fixable and Deolalikar’s proof ends up being accepted by the mathematical community, it will still be quite a few years before he could possibly win the $1 million. First, the proof needs to be published in a major mathematical journal (which will probably take at least a year), then there is a mandatory two-year waiting period, and then a special committee has to examine the proof and decide whether to award the prize! So don’t hold your breath.

At this point, you may also be wondering what the heck the P vs NP problem actually is, and why it is so important. Fortunately, it’s the one Millenium Prize problem that I am actually qualified to explain, and in the following post or two I intend to do just that! (Unfortunately I am emphatically not qualified to explain anything about Deolalikar’s attempted proof.) I’ve been intending for quite a while to write about some interesting topics in the intersection of mathematics and computer science (my day job, after all, is as a computer scientist!) and this will provide the perfect excuse.

Irrationality of pi: the integral that wasn’t

Thursday, February 11th, 2010

And now for the punchline! Today we’ll show that, for large enough values of n,


0 < \int_0^\pi f(x) \sin (x)\, dx < 1,

completing the proof of the irrationality of \pi.

First, let’s show that f(x) \sin(x) is positive when 0 < x < \pi. We know that \sin(x) is positive for 0 < x < \pi. But I claim that f(x) is too. Remember that


\displaystyle f(x) = \frac{x^n(a - bx)^n}{n!}.

n! and x^n are clearly positive when x is positive; and a - bx is also positive when x < \pi = a/b. From here we simply note that if a function is positive over an entire interval, the integral of the function over that interval will be positive as well.

For the second part, note first that for 0 < x < \pi,


\displaystyle f(x) \sin(x) < \frac{\pi^n a^n}{n!}.

Why is this? Well, clearly x^n < \pi^n (since x < \pi), and also a - bx < a (since x > 0) and hence (a - bx)^n < a^n, so we conclude that


\displaystyle f(x) = \frac{x^n(a - bx)^n}{n!} < \frac{\pi^n a^n}{n!}.

This doesn’t yet include the \sin(x), but notice that multiplying by \sin(x) can only make things smaller, since \sin(x) is at most 1. Now, here’s the slightly sneaky part: I claim that we can make \frac{\pi^n a^n}{n!} as small as we want by making n big enough. Why is this? Notice that we can rewrite it as


\displaystyle \frac{\pi^n a^n}{n!} = \frac{\pi a}{1} \cdot \frac{\pi a}{2} \cdot \frac{\pi a}{3} \dots \frac{\pi a}{n}.

Now, a—the “denominator” of \pi—might be very large. It might have fourteen million zillion digits. But no matter how big a is, there will of course be an integer z which is bigger than \pi a, so \frac{\pi a}{z} < 1. And then \frac{\pi a}{z + 1} < 1, and \frac{\pi a}{z + 2} < 1, and so on... of course, multiplying by something less than one makes things smaller. And it might take a really really long time to cancel out the enormous product \frac{\pi a}{1} \cdot \frac{\pi a}{2} \dots \frac{\pi a}{z}, but if we just wait patiently it will get smaller and smaller... and eventually there will come some n for which


\displaystyle \frac{\pi a}{1} \cdot \frac{\pi a}{2} \dots \frac{\pi a}{z} \cdot \frac{\pi a}{z + 1} \dots \frac{\pi a}{n} < 1.

Actually, even this isn’t quite small enough: we want the integral from 0 to \pi of f(x) \sin(x) to be less than 1. But that’s not a problem; to ensure that we can just pick n big enough so that f(x) < 1/\pi (if the graph of f(x) fits inside a \pi by 1/\pi box, then its integral on this interval must be less than the area of the box).

Voila! An integral which is an integer absurdly between 0 and 1, all because we assumed \pi was rational.

The inescapable conclusion, which probably would have driven the ancient Greeks crazy, is that \pi is irrational!

Irrationality of pi: the impossible integral

Saturday, February 6th, 2010

We’re getting close! Last time, we defined a new function F(x) and showed that F(0) and F(\pi) are both integers, and that F^{\prime\prime}(x) + F(x) = f(x). So, consider the following:

 $ \begin{align*} &\frac{d}{dx} [ F'(x) \sin x - F(x) \cos x ] \\ &= F^{\prime\prime}(x)\sin x + F'(x) \cos x \\ & \qquad - F'(x) \cos x + F(x) \sin x \\ &= F^{\prime\prime}(x) \sin x + F(x) \sin x \\ &= [F^{\prime\prime}(x) + F(x)]\sin x \\ &= f(x) \sin x. \end{align*} $

The first step uses the product rule for differentiation (recalling that \frac{d}{dx}\sin x = \cos x and \frac{d}{dx}\cos x = - \sin x); the last step is what we showed last time. Now we see the point of defining F(x): it’s just so that we have a convenient way to talk about the antiderivative of f(x) \sin x. We could just do everything directly in terms of alternating sums of derivatives of f(x)… but it’s much clearer this way, don’t you agree?

Now that we know the antiderivative of f(x)\sin x, we can use the Fundamental Theorem of Calculus to compute the following integral:

 $ \begin{align*} &\int_0^\pi f(x)\sin x dx \\ &= \left[ F'(x) \sin x - F(x) \cos x \right]_0^\pi \\ &= F'(\pi) \sin \pi - F(\pi) \cos \pi \\ & \qquad - F'(0) \sin 0 + F(0) \cos 0 \\ &= F(\pi) + F(0). \end{align*} $

Note that the value of this integral is an integer, since both F(\pi) and F(0) are integers. But next time we’ll show that it is also strictly between 0 and 1 (for a suitable choice of n), which is clearly nonsense!

Irrationality of pi: curiouser and curiouser

Saturday, January 30th, 2010

I’ve been remiss in posting here lately, which I will attribute to Christmas and New Year travelling and general craziness, and then starting a new semester craziness… but things have settled down a bit, so here we go again!

Since it’s been a while since my last post in this series, here’s a quick recap: I’m presenting a proof by Ivan Niven that \pi is irrational, that is, that it cannot be represented as the ratio of two integers (and hence its decimal expansion goes on forever without repeating). My first post just gave some background and an outline of the general argument. In my second post, we began by assuming that \pi is rational, and defined the function


\displaystyle f(x) = \frac{x^n(a - bx)^n}{n!}

(really, a family of functions, one for each value of n) where a and b are the “numerator” and “denominator” of \pi. We then showed that f(0) = f(\pi) = 0, and in fact that f(x) is symmetric, with f(\pi - x) = f(x). In my third post, we showed that all the derivatives of f(x) take on integer values when evaluated at both 0 and \pi. We’re about halfway there! Today we’ll continue by defining a new function F(x) in terms of f(x), and show some of its properties. Recall too our overall plan: we’re going to wind up with an integral which is strictly greater than 0, strictly less than 1, and also an integer! Since this is clearly nonsense (there are no integers between 0 and 1) we will conclude that our initial assumption—that \pi is rational—was bogus, and that \pi must be irrational after all.

So without further ado, here’s our new function F(x). Actually, this too is technically a family of functions F_n(x), one for each n; but again, everything we prove about it will be true no matter what n is.


\displaystyle F(x) = f(x) - f^{(2)}(x) + f^{(4)}(x) - \dots + (-1)^n f^{(2n)}(x).

In words, F(x) is the alternating sum of all the even derivatives of f(x). (I say “all” because, as noted in my last post, any derivative of f(x) higher than 2n is zero.) Using Sigma notation, we can also write this more concisely as


\displaystyle F(x) = \sum_{i = 0}^n (-1)^i f^{(2i)}(x).

There are a few things to note. First, think what happens when we evaluate F(0): since all the derivatives of f(x) take on integer values at 0, and F(x) is just a sum of a bunch of derivatives of f(x), F(0) must be an integer too. Of course, the same thing goes for F(\pi).

Next, consider


F^{\prime\prime}(x) + F(x).

Since the derivative of a sum is the sum of the derivatives, we can compute F^{\prime\prime}(x) as


F^{\prime\prime}(x) = f^{(2)}(x) - f^{(4)}(x) + \dots + (-1)^{n-1}f^{(2n)}(x).

That is, f(x) turns into f^{(2)}(x), -f^{(2)}(x) turns into -f^{(4)}(x), and so on. “But wait a minute,” you say. “Shouldn’t the (-1)^n f^{(2n)}(x) at the end of F(x) turn into (-1)^n f^{(2n+2)}(x) in F^{\prime\prime}(x)?” In fact, it does—but as noted before, f^{(2n+2)}(x) is zero, so that term just goes away. Now we note that every term of F(x) has a corresponding term in F^{\prime\prime}(x) of the opposite sign, except f(x), which has no corresponding term. So when we add F(x) and F^{\prime\prime}(x), everything cancels except f(x):


F^{\prime\prime}(x) + F(x) = f(x).

Astute readers will note a funny resemblance between the definition of F(x) and the Taylor series for \cos(x)… and indeed, next time we’ll start making some connections with our old trigonometric friends, \sin and \cos.

Irrationality of pi: derivatives of f

Sunday, December 20th, 2009

In my previous post in this series, we defined the function


\displaystyle f(x) = \frac{x^n(a - bx)^n}{n!}

and showed that f(0) = f(\pi) = 0. Today we’ll show the surprising fact that, for every positive integer i, although f^{(i)}(0) and f^{(i)}(\pi) are not necessarily zero, they are always integers. (The notation f^{(i)} means the ith derivative of f; that is, take the derivative of f, then the derivative of that, then the derivative of that, … i times.) Put more succinctly: every derivative of f takes on integer values at x = 0 and x = \pi.

Why might this be surprising? It’s surprising because of the n! in the denominator of f. For example, consider the function (which I just made up):


g(x) = \frac{x^3 + 5x}{6}.

It’s easy to see that g(0) = 0. But let’s take the derivative: g'(x) = \frac{3x^2 + 5}{6}, so g'(0) = 5/6, which is clearly not an integer. For the derivatives of f to always give an integer at x = 0 (let alone at x = \pi) there must be some fancy canceling going on!

For now we will consider only f^{(i)}(0) (we’ll come back to f^{(i)}(\pi) later). Of course, substituting 0 for x causes every term containing x to disappear, so f^{(i)}(0) is just the constant term of f^{(i)}(x). Hence, we must show that the constant term of f^{(i)}(x) is always an integer.

Consider the numerator of f(x), that is,


n!f(x) = x^n (a - bx)^n

Note that (a - bx)^n, when expanded out, is a polynomial of the form a^n - \dots + (-b)^n x^n, where the ellipsis contains a bunch of terms with integer coefficients and powers of x between 1 and n-1. (In fact, we could use the Binomial Theorem to compute the precise coefficients—but it really doesn’t matter; all we will care about is that they are integers.) Multiplying by x^n, we see that


n!f(x) = a^n x^n - \dots + (-b)^n x^{2n}

so n!f(x) is a polynomial with terms of degree n through 2n, and hence so is f(x), since dividing by n! changes the coefficients but not the exponents. (Note that f(x) has no constant term, so f(0) = 0—but we already knew that.)

Recall that the derivative of x^k is k x^{k-1}, so taking the derivative of a polynomial reduces each of the exponents by one. So the first derivative of f(x) is a polynomial with terms of degree n-1 through 2n - 1 (and hence a constant term of zero); the second derivative has terms of degree n-2 through 2n - 2 (still no constant term); and so on. We can see that none of the first n-1 derivatives of f(x) will have a constant term, so f^{(i)}(0) = 0 (which is certainly an integer) for i < n. What about the nth derivative and higher? This is where the fancy canceling comes in!

As we noted above, when expanded out f(x) is a sum of a bunch of terms of the form


\displaystyle \frac{c_i x^{n+i}}{n!}

where 0 \leq i \leq n and c_i is some integer. When we take the derivative, this term will turn into (n+i) c_i x^{n+i-1}/n!; if we take the derivative again, it will become (n+i)(n+i-1) c_i x^{n+i-2}/n!; another derivative gives us (n+i)(n+i-1)(n+i-2) c_i x^{n+i-3}/n!, and so on. Do you see what is happening? After taking the derivative exactly n+i times, we will end up with the constant term


\displaystyle \frac{(n+i)! c_i}{n!}

and here’s our fancy canceling: (n+i)! is clearly divisible by n!, so this is some integer times c_i, which is also an integer. Voila! Said a different way, and more succinctly: since each term of f(x) has degree at least n, by the time we have taken the derivative enough times for it to yield a constant term, the n! will be canceled from the denominator, since we will have taken the derivative at least at each power of x from n down to 1.

Finally, if we take the derivative of f more than 2n times, we get 0, so no problems there.

Great, so f^{(i)}(0) is always an integer. But what about f^{(i)}(\pi)? Well, remember, last time we showed that f(\pi - x) = f(x). If we take the derivative of both sides with respect to x (being careful to use the chain rule on the left side, noting that the derivative of \pi - x with respect to x is -1), we get


 $ \begin{align*}\frac{d}{dx}f(\pi - x) &= \frac{d}{dx} f(x) \\ -f'(\pi - x) &= f'(x) \end{align*} $

We can repeat this process to find that f^{(2)}(\pi - x) = f^{(2)}(x) (the two negatives cancel on the left side), -f^{(3)}(\pi - x) = f^{(3)}(x), and so on. But the extra negative sign for odd derivatives doesn’t really matter: in either case, f^{(i)}(\pi) = \pm f^{(i)}(\pi - \pi) = \pm f^{(i)}(0), which is an integer.

Getting closer! Next time, we will define another special function F(x) in terms of f(x) and its derivatives; this function F(x) will help us compute \int_0^\pi f(x) \sin (x) dx—which (if you recall the punchline) will turn out to be an integer strictly between 0 and 1 (which is impossible).

Irrationality of pi: the unpossible function

Saturday, December 12th, 2009

Recall from my last post what we are trying to accomplish: by assuming that \pi is a rational number, we are going to define an unpossible function! So, without further ado:

Suppose \pi = \frac{a}{b}, where a and b are positive integers. Define the function f like this:


\displaystyle f(x) = \frac{x^n(a - bx)^n}{n!}.

(In case you’ve forgotten, n!, pronounced “n factorial,” is the product of all the numbers from 1 to n.) “OK… but… what is n?” I hear you ask. Good question. The short answer is, it doesn’t matter: n can be any positive integer. We will show a bunch of things that are true about f no matter what n is. Later, we will see that we get a contradiction only for values of n which are “big enough.” But that’s OK; since everything we prove up to that point will be true no matter what n is, we can pick a value of n which is as big as we like.

Let’s explore some properties of f(x). First, it’s easy to see that f(0) = \frac{0^n a^n}{n!} = 0. It’s not too hard to see that f(\pi) = 0 as well (remembering that \pi = a/b, of course, which means that a-b\pi = a - a = 0):

 $ \begin{align*}f(\pi) &= \frac{\pi^n (a - b\pi)^n}{n!} \\ &= \frac{\pi^n 0^n}{n!} = 0.\end{align*} $

So f(x) has zeros at x = 0 and x = \pi. But more is true: in fact, f(x) is symmetric (a mirror reflection of itself) around the line x = \pi/2. That is,


f(x) = f(\pi - x) = f(a/b - x).

Let’s prove this:

 $ \begin{align*}f(a/b - x) &= \frac{(a/b - x)^n(a - b(a/b - x))^n}{n!} \\ &= \frac{(a/b - x)^n(a - a + bx)^n}{n!} \\ &= \frac{(a/b - x)^n b^n x^n}{n!} \\ &= \frac{(a - bx)^n x^n}{n!} = f(x). \end{align*} $

“I don’t see what’s so unpossible about f so far,” you say? Patience! (Of course, it isn’t really f itself which is the problem; the problem is our insistence that f is actually defined in terms of the “numerator” and “denominator” of \pi…)

Next time, we’ll see that the derivatives of f also have some special properties.

Irrationality of pi

Monday, December 7th, 2009

Everyone knows that \pi—the ratio of any circle’s diameter to its circumference—is irrational, that is, cannot be written as a fraction a/b. This also means that \pi’s decimal expansion goes on forever and never repeats …but have you ever seen a proof of this fact, or did you just take it on faith?

The irrationality of \pi was first proved (according to modern standards of rigor) in 1768 by Lambert, but his proof was rather complicated. A more elementary proof, using only basic calculus, was given in 1947 by Ivan Niven. You can read his original paper here, but it’s rather terse! Just as I did for Calkin and Wilf’s paper, Recounting the Rationals, I plan to write a series of posts explaining Niven’s proof in a bit more detail, with some accompanying intuition. I’ll assume a basic knowledge of calculus; if you don’t know calculus, just hang tight for a few posts!

Here’s the basic outline of the proof. We begin by supposing that \pi is rational: in particular, suppose \pi = a/b for some integers a and b. We’ll then use these values of a and b to define a special function f(x), about which we will show the following:


\int_0^\pi f(x) \sin(x) dx is an integer, AND

0 < \int_0^\pi f(x) \sin(x) dx < 1.

But this is absurd! There are no integers greater than zero and less than one. The inescapable conclusion will be that our initial assumption—that \pi = a/b—was false.

In my next post, we’ll define the special function f(x) and begin exploring some of its properties.

The hyperbinary sequence and the Calkin-Wilf tree

Sunday, October 18th, 2009

And now, the amazing conclusion to this series of posts on Neil Calkin and Herbert Wilf’s paper, Recounting the Rationals, and the answers to all the questions about the hyperbinary sequence. Hold on to your hats!

The Calkin-Wilf Tree

First, recall the Calkin-Wilf tree, defined as follows: the “root” of the tree is the fraction 1/1, and every node labeled i/j in the tree has two “children”, with the left child labeled i/(i+j), and the right labeled (i+j)/j. Here’s a picture of the first four levels of the tree (of course, the tree is infinite):



Now wait just one minute… those numbers look familiar! Let’s list those fractions in order, one level at a time, from left to right:


\displaystyle \frac 1 1, \frac 1 2, \frac 2 1, \frac 1 3, \frac 3 2, \frac 2 3, \frac 3 1, \frac 1 4, \frac 4 3, \frac 3 5, \frac 5 2, \frac 2 5, \frac 5 3, \frac 3 4, \frac 4 1 \dots

That’s right, it’s the hyperbinary sequence! In particular, the numerators are the hyperbinary sequence starting at index 0, and the denominators follow the same sequence, offset by one.

…but why?! The Calkin-Wilf tree is defined by a recursive rule for manipulating fractions, and the hyperbinary sequence is defined in terms of ways to write numbers as sums of powers of two… it certainly isn’t a priori obvious why these should have anything to do with each other!

Labelling the CW-tree

Let’s label the nodes in the Calkin-Wilf tree with positive integers, starting with 1 at the root and proceeding in order by levels, from left to right. This means that the fraction at node n is h(n-1)/h(n)—at least, that’s what we’re trying to prove! Here’s an illustration of our numbering scheme:



Now, what do you notice about the relationship between the label on a node and the labels on its children? It’s not too hard to guess the pattern: the left child of the node labelled k is labelled 2k, and the right child is labelled 2k+1. How can we prove this?

Well, it’s certainly true for the root note: it’s labelled 1, and its children are labelled 2 and 3. Now we note that if the pattern holds for the node immediately preceding node k—that is, the children of node k-1 are 2k-2 and 2k-1—then the children of node k come immediately after the children of k-1, so they are numbered 2k and 2k+1. If you think about it, this is true regardless of whether k-1 and k are on the same level, or k-1 is the end of one level and k is the beginning of the next.

We can think of labelling each edge of the Calkin-Wilf tree with either a 0 or a 1: 0 for left edges, and 1 for right edges. Then taking all the zeros and ones along the path from the root to any node and sticking an extra 1 at the beginning gives us the label of that node in binary! For example, the path that goes left, left, right corresponds to 0, 0, 1, and adding an extra 1 to the front gives us 1001–which is indeed the binary representation of 9!



So, is it really true that the fraction at node n is always h(n-1)/h(n)? It is indeed, and we now have the tools we need to see why. First of all, the root node of the tree—that is, node 1—is (by definition) 1/1, which is indeed h(0)/h(1). Next, suppose that the fraction at node k is h(k-1)/h(k); we need to show that the fraction at the left child (which is node 2k) is h(2k-1)/h(2k), and the fraction at the right child is h(2k)/h(2k+1). But this is now easy: by the way the Calkin-Wilf tree is defined, the left child is

\displaystyle\frac{h(k-1)}{h(k-1) + h(k)} = \frac{h(2k-1)}{h(2k)}.

The equality, of course, follows directly from the recurrence relation for the hyperbinary sequence! Likewise, the right child is

\displaystyle \frac{h(k-1) + h(k)}{h(k)} = \frac{h(2k)}{h(2k+1)}.

Solving the \phi(n) mystery

As pointed out by JM, this connection with the Calkin-Wilf tree is exactly what we need to prove Fergal Daly’s conjecture about the number of primary occurrences of any given number.

We defined a primary occurrence simply as a number at an even position in the hyperbinary sequence. Of course, even-labelled nodes in the Calkin-Wilf tree are exactly the ones which are left children, and so the primary occurrences are the denominators of left children (equivalently, the numerators of right children). Also, since left children are always of the form i/(i+j), left children always have denominators bigger than their numerators. Finally, as you may recall, we showed earlier that every positive rational number occurs somewhere in the Calkin-Wilf tree, in lowest terms, exactly once.

Putting this all together: each primary occurrence of n corresponds to a fraction m/n in the Calkin-Wilf tree, where m < n. And since every fraction occurs exactly once in lowest terms, there is exactly one such fraction m/n for each positive integer m < n which is relatively prime to n. Of course, by definition, there are \phi(n) such numbers, so this is the proof of Fergal’s conjecture! There are exactly \phi(n) primary occurrences of n, since that’s how many reduced fractions there are between 0 and 1 which have a denominator of n.

Computing primary occurrences

The connection with the Calkin-Wilf tree also allows us to compute h^{-1}—that is, given any n, we can efficiently (in O(n \log n) time) compute all the primary occurrences k such that h(k) = n!

How does this work? Well, if you remember, the Euclidan Algorithm gives us a way to find our way up the Calkin-Wilf tree from any starting fraction. Combining this with the relationship we discovered earlier between the label on a node and the labels on its children allows us to compute the label on the node where we started.

As an example, let’s compute one of the primary occurrences of 5. In particular, let’s do the one that corresponds to the fraction 3/5. Since 3 is smaller than 5, 3/5 must be the left child of 3/(5-3) = 3/2. 3 is bigger than 2, so 3/2 is the right child of (3-2)/2 = 1/2. Finally, 1/2 is the left child of 1/1. Now we can work backwards: 1/1 is node 1; 1/2 is to the left, so it’s node 2*1 = 2; 3/2 is to the right of that, so it’s node 2*2 + 1 = 5; and 3/5 is to the left again, so it’s node 2*5 = 10. Therefore h(10) = 5 is a primary occurrence of 5! We can also think of this more succinctly by recalling that paths in the Calkin-Wilf tree correspond to binary representations of the labels. So we can run the Euclidean Algorithm, getting a zero or one at each step, and then reverse them and put an extra 1 at the beginning to get the binary representation of the primary occurrence. To be more precise:

  1. Start with the pair (m,n) where m < n is relatively prime to n. Call the current pair of numbers (a,b).
  2. Given (a,b), if a < b then write (a,b) \stackrel{0}{\to} (a,b-a). Otherwise write (a,b) \stackrel{1}{\to} (a-b,b).
  3. Repeat step 2 until reaching (1,1).
  4. Read off the primary occurrence of n by prefixing 1 to the reverse of the binary digits written over the arrows.

All of this leads to the following (somewhat mind-blowing) observation: the position of each primary occurrence of n in the hyperbinary sequence is a proof (encoded in binary) that n is relatively prime to the number that comes immediately before it.

To tie this all together, here’s a little Haskell program I wrote to compute primary occurrences:

import System.Environment (getArgs)
import Control.Monad (forM_)
import Data.Maybe (catMaybes)
import Data.List (sort)

-- Find a fraction in the Calkin-Wilf tree.  findCW a b computes the
-- location of a/b in the Calkin-Wilf tree, or returns Nothing if a
-- and b are not relatively prime.
findCW :: Integer -> Integer -> Maybe Integer
findCW 1 1 = Just 1
findCW a b | a < b  = left  `fmap` findCW a (b-a)
           | a > b  = right `fmap` findCW (a-b) b
           | a == b = Nothing

left  x = 2*x
right x = 2*x + 1

-- Find the primary occurrences of n.
primaryOccs :: Integer -> [Integer]
primaryOccs n = sort . catMaybes $ [ findCW m n | m <- [1..n] ]

main :: IO ()
main = do (from:to:_) <- getArgs
          forM_ [read from..read to] $ \i -> do
            putStr $ show i ++ ": "
            putStrLn . unwords . map show . primaryOccs $ i

If there’s interest, I could write another post explaining how this program works in a bit more detail, but hopefully you can mostly understand what it’s doing from the preceding discussion. Let’s check that it works:

[brent@archimedes:~/teaching/mlt/hyperbin]$ ./prim-occs 2 16
2: 2
3: 4 6
4: 8 14
5: 10 12 16 30
6: 32 62
7: 18 22 24 28 64 126
8: 20 26 128 254
9: 34 46 48 60 256 510
10: 38 56 512 1022
11: 36 40 54 58 66 94 96 124 1024 2046
12: 44 50 2048 4094
13: 42 52 70 78 112 120 130 190 192 252 4096 8190
14: 68 80 110 122 8192 16382
15: 72 118 258 382 384 508 16384 32766
16: 92 98 134 158 224 248 32768 65534

Yup, looks good! And to show that it really is efficient, let’s use it to compute the primary occurrences of 1000 (yes, all 400 of them!):

[brent@archimedes:~/teaching/mlt/hyperbin]$ ./prim-occs 1000 1000
1000: 39770 42278 56024 58532 68396 70568 71444 75674 76688 85046 86480 92576 104030
... 386 more primary occurrences...
1071508607186267320948425049060001810561404811705533607443750388370351051124936
1224931983788156958581275946729175531468251871452856923140435984577574698574803
9345677748242309854210746050623711418779541821530464749835819412673987675591655
43946077062914571196477686542167660429831652624386837205668069374

On my computer this runs in about 0.03 seconds. Sweet! Note that the position of the final primary occurrence of 1000 has 303 digits–that’s approximately one centillion. I certainly wouldn’t want to go through the first one centillion numbers in the hyperbinary sequence looking for 1000 by hand, would you?

Further Reading

I hope you’ve enjoyed this little (1.5-year-long!) excursion into the fascinating world opened up by Calkin and Wilf’s paper. (I’ve sure enjoyed writing it!) If you’re interested in exploring more, I suggest looking at Roland Backhouse and João Ferreira’s paper, Recounting the Rationals, Twice! which also explains the connection to another famous tree of fractions, the Stern-Brocot tree. If you’re interested in the computational aspect of the hyperbinary numbers, have a look at Enumerating the Rationals, by Jeremy Gibbons, David Lester, and Richard Bird.

Hyperbinary conjecture seeking proof for a good time, long walks on the beach

Monday, October 12th, 2009

Here’s the latest progress on the hyperbinary sequence. We’re trying to figure out the inverse relation of the function h: given a particular number n, where does it occur in the hyperbinary sequence? That is, what are the values of k for which h(k) = n?

There are infinitely many, but in a previous post I argued why we only need to find occurrences at even positions of the sequence, which we call primary occurrences. I have no idea how easy or hard it is to give a general method for finding all primary occurrences. But some progress has been made:

  • Brendan proved by induction that h(2^k - 2) = k and h(2^k) = k+1. These correspond to the numbers that occur right next to 1 in the sequence (we saw earlier that h(2^k-1) = 1).
  • Brendan also proved that h(2^{p+q} - 2^q) = 1 + pq. This is impressive, since this pattern certainly isn’t obvious (at least, it wasn’t to me!).
  • Fergal Daly conjectured that the number of primary occurrences of n is \phi(n). \phi denotes the so-called Euler totient function; \phi(n) is defined to be the number of positive integers smaller than n which are relatively prime to n. An explanation of this fact, if it is true (and it really looks like it might be!) would probably go a long way towards finding a general method for computing h^{-1}!

Can anyone find a proof of Fergal’s conjecture?

More hyperbinary fun

Monday, September 28th, 2009

When I originally posed Challenge #12, a certain Dave posted a series of comments with some explorations and partial solutions to part II (the hyperbinary sequence). Although I gave the “solution” in my last post, no solution to any problem is ever the last word—there’s always more to explore, more connections to see! It’s worth taking another post to explore some of Dave’s discoveries and conjectures, and how they connect to my last post. I’ve used different notations, made up some terms, tried to simplify things where possible, and so on, but all the ideas in this post are fundamentally Dave’s.

First things first

Dave first noticed that

h(2^k - 1) = 1

for all k \geq 0. I noted this in my previous post as well, but said only that this is “apparent” from thinking about the recursive construction I described. However, it’s worth going through a proof of this more formally; it’s a good example of a proof by induction. First, let’s recall the recurrence relation that defines the hyperbinary sequence:

$ \begin{align*}h(0) &= 1 \\ h(2n+1) &= h(n) && \text{(O)}\\ h(2n+2) &= h(n+1) + h(n) && \text{(E)}\end{align*} $

Now we can show that h(2^k - 1) = 1 for k \geq 0. First, the base case: when k=0, h(2^0 - 1) = h(0) = 1; check!

Now, let’s suppose that h(2^k - 1) = 1 for some particular k \geq 0; this is called the inductive hypothesis, which I will abbreviate as IH. we want to show that h(2^{k+1} - 1) = 1 as well. 2^{k+1} - 1 is obviously odd, so we want to apply rule (O). The only trick is to get it into the right form first:

$ \begin{align*}h(2^{k+1} - 1) &= h(2^{k+1} - 2 + 1) && \text{\{ arithmetic \}}\\ &= h(2(2^k - 1) + 1) && \text{\{ factor \}}\\ &= h(2^k - 1) && \text{\{ rule (O) \}} \\ &= 1 && \text{\{ IH \}}\end{align*} $

Presto! Since the thing we wanted to show is true when k=0 and is true for k+1 whenever it is true for k, by induction it must be true for all k \geq 0.

Second things, third things…

But Dave then went on to notice that this sort of pattern isn’t unique to 1. For example,

h(3 \cdot 2^k - 1) = 2

for k \geq 0, and

h(5 \cdot 2^k - 1) = h(7 \cdot 2^k - 1) = 3

also for k \geq 0. Dave conjectured that this generalizes, and indeed it does: most generally, we can say that

h(p \cdot 2^k - 1) = h(p - 1)

for k \geq 0. That is, once a certain number occurs at position p - 1, it will keep occurring in a regular pattern after that, at every position of the form p \cdot 2^k - 1. Note that h(2^k - 1) = 1 is a special case with p = 1.

Why is this? Intuitively, it’s because of the “copying”—the blue numbers in my previous post. Once a number occurs, it will keep getting copied, but with the distance between the copies doubling each time. But again, this is worth proving formally. I’ll let you do this one—the proof by induction is similar to the proof above!

But why do occurrences of 1 correspond to p = 1, and occurrences of 2 to p = 3, and occurrences of 3 to p = 5 and p = 7, and so on? Are there any patterns to be found here? This is what Dave explored next; let’s follow.

Primary occurrences and the inverse of h

Let’s think a little more about this equation h(p \cdot 2^k - 1) = h(p - 1). Let’s suppose that p is odd (if it isn’t, we can always take another factor of two out of it to combine with the 2^k). Notice that p \cdot 2^k - 1 is always odd, except when k = 0 (in which case p \cdot 2^k - 1 = p - 1, which is even). Moreover, we can always write any odd number in the form p \cdot 2^k - 1, where p is odd: just add one, and factor out as many copies of 2 as possible. For example, 11 = 12 - 1 = 3 \cdot 2^2 - 1.

This means that if q is any odd number, we can write it in the form q = p \cdot 2^k - 1, and so h(q) = h(p-1): that is, every odd position in the hyperbinary sequence is just copied from an earlier even position. Of course, this can also be seen from the recursive construction in my previous post; but now we have really proved it! And what about the even positions? The values of the hyperbinary sequence at even positions are obtained by adding, rather than copying. If n occurs at an even position, we will call it a primary occurrence of n.

The nice thing about primary occurrences is that (a) there are only finitely many primary occurrences of any particular n (challenge: prove this!), and (b) once we know the positions of the primary occurrences of n, we know the positions of all the occurrences of n: they are at positions of the form p \cdot 2^k - 1, where p - 1 is a primary occurrence.

So, given n, can we find all its primary occurrences? Essentially we are asking if we can compute h^{-1} (by which I mean the inverse relation of h; since h is not injective its inverse is obviously not a function), but leaving out non-primary occurrences since we can easily find those from primary ones. Let’s start by making a table with some small values of n:

n Primary occurrences
1 0
2 2
3 4, 6
4 8, 14
5 10, 12, 16, 30
6 32, 62
7 18, 22, 24, 28, 64, 126
8 20, 26, 128, 254
9 34, 46, 48, 60, 256, 510
10 38, 56, 512, 1022
11 36, 40, 54, 58, 66, 94, 96, 124, 1024, 2046
12 44, 50, 2048, 4094
13 42, 52, 70, 78, 112, 120, 130, 190, 192, 252, 4096, 8190
14 68, 80, 110, 122, 8192, 16382
15 72, 118, 258, 382, 384, 508, 16384, 32766
16 92, 98, 134, 158, 224, 248, 32768, 65534

Do you see any patterns? If so, can you prove them by induction? I’ll talk about a few in my next post. But I still don’t know whether there’s an effective way to compute h^{-1}(n)!