Archive for the ‘convergence’ Category

Irrationality of pi: the integral that wasn’t

Thursday, February 11th, 2010

And now for the punchline! Today we’ll show that, for large enough values of n,


0 < \int_0^\pi f(x) \sin (x)\, dx < 1,

completing the proof of the irrationality of \pi.

First, let’s show that f(x) \sin(x) is positive when 0 < x < \pi. We know that \sin(x) is positive for 0 < x < \pi. But I claim that f(x) is too. Remember that


\displaystyle f(x) = \frac{x^n(a - bx)^n}{n!}.

n! and x^n are clearly positive when x is positive; and a - bx is also positive when x < \pi = a/b. From here we simply note that if a function is positive over an entire interval, the integral of the function over that interval will be positive as well.

For the second part, note first that for 0 < x < \pi,


\displaystyle f(x) \sin(x) < \frac{\pi^n a^n}{n!}.

Why is this? Well, clearly x^n < \pi^n (since x < \pi), and also a - bx < a (since x > 0) and hence (a - bx)^n < a^n, so we conclude that


\displaystyle f(x) = \frac{x^n(a - bx)^n}{n!} < \frac{\pi^n a^n}{n!}.

This doesn’t yet include the \sin(x), but notice that multiplying by \sin(x) can only make things smaller, since \sin(x) is at most 1. Now, here’s the slightly sneaky part: I claim that we can make \frac{\pi^n a^n}{n!} as small as we want by making n big enough. Why is this? Notice that we can rewrite it as


\displaystyle \frac{\pi^n a^n}{n!} = \frac{\pi a}{1} \cdot \frac{\pi a}{2} \cdot \frac{\pi a}{3} \dots \frac{\pi a}{n}.

Now, a—the “denominator” of \pi—might be very large. It might have fourteen million zillion digits. But no matter how big a is, there will of course be an integer z which is bigger than \pi a, so \frac{\pi a}{z} < 1. And then \frac{\pi a}{z + 1} < 1, and \frac{\pi a}{z + 2} < 1, and so on... of course, multiplying by something less than one makes things smaller. And it might take a really really long time to cancel out the enormous product \frac{\pi a}{1} \cdot \frac{\pi a}{2} \dots \frac{\pi a}{z}, but if we just wait patiently it will get smaller and smaller... and eventually there will come some n for which


\displaystyle \frac{\pi a}{1} \cdot \frac{\pi a}{2} \dots \frac{\pi a}{z} \cdot \frac{\pi a}{z + 1} \dots \frac{\pi a}{n} < 1.

Actually, even this isn’t quite small enough: we want the integral from 0 to \pi of f(x) \sin(x) to be less than 1. But that’s not a problem; to ensure that we can just pick n big enough so that f(x) < 1/\pi (if the graph of f(x) fits inside a \pi by 1/\pi box, then its integral on this interval must be less than the area of the box).

Voila! An integral which is an integer absurdly between 0 and 1, all because we assumed \pi was rational.

The inescapable conclusion, which probably would have driven the ancient Greeks crazy, is that \pi is irrational!

Square roots with pencil and paper: the Babylonian method

Monday, May 18th, 2009

Everyone knows how to add, subtract, multiply and divide with pencil and paper; but do you know how to find square roots without a calculator? (Incidentally, I highly recommend reading The Feeling of Power by Isaac Asimov, a short story about a future in which humans are so reliant on computers that they have forgotten how to do arithmetic.)

An obvious method is to guess and check while keeping track of lower and upper bounds. For example, if we wanted to find the square root if 7, we might start by guessing that the square root is 2. Computing 2^2 = 4, we see that 2 is too small. So we try 3: 3^2 = 9, so 3 is too big! So we know the square root of 7 must be somewhere in between 2 and 3. Let’s try 2.5: 2.5^2 = 6.25. So 2.5 is too small, and the square root of 7 is somewhere between 2.5 and 3. We might try 2.7 next (too big), and so on.

This works, but it is extremely tedious and inefficient! We can cut the search range in half at each step, but this means that on average we only add a single new decimal place every 3.3 steps or so (3.3 \approx \log_2 10). Not to mention that at each step we have to compute the square of increasingly long numbers. There are at least two better methods; I’ll share one of them today and one in a future post.

The first method is often called the “Babylonian method” since it was known to the ancient Babylonians. Here’s how it works. Say we are trying to find the square root of N. Just like with the guess and check method, we start out with some guess R. Then we compute a new value for R as follows:


\displaystyle R' = \frac{R + N/R}{2}.

Repeating this process will result in closer and closer approximations to \sqrt{N}.

Let’s try an example, again using N = 7 and R_0 = 2 as our initial guess. We can compute a few iterations of the process according to the above formula:

 $ \begin{align*} R_1 &= (R_0 + 7/R_0)/2 = (2 + 7/2)/2 \\ &= (11/2)/2 = 11/4 \\ R_2 &= (R_1 + 7/R_1)/2 = (11/4 + 28/11)/2 \\ &= (121/44 + 112/44)/2 = 233/88 \\ R_3 &= (R_2 + 7/R_2)/2 = (233/88 + (7 \cdot 88)/233)/2 \\ &= (54289/20504 + 54208/20504)/2 \\ &= 108497/41008 \\ R_4 &= \dots = 23543191457/8898489952. \end{align*} $

How close did we get? The true value of \sqrt{7}, to 15 decimal places, is

\sqrt{7} = 2.645751311064591\dots

(Incidentally, I computed this using Wolfram|Alpha by typing “sqrt 7 to 15 digits”.) Here are our approximations, with the correct decimal places in bold:

 $ \begin{align*} R_0 &= \mathbf{2} \\ R_1 &= \mathbf{2}.75 \\ R_2 &= \mathbf{2.64}77\overline{27} \\ R_3 &= \mathbf{2.64575}20483808037\dots \\ R_4 &= \mathbf{2.645751311064}6933\dots \end{align*} $

Wow! That converges pretty fast. In fact, this method converges quadratically—the number of correct decimal places approximately doubles with every step!

So, why does this work? Well, first of all, note that if R = (R + N/R)/2 (that is, if R is a fixed point of this operation), then

 $ \begin{align*} 2R &= R + N/R \\ R &= N/R \\ R^2 &= N \\ R &= \sqrt{N}. \end{align*} $

Also, it is not too hard to see that \sqrt{N} must lie in between R and N/R, since R \cdot N/R = N; so taking their average (which is essentially what the Babylonian method does) will necessarily give us a better approximation to \sqrt{N} at each step.

The Babylonian method is one of the fastest-converging methods for computing square roots, but it can be somewhat inconvenient. You have to choose whether to do all the calculations with fractions and then convert to a decimal representation at the end (as I did above), which means you have to deal with multiplying rather large numbers; or use a decimal representation throughout, which means you have to do some annoying long division. There’s another method which doesn’t converge as quickly but can be much more convenient, since it explicitly uses decimal notation and involves somewhat more manageable operations; I’ll describe this other method in an upcoming post.

For more reading on the Babylonian method and a number of related generalizations, check out this MathPages article.

Predicting pi: pretty graphs and convergents

Friday, January 16th, 2009

Recall the challenge I posed in a previous post: given the sequence of integers \floor{\pi}, \floor{2\pi}, \floor{3\pi}, \ldots, what can you learn about \pi (assuming you didn’t know anything about it before)? The answer, as explained in another post, is that you can learn \pi to whatever precision you like, if you wait long enough. Not only that, but you can do better than is initially obvious. You can learn n digits of the decimal expansion of \pi if you wait for \floor{10^n \pi}, but this is throwing away too much information: instead, noting that \floor{n \pi} \leq n\pi < \floor{n \pi} + 1, we get upper and lower bounds on \pi from each term in the sequence:


\displaystyle\frac{\floor{n\pi}}{n} \leq \pi < \frac{\floor{n\pi} + 1}{n}.

Let’s graph these upper and lower bounds for various values of n:

Upper and lower bounds for pi, n = 1 to 200

Neato! We can view this in a slightly different way by graphing the error (the amount under or over \pi) of the approximations instead of the approximations themselves:

Upper and lower bound errors for pi, n = 1 to 200

So, that looks pretty cool… but the best approximations quickly get so close to \pi that it’s hard to see what’s really going on. So now we’re going to (a) zoom way out—the graph will go to n = 50000 instead of just 200, and (b) use a log-log scale—that is, we’ll graph the logarithm of both the error and n.

log-log plot of upper and lower bound errors for pi, n = 1 to 50000

Cool! Now we can see some more interesting structure at the bottom of the graph (in the previous graph, the approximations quickly got so close to the axis that it was impossible to tell what was going on; this is why plotting the logarithm of the error helps). We can see that most of the approximations are in that big mass sloping gently down and to the right. This represents about how well we would do at approximating pi if we just looked at the 10^nth elements of the sequence. But the interesting thing is the approximations which are not part of that big mass. At several places—most notably at around n=5, n=100, and n=30000—there are downward spikes, representing approximations which jump out as being way better than most of the other ones at that point. These are precisely the convergents of \pi, which are (in a specific technical sense) the best rational approximations to \pi. This is why, by looking at all the approximations and choosing the best ones, we can do much better than just looking at the 10^nth elements of the sequence (imagine a line drawn through the bottommost points in the graph; it’s much steeper than the average slope of the big mass of approximations above it). In a future post I hope to write a bit more about these so-called “convergents” and where they come from; I’ve actually written a bit about them before, but didn’t point it out at the time!

log-log plot of upper and lower bound errors for pi, n = 100 to 30000

Predicting Pi: solution

Thursday, October 23rd, 2008

Now for the solution to the question in my previous post, which asked what you can learn about \pi, given the sequence of integers \floor{\pi}, \floor{2\pi}, \floor{3\pi}, \dots.

Nick Johnson commented:

Well, the obvious thing one can learn given just |(10^n)r| is the first n digits of the decimal expansion of r.

For example, when we are given \floor{100\pi} = 314, we can say with confidence that \pi \approx 3.14\dots. If we wait long enough, we can learn as many decimal digits of \pi as we want; in particular, to learn the first n decimal digits of \pi, we can just wait for the 10^nth item of the list.

This seems like an awfully long time to wait, though. Not only that, but if we only look at the first, tenth, hundredth, thousandth,… elements of the sequence, we would be ignoring almost all the information in the sequence! Intuitively, it seems that we should be able to do better by taking more of the sequence into account. And indeed, we can.

The key is to realize that


\floor{n\pi} \leq n\pi < \floor{n\pi} + 1.

Do you see why? Taking the floor of something never makes it bigger, so \floor{n\pi} \leq n\pi (in fact, it can never be equal since \pi is irrational, but that’s not important for our purposes). Also, taking the floor of something reduces it by some amount less than one, so adding one to the floor gives us something bigger than the original; hence n\pi < \floor{n\pi} + 1. Dividing through by n, we find that


 \displaystyle \frac{\floor{n\pi}}{n} \leq \pi < \frac{\floor{n\pi} + 1}{n}.

That is, if the nth element of the sequence is k, then we know that \pi must be between k/n and (k + 1)/n. Of course, this works for any number r, not just for \pi.

Let’s see how this works. The first few numbers in the sequence \floor{\pi}, \floor{2\pi}, \floor{3\pi}, \dots are 3, 6, 9, \dots. After seeing the 3, we know that 3/1 \leq \pi < 4/1. After seeing the 6, we know that 6/2 \leq \pi < 7/2 -- so we have a better upper bound now (3.5) than we did before (4). After seeing the 9, we know 9/3 \leq \pi < 10/3, and so on. Notice that our lower bound hasn't changed yet. That won't happen until we get to \floor{8\pi} = 25, when we learn that 25/8 \leq \pi < 26/8. So our new lower bound is 3.125. But the upper bound at this step, 26/8 = 3.25, is actually worse than the upper bound that we would have found on the previous step, namely 22/7 = 3.142857... So in general, the upper and lower bounds that we find in each step might not be better than all the previous ones; we can just keep the greatest lower bound and the least upper bound that we've seen so far.

So, how good are these bounds? I wrote a little program to compute the best upper and lower bounds for various points in the sequence. Here's a table showing the best lower and upper bounds at various points in the sequence, and the decimal digits they give us confidence about.

Term Best bounds \pi so far
10 \frac{25}{8} < \pi < \frac{22}{7} 3.1
100 \frac{311}{99} < \pi < \frac{22}{7} 3.14
1000 \frac{2818}{897} < \pi < \frac{355}{113} 3.1415
10000 \frac{31218}{9937} < \pi < \frac{355}{113} 3.141592
100000 \frac{208341}{66317} < \pi < \frac{312689}{99532} 3.141592653
1000000 \frac{3126535}{995207} < \pi < \frac{1146408}{364913} 3.1415926535

As you can see, this method does much better than the method of just looking at powers of ten! At n = one million, we already know ten decimal digits of pi; by looking just at the one millionth element of the sequence, we would only know six digits. It turns out that in general, it is the case (which I will not prove) that on average, we can get approximately twice as many decimal digits by finding best upper and lower bounds this way instead of just looking at powers of ten.

In a future post I hope to make a graph of these upper and lower bounds and talk a little bit more about what’s going on—it turns out to be pretty interesting stuff!

The Mandelbrot Set

Monday, May 8th, 2006

For those of you already familiar with the Mandelbrot Set, I suppose this will be like visiting with an old friend. For those of you who aren’t — you’re in for a treat!

Okay, you say, that looks pretty cool I guess, but…huh? Well, to answer the fundamental question of “huh?” we need to dust off our Complex Number Skills. (If you don’t know what a complex number is — or need a refresher — read my explanation of complex numbers.)

Here’s what you do. Pick some complex number c and define the function

f(z) = z^2 + c

(Note that z often denotes a complex number.) Now start with z = 0 and iterate the function f, by taking each value output from the function and putting it back into the function. In other words, find f(0), f(f(0)), f(f(f(0))), f(f(f(f(0)))), and so on. For example, let’s pick c = 1 + i and follow this process for a few steps:

 $ \begin{eqnarray*} f(0) & = & 0^2 + (1 + i) = 1 + i \\ f(f(0)) & = & f(1 + i) = (1 + i)^2 + (1 + i) = 1 + 3i \\ f(f(f(0))) & = & f(1 + 3i) = (1 + 3i)^2 + (1 + i) = \dots \\ & \vdots & \end{eqnarray*} $

and so on. This is a very simple process, and can be worked out by hand fairly quickly. It can be worked out by a computer in the blink of an eye.

For some values of c, iterating the function f will tend to produce complex numbers that just get bigger and bigger. For other values of c, iterating the function will produce complex numbers that stay relatively small, no matter how long you keep iterating the function. (Of course there are technical definitions corresponding to the phrases “bigger and bigger” and “stay relatively small”, but for now we won’t worry about what they are.) Try starting with the value c = 0 + 0.1i to see an example of the latter.

Well, now we’re ready to define the Mandelbrot set: the Mandelbrot set is the set of all complex numbers c for which iterating the function f(z) = z^2 + c produces complex numbers that stay relatively small.

We can make a picture of the Mandelbrot set by letting each complex number a + bi correspond to the point with coordinates (a,b). For example, a computer can easily make a picture of the Mandelbrot set by looking at each point on the screen one by one, deciding which complex number c that point corresponds to, then (say) coloring the point black if c is in the Mandelbrot set, and white otherwise. (Often, instead of just white, programs will choose different colors for points which are not in the Mandelbrot set, based on how many iterations the program had to do before it could decide whether the point was in the set or not.)

You might think that with such a simple function, the picture would be simple as well — like a circle, or a parabola, or something like that. But in fact you get that crazy thing shown above. Iteration can make even the simplest functions behave in very complex ways!

In fact, the Mandelbrot set is what is known as a fractal, an object which is infinitely detailed and contains copies (or near-copies) of itself on all different scales. This means that (theoretically) you can keep zooming into the Mandelbrot forever, and you will always see details just as fine and complex as you do at the “top level”. Moreover, as you zoom in, you will find structures that appear to be tiny copies of the entire picture.

But don’t take my word for it — here are some nice zoomed-in pictures of the Mandelbrot set, and you can find lots more with Google image search. You might also want to download some software for viewing fractals to be able to play around with it yourself.

The most amazing thing is that no one made up these pictures — they have existed forever, built into the mathematical structure of the universe, just waiting for someone to come along and iterate a certain function and make a picture out of it. And in fact, it’s only been since the invention of computers that we’ve been able to do such things (although it’s easy to carry out the iteration described above for a particular value of c by hand, to do it for enough different values of c to make a decent picture would take so long, and be so mind-numbingly tedious, as to make it practically impossible.)

More about the Mandelbrot set, if you’re interested:

Convergence

Monday, March 20th, 2006

Let’s dig a little deeper behind the solutions to Challenges #1 and #2. What on earth does it mean for an infinite expression to have a “value”? Well, as noted in the solution to Challenge #1, what we’re really talking about is the value of the expression at various “stopping points” along the way: what happens to these values as the stopping points get further and further along?

Let’s look again at the infinite continued fraction from Challenge #1. To make things easier to read (and so they won’t take up as much space), we’ll use a common notation for continued fractions:

[a,b,c,d,\dots] = a + \cfrac{1}{b + \cfrac{1}{c + \cfrac{1}{d + \ddots}}}

So, for example, the expression from Challenge #1 can be written as [1,1,1,1,\dots], and the second expression from Challenge #2 can be written as [1,3,1,3,\dots]. As another example, [2,1,3] (note there are no ellipses) means 2 + \frac{1}{1 + \frac{1}{3}} which is equal to 11/4.

Now, let’s analyze various “stopping points” along the way to the infinite expression [1,1,1,1,\dots], starting with 1, then 1 + \frac{1}{1}, then 1 + \frac{1}{1 + \frac{1}{1}}, and so on:

 $ \begin{eqnarray*} [1] & = & 1 \\ \left [ 1,1 \right ] & = & 2 \\ \left [ 1,1,1 \right ] & = & 3/2 = 1.5 \\ \left [ 1,1,1,1 \right ] & = & 5/3 \approx 1.66667 \\ \left [ 1,1,1,1,1 \right ] & = & 8/5 = 1.6 \\ \left [ 1,1,1,1,1,1 \right ] & = & 13/8 = 1.625 \\ \left [ 1,1,1,1,1,1,1 \right ] & = & 21/13 \approx 1.61538 \\ \vdots & = & \vdots \end{eqnarray*} $

(Try confirming these values for yourself.) As you can see, it seems like these numbers are getting closer and closer to something… here’s a graphical view of what’s going on:

The red squares indicate the stopping-point values that we calculated: the leftmost square (in the bottom left) is the value at the first stopping point, and the values progress to the right. It’s easy to see that they seem to be quickly zooming in on a particular value (indicated by the horizontal blue line), somewhere around 1.62.

The math word for this is convergence — the red squares (the values of successive stopping-points along the way to [1,1,1,1,\dots]) converge to a particular value. Technically, this means that we can get as close as we want to that particular value, as long as we are willing to pick a stopping point that is far enough along. The stopping point values will keep getting closer and closer (converging) to this particular value forever, even though they will never actually reach it exactly.

Note also that this is why the “value” of 1 + 1 + 1 + 1 + \dots is “infinity”. If we look at the value at successive stopping points (i.e. 1, 1+1, 1+1+1, etc.), they are not getting closer and closer to anything; they are simply getting bigger and bigger:

Stopping point values for 1+1+1+1+...

Saying that the value of something is “infinity” is really just a shorthand way of saying that it does not converge to anything (it diverges).

So when we ask for the “value” of the infinite expression [1,1,1,1,\dots], we are really asking: what do the stopping point values converge to?

Well, what do they converge to? You already know the answer to this: they converge to \frac{1 + \sqrt{5}}{2} \approx 1.61803, otherwise known as \phi!

There is much, much more going on here as well… but that will have to wait for another post. As always, feel free to comment with questions, ideas, comments, or anything at all (except spam).