Archive for the ‘induction’ Category

The hyperbinary sequence and the Calkin-Wilf tree

Sunday, October 18th, 2009

And now, the amazing conclusion to this series of posts on Neil Calkin and Herbert Wilf’s paper, Recounting the Rationals, and the answers to all the questions about the hyperbinary sequence. Hold on to your hats!

The Calkin-Wilf Tree

First, recall the Calkin-Wilf tree, defined as follows: the “root” of the tree is the fraction 1/1, and every node labeled i/j in the tree has two “children”, with the left child labeled i/(i+j), and the right labeled (i+j)/j. Here’s a picture of the first four levels of the tree (of course, the tree is infinite):



Now wait just one minute… those numbers look familiar! Let’s list those fractions in order, one level at a time, from left to right:


\displaystyle \frac 1 1, \frac 1 2, \frac 2 1, \frac 1 3, \frac 3 2, \frac 2 3, \frac 3 1, \frac 1 4, \frac 4 3, \frac 3 5, \frac 5 2, \frac 2 5, \frac 5 3, \frac 3 4, \frac 4 1 \dots

That’s right, it’s the hyperbinary sequence! In particular, the numerators are the hyperbinary sequence starting at index 0, and the denominators follow the same sequence, offset by one.

…but why?! The Calkin-Wilf tree is defined by a recursive rule for manipulating fractions, and the hyperbinary sequence is defined in terms of ways to write numbers as sums of powers of two… it certainly isn’t a priori obvious why these should have anything to do with each other!

Labelling the CW-tree

Let’s label the nodes in the Calkin-Wilf tree with positive integers, starting with 1 at the root and proceeding in order by levels, from left to right. This means that the fraction at node n is h(n-1)/h(n)—at least, that’s what we’re trying to prove! Here’s an illustration of our numbering scheme:



Now, what do you notice about the relationship between the label on a node and the labels on its children? It’s not too hard to guess the pattern: the left child of the node labelled k is labelled 2k, and the right child is labelled 2k+1. How can we prove this?

Well, it’s certainly true for the root note: it’s labelled 1, and its children are labelled 2 and 3. Now we note that if the pattern holds for the node immediately preceding node k—that is, the children of node k-1 are 2k-2 and 2k-1—then the children of node k come immediately after the children of k-1, so they are numbered 2k and 2k+1. If you think about it, this is true regardless of whether k-1 and k are on the same level, or k-1 is the end of one level and k is the beginning of the next.

We can think of labelling each edge of the Calkin-Wilf tree with either a 0 or a 1: 0 for left edges, and 1 for right edges. Then taking all the zeros and ones along the path from the root to any node and sticking an extra 1 at the beginning gives us the label of that node in binary! For example, the path that goes left, left, right corresponds to 0, 0, 1, and adding an extra 1 to the front gives us 1001–which is indeed the binary representation of 9!



So, is it really true that the fraction at node n is always h(n-1)/h(n)? It is indeed, and we now have the tools we need to see why. First of all, the root node of the tree—that is, node 1—is (by definition) 1/1, which is indeed h(0)/h(1). Next, suppose that the fraction at node k is h(k-1)/h(k); we need to show that the fraction at the left child (which is node 2k) is h(2k-1)/h(2k), and the fraction at the right child is h(2k)/h(2k+1). But this is now easy: by the way the Calkin-Wilf tree is defined, the left child is

\displaystyle\frac{h(k-1)}{h(k-1) + h(k)} = \frac{h(2k-1)}{h(2k)}.

The equality, of course, follows directly from the recurrence relation for the hyperbinary sequence! Likewise, the right child is

\displaystyle \frac{h(k-1) + h(k)}{h(k)} = \frac{h(2k)}{h(2k+1)}.

Solving the \phi(n) mystery

As pointed out by JM, this connection with the Calkin-Wilf tree is exactly what we need to prove Fergal Daly’s conjecture about the number of primary occurrences of any given number.

We defined a primary occurrence simply as a number at an even position in the hyperbinary sequence. Of course, even-labelled nodes in the Calkin-Wilf tree are exactly the ones which are left children, and so the primary occurrences are the denominators of left children (equivalently, the numerators of right children). Also, since left children are always of the form i/(i+j), left children always have denominators bigger than their numerators. Finally, as you may recall, we showed earlier that every positive rational number occurs somewhere in the Calkin-Wilf tree, in lowest terms, exactly once.

Putting this all together: each primary occurrence of n corresponds to a fraction m/n in the Calkin-Wilf tree, where m < n. And since every fraction occurs exactly once in lowest terms, there is exactly one such fraction m/n for each positive integer m < n which is relatively prime to n. Of course, by definition, there are \phi(n) such numbers, so this is the proof of Fergal’s conjecture! There are exactly \phi(n) primary occurrences of n, since that’s how many reduced fractions there are between 0 and 1 which have a denominator of n.

Computing primary occurrences

The connection with the Calkin-Wilf tree also allows us to compute h^{-1}—that is, given any n, we can efficiently (in O(n \log n) time) compute all the primary occurrences k such that h(k) = n!

How does this work? Well, if you remember, the Euclidan Algorithm gives us a way to find our way up the Calkin-Wilf tree from any starting fraction. Combining this with the relationship we discovered earlier between the label on a node and the labels on its children allows us to compute the label on the node where we started.

As an example, let’s compute one of the primary occurrences of 5. In particular, let’s do the one that corresponds to the fraction 3/5. Since 3 is smaller than 5, 3/5 must be the left child of 3/(5-3) = 3/2. 3 is bigger than 2, so 3/2 is the right child of (3-2)/2 = 1/2. Finally, 1/2 is the left child of 1/1. Now we can work backwards: 1/1 is node 1; 1/2 is to the left, so it’s node 2*1 = 2; 3/2 is to the right of that, so it’s node 2*2 + 1 = 5; and 3/5 is to the left again, so it’s node 2*5 = 10. Therefore h(10) = 5 is a primary occurrence of 5! We can also think of this more succinctly by recalling that paths in the Calkin-Wilf tree correspond to binary representations of the labels. So we can run the Euclidean Algorithm, getting a zero or one at each step, and then reverse them and put an extra 1 at the beginning to get the binary representation of the primary occurrence. To be more precise:

  1. Start with the pair (m,n) where m < n is relatively prime to n. Call the current pair of numbers (a,b).
  2. Given (a,b), if a < b then write (a,b) \stackrel{0}{\to} (a,b-a). Otherwise write (a,b) \stackrel{1}{\to} (a-b,b).
  3. Repeat step 2 until reaching (1,1).
  4. Read off the primary occurrence of n by prefixing 1 to the reverse of the binary digits written over the arrows.

All of this leads to the following (somewhat mind-blowing) observation: the position of each primary occurrence of n in the hyperbinary sequence is a proof (encoded in binary) that n is relatively prime to the number that comes immediately before it.

To tie this all together, here’s a little Haskell program I wrote to compute primary occurrences:

import System.Environment (getArgs)
import Control.Monad (forM_)
import Data.Maybe (catMaybes)
import Data.List (sort)

-- Find a fraction in the Calkin-Wilf tree.  findCW a b computes the
-- location of a/b in the Calkin-Wilf tree, or returns Nothing if a
-- and b are not relatively prime.
findCW :: Integer -> Integer -> Maybe Integer
findCW 1 1 = Just 1
findCW a b | a < b  = left  `fmap` findCW a (b-a)
           | a > b  = right `fmap` findCW (a-b) b
           | a == b = Nothing

left  x = 2*x
right x = 2*x + 1

-- Find the primary occurrences of n.
primaryOccs :: Integer -> [Integer]
primaryOccs n = sort . catMaybes $ [ findCW m n | m <- [1..n] ]

main :: IO ()
main = do (from:to:_) <- getArgs
          forM_ [read from..read to] $ \i -> do
            putStr $ show i ++ ": "
            putStrLn . unwords . map show . primaryOccs $ i

If there’s interest, I could write another post explaining how this program works in a bit more detail, but hopefully you can mostly understand what it’s doing from the preceding discussion. Let’s check that it works:

[brent@archimedes:~/teaching/mlt/hyperbin]$ ./prim-occs 2 16
2: 2
3: 4 6
4: 8 14
5: 10 12 16 30
6: 32 62
7: 18 22 24 28 64 126
8: 20 26 128 254
9: 34 46 48 60 256 510
10: 38 56 512 1022
11: 36 40 54 58 66 94 96 124 1024 2046
12: 44 50 2048 4094
13: 42 52 70 78 112 120 130 190 192 252 4096 8190
14: 68 80 110 122 8192 16382
15: 72 118 258 382 384 508 16384 32766
16: 92 98 134 158 224 248 32768 65534

Yup, looks good! And to show that it really is efficient, let’s use it to compute the primary occurrences of 1000 (yes, all 400 of them!):

[brent@archimedes:~/teaching/mlt/hyperbin]$ ./prim-occs 1000 1000
1000: 39770 42278 56024 58532 68396 70568 71444 75674 76688 85046 86480 92576 104030
... 386 more primary occurrences...
1071508607186267320948425049060001810561404811705533607443750388370351051124936
1224931983788156958581275946729175531468251871452856923140435984577574698574803
9345677748242309854210746050623711418779541821530464749835819412673987675591655
43946077062914571196477686542167660429831652624386837205668069374

On my computer this runs in about 0.03 seconds. Sweet! Note that the position of the final primary occurrence of 1000 has 303 digits–that’s approximately one centillion. I certainly wouldn’t want to go through the first one centillion numbers in the hyperbinary sequence looking for 1000 by hand, would you?

Further Reading

I hope you’ve enjoyed this little (1.5-year-long!) excursion into the fascinating world opened up by Calkin and Wilf’s paper. (I’ve sure enjoyed writing it!) If you’re interested in exploring more, I suggest looking at Roland Backhouse and João Ferreira’s paper, Recounting the Rationals, Twice! which also explains the connection to another famous tree of fractions, the Stern-Brocot tree. If you’re interested in the computational aspect of the hyperbinary numbers, have a look at Enumerating the Rationals, by Jeremy Gibbons, David Lester, and Richard Bird.

More hyperbinary fun

Monday, September 28th, 2009

When I originally posed Challenge #12, a certain Dave posted a series of comments with some explorations and partial solutions to part II (the hyperbinary sequence). Although I gave the “solution” in my last post, no solution to any problem is ever the last word—there’s always more to explore, more connections to see! It’s worth taking another post to explore some of Dave’s discoveries and conjectures, and how they connect to my last post. I’ve used different notations, made up some terms, tried to simplify things where possible, and so on, but all the ideas in this post are fundamentally Dave’s.

First things first

Dave first noticed that

h(2^k - 1) = 1

for all k \geq 0. I noted this in my previous post as well, but said only that this is “apparent” from thinking about the recursive construction I described. However, it’s worth going through a proof of this more formally; it’s a good example of a proof by induction. First, let’s recall the recurrence relation that defines the hyperbinary sequence:

$ \begin{align*}h(0) &= 1 \\ h(2n+1) &= h(n) && \text{(O)}\\ h(2n+2) &= h(n+1) + h(n) && \text{(E)}\end{align*} $

Now we can show that h(2^k - 1) = 1 for k \geq 0. First, the base case: when k=0, h(2^0 - 1) = h(0) = 1; check!

Now, let’s suppose that h(2^k - 1) = 1 for some particular k \geq 0; this is called the inductive hypothesis, which I will abbreviate as IH. we want to show that h(2^{k+1} - 1) = 1 as well. 2^{k+1} - 1 is obviously odd, so we want to apply rule (O). The only trick is to get it into the right form first:

$ \begin{align*}h(2^{k+1} - 1) &= h(2^{k+1} - 2 + 1) && \text{\{ arithmetic \}}\\ &= h(2(2^k - 1) + 1) && \text{\{ factor \}}\\ &= h(2^k - 1) && \text{\{ rule (O) \}} \\ &= 1 && \text{\{ IH \}}\end{align*} $

Presto! Since the thing we wanted to show is true when k=0 and is true for k+1 whenever it is true for k, by induction it must be true for all k \geq 0.

Second things, third things…

But Dave then went on to notice that this sort of pattern isn’t unique to 1. For example,

h(3 \cdot 2^k - 1) = 2

for k \geq 0, and

h(5 \cdot 2^k - 1) = h(7 \cdot 2^k - 1) = 3

also for k \geq 0. Dave conjectured that this generalizes, and indeed it does: most generally, we can say that

h(p \cdot 2^k - 1) = h(p - 1)

for k \geq 0. That is, once a certain number occurs at position p - 1, it will keep occurring in a regular pattern after that, at every position of the form p \cdot 2^k - 1. Note that h(2^k - 1) = 1 is a special case with p = 1.

Why is this? Intuitively, it’s because of the “copying”—the blue numbers in my previous post. Once a number occurs, it will keep getting copied, but with the distance between the copies doubling each time. But again, this is worth proving formally. I’ll let you do this one—the proof by induction is similar to the proof above!

But why do occurrences of 1 correspond to p = 1, and occurrences of 2 to p = 3, and occurrences of 3 to p = 5 and p = 7, and so on? Are there any patterns to be found here? This is what Dave explored next; let’s follow.

Primary occurrences and the inverse of h

Let’s think a little more about this equation h(p \cdot 2^k - 1) = h(p - 1). Let’s suppose that p is odd (if it isn’t, we can always take another factor of two out of it to combine with the 2^k). Notice that p \cdot 2^k - 1 is always odd, except when k = 0 (in which case p \cdot 2^k - 1 = p - 1, which is even). Moreover, we can always write any odd number in the form p \cdot 2^k - 1, where p is odd: just add one, and factor out as many copies of 2 as possible. For example, 11 = 12 - 1 = 3 \cdot 2^2 - 1.

This means that if q is any odd number, we can write it in the form q = p \cdot 2^k - 1, and so h(q) = h(p-1): that is, every odd position in the hyperbinary sequence is just copied from an earlier even position. Of course, this can also be seen from the recursive construction in my previous post; but now we have really proved it! And what about the even positions? The values of the hyperbinary sequence at even positions are obtained by adding, rather than copying. If n occurs at an even position, we will call it a primary occurrence of n.

The nice thing about primary occurrences is that (a) there are only finitely many primary occurrences of any particular n (challenge: prove this!), and (b) once we know the positions of the primary occurrences of n, we know the positions of all the occurrences of n: they are at positions of the form p \cdot 2^k - 1, where p - 1 is a primary occurrence.

So, given n, can we find all its primary occurrences? Essentially we are asking if we can compute h^{-1} (by which I mean the inverse relation of h; since h is not injective its inverse is obviously not a function), but leaving out non-primary occurrences since we can easily find those from primary ones. Let’s start by making a table with some small values of n:

n Primary occurrences
1 0
2 2
3 4, 6
4 8, 14
5 10, 12, 16, 30
6 32, 62
7 18, 22, 24, 28, 64, 126
8 20, 26, 128, 254
9 34, 46, 48, 60, 256, 510
10 38, 56, 512, 1022
11 36, 40, 54, 58, 66, 94, 96, 124, 1024, 2046
12 44, 50, 2048, 4094
13 42, 52, 70, 78, 112, 120, 130, 190, 192, 252, 4096, 8190
14 68, 80, 110, 122, 8192, 16382
15 72, 118, 258, 382, 384, 508, 16384, 32766
16 92, 98, 134, 158, 224, 248, 32768, 65534

Do you see any patterns? If so, can you prove them by induction? I’ll talk about a few in my next post. But I still don’t know whether there’s an effective way to compute h^{-1}(n)!

Challenge #12 solution, part II

Saturday, September 19th, 2009

Yes, that’s right, that Challenge #12, posted one year, five months, and a day ago. You see, I have this nasty habit of starting things and not finishing them… well, better late than never!

Question two of the aforementioned challenge asked this:

Given a positive integer n, in how many ways can n be written as a sum of powers of two, when each power is allowed to occur at most twice? Say h(n) denotes the number of different ways to write n as a sum of powers of two with at most two copies of each power allowed. What can you say about the function h?

These representations using at most two copies of each power of two are sometimes called hyperbinary representations. For example, 11 can be written as 2^3 + 2^1 + 2^0 or as 2^2 + 2^2 + 2^1 + 2^0. It turns out these are the only hyperbinary representations of 11, so h(11) = 2.

How to analyze this? We can use a similar approach to that outlined in the solution to part III (which asked a similar question, but allowing any number of copies of each power of two). First let’s think about odd numbers, that is, numbers of the form 2n + 1. Any representation of an odd number must include an odd number of copies of 2^0 (otherwise the sum won’t be odd!), but since at most two copies are allowed, there must be exactly one copy of 2^0. If we take away the 2^0 and divide everything else by two, we get a valid representation of n. The reverse works as well: if we start with any valid representation of n, multiply everything by two, and add a copy of 2^0, we get a valid representation of 2n + 1. This means that

h(2n+1) = h(n).

What about even numbers? A valid representation of 2n + 2 has either two copies of 2^0, or none. If it has none, we can simply divide by two to get a valid representation of n + 1 (and vice versa); if two, we can take away the copies of 2^0 and then divide by two to get a representation of n, and vice versa. Therefore,

h(2n+2) = h(n+1) + h(n).

These two equations, h(2n+1) = h(n) and h(2n+2) = h(n+1) + h(n), along with the fact that h(0) = 1, are enough to completely characterize h(n). We can easily compute h(n) for the first few values of n:

1,1,2,1,3,2,3,1,4,3,5,2,5,3,4,1,5,4,7,3,8,5,7,2,7,5,8,3,7,4,5,1…

It’s immediately striking, of course, that h(n) isn’t strictly increasing: that is, just because m > n doesn’t mean h(m) > h(n). In fact, even as the values of h(n) continue to get bigger on average, there are still some values of 1 hiding in there. In fact, h(0) = h(1) = h(3) = h(7) = h(15) = 1, and the obvious conjecture is that h(2^k - 1) = 1 for every k \geq 0.

The next thing we notice, after staring a bit, is that the h(n) sequence contains a lot of palindromes! Every section sandwiched in between two 1’s seems to read the same forwards and backwards. Let’s make some graphs so we can see these patterns more easily: below I’ve made graphs showing h(n) as n goes from 1 to 100, 1000, and 10000, respectively.

The hyperbinary sequence through n=100

The hyperbinary sequence through n=1000

The hyperbinary sequence through n=10000

Pretty! We can clearly see the symmetric structure here, and the way that it occasionally (every 2^k - 1) jumps back down to 1.

Here’s another way to think about building the sequence. We start with the first two elements:

1,\rrr{1}

Now we copy the red part onto the end of the sequence (the copy will be blue), leaving a blank space in front of each blue number:

1,\rrr{1},\blnk,\bbb{1}

Now we fill in each blank (in green) by adding the two numbers on either side of it:

1,\rrr{1},\gggr{2},\bbb{1}

Now the blue and green part becomes the new red part:

1,1,\rrr{2},\rrr{1}

Now we repeat! Copy the red part to the end, leaving a blank in front of each copied number:

1,1,\rrr{2},\rrr{1},\blnk,\bbb{2},\blnk,\bbb{1}

Fill in each blank by adding the two numbers on either side:

1,1,\rrr{2},\rrr{1},\gggr{3},\bbb{2},\gggr{3},\bbb{1}

And again: blue and green become the new reds, copy the red numbers to the end with a blank in front of each:

1,1,2,1,\rrr{3},\rrr{2},\rrr{3},\rrr{1},\blnk,\bbb{3},\blnk,\bbb{2},\blnk,\bbb{3},\blnk,\bbb{1}

And fill in the blanks by adding:

1,1,2,1,\rrr{3},\rrr{2},\rrr{3},\rrr{1},\gggr{4},\bbb{3},\gggr{5},\bbb{2},\gggr{5},\bbb{3},\gggr{4},\bbb{1}

And so on. It takes a little bit of thought, but it can be seen that the method I’ve described here is really just a different way of expressing the equations h(2n+1) = h(n) and h(2n+2) = h(n+1) + h(n)! But thinking about it this way makes it apparent why a 1 always shows up at h(2^k - 1), and also why we end up with palindromes.

I’ll stop there for now; in an upcoming post I’ll bring things full circle by showing how these numbers relate to the Calkin-Wilf tree of rationals.

Recounting the Rationals, part IV

Wednesday, February 6th, 2008

Continuing a series about the Calkin-Wilf tree (see those links for some background), today I’d like to show why all the rationals in the tree must be in lowest terms. Let’s start off with a little number theory!

What do we mean when we say that a fraction is “in lowest terms”? We simply mean that the numerator and denominator have no common factors other than 1. So 3/7 is in lowest terms, but 14/21 isn’t, since the numerator and denominator are both divisible by 7. You will sometimes also hear the term relatively prime to describe two numbers which have no common factors. It doesn’t really have anything to do with being prime; for example, 6 and 49 are relatively prime (and hence 6/49 is in lowest terms), even though neither 6 nor 49 is prime.

A more formal way to talk about all of this is in terms of the greatest common divisor, or GCD. The GCD of two numbers m and n, often written \gcd(m,n), is the largest number which evenly divides both m and n. For example, \gcd(12,30) = 6, since 6 is the largest number which evenly divides both 12 and 30. So, in particular, we can say that m and n are relatively prime when \gcd(m,n) = 1.

You should be able to verify a few simple properties of the GCD function yourself. For example, \gcd(m,n) = \gcd(n,m), and \gcd(km, kn) = k \cdot \gcd(m,n). Another slightly less obvious property, but more important for our purposes, is that \gcd(m,n) = \gcd(m, m \pm n). Why is this? Well, suppose \gcd(m,n) = d. That means d divides both m and n, so we can write m = dr and n = ds. Therefore m+n = dr + ds = d(r+s), so d divides m+n too! So, d is a common divisor of m and m+n. Note also that if anything bigger than d was a common divisor of m and m+n, then this larger divisor would also divide (m+n) - m = n, so it would be a common divisor of m and n — but we already know that d is the greatest common divisor of m and n. Therefore, d really is the greatest common divisor of m and m+n, and \gcd(m,n) = \gcd(m, m+n). The argument to show that \gcd(m,n) = \gcd(m,m-n) is exactly the same.

Applying this to the Calkin-Wilf tree is a piece of cake. We want to show that for every i/j in the Calkin-Wilf tree, it is the case that \gcd(i,j) = 1. Well, it’s certainly true for the root node, 1/1: \gcd(1,1) = 1. And supposing it is true for some node i/j, it must also be true for the child nodes, since we know that \gcd(i,j) = \gcd(i,i+j) = \gcd(i+j,j). And we’re done! This is a nice example of a proof by induction: all we have to do is prove that something is true for some starting value, and that whenever it is true for one thing it must be true for the next, and voilà! We have proved it for an infinite series of cases.

So, we now know that the Calkin-Wilf tree contains every positive rational, in lowest terms, exactly once! Next time, I’ll talk about a way to compute the GCD of two numbers, called the Euclidean Algorithm—which is probably a lot different than the way you learned to compute GCDs in elementary school—and its cool connection to the Calkin-Wilf tree.

Golden powers

Monday, September 10th, 2007


So, we know from a previous challenge that \gr^2 = \gr + 1. That’s a pretty interesting property, which is shared only by its cousin, \grh. I wonder whether other powers of \gr have special properties too? Let’s see:

\gr^3 = \gr \cdot \gr^2 = \gr (\gr + 1) = \gr^2 + \gr = 2\gr + 1.

Interesting! What about \gr^4?

\gr^4 = \gr \cdot \gr^3 = \gr (2\gr + 1) = 2 \gr^2 + \gr = 3\gr + 2.

And \gr^5?

\gr^5 = \gr \cdot \gr^4 = \gr (3\gr + 2) = 3 \gr^2 + 2\gr = 5\gr + 3.

Curiouser and curiouser! Are you noticing a pattern? (Try working out \gr^6 to see if it fits the pattern you suspect.)

Hopefully, you noticed that Fibonacci numbers are involved here. If this pattern holds, we would expect to find that \gr^6 = 8\gr + 5, \gr^7 = 13\gr + 8, and so on. To state it more generally, we suspect that

\gr^n = F_n \gr + F_{n-1}, when n \geq 2.

Is this true? It turns out, amazingly, that it is, and we can prove it using a technique called mathematical induction. The idea of mathematical induction is sort of like knocking over a chain of dominoes. First, we’ll show that the above statement is true for some particular value of n (the base case; for this statement, we’ll use n = 2). Then, we’ll show that if the statement is true for any given value of n, then it must be true for n + 1 as well (the inductive step). Do you see how it’s like knocking over dominoes? If the statement is true when n is 2, then it must be true for 3 as well, which means it must be true for 4, which means… and so on, for all positive integers!

Ready? First, recall that the Fibonacci numbers F_n are defined as follows:

 $ \begin{align*} F_1 &= F_2 = 1 \\ F_n &= F_{n-1} + F_{n-2} & (n > 2). \end{align*} $

Now for the base case: when n is 2, the statement says that \gr^2 = F_2 \gr + F_1; this is true since F_2 and F_1 are both equal to 1.

Finally, the inductive step. We’ll begin by supposing that \gr^n = F_n \gr + F_{n-1} for some n \geq 2, and see if this allows us to conclude that the statement is true for n + 1 as well. The method we used above to evaluate \gr^3, \gr^4 and so on suggests a method of attack:

 $ \begin{align*} \gr^{n+1} &= \gr \cdot \gr^n \\ &= \gr \cdot (F_n \gr + F_{n-1}) \\ &= F_n \gr^2 + F_{n-1} \gr \\ &= F_n (\gr + 1) + F_{n-1} \gr \\ &= (F_n + F_{n-1}) \gr + F_n \\ &= F_{n+1} \gr + F_n \end{align*} $

So, since we’ve proved that the statement is true when n is 2, and we’ve also proved that it must be true for n + 1 whenever it’s true for n, it must be true for all n \geq 2!

Next up: we’ll use this neat relationship between \gr and the Fibonacci numbers to discover another one!