Archive for the ‘solutions’ Category

The broken weight problem: solutions and further exploration

Tuesday, May 11th, 2010

First of all, let me say to all my readers how fantastic it felt to post a puzzle, after not posting anything for two months, and get eighteen thoughtful, insightful comments in just three days; it’s every blogger’s dream. You all are fantastic and make this a lot of fun — thanks for reading!

I thought I’d take a post just to summarize some of the responses and solutions to the broken weight problem. As many commenters realized, the solution is that the weights are of sizes 1, 3, 9, and 27. Hmm, powers of three… coincidence? Of course not!

As several commenters noted, something involving base three readily presents itself if we realize that there are three possibilities for each weight: it can be on the left of the balance scale, on the right, or not on the scale at all. Since one side of the scale corresponds to adding the weight and the other side to subtracting, we are essentially writing numbers in base three, but using the digits -1, 0, and 1 instead of the usual 0, 1, 2. For example, 25 can be written as 10(-1)1, that is, 1 \cdot 27 + 0 \cdot 9 - 1 \cdot 3 + 1 \cdot 1 (if we were really going to use this system we’d want to come up with a better symbol for -1). In fact, this is known as balanced ternary, and it is a fact (as proved by a few commenters) that n digits of balanced ternary allow us to uniquely represent every integer between \pm \frac{3^n - 1}{2}. With four digits (as in the problem) we can uniquely represent every integer between -40 and 40. There was a bit of confusion in the comments about being able to represent some integers in more than one way, but I think if you try it out you will find that this is not the case.

From this problem (generalizing it to arbitrary numbers of weights), we can see that


\displaystyle 1 + 3 + \dots + 3^{n-1} = \frac{3^n - 1}{2}.

JM noted that this generalizes to


\displaystyle 1 + b + b^2 + \dots + b^{n-1} = \frac{b^n - 1}{b - 1}

and wondered whether this has anything to do with solving the problem, or with problems like it. Indeed it does; here’s one for you: suppose you are tasked with designing a set of weights. The weights should make it possible to weigh as many different integer weights as possible, without leaving any out, just like the weights 1, 3, 9, 27 make it possible to weigh every integer weight up to 40 without leaving any out. The one difference is that you want to have two identical copies of each weight. What is the best you can do?

I’ve left the problem slightly vague on purpose, but I hope you will have fun solving it and figuring out what it has to do with JM’s observation! Can you come up with other interesting generalizations of the problem?

Finally, Sam Shah posted a link to a description of his experience using the problem in a real-life problem-solving session.

The hyperbinary sequence and the Calkin-Wilf tree

Sunday, October 18th, 2009

And now, the amazing conclusion to this series of posts on Neil Calkin and Herbert Wilf’s paper, Recounting the Rationals, and the answers to all the questions about the hyperbinary sequence. Hold on to your hats!

The Calkin-Wilf Tree

First, recall the Calkin-Wilf tree, defined as follows: the “root” of the tree is the fraction 1/1, and every node labeled i/j in the tree has two “children”, with the left child labeled i/(i+j), and the right labeled (i+j)/j. Here’s a picture of the first four levels of the tree (of course, the tree is infinite):



Now wait just one minute… those numbers look familiar! Let’s list those fractions in order, one level at a time, from left to right:


\displaystyle \frac 1 1, \frac 1 2, \frac 2 1, \frac 1 3, \frac 3 2, \frac 2 3, \frac 3 1, \frac 1 4, \frac 4 3, \frac 3 5, \frac 5 2, \frac 2 5, \frac 5 3, \frac 3 4, \frac 4 1 \dots

That’s right, it’s the hyperbinary sequence! In particular, the numerators are the hyperbinary sequence starting at index 0, and the denominators follow the same sequence, offset by one.

…but why?! The Calkin-Wilf tree is defined by a recursive rule for manipulating fractions, and the hyperbinary sequence is defined in terms of ways to write numbers as sums of powers of two… it certainly isn’t a priori obvious why these should have anything to do with each other!

Labelling the CW-tree

Let’s label the nodes in the Calkin-Wilf tree with positive integers, starting with 1 at the root and proceeding in order by levels, from left to right. This means that the fraction at node n is h(n-1)/h(n)—at least, that’s what we’re trying to prove! Here’s an illustration of our numbering scheme:



Now, what do you notice about the relationship between the label on a node and the labels on its children? It’s not too hard to guess the pattern: the left child of the node labelled k is labelled 2k, and the right child is labelled 2k+1. How can we prove this?

Well, it’s certainly true for the root note: it’s labelled 1, and its children are labelled 2 and 3. Now we note that if the pattern holds for the node immediately preceding node k—that is, the children of node k-1 are 2k-2 and 2k-1—then the children of node k come immediately after the children of k-1, so they are numbered 2k and 2k+1. If you think about it, this is true regardless of whether k-1 and k are on the same level, or k-1 is the end of one level and k is the beginning of the next.

We can think of labelling each edge of the Calkin-Wilf tree with either a 0 or a 1: 0 for left edges, and 1 for right edges. Then taking all the zeros and ones along the path from the root to any node and sticking an extra 1 at the beginning gives us the label of that node in binary! For example, the path that goes left, left, right corresponds to 0, 0, 1, and adding an extra 1 to the front gives us 1001–which is indeed the binary representation of 9!



So, is it really true that the fraction at node n is always h(n-1)/h(n)? It is indeed, and we now have the tools we need to see why. First of all, the root node of the tree—that is, node 1—is (by definition) 1/1, which is indeed h(0)/h(1). Next, suppose that the fraction at node k is h(k-1)/h(k); we need to show that the fraction at the left child (which is node 2k) is h(2k-1)/h(2k), and the fraction at the right child is h(2k)/h(2k+1). But this is now easy: by the way the Calkin-Wilf tree is defined, the left child is

\displaystyle\frac{h(k-1)}{h(k-1) + h(k)} = \frac{h(2k-1)}{h(2k)}.

The equality, of course, follows directly from the recurrence relation for the hyperbinary sequence! Likewise, the right child is

\displaystyle \frac{h(k-1) + h(k)}{h(k)} = \frac{h(2k)}{h(2k+1)}.

Solving the \phi(n) mystery

As pointed out by JM, this connection with the Calkin-Wilf tree is exactly what we need to prove Fergal Daly’s conjecture about the number of primary occurrences of any given number.

We defined a primary occurrence simply as a number at an even position in the hyperbinary sequence. Of course, even-labelled nodes in the Calkin-Wilf tree are exactly the ones which are left children, and so the primary occurrences are the denominators of left children (equivalently, the numerators of right children). Also, since left children are always of the form i/(i+j), left children always have denominators bigger than their numerators. Finally, as you may recall, we showed earlier that every positive rational number occurs somewhere in the Calkin-Wilf tree, in lowest terms, exactly once.

Putting this all together: each primary occurrence of n corresponds to a fraction m/n in the Calkin-Wilf tree, where m < n. And since every fraction occurs exactly once in lowest terms, there is exactly one such fraction m/n for each positive integer m < n which is relatively prime to n. Of course, by definition, there are \phi(n) such numbers, so this is the proof of Fergal’s conjecture! There are exactly \phi(n) primary occurrences of n, since that’s how many reduced fractions there are between 0 and 1 which have a denominator of n.

Computing primary occurrences

The connection with the Calkin-Wilf tree also allows us to compute h^{-1}—that is, given any n, we can efficiently (in O(n \log n) time) compute all the primary occurrences k such that h(k) = n!

How does this work? Well, if you remember, the Euclidan Algorithm gives us a way to find our way up the Calkin-Wilf tree from any starting fraction. Combining this with the relationship we discovered earlier between the label on a node and the labels on its children allows us to compute the label on the node where we started.

As an example, let’s compute one of the primary occurrences of 5. In particular, let’s do the one that corresponds to the fraction 3/5. Since 3 is smaller than 5, 3/5 must be the left child of 3/(5-3) = 3/2. 3 is bigger than 2, so 3/2 is the right child of (3-2)/2 = 1/2. Finally, 1/2 is the left child of 1/1. Now we can work backwards: 1/1 is node 1; 1/2 is to the left, so it’s node 2*1 = 2; 3/2 is to the right of that, so it’s node 2*2 + 1 = 5; and 3/5 is to the left again, so it’s node 2*5 = 10. Therefore h(10) = 5 is a primary occurrence of 5! We can also think of this more succinctly by recalling that paths in the Calkin-Wilf tree correspond to binary representations of the labels. So we can run the Euclidean Algorithm, getting a zero or one at each step, and then reverse them and put an extra 1 at the beginning to get the binary representation of the primary occurrence. To be more precise:

  1. Start with the pair (m,n) where m < n is relatively prime to n. Call the current pair of numbers (a,b).
  2. Given (a,b), if a < b then write (a,b) \stackrel{0}{\to} (a,b-a). Otherwise write (a,b) \stackrel{1}{\to} (a-b,b).
  3. Repeat step 2 until reaching (1,1).
  4. Read off the primary occurrence of n by prefixing 1 to the reverse of the binary digits written over the arrows.

All of this leads to the following (somewhat mind-blowing) observation: the position of each primary occurrence of n in the hyperbinary sequence is a proof (encoded in binary) that n is relatively prime to the number that comes immediately before it.

To tie this all together, here’s a little Haskell program I wrote to compute primary occurrences:

import System.Environment (getArgs)
import Control.Monad (forM_)
import Data.Maybe (catMaybes)
import Data.List (sort)

-- Find a fraction in the Calkin-Wilf tree.  findCW a b computes the
-- location of a/b in the Calkin-Wilf tree, or returns Nothing if a
-- and b are not relatively prime.
findCW :: Integer -> Integer -> Maybe Integer
findCW 1 1 = Just 1
findCW a b | a < b  = left  `fmap` findCW a (b-a)
           | a > b  = right `fmap` findCW (a-b) b
           | a == b = Nothing

left  x = 2*x
right x = 2*x + 1

-- Find the primary occurrences of n.
primaryOccs :: Integer -> [Integer]
primaryOccs n = sort . catMaybes $ [ findCW m n | m <- [1..n] ]

main :: IO ()
main = do (from:to:_) <- getArgs
          forM_ [read from..read to] $ \i -> do
            putStr $ show i ++ ": "
            putStrLn . unwords . map show . primaryOccs $ i

If there’s interest, I could write another post explaining how this program works in a bit more detail, but hopefully you can mostly understand what it’s doing from the preceding discussion. Let’s check that it works:

[brent@archimedes:~/teaching/mlt/hyperbin]$ ./prim-occs 2 16
2: 2
3: 4 6
4: 8 14
5: 10 12 16 30
6: 32 62
7: 18 22 24 28 64 126
8: 20 26 128 254
9: 34 46 48 60 256 510
10: 38 56 512 1022
11: 36 40 54 58 66 94 96 124 1024 2046
12: 44 50 2048 4094
13: 42 52 70 78 112 120 130 190 192 252 4096 8190
14: 68 80 110 122 8192 16382
15: 72 118 258 382 384 508 16384 32766
16: 92 98 134 158 224 248 32768 65534

Yup, looks good! And to show that it really is efficient, let’s use it to compute the primary occurrences of 1000 (yes, all 400 of them!):

[brent@archimedes:~/teaching/mlt/hyperbin]$ ./prim-occs 1000 1000
1000: 39770 42278 56024 58532 68396 70568 71444 75674 76688 85046 86480 92576 104030
... 386 more primary occurrences...
1071508607186267320948425049060001810561404811705533607443750388370351051124936
1224931983788156958581275946729175531468251871452856923140435984577574698574803
9345677748242309854210746050623711418779541821530464749835819412673987675591655
43946077062914571196477686542167660429831652624386837205668069374

On my computer this runs in about 0.03 seconds. Sweet! Note that the position of the final primary occurrence of 1000 has 303 digits–that’s approximately one centillion. I certainly wouldn’t want to go through the first one centillion numbers in the hyperbinary sequence looking for 1000 by hand, would you?

Further Reading

I hope you’ve enjoyed this little (1.5-year-long!) excursion into the fascinating world opened up by Calkin and Wilf’s paper. (I’ve sure enjoyed writing it!) If you’re interested in exploring more, I suggest looking at Roland Backhouse and João Ferreira’s paper, Recounting the Rationals, Twice! which also explains the connection to another famous tree of fractions, the Stern-Brocot tree. If you’re interested in the computational aspect of the hyperbinary numbers, have a look at Enumerating the Rationals, by Jeremy Gibbons, David Lester, and Richard Bird.

More hyperbinary fun

Monday, September 28th, 2009

When I originally posed Challenge #12, a certain Dave posted a series of comments with some explorations and partial solutions to part II (the hyperbinary sequence). Although I gave the “solution” in my last post, no solution to any problem is ever the last word—there’s always more to explore, more connections to see! It’s worth taking another post to explore some of Dave’s discoveries and conjectures, and how they connect to my last post. I’ve used different notations, made up some terms, tried to simplify things where possible, and so on, but all the ideas in this post are fundamentally Dave’s.

First things first

Dave first noticed that

h(2^k - 1) = 1

for all k \geq 0. I noted this in my previous post as well, but said only that this is “apparent” from thinking about the recursive construction I described. However, it’s worth going through a proof of this more formally; it’s a good example of a proof by induction. First, let’s recall the recurrence relation that defines the hyperbinary sequence:

$ \begin{align*}h(0) &= 1 \\ h(2n+1) &= h(n) && \text{(O)}\\ h(2n+2) &= h(n+1) + h(n) && \text{(E)}\end{align*} $

Now we can show that h(2^k - 1) = 1 for k \geq 0. First, the base case: when k=0, h(2^0 - 1) = h(0) = 1; check!

Now, let’s suppose that h(2^k - 1) = 1 for some particular k \geq 0; this is called the inductive hypothesis, which I will abbreviate as IH. we want to show that h(2^{k+1} - 1) = 1 as well. 2^{k+1} - 1 is obviously odd, so we want to apply rule (O). The only trick is to get it into the right form first:

$ \begin{align*}h(2^{k+1} - 1) &= h(2^{k+1} - 2 + 1) && \text{\{ arithmetic \}}\\ &= h(2(2^k - 1) + 1) && \text{\{ factor \}}\\ &= h(2^k - 1) && \text{\{ rule (O) \}} \\ &= 1 && \text{\{ IH \}}\end{align*} $

Presto! Since the thing we wanted to show is true when k=0 and is true for k+1 whenever it is true for k, by induction it must be true for all k \geq 0.

Second things, third things…

But Dave then went on to notice that this sort of pattern isn’t unique to 1. For example,

h(3 \cdot 2^k - 1) = 2

for k \geq 0, and

h(5 \cdot 2^k - 1) = h(7 \cdot 2^k - 1) = 3

also for k \geq 0. Dave conjectured that this generalizes, and indeed it does: most generally, we can say that

h(p \cdot 2^k - 1) = h(p - 1)

for k \geq 0. That is, once a certain number occurs at position p - 1, it will keep occurring in a regular pattern after that, at every position of the form p \cdot 2^k - 1. Note that h(2^k - 1) = 1 is a special case with p = 1.

Why is this? Intuitively, it’s because of the “copying”—the blue numbers in my previous post. Once a number occurs, it will keep getting copied, but with the distance between the copies doubling each time. But again, this is worth proving formally. I’ll let you do this one—the proof by induction is similar to the proof above!

But why do occurrences of 1 correspond to p = 1, and occurrences of 2 to p = 3, and occurrences of 3 to p = 5 and p = 7, and so on? Are there any patterns to be found here? This is what Dave explored next; let’s follow.

Primary occurrences and the inverse of h

Let’s think a little more about this equation h(p \cdot 2^k - 1) = h(p - 1). Let’s suppose that p is odd (if it isn’t, we can always take another factor of two out of it to combine with the 2^k). Notice that p \cdot 2^k - 1 is always odd, except when k = 0 (in which case p \cdot 2^k - 1 = p - 1, which is even). Moreover, we can always write any odd number in the form p \cdot 2^k - 1, where p is odd: just add one, and factor out as many copies of 2 as possible. For example, 11 = 12 - 1 = 3 \cdot 2^2 - 1.

This means that if q is any odd number, we can write it in the form q = p \cdot 2^k - 1, and so h(q) = h(p-1): that is, every odd position in the hyperbinary sequence is just copied from an earlier even position. Of course, this can also be seen from the recursive construction in my previous post; but now we have really proved it! And what about the even positions? The values of the hyperbinary sequence at even positions are obtained by adding, rather than copying. If n occurs at an even position, we will call it a primary occurrence of n.

The nice thing about primary occurrences is that (a) there are only finitely many primary occurrences of any particular n (challenge: prove this!), and (b) once we know the positions of the primary occurrences of n, we know the positions of all the occurrences of n: they are at positions of the form p \cdot 2^k - 1, where p - 1 is a primary occurrence.

So, given n, can we find all its primary occurrences? Essentially we are asking if we can compute h^{-1} (by which I mean the inverse relation of h; since h is not injective its inverse is obviously not a function), but leaving out non-primary occurrences since we can easily find those from primary ones. Let’s start by making a table with some small values of n:

n Primary occurrences
1 0
2 2
3 4, 6
4 8, 14
5 10, 12, 16, 30
6 32, 62
7 18, 22, 24, 28, 64, 126
8 20, 26, 128, 254
9 34, 46, 48, 60, 256, 510
10 38, 56, 512, 1022
11 36, 40, 54, 58, 66, 94, 96, 124, 1024, 2046
12 44, 50, 2048, 4094
13 42, 52, 70, 78, 112, 120, 130, 190, 192, 252, 4096, 8190
14 68, 80, 110, 122, 8192, 16382
15 72, 118, 258, 382, 384, 508, 16384, 32766
16 92, 98, 134, 158, 224, 248, 32768, 65534

Do you see any patterns? If so, can you prove them by induction? I’ll talk about a few in my next post. But I still don’t know whether there’s an effective way to compute h^{-1}(n)!

Challenge #12 solution, part II

Saturday, September 19th, 2009

Yes, that’s right, that Challenge #12, posted one year, five months, and a day ago. You see, I have this nasty habit of starting things and not finishing them… well, better late than never!

Question two of the aforementioned challenge asked this:

Given a positive integer n, in how many ways can n be written as a sum of powers of two, when each power is allowed to occur at most twice? Say h(n) denotes the number of different ways to write n as a sum of powers of two with at most two copies of each power allowed. What can you say about the function h?

These representations using at most two copies of each power of two are sometimes called hyperbinary representations. For example, 11 can be written as 2^3 + 2^1 + 2^0 or as 2^2 + 2^2 + 2^1 + 2^0. It turns out these are the only hyperbinary representations of 11, so h(11) = 2.

How to analyze this? We can use a similar approach to that outlined in the solution to part III (which asked a similar question, but allowing any number of copies of each power of two). First let’s think about odd numbers, that is, numbers of the form 2n + 1. Any representation of an odd number must include an odd number of copies of 2^0 (otherwise the sum won’t be odd!), but since at most two copies are allowed, there must be exactly one copy of 2^0. If we take away the 2^0 and divide everything else by two, we get a valid representation of n. The reverse works as well: if we start with any valid representation of n, multiply everything by two, and add a copy of 2^0, we get a valid representation of 2n + 1. This means that

h(2n+1) = h(n).

What about even numbers? A valid representation of 2n + 2 has either two copies of 2^0, or none. If it has none, we can simply divide by two to get a valid representation of n + 1 (and vice versa); if two, we can take away the copies of 2^0 and then divide by two to get a representation of n, and vice versa. Therefore,

h(2n+2) = h(n+1) + h(n).

These two equations, h(2n+1) = h(n) and h(2n+2) = h(n+1) + h(n), along with the fact that h(0) = 1, are enough to completely characterize h(n). We can easily compute h(n) for the first few values of n:

1,1,2,1,3,2,3,1,4,3,5,2,5,3,4,1,5,4,7,3,8,5,7,2,7,5,8,3,7,4,5,1…

It’s immediately striking, of course, that h(n) isn’t strictly increasing: that is, just because m > n doesn’t mean h(m) > h(n). In fact, even as the values of h(n) continue to get bigger on average, there are still some values of 1 hiding in there. In fact, h(0) = h(1) = h(3) = h(7) = h(15) = 1, and the obvious conjecture is that h(2^k - 1) = 1 for every k \geq 0.

The next thing we notice, after staring a bit, is that the h(n) sequence contains a lot of palindromes! Every section sandwiched in between two 1’s seems to read the same forwards and backwards. Let’s make some graphs so we can see these patterns more easily: below I’ve made graphs showing h(n) as n goes from 1 to 100, 1000, and 10000, respectively.

The hyperbinary sequence through n=100

The hyperbinary sequence through n=1000

The hyperbinary sequence through n=10000

Pretty! We can clearly see the symmetric structure here, and the way that it occasionally (every 2^k - 1) jumps back down to 1.

Here’s another way to think about building the sequence. We start with the first two elements:

1,\rrr{1}

Now we copy the red part onto the end of the sequence (the copy will be blue), leaving a blank space in front of each blue number:

1,\rrr{1},\blnk,\bbb{1}

Now we fill in each blank (in green) by adding the two numbers on either side of it:

1,\rrr{1},\gggr{2},\bbb{1}

Now the blue and green part becomes the new red part:

1,1,\rrr{2},\rrr{1}

Now we repeat! Copy the red part to the end, leaving a blank in front of each copied number:

1,1,\rrr{2},\rrr{1},\blnk,\bbb{2},\blnk,\bbb{1}

Fill in each blank by adding the two numbers on either side:

1,1,\rrr{2},\rrr{1},\gggr{3},\bbb{2},\gggr{3},\bbb{1}

And again: blue and green become the new reds, copy the red numbers to the end with a blank in front of each:

1,1,2,1,\rrr{3},\rrr{2},\rrr{3},\rrr{1},\blnk,\bbb{3},\blnk,\bbb{2},\blnk,\bbb{3},\blnk,\bbb{1}

And fill in the blanks by adding:

1,1,2,1,\rrr{3},\rrr{2},\rrr{3},\rrr{1},\gggr{4},\bbb{3},\gggr{5},\bbb{2},\gggr{5},\bbb{3},\gggr{4},\bbb{1}

And so on. It takes a little bit of thought, but it can be seen that the method I’ve described here is really just a different way of expressing the equations h(2n+1) = h(n) and h(2n+2) = h(n+1) + h(n)! But thinking about it this way makes it apparent why a 1 always shows up at h(2^k - 1), and also why we end up with palindromes.

I’ll stop there for now; in an upcoming post I’ll bring things full circle by showing how these numbers relate to the Calkin-Wilf tree of rationals.

Distributing cookies: solutions

Tuesday, April 14th, 2009

And now for some solutions to the cookie distribution problem. I’m actually going to describe four different methods of solution, and thereby (re)discover some nice combinatorial identities along the way. This is what I love about combinatorics—you discover all this rich structure just by coming up with different ways to count things!

Solution 1: Cookies and Fences

The first solution, as described in a comment by Mike, consists in considering permutations of cookies and fences.

Cookies and fences

If we set up the cookies in a row, we can use four fences to separate them into five sections; the first student will get the first section of cookies, the second student will get the second section, and so on. In the example above, student A gets one cookie, student B gets two, student C doesn’t get any, and students D and E get three and four, respectively. It’s easy to see that every distribution of cookies corresponds to an arrangement of fences, and vice versa. So, how many ways are there to arrange the fences? Think of it this way: we have fourteen “slots”, and into each slot we can put either a cookie or a fence. Choosing a distribution is just choosing in which of the fourteen slots to put the four fences, and there are \binom{14}{4} = 1001 ways to choose four out of fourteen slots. (If you don’t know about binomial coefficients, you can read about them here.) Incidentally, this is also the number of ways to distribute four fences to eleven students; just use the cookies to divide the fences into eleven sections (most of which will, of course, be empty). Although I have no idea why students would want fences.

This can be easily generalized: the above argument shows that there are \binom{c + s - 1}{s - 1} ways to distribute c cookies to s students.

So, the problem is solved, right? What more can there be to say about it?

…plenty! This is a nice elegant solution, but it takes some cleverness (and/or luck, and/or experience with this sort of problem) to see it, and in fact it isn’t the first solution I came up with. Looking at some different methods of solution will also illuminate some of the rich structure of binomial coefficients.

Solution 2: Don’t Be So Mean

It seems mean that sometimes some students don’t get any cookies at all. Wouldn’t it be much nicer if every student got at least one cookie (even if they still might be distributed unevenly)? The key insight, as pointed out by Jonathan, is this: the number of ways to distribute ten cookies to five students is the same as the number of ways to distribute fifteen cookies to five students in such a way that each student gets at least one cookie.

Why is this? Well, suppose you have distributed fifteen cookies to five students so that each student has at least one. Now take one cookie away from each student—and you have now distributed ten cookies to five students. Conversely, if you have distributed ten cookies to five students somehow, if you now give each student an extra cookie, you have distributed fifteen cookies and each student has at least one. Each distribution of ten cookies corresponds to precisely one distribution of fifteen cookies where each student gets at least one, and vice versa, so counting one is the same as counting the other. Now, how many ways are there to distribute fifteen cookies like this? We can line the cookies up and imagine choosing points at which to divide them; to divide them into five groups, we have to choose exactly four out of the fourteen possible places to divide the cookies. (There is at least one cookie between any two dividing points, which means that each student has to get at least one cookie.)

Cookies and dividers (each student gets at least one cookie)

The above illustration shows a distribution in which students A, B, C, D, and E get two, three, one, four, and five cookies respectively. Note that this is the distribution of fifteen cookies that corresponds to the distribution of ten cookies illustrated with the fences previously.

So, the number of ways to distribute fifteen cookies to five students so that each gets at least one is \binom{14}{4}; more generally, the number of ways to distribute c cookies to s students so each gets at least one is \binom{c-1}{s-1}. Also, the total number of ways to distribute c cookies to s students is the same as the number of ways to distribute c+s cookies to s students so that each gets at least one.

Solution 3: One at a time, please!

Another way of approaching the problem comes courtesy of meichenl. To distribute cookies to s students, we first have to give some number of cookies to the first student; we can give her anywhere from zero to ten cookies. If we give her k cookies, there are c – k cookies left to distribute to the other s – 1 students. If we let D(c,s) stand for the number of ways of Distributing c cookies to s students, we can write this observation as


\displaystyle D(c,s) = \sum_{k=0}^c D(c - k, s - 1).

That is, we split the distribution up into cases based on how many cookies we give to the first student; in each case we’ve reduced the problem to a simpler distribution problem. Adding everything up, we get the above recurrence—we’ve defined D(c,s) recursively, that is, in terms of itself. This definition actually gets off the ground because we can identify some base cases, that is, simple situations in which we know the answer without breaking things down further. In particular, D(0,s) = 1 for any s (there’s only one way to distribute zero cookies—namely, to give zero cookies to each student), and D(c,1) = 1 for any c (there’s only one way to distribute cookies to a single lucky student!).

So, once we have this recurrence, what do we do with it? We can use it for direct calculation, as Mark James apparently did; he also used the recurrence in a computer program to print all the distributions. As observed by Daniel Klein, this sort of recurrence is also perfectly suited for a programming technique called dynamic programming; it’s easy to turn this recurrence into a computer program to compute D(c,s) for any values of c and s (although in this particular case, as we’ve seen, there are more efficient ways to compute D(c,s)).

But here’s another interesting thing: we already know what D(c,s) is, from solutions 1 and 2: it’s \binom{c+s-1}{s-1}. So if we substitute this into the above recurrence, we get


\displaystyle \binom{c+s-1}{s-1} = \sum_{k=0}^c \binom{c - k + s - 2}{s - 2}

If we substitute d + 1 in place of s – 1 everywhere, we can make this a little nicer without changing anything (d is just a new variable I made up, it doesn’t stand for anything in particular):


\displaystyle \binom{c+d+1}{d+1} = \sum_{k=0}^c \binom{c + d - k}{d}

We’ve discovered an identity involving sums of binomial coefficients! Neat! Now, you might think, “OK, let’s prove this by induction.” But we need do no such thing. We have already argued that the expression on the left and the expression on the right count the same things—so they must be equal! This is the epitome of a combinatorial proof: to show that two things are equal, just show that they represent two different ways of counting the same thing.

Solution 4: How mean do you want to be?

The last solution (at least, the last solution I will discuss in this post!) comes courtesy of Dave. Instead of breaking the distribution into cases according to how many cookies we give to the first student (as in the previous solution), we can break it into cases according to how many students get any cookies: we can give all the cookies to one student, or give them all to two students, or three, and so on.

The first case is easy: if we give all the cookies to one student, there are 5 ways to choose which student to give the cookies to, and only one way to give them the cookies. What if we give all the cookies to two students? Well, there are \binom{5}{2} ways to choose which two students get cookies. Now we want to distribute the ten cookies to these two students in such a way that each student gets at least one cookie (otherwise we are back in the first case). As previously noted in solution 2, there are \binom{9}{1} ways to do this. If three students get cookies, there are \binom{5}{3} ways to choose the three students, and \binom{9}{2} ways to distribute the cookies, and so on. Generalizing, we see that


\displaystyle D(c,s) = \sum_{k=1}^s \binom{s}{k} \binom{c-1}{k-1}.

Egads! We’ve discovered another combinatorial identity, this one involving a sum of products of binomial coefficients!


\displaystyle \binom{c+s-1}{s-1} = \sum_{k=1}^s \binom{s}{k} \binom{c-1}{k-1}.

There’s yet another method of solution I’m aware of—the theory of combinatorial species and generating functions—but that can wait for another post, I think we’ve had quite enough combinatorial goodness for one day!

Chessboard counting: solutions and further challenges

Monday, March 30th, 2009

And now for some solutions to the chessboard counting challenges.

  1. The first challenge was to count the number of squares of any size on an 8×8 chessboard. The key here (as with many counting problems) is to break the problem down in an appropriate way. In this case, we can think about counting each size square separately. There is only one square with sides of length 8. There are 4 squares of size 7—they can be in one of two positions horizontally, and one of two positions vertically. It’s not hard to see that there are 3^2 = 9 squares of size 6, 4^2 = 16 of size 5, and so on. So the total number of squares is 1^2 + 2^2 + \dots + 8^2 = 204.
  2. This pattern continues; on an n by n chessboard, there are 1^2 + 2^2 + \dots + n^2 = \sum_{k=1}^n k^2 total squares. There’s actually a nice closed formula for this sum for general n: \frac{n(n+1)(2n+1)}{6}. (Showing how to derive this formula is a good subject for another post. We can check it for n = 8: (8 \cdot 9 \cdot 17)/6 = 4 \cdot 3 \cdot 17 = 204, as expected.)
  3. Counting rectangles instead of squares turns out not to be too much harder. There are 8 \cdot 8 1 by 1 rectangles, 8 \cdot 7 1 by 2 rectangles, 7 \cdot 8 2 by 1 rectangles, and so on. In general, there are (8 - j + 1)(8 - k + 1) j by k rectangles. So the total number of rectangles is

    \scriptstyle (8 \cdot 8 + 7 \cdot 8 + \dots + 1 \cdot 8) + (8 \cdot 7 + \dots + 1 \cdot 7) + \dots + (8 \cdot 1 + \dots + 1 \cdot 1)

    That looks annoying to add up, but there’s a clever trick we can pull. Notice that we can factor an 8 out of the first parenthesized expression above, a 7 out of the second, and so on, yielding

    \scriptstyle 8(8 + 7 + \dots + 1) + 7(8 + 7 + \dots + 1) + \dots + 1(8 + 7 + \dots + 1)

    But now we can factor (8 + 7 + \dots + 1) out of this, yielding

    (8 + 7 + \dots + 1)^2 = 36^2 = 1296.

    Ah, that’s much nicer!

  4. For a general m by n chessboard, the same method yields

    (m + (m-1) + \dots + 1)(n + (n-1) + \dots + 1)

    total rectangles. Using the closed formula for the sum of the numbers from 1 through n, we can rewrite this as mn(m+1)(n+1)/4.

See the comments to the previous post for some further analysis. Also in the comments, Veky posed the following follow-up challenge: how many squares (or rectangles) are there if we also allow squares (rectangles) which are not parallel to the sides of the chess board—the only restriction is that their vertices must occur at “crosshair” points in between the squares of the chess board?

Predicting Pi: solution

Thursday, October 23rd, 2008

Now for the solution to the question in my previous post, which asked what you can learn about \pi, given the sequence of integers \floor{\pi}, \floor{2\pi}, \floor{3\pi}, \dots.

Nick Johnson commented:

Well, the obvious thing one can learn given just |(10^n)r| is the first n digits of the decimal expansion of r.

For example, when we are given \floor{100\pi} = 314, we can say with confidence that \pi \approx 3.14\dots. If we wait long enough, we can learn as many decimal digits of \pi as we want; in particular, to learn the first n decimal digits of \pi, we can just wait for the 10^nth item of the list.

This seems like an awfully long time to wait, though. Not only that, but if we only look at the first, tenth, hundredth, thousandth,… elements of the sequence, we would be ignoring almost all the information in the sequence! Intuitively, it seems that we should be able to do better by taking more of the sequence into account. And indeed, we can.

The key is to realize that


\floor{n\pi} \leq n\pi < \floor{n\pi} + 1.

Do you see why? Taking the floor of something never makes it bigger, so \floor{n\pi} \leq n\pi (in fact, it can never be equal since \pi is irrational, but that’s not important for our purposes). Also, taking the floor of something reduces it by some amount less than one, so adding one to the floor gives us something bigger than the original; hence n\pi < \floor{n\pi} + 1. Dividing through by n, we find that


 \displaystyle \frac{\floor{n\pi}}{n} \leq \pi < \frac{\floor{n\pi} + 1}{n}.

That is, if the nth element of the sequence is k, then we know that \pi must be between k/n and (k + 1)/n. Of course, this works for any number r, not just for \pi.

Let’s see how this works. The first few numbers in the sequence \floor{\pi}, \floor{2\pi}, \floor{3\pi}, \dots are 3, 6, 9, \dots. After seeing the 3, we know that 3/1 \leq \pi < 4/1. After seeing the 6, we know that 6/2 \leq \pi < 7/2 -- so we have a better upper bound now (3.5) than we did before (4). After seeing the 9, we know 9/3 \leq \pi < 10/3, and so on. Notice that our lower bound hasn't changed yet. That won't happen until we get to \floor{8\pi} = 25, when we learn that 25/8 \leq \pi < 26/8. So our new lower bound is 3.125. But the upper bound at this step, 26/8 = 3.25, is actually worse than the upper bound that we would have found on the previous step, namely 22/7 = 3.142857... So in general, the upper and lower bounds that we find in each step might not be better than all the previous ones; we can just keep the greatest lower bound and the least upper bound that we've seen so far.

So, how good are these bounds? I wrote a little program to compute the best upper and lower bounds for various points in the sequence. Here's a table showing the best lower and upper bounds at various points in the sequence, and the decimal digits they give us confidence about.

Term Best bounds \pi so far
10 \frac{25}{8} < \pi < \frac{22}{7} 3.1
100 \frac{311}{99} < \pi < \frac{22}{7} 3.14
1000 \frac{2818}{897} < \pi < \frac{355}{113} 3.1415
10000 \frac{31218}{9937} < \pi < \frac{355}{113} 3.141592
100000 \frac{208341}{66317} < \pi < \frac{312689}{99532} 3.141592653
1000000 \frac{3126535}{995207} < \pi < \frac{1146408}{364913} 3.1415926535

As you can see, this method does much better than the method of just looking at powers of ten! At n = one million, we already know ten decimal digits of pi; by looking just at the one millionth element of the sequence, we would only know six digits. It turns out that in general, it is the case (which I will not prove) that on average, we can get approximately twice as many decimal digits by finding best upper and lower bounds this way instead of just looking at powers of ten.

In a future post I hope to make a graph of these upper and lower bounds and talk a little bit more about what’s going on—it turns out to be pretty interesting stuff!

Challenge #12 solution, part III

Thursday, April 24th, 2008

And now for the solution to problem #3 from Challenge #12, which asked: how many ways are there to write a positive integer n as a sum of powers of two, with no restrictions on how many powers of two may be used?

(more…)

Challenge #12 solution, part I

Monday, April 21st, 2008

I’ll begin by providing an answer to the first of the three questions I posed in a previous post.

(more…)

Perfect numbers, part III

Tuesday, November 27th, 2007

This is the last in a series of posts about perfect numbers. A quick recap of the series so far: in part I, I defined perfect numbers as positive integers n for which \sigma(n) = 2n, where \sigma(n) denotes the sum of the divisors of n. I also revealed that if n is factored into prime powers as n = p_1^{\alpha_1} p_2^{\alpha_2} \dots p_m^{\alpha_m}, then \sigma(n) can be calculated as follows:

\displaystyle \sigma(n) = \sum_{i=1}^m \frac{p_i^{\alpha_i + 1} - 1}{p_i - 1}

In part II, we saw where this formula actually comes from. Finally, in the challenge interlude, DB and Steve Gilberg (and you?) found that the first four perfect numbers seem to all be of the form 2^k (2^{k+1} - 1); DB additionally claimed that this only works when 2^{k+1} - 1 is prime.

Well, it’s true: 2^k (2^{k+1} - 1) is perfect whenever 2^{k+1} - 1 is prime. But you don’t have to take my word for it—let’s prove it! Since n is perfect if \sigma(n) = 2n, we want to show that \sigma(2^k (2^{k+1} - 1)) = 2^{k+1} (2^{k+1} - 1) if 2^{k+1} - 1 is prime. Applying the formula for \sigma(n), this is just some straightforward algebra. (Note: by “straightforward” I don’t mean “you’re dumb if you don’t see it immediately”; I mean “if you’re patient and persistent, you should be able to work it out for yourself”. Mathematicians are fond of saying that things are “obvious” or “straightforward”, but this is what they actually mean.)

 $ \begin{align*} \sigma(2^k (2^{k+1} - 1)) &= \frac{2^{k+1} - 1}{2 - 1} \cdot \frac{(2^{k+1} - 1)^2 - 1}{(2^{k+1} - 1) - 1} \\ &= (2^{k+1} - 1) \cdot \frac{2^{2k+2} - 2^{k+2} + 1 - 1}{2^{k+1} - 2} \\ &= (2^{k+1} - 1) \cdot \frac{2^{k+1}(2^{k+1} - 2)}{2^{k+1} - 2} \\ &= (2^{k+1} - 1) \cdot 2^{k+1} \end{align*} $

Voila! Just what we wanted to show. Note that the restriction that 2^{k+1} - 1 must be prime is very important: the formula for \sigma(n) assumes that n is factored as a product of prime powers, so the computation above is invalid if 2^{k+1} - 1 can be factored further.

So, the question becomes, when is a number of the form 2^m - 1 prime? Well, first of all, m must be prime; if m can be factored as m = pq, then 2^m - 1 can also be factored, as (2^q - 1)(2^{q(p-1)} + 2^{q(p-2)}} + \dots + 2^q + 1). But is it enough for m to be prime?

Notice that the first four perfect numbers correspond to the first four primes, 2, 3, 5, and 7:

 $ \begin{align*} 6 &= 2^1 (2^2 - 1) \\ 28 &= 2^2 (2^3 - 1) \\ 496 &= 2^4 (2^5 - 1) \\ 8128 &= 2^6 (2^7 - 1) \end{align*} $

The next one, however, dashes our hopes: 2^{11} - 1 = 2047 = 23 \cdot 89. Now, it is important to note that this doesn’t necessarily mean that 2^{10} (2^{11} - 1) = 2096128 isn’t perfect: we have only proven above that 2^k (2^{k+1} - 1) is perfect when 2^{k+1} - 1 is prime, which doesn’t necessarily say anything about what happens when it isn’t. However, it turns out that this is indeed true. In fact, more is true: all even perfect numbers must be of the form 2^k (2^{k+1} - 1), with 2^{k+1} - 1 prime. There are no other kinds of even perfect numbers. So, we saw that 11 doesn’t give us a perfect number, but it turns out that the next three primes (13, 17, and 19) do:

 $ \begin{align*} 2^{12} (2^{13} - 1) &= 33550336 \\ 2^{16}(2^{17} - 1) &= 8589869056 \\ 2^{18}(2^{19} - 1) &= 137438691328 \end{align*} $

are all perfect. But then a few more primes get skipped; the next perfect number corresponds to 31.

Primes of the form 2^m - 1 might ring a bell for long-time readers of this blog: these are the Mersenne primes! As of right now, we know of 44 Mersenne primes, and therefore we know about 44 perfect numbers. The largest known Mersenne prime has almost ten million digits, so the largest known perfect number has about twice that many!

Now, what about odd perfect numbers? Are there any odd numbers which equal the sum of their proper divisors?

…No one knows!!