A Solution to a Generalized Pepys’s Problem

Samuel Pepys, the famed English diarist, proposed the following problem to Isaac Newton, in order to gauge his chances in a wager:

Which experiment is more probable?
A. You roll 6 fair die and get at least 1 dice that shows up as “6″;
B. You roll 12 fair die and get at least 2 die that show up as “6″;
C. You roll 18 fair die and get at least 3 die that show up as “6″.

The correct answer turns out to be experiment A. Based upon the correspondence between Newton (who provided the correct solution) and Pepys, the wagerer was not immediately convinced of the answer. When challenged, Newton attempted to give an intuitive explanation of the answer, though it is technically incorrect/incomplete (Newton probably knew this). Here it is, paraphrased from the Ye Olde Englishe version:

Imagine that B and C toss their dice in groups of six. A is most favorable because it requires a 6 in only one group toss, while B and C require a 6 in each of their group tosses.

Astute readers will recognize that this intuitive explanation neglects situations where C might get two 6′s in the first or third set, for example.

This problem has been discussed often enough that it has been titled the “Newton-Pepys Problem“, and one is likely to find it given as an elementary probability exercise. I myself came across a more general version of Pepys problem: I win if, in 6n independent rolls of dice I acquire at least n 6′s. You win if, in 6(n + 1) independent rolls of dice you acquire at least n + 1 6′s. Who is more likely to win?

When n = 1 or 2, we know from Newton that I am more likely to win (yay!), but what about for all n? The answer is of course that I still am more likely to win, but how goes the proof? It is unclear whether Newton had a general solution, as his letters to Pepys indicate a purely numerical approach (he calculated the probabilities by hand for A, B and C). My guess is that he had an elegant solution that unfortunately could not fit in the margin of his letter.

And surprisingly, the solution to this general Pepys’s problem cannot fit within the confines of a Wikipedia entry or Wolfram Mathworld page either! The latter aludes to a general solution, which involves this horrifying “regularised hypergeometric function” – what the hell is that?! No, no, there has to be a better way.

Let D_k be the random variable that counts the occurrences of 6′s in k independent dice rolls (fair die). Then we are comparing probabilities \mathbb{P}(D_{6n} \geq n) vs. \mathbb{P}(D_{6n + 6} \geq n+1). Thanks to independence, we can write D_{6n + 6} = D_{6n} + D_6.

We can forget that the dice have six sides, and instead pretend they’re coins, where you get heads (6) with probability 1/6, and get tails (1,…, 5) with probability 5/6. Then the D_k has a binomial distribution:

\mathbb{P}(D_{6n} \geq n) = \sum_{j=n}^{6n}{\binom{6n}{j} p^{j}(1-p)^{6n-j} }
\mathbb{P}(D_{6n+6} \geq n+1) = \sum_{j=n+1}^{6n+6}{\binom{6n+6}{j} p^{j}(1-p)^{6n+6-j} }

with p = 1/6. But treating D_{6n + 6} as a convolution of D_{6n} and D_6, we can instead write:

\mathbb{P}(D_{6n+6} \geq n+1) = \sum_{j=n+1}^{6n+6} \sum_{y = 0}^{6} \mathbb{P}(D_{6n} = j - y) \mathbb{P}(D_{6} = y)

By implementing the change of variables z = j - y, we have

\mathbb{P}(D_{6n+6} \geq n+1) = \sum_{z=n+1}^{6n}{\mathbb{P}(D_{6n} = z) } + \Lambda

where \Lambda = \sum_{z=n-5}^{n}{\mathbb{P}(D_{6n} = z) \sum_{y=n-z+1}^{6} {\mathbb{P}(D_6 = y) }}. That is the first trick.

Note that \mathbb{P}(D_{6n} \geq n) = \sum_{z=n}^{6n} {\mathbb{P}(D_{6n} = z) }, so in order to compare  \mathbb{P}(D_{6n} \geq n) vs. \mathbb{P}(D_{6n + 6} \geq n+1) we can compare \Lambda against \mathbb{P}(D_{6n} = n).

Assuming that n > 6 (we’ll leave cases n = 1,\ldots,6 to the reader), \mathbb{P}(D_{6n} = n - h) < \mathbb{P}(D_{6n} = n) for h = 1,\ldots,5.  Here comes the second trick. Note that \sum_{z=n-5}^{n}{\sum_{y=n-z+1}^{6} {\mathbb{P}(D_6 = y) }} = \sum_{j=0}^{6}{\binom{6}{j} jp^j(1-p)^{6 - j}} = 6p = 1, which follows from this identity. Thus, we know that in the \Lambda sum, the \mathbb{P}(D_{6n} = z) are being weighted by numbers that sum up to 1, and so the average must be smaller than the largest probability, which is \mathbb{P}(D_{6n} = n). Thus, \Lambda < \mathbb{P}(D_{6n} = n), and so \mathbb{P}(D_{6n} \geq n) > \mathbb{P}(D_{6n + 6} \geq n+1). I win!

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s