Main
Sequence Page 5
A Little Probability: Casting Out Dice
Cut to the chase: My current best solution
Cut to the chase: C Program
for Empirical Values
Cut to the chase: C
Program for Theoretical Values
Contents
Introduction
Methods
First Guess for <T(N)>
Some Notation
PDF(1)
CDF(1)
<T(1)>
<T(1)> != <Tbad(N)>
<T(N)>
Loose Ends
Spontaneous Processes
Expectation in Physics
Where Next?
Introduction
If you've ever played a dice game (say Risk
for example, or D&D) where
rolling high numbers is desirable, you may feel a small twinge of
happiness every time you throw a die and it comes up 6. I do,
and so one day I found myself playing a silly game with a big handful
of standard dice. I would throw them all at once and set aside any dice
that
came up 6 from the group; those were the good dice. Then I would scoop
up the remainder (those
that weren't showing 6 pips) and
throw these,
repeating the remove-the-6's procedure after each throw. Of course the
idea is to try and get all the dice to read 6 in as few throws as
possible.
Clearly this is a game of skill.
After playing this game for a little while I wondered how many throws
of the dice it ought to take
in order to complete
the game. This led to a study of probability written up on
this page. Since I'm re-discovering these ideas and trying to stay
reasonably concise, this page will refer to some background concepts
listed here.
To clarify the statement of the game: What we seek is an expectation
value <T> which is the number of times we expect to throw a
handful of N dice such that each die comes up a '6' at least once.
Another way to think of <T> is in terms of an experiment
conducted many times. Suppose we have 17 dice and we throw them
repeatedly, pulling out the ones that come up 6 until we have none
left. (We could also paint them blue and keep throwing them until all
the dice are blue. Since they are assumed to operate independently it
doesn't matter if we keep throwing the ones that have already come up
six.)
Suppose this first trial with the 17 dice took 12 throws to eliminate
all the dice from the game; this is the result for the first
experiment. We repeat this experiment many times, each time recording
how many throws were necessary to finish the game. Finally after we've
done the experiment
many times we average the number of throws we counted for each game to
get <T(17)>. We'd like to arrive at a general formula for any
number of dice, and we can call this <T(N)>.
Here's the thing: When first looking at this problem I found a solution
to the basic
question that worked pretty well for a large initial number of dice. In
fact I
imagined at the time that I had solved the problem. But strangely the
solution failed miserably for one or two or
three dice. And furthermore the rationale used to produce the
formula was very suspicious! Refining my original guess to one that
actually works for any number of dice was then my new objective, and
this effort produced dozens of pages of mistakes and blind alleys and
incorrect results... but I think I finally got it.
So: The question again:
Beginning
with N dice, how
many throws <T(N)> will be necessary (on average) to produce all
6's?
This is the same as asking what is <T(N)> where each die has come
up 6 at least once?
The failure of my original hypothesis (for small N) strikes me as an
amusing microcosm of an aspect of science. Reading science history it
seems the rule rather than an exception that the early answers to any
question are
found wanting and it takes some time before a deeper and more
complete explanation is found.
Methods
Let's face it, the novelty of counting the number
of throws necessary to reduce 30 dice to zero wears thin in a hurry,
and that's just one data point.
I have used a computer program
that rolls many dice very quickly to arrive at my empirical or
experimental values for <T(N)>. I can then compare these results
to theoretical tries to see how my theory looks. The theoretical values
I can calculate using a different computer program since I wish to
compare <T(N)>theory with <T(N)>experiment
for at least N = 1, 2, 3, ..., 150.
First
Guess for <T(N)>
There are proper rules for probability, and then there are made-up
hokey rules. I'll start with made-up hokey rules to arrive at that
initial incorrect
solution. Here is a first question leading toward the incorrect
solution:
What is <T(N)> if
the dice behaved perfectly but were not discrete?
This is a really bizarre and unclear statement. For one thing, perfect
behavior implies the dice are discrete; a die can come up a 6 or not;
it can not come up 30% 6 and 20% 5 and 50% 1, 2, 3, or 4. So is
this question nonsense, a contradiction in terms? Actually what I mean
by 'not discrete' is this:
Suppose the dice are capable
of being cut into
pieces, 'fractional
dice'. Suppose further that of any number of dice (like say 19.4 of
them) can be thrown and exactly 1/6 of them will come up 6 and will
therefore be eliminated from the game. This is what I
mean by perfectly behaved but non-discrete dice: They distribute their
rolls evenly among all 6
possible outcomes, even though there is a non-integral number of dice.
As an aside: I am essentially using a large-numbers
approximation here by assuming 1/6th of my dice can be eliminated.
Small wonder this solution does not work well for small numbers of
dice! Here is a digression
on the
definitions of discrete and continuous types of probability.
To continue with this derivation: Suppose I start with N dice
and 1/6 of them turn up 6's. Then after 1
throw I have R(1) = (5/6)xN
dice
left. After 2 throws I have R(2) =
(5/6)x(5/6)xN
dice remaining. After t throws
I have R(t) = N x (5/6)t
dice remaining. If I
set R(t) = 0 and solve for t then t will be my answer
for <T(N)>. Simple!
(5/6)t x N = 0
This unfortunately has the solution t
=
+infinity. But let's not give up hope. We could fudge a little
and say "Let's see
how many
throws it will take to wind up with 1 die remaining." Now my equation
will have a finite solution:
(5/6)t x N = 1 and t = ln(N) / ln(6/5) = 0.182 ln
N.
Then I say "Hah! I have tricked the problem into telling me how many
rolls it takes to eliminate N-1
dice, since I began with N and
I now
have 1 left. So I can say
<T(N-1)> = 0.182 ln N
or:
<T(N)> = 0.182 ln (N+1)
Of course this is not quite correct because I had that extra die riding
along on my way to eliminating N
dice, and it might have come up 6 in there somewhere in which case...
I'm done before I finish... but this is very confusing! Another
thing I could do is take <T(N-1)> = 0.182 ln N and add 6 more
throws to get rid of that last die. I should
expect to have to throw a die six times in order to expect it to come
up six at least once. Or maybe only three times since it could come up
in the first 3 throws or the second three throws with equal probability.
You see how muddled my reasoning is becoming here? This is all
just casting about for a clue. It's not correct!
What if I
use this fractional-dice idea and start with N dice and roll
until I have just
1/2 a die left? In this case since I can't really have half a die left, maybe
it goes poof and becomes zero dice precisely at that point. So using
the same equation I'll
suppose that I'm finished as soon as I have 1/2 a die left:
(5/6)t x N = 1/2 and t = ln(2N) / ln(6/5) = 0.182
ln
(2N).
This is my phony answer which seems to work quite well for N = 20, 21,
22, ... and so on for larger values of N, the number of dice I begin
with. Is that weird or what?
To reiterate: I start with a large number
of dice, I assume that every time I roll them precisely 1/6 of them go
away even if this is a non-integer value, and once I have 0.5 dice left
it collapses to zero dice and I'm done and that is my value of
<T(N)>. Very very suspicious argument!
As I mentioned this formula does not work
for smaller numbers of dice, like for N
= 1 it gives <T(1)> = 3.8.
What is <T(1)> really?
This I can calculate precisely by other
means, so I'll go on to that next. Once more to summarize the
bad answer:
<Tbad(N)>
= ln(2N) / ln(6/5).
Some
Notation
I do not have the means at this time to render the math in a
pretty way. Therefore I will resort to three pieces of notational
subsitution.
Sum (i = 1 to N) { f(i) }
<==> Summation as index i
goes from 1 to
N of some function f(i).
Lim (t -> +inf) { g(t) }
<==> The limit as t
goes to +infinity of some
function g(t).
(a b) <==> 'a choose b' or Combination(a b), equal to a! /
((a-b)! b!).
What is
pdf1(t)?
Note: In the above
question, 't' means 'upon failing to throw a 6 t-1 times in a row,
throw t comes up 6'.
Note: In the above question, the
subscript 1 means "using one die".
From here I am jumping into the use of a probability distribution
function for the discrete case of fair dice. There is a note on this here.
Now I'm trying to proceed properly to eventually arrive at an
expression for <T(N)>
that is correct
for all N. I'll start with <T(1)> which is easy enough to
get and to do that I proceed
in four steps.
[1]
The first step is easy: Throw away
the rubbish from the previous section and just start with P(6) = 1/6. This is the assumption
that the dice are all fair, that the probability or rolling a 6 is one
in six.
[2]
The expression for <T(1)>
depends on the discrete
probability density function pdf1(t).
Step 2 then is calculating pdf1(t).
[3]
Then I should verify it
sums to 1. This sum is called the
cumulative distribution function cdf,
so showing cdf1(t ->
+infinity) = 1 is step 3.
[4]
In step 4 I'll derive <T(1)>.
Step
2 Question: What is pdf1(t), the probability of rolling a 6
on throw t?
(By the rules of the game I must take into account that
I might
already have rolled a 6 on an earlier throw
in which case the
probability on throw t
is zero
because the die has been eliminated from play.)
To do this in the "counting
ways" method I can imagine that a die is thrown t-1 times without ever coming up 6.
Here are some spaces for writing down the results of these t-1 throws:
__
__ __ __ __ __ __ ... __
1 2 3 4
5 6 7 t-1
What rolls can go in these spaces? On throw 1 I could have got a 1 or a
2 or a 3 or a 4 or a 5, 5 ways. Same thing for roll 2 and for roll 3
and so on. Since these throws are independent of one another, citing Idea 4
about independent events, the total number of ways of producing these
desired t-1 throws is obtained
by multiplying the ways for each throw: 5 x 5 x 5 x 5 x 5 x ... x 5 = 5t-1 = ways to reach the desired result
(no 6's came up).
To count the total number of ways of throwing a die t-1 times I do exactly the same
thing only this time allowing 6's as well, to arrive at total ways = 6t-1.
Finally once we've established the probability of arriving at throw t without having rolled a six, which
is the number of possible ways
divided by the total number of ways,
we ask: What is the probability of now finally rolling a 6 on throw t? Answer: 1/6. So the aggregate
probability on t throws of:
Not rolling 6 on throw 1 and Not
rolling 6 on throw 2 and ... and Not rolling 6 on throw t-1 and Rolling
6 on throw t is:
P(6
on throw t) = (5/6)t-1 x (1/6) = pdf1(t)
for one die.
Great. We have a probability distribution function for a single die,
and this is the cornerstone of the calculation of <T(1)>.
Step [3]:
Does
cdf1(t -> +infinity) = 1?
Bearing in mind that
cdf(t) = SUM(i=1 to t) { pdf(t) }.
Step 3 Question:
Does the cumulative distribution function or cdf for
the pdf derived in Step 2 indeed sum to 1?
In full I want to calculate LIM(tau
-> +infinity) { SUM(t=1 to tau) { pdf1(t)
=
(1/6) x (5/6)t-1 } }.
The solution is found from a
very pretty idea in algebra. The idea is that the series
S = 1 + a + a2 + a3 + a4 +
... + at
can be multiplied by -a to produce a new series
T = -a - a2 - a3 - a4 - a5 -
... - at - at+1.
When these two series S and T are added together, only the first and
last terms remain: S + T = 1 - at+1. The rest of the terms
'telescope' away to nothing. Hence writing T as -aS:
S (1 - a) =
1 - at+1
or
S = (1 - at+1) / (1 - a)
(I got this nice idea from Apostol's Calculus-I.) As a result
the sum S for a=(5/6) is:
S = (1/6) x (1 - (5/6)t+1) / (1
- (5/6)).
In the limit t -> +infinity (where it is vital to note that 5/6 is
less than
one) this becomes
S = (1/6) x (1) / (1 - (5/6)) = 1.
That is, the cdf corresponding
to pdf1(t) converges
nicely to one as t goes to
infinity, meaning that if I throw a
single die for a really really long time, the probability that I will
sooner or later roll a six is very good. So far so good.
What is
<T(1)>?
<T(1)> = 6. But I should add some text here.
Notice
that <T(1)> and <Tbad(N)> do not agree
Perhaps there is a conceptual framework that permits us to take
the "dropping out 1/6th" approach in the limit of large N that requires
a different way of finishing. But this is taking a false result and
trying to shove it into a round hole. Better is to start over from the
ground up and derive a correct answer.
What is <T(N)>?
Going from <T(1)> to <T(N)> is not simple, but it's
not too horrible either. As for T<1>: I figured out a PDF, made
sure it was normalized (summed to 1), and from there wrote the
expression for <T(N)>, this time as the double sum where t =
1,..., inf and i = 1, ..., N.
<T(N)> = Sum (t = 1, inf) { t x Sum (i = 1, inf) { F1 x F2 x F3 x
F4 } }
The internal Sum over the index i is the expression for the pdf. It
remains to specify the four factors F1, F2, F3, F4. They represent the
probability of 'finishing the game' on throw t. That is, assuming 'i'
dice remain after t-1 throws, what is the probabiity they all come up
'6' on throw t? This is a compounded probability, hence four factors;
it must not only deal with the last question but should also include
the probability of arriving at that state, having precisely 'i' dice
remaining in the game.
F1 = (1/6)^i
This is the probability of rolling 'i' sixes on throw t.
F2 = N choose i
This is the number of ways of selecting i dice from the original pool
of N dice.
F3 = ((5/6)^(t-1))^i
This is the probability that i dice were thrown t-1 times and
never came up 6.
F4 = (1-(5/6)^(t-1))^(N-i)
This is the probability that the remaining (N-i) dice were thrown (t-1)
times and came up 6 at least once.
So that's it, and it agrees well with experiment. Can it be simplified
algebraically? Approximated? I don't know just yet.
Here
is code for the experimental calculation of <T(N> and here is code for the
theoretical calculation.
Some
Loose Ends
Plot <T>experimental versus <T>theoretical.
Does <T> match <Tbad> for large N?
Does <T> match the cheater's-solution that uses N+1?
BThm and Fibonacci
Radioactive Decay
How does this relate to
spontaneous processes?
Since our dice behave a little like unstable atoms waiting
around to fall apart (come up 6) there is a nice connectivity between
the dice problem and radioactive decay. Is the early "wrong" guess more
applicable when we consider huge numbers of atoms or dice?
How does one approach time
expectation in physics?
The various stages in the proton chain discussed on Page 2 each have
associated times, i.e. how long it takes for them to occur in the
center of a star. Each process or reaction happens at a characteristic
rate.
WHY??? What makes one particle likely to change structure in a matter
of
minutes while another requires a million years on average? (A
partial answer: It depends upon the
energy structure of the process in question. I think that many
reactions require tunneling through a potential barrier and the harder
this is the longer it will take to occur.)
The connection of course is that the dice problem considered here,
though discrete, employs ideas useful in probability theory for
physical processes. Or so I hope. If I can think about rolling dice and
make the connection from problem to pdf to expectation, then the hope
is that the thoughts will carry through to a derivation of the Planck
radiation law (for example).
Where next?
So far in this sequence of web pages I've thought about cosmic
rays, photoelectrons, synthesis of
elements in stars, and cellular automata (in addition to
rolling a few handfuls of dice). At this point I think it's important
to try to solve a more difficult problem from physical theory. I
have an idea of what that problem should be and it comes from this
realization: I don't understand something properly if I can use my
poor understanding to make up things that don't exist. That is, the
things I
make up that don't exist don't exist for a reason, so if I don't know
the reason then I'm missing something.
Example: The photon is a little quantum of energy that flies
along, and as it moves it produces or is
a little electric field that
oscillates, plus it produces or is a
little magnetic field at right angles
that also oscillates. Furthermore electrical charges that accelerate
produce magnetic fields and magnetic fields that change with time
produce what? Electric fields? We have light, electrical fields,
charges, magnetic fields all bound together so I figure I ought to be
able to construct a magnetic field
that can bend the path of a particle of light, or an electric field
that does this. Let's call these devices magnetic prisms and
electric prisms. The difficult problem I would like to solve is: How to
build an electric and/or a magnetic prism capable of bending visible
light.
Because this idea is (a) difficult and worse (b) probably next to
impossible, I
expect to exert some serious effort to fail at it properly. That is,
this proposition of understanding magnetic prisms will require a degree
of competence and sophistication. So
before starting that I'll need a warm-up in the form of two
intermediate
steps, the next two pages of this Main Sequence.
The first warm-up is
easy: Calculate a lot of interesting physical results like the peak
emission wavelength of the sun and that from the tip of my nose. The
second step will be harder: Count how many photons there are in a given
magnetic
field (and determine their energies) and from that calculate how long
it will take a permanent magnet to run out
of energy if I put it to work.
Previous
Next
Home