## Monday Math Madness: We have a winner!

For the very first Monday Math Madness contest we got 13 submissions. Of the 13, 6 were correct. For the record, I solved the problem by enumerating the various cases where 3 categories were represented and computing and adding their probabilities. I also verified my solution to the problem by writing a computer program to enumerate all 243 (3^5) permutations of 3 categories and 5 blog posts and count the ones were all 3 blog categories were represented. So, I'm pretty confident I got the right answer ðŸ™‚

I used the random number generator at random.org to generate random integers between 1 and 13 (inclusive). It took two random numbers to get a number that corresponded to a correct answer. (All correct answers were well explained.) That second number was 8, and the 8th answer we received was correct.

Our winner is Austin Betz. Great job, Austin! I'll be sending you a $10 gift certificate to Amazon.com. And, Blinkdagger will be posting a new contest next Monday.

This was the problem:

A popular blog has just three categories: brilliant, insightful, and clever. Every blog post belongs to exactly one of the three categories and the category for each post is selected at random. What is the probability of reading at least one post from each category if a reader reads exactly five posts?

The answer to the problem is 150/243, or 50/81.

Here is Austin's solution. Note the clever use of 5-digit binary numbers to solve part of the problem:

This one could get kinda messy if you try to count up all the ways you can read at least one of each, so I'd go the other way: What is the probability that of the five selected posts, at least one of the categories is not in the reading list?

To answer that, I'll count up all the ways that you could read five posts with either all of them from the same category, or all of them from only two categories.

Of course, for each category, there is only one way that one could read posts solely from that category, giving 3 possible solutions to the rephrased question.

To find all the ways to have a heterogeneous reading list from two categories, I considered the binary numbers from 00001 to 11110, which would correspond to the numbers 1 to 30 in decimal. Since these binary numbers give all the heterogeneous, five-digit combinations of 0 and 1, it will also give the number of ways that I could choose 5 readings from 2 categories. Of course, I'll have to consider all 3 ways that I could exclude one of the categories, so I'll multiply those 30 possibilities by 3 to get 90 heterogeneous reading lists from 2 of the 3 categories.

So, the probability of selecting 5 readings and omiiting at least one category would be ( (# of homogeneous reading lists) + (# of heterogeneous reading lists from 2 categories) ) / (# of ways to select 5 posts from 3 categories) = ( 3 + 90 ) / (3^5) = 93/273 = 31/81

To answer the original question, I'll subtract my answer from 1 to get 1 - 31/81 = 50/81, or about 62%

Pat Bellow, of the Mathwords site, gets my applause for the most interesting solution. Pat solved the problem two ways, one of the ways was using Markov state matrices. I've never heard of such an approach; I guess I could have had more fun in probability class if my teacher had known about this approach. If Pat is willing, I'll post notes from him on how he solved this problem as a separate post.

Richard Berlin also had an innovative approach that seems to have some things in common with Pat Bellow's approach:

We probably need to assume that the total number of entries is large; otherwise the effect of a choice can substantially effect future probabilities).

The full set of possibilities make a tree. We'll describe them as 2-tuples (probability, # differences)

The first choice has a 1.0 probability of being different from what's already been seen.

1 (1.0, 1)

At the second choice, there are two possibilities that would be different, one which would be the same

2 (2/3, 2) (1/3, 1)

At the third choice, the probabilities depend on the probability of reaching this point in the decision tree.

So there's a 2/3 chance of starting with the "2 different" condition and a 1/3 probability of starting with the "both same" condition. If the are both different, you have a 1/3 chance of picking a third different which gives 2/3 * 1/3 = 2/9, and a 2/3 chance of picking one that is the same as one of the first two, for 2/3 * 2/3 = 4/9. In the other branch of the tree, it's 2/3 * 1/3 and 1/3 * 1/3.

3 (2/9, 3) (4/9, 2) (2/9, 2) (1/9, 1)

Fortunately, we can prune the first duple above, because we've reached 3. The others are

4 - (4/27, 3) (8/27, 2) (2/27, 3) (4/27,2) (2/27, 2) (1/27, 1)

Two more nodes can now be pruned. It might be worth stopping a moment though, and seeing what the answer would be if we only chose 4 articles.

The nodes that haven't reached 3 yet have probabilities

8/27 + 4/27 + 2/27 + 1/27 and the probability of success is obtained by subtracting that value from 1.

The pattern suggests a closed-form solution:

1 - (2^n - 1)/3^(n - 1)

The fifth branch looks like this:

5 - - (8/81, 3) (16/81, 2) - (4/81, 3) (8/81,2) (2/81, 3) (4/81, 2) (2/81, 2) (1/81, 1)

Counting up the nodes that haven't reached 3 yet and summing their probabilities gives us

(16 + 8 + 4 + 2 + 1) / 81 = 31/81.

which confirms our proposed closed-form solution.

The probability of reading at least one of each kind with a sample of 5 is therefore

1 - (2^5 - 1)/3^4 = 1 - 31/81 = 50/81

Check the Blinkdagger site next week for the second Monday Math Madness contest problem.

QuanMarch 11th, 2008 - 12:38

Congratulations Austin, and great job hosting the first contest Sol. Daniel and I are looking forward to hosting the next contest at blinkdagger. Keep those submissions coming!

efriqueMarch 11th, 2008 - 19:56

Heh.

I didn’t expect to win this (since my explanation was so terse, being a slightly longer version of “it’s just a standard multinomial”, followed by some substituting in the formula (multiple of a sum of two multinomial probabilities) to get the answer, but it did at least have the virtue of being brief.

pat ballewMarch 12th, 2008 - 08:59

I LOVE the use of binomials by Richard Berlin. I scribbled pages of notes trying to find a nice way to apply five digit base three numbers and just got cross-eyed…

Nice work, Richard.

Pat

SolMarch 15th, 2008 - 08:42

@Quan: I’m looking forward to the next contest too.

@efrique: Terseness of explanation is not a concern if the explanation is clear.

@Pat: I also liked Richard’s approach. I’d need to spend some time with the binomial theorem and the inclusion exclusion principle to demonstrate that both get you to a solution. Also, my next post will be of your very interesting approach to solving the problem.

JimApril 14th, 2008 - 11:09

I thought of a solution but did not get the same answer.

Assume the categories are a,b,c

The sample space is then 5 a’s , 5 b’s , 5 c’s

The total number of patters u can make of this space taking 5 at a time is: 15C5 = !15/(!10*!5)= 3003

Thus the total number of possible events is 3003 (i.e. he can read 5 articles in these many different ways.)

However, all these ways do not have a,b and c within them. So subtract the number of patterns that don’t have all 3 i.e

3003 – [Patterns(only a) + Patterns(only b) + Patterns(only c) + Patterns(only a and b) + Patterns(only a and c) + Patterns(only b and c)]

=

3003 – [1 + 1 + 1 + 3*10C5]

= 3003 – [3 + 3*252]

= 3003 – [3 + 756]

= 2244

Hence the probability of getting a favorable pattern with a,b and c in it is:

2244/3002 = 0.7472

The answers is different so am not sure if it’s correct.

Jim

pat ballewApril 14th, 2008 - 22:58

Jim,

I think there is one major problem with this approach, and another small arithmetic problem… first, the arithmetic because it is easier…

I think you double counted some cases.. for example, the 10 choose 5 for “patterns only a and b” would include one each of “Only a” and “Only b”.. these would also be recounted again in the “patterns only a and c” and yet again in the “patterns only a and c”.

The problem I have with your combinations approach is that the probablility of event a being selected is dependent on what happened before. (the probability you hear type a on the second listening depends on what you heard on the first listening. I think the intention (from the use of the word “random” was that these probabilities would be independent from trial to trial.

Hope that helps explain the difference in your answer.

Pat