UPDATED! See below.
My correspondence concerning the
Ask Marilyn column with a reader continues. The emails are copied below. I have removed the reader's name and have only deleted some friendly asides and such. I have more comments about the Marilyn vos Savant column after the emails.
10/30/2011
READER: In any event, I'm not sure what you're saying. In your 1st blog
seem to say Marilyn actually is correct.
10/30/2011
VP: In her first answer, she writes "It was far more likely to have been that mix [emphasis added] than a series of ones." In my view, when she writes 'that mix', she is referring to that one specific roll, and that roll has the same probability as all 1s. That's why I claimed "Marilyn's first answer is wrong" in my first post.
However, in her answer to George Alland, she changes her answer to '"It was far more likely to be (b), a jumble [emhasis added] of numbers." She has changed the conditions of the problem from considering one particular throw to a roll that is jumbled. That's why I wrote in my first post "Her second answer (the "jumble of numbers" is more likely) is correct..."
As you wrote, Marilyn may sometimes be ambiguous and being confined to one small column in Parade magazine, that can be all too easy.
10/30/2011
READER: You are correct that when Marilyn writes "that mix", she is talking
about the specific series (b).
However, what you are omitting in your analysis of Marilyn's answer,
is that "that mix", viz., (b), is indeed "far more likely" GIVEN THAT
the rolled series must be either (a) or (b).
The "GIVEN THAT" clause is crucial in determining likelihood. I had
stated this key point in my first response to your blog entry, along
with the other key point that the writing down of the series occurs
after the 20 die rolls.
Changing Marilyn's problem by omitting the "GIVEN THAT" clause
constraint, would make your probability analysis correct and your
ambiguity complaint reasonable.
BTW, I'm not a die-hard Marilyn fan. When she messes up, e.g., when
she claimed that Wiles' proof of Fermat's Last Theorem was invalid,
I'm the first to throw a stone.
11/2/2011
VP: I must admit that I'm at a loss. I fail to grasp how the phrase 'given that' affects the probabilities. Could you explain further?
The reader points to the original problem as stated by Marilyn. I reread it and I see that there's an even more egregious error. Marilyn writes '
It’s (b) because the roll has already occurred.' This implies there is some conditional probability.
As far as my understanding of probability goes, there's three issues here. (1) What is the probability of rolling a die twenty times and getting one out of 3,656,158,440,062,976 possible outcomes? (2) What is the probability of rolling a die twenty times and getting a particular mix of 1s, 2s, 3s, 4s, 5s, and 6s? And (3) this problem does not involve any conditional probabilities.
Are there any readers who can find some oversight, misconception, and/or goof on my part?
UPDATE 11/2/2011 Email
First note that Marilyn doesn't explicitly use the words "given that". However, the meaning of her wording involves the same idea, viz., conditional probability.
You can google something like: "given that" probability to find numerous examples using the phrase "given that" in this conditional probability context, e.g., http://www.mathgoodies.com/lessons/vol6/conditional.html
OK, let's move to Marilyn's article. I've carefully chosen wording and formatting to make what's going on easier to understand.
The 1st half of Marilyn's article basically says:
The specific mix of numbers (b) 66234441536125563152
is as likely to appear next, as
the specific series (a) 11111111111111111111,
GIVEN THAT
I've already written down (a) and (b).
Hopefully, you agree with this wording and the correctness of the statement, so far.
The 2nd half of Marilyn's article basically says:
The specific mix of numbers (b) 66234441536125563152
is more likely to have been the rolled series, than
the specific series (a) 11111111111111111111,
GIVEN THAT
I wrote down (a) and (b) after I finished rolling the die,
AND
the series I rolled is indeed either (a) or (b).
Please take a moment to confirm that this captures the meaning of the 2nd half of Marilyn's article.
Now, do you also see how the "given that" clause for the 2nd half fundamentally changes the likelihood of (a) vs. (b), even though Marilyn still compares explicitly "that mix", 66234441536125563152, with the all ones series?
Note that Marilyn is NOT saying that, if we run the entire experiment again, that 66234441536125563152 would again be the series written down on the piece of paper.