Chance News 104: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
 
(12 intermediate revisions by the same user not shown)
Line 1: Line 1:
March 1, 2015 to May 15, 2015
==Quotations==
==Quotations==
"Regression to the mean is so powerful that once-in-a-generation talent basically never sires once-in-a-generation talent. It explains why Michael Jordan’s sons were middling college basketball players and Jakob Dylan wrote two good songs....<br>
"Regression to the mean is so powerful that once-in-a-generation talent basically never sires once-in-a-generation talent. It explains why Michael Jordan’s sons were middling college basketball players and Jakob Dylan wrote two good songs....<br>
Line 4: Line 6:
"The Bush family’s dominance [in presidential politics] would be the basketball equivalent of Michael Jordan being the father of LeBron James and Kevin Durant — and of Michael Jordan’s father being Walt Frazier....In other words, it is virtually impossible, statistically speaking, that Bushes are consistently the most talented people to lead our country. Same for Chelsea Clinton or any other member of a political dynasty thought to be possible presidential timber."
"The Bush family’s dominance [in presidential politics] would be the basketball equivalent of Michael Jordan being the father of LeBron James and Kevin Durant — and of Michael Jordan’s father being Walt Frazier....In other words, it is virtually impossible, statistically speaking, that Bushes are consistently the most talented people to lead our country. Same for Chelsea Clinton or any other member of a political dynasty thought to be possible presidential timber."


<div align=right>-- Seth Stephens-Davidowitz, in [http://www.nytimes.com/2015/03/22/opinion/sunday/seth-stephens-davidowitz-just-how-nepotistic-are-we.html?_r=0 Just how nepotistic are we?], ''New York Times'', 21 March 2015</div>
<div align=right>-- Seth Stephens-Davidowitz, in: [http://www.nytimes.com/2015/03/22/opinion/sunday/seth-stephens-davidowitz-just-how-nepotistic-are-we.html?_r=0 Just how nepotistic are we?], ''New York Times'', 21 March 2015</div>
Submitted by Bill Peterson
Submitted by Bill Peterson
[Note.  For an accessible popular account of regression to the mean, see [http://www.abc.net.au/science/articles/2015/05/12/4217146.htm What does regression to the mean, mean?], ''ABC Science'' (Australian Broadcasting Corporation), 12 May 2015.]
----
----
“Keeping an open mind is a virtue – but, as the space engineer James Oberg once said, not so open that your brains fall out.”<br>
“Keeping an open mind is a virtue – but, as the space engineer James Oberg once said, not so open that your brains fall out.”<br>
Line 68: Line 72:
Submitted by Margaret Cibes at the suggestion of James Greenwood
Submitted by Margaret Cibes at the suggestion of James Greenwood


== Transitivity, Correlation and Causation ==
== Transitivity, correlation and causation ==




Line 178: Line 182:
by Justin Wolfers, "Upshot" blog, ''New York Times'', 2 April 2015
by Justin Wolfers, "Upshot" blog, ''New York Times'', 2 April 2015


In this pair of articles, Wolfers seeks to debunk a study on parenting time that was widely reported in the media (he cites articles from [http://www.washingtonpost.com/local/making-time-for-kids-study-says-quality-trumps-quantity/2015/03/28/10813192-d378-11e4-8fce-3941fc548f1c_story.html The Washington Post], [http://www.theguardian.com/commentisfree/2015/apr/01/dont-stress-out-our-kids-are-just-fine-when-their-mothers-work-late The Guardian], and [http://www.today.com/parents/quality-over-quantity-new-study-brings-time-squeezed-parents-relief-t11746 NBC News], among others).  The study in question failed to find a significant correlation between parental time spent with children and outcomes later in life, such as scores on standardized tests.  The common theme of all the media reports was that "quality beats quantity" in parenting time.
In this pair of articles, Wolfers seeks to debunk a study on parenting time that was widely reported in the media (he cites articles from [http://www.washingtonpost.com/local/making-time-for-kids-study-says-quality-trumps-quantity/2015/03/28/10813192-d378-11e4-8fce-3941fc548f1c_story.html The Washington Post], [http://www.theguardian.com/commentisfree/2015/apr/01/dont-stress-out-our-kids-are-just-fine-when-their-mothers-work-late The Guardian], and [http://www.today.com/parents/quality-over-quantity-new-study-brings-time-squeezed-parents-relief-t11746 NBC News], among others).  The study in question failed to find a significant correlation between parental time spent with children and positive outcomes later in life, such as better grades in school or higher scores on standardized tests.  A common theme in all the media reports was that "quality beats quantity" in parenting time.


Wolfers's first article pointed out that the study in question was based on a survey that asked parents about two specific days, one during the the week and one on the weekend.  He compares this with trying to predict your income by looking a particular day:  the result would vary wildly based on whether the day in question happened to be a payday.  Similary he quotes developmental psychologists saying, “What you did yesterday should not be taken as representative of what you did last year.”  
Wolfers's first article pointed out that the study in question was based on a survey that questioned parents about two specific days, one during the the week and one on the weekend.  He compares this with trying to predict your income by looking a particular day:  the result would vary wildly based on whether the day in question happened to be a payday.  Similary he quotes developmental psychologists saying, “What you did yesterday should not be taken as representative of what you did last year.”  


The second article provides responds to some readers's objections, and gives a particularly careful discussion of statistical issues related to "errors in variables."  Wolfers acknowledges that randomness in the sample could reasonably be expected to guarantee that the average parenting time balances out correctly in the measure, in that some parents will respond about more time-intensive days, and others about lesser days.  The real problem, he explains, comes in correlating this with another measure.  To illustrate, he constructs a set of three scatterplots, the first showing a positive correlation between the time you spent with your children today and the  time you typically spend, the second showing a positive correlation between test scores and the time you typically spend, but the third showing near-zero correlation between test scores and the time you spend with your children today.  
The second article responds to some readers's objections, and gives a particularly careful discussion of statistical issues related to "errors in variables."  Wolfers acknowledges that randomness in the sample could reasonably be expected to guarantee that the average parenting time balances out correctly in the measure, in that some parents will respond about more time-intensive days, and others about lesser days.  The real problem, he explains, comes in correlating this with another measure.  To illustrate, he constructs a set of three scatterplots, the first showing a positive correlation between the time you spent with your children today and the  time you typically spend, the second showing a positive correlation between test scores and the time you typically spend, but the third showing near-zero correlation between test scores and the time you spend with your children today.  


Submitted by Bill Peterson
Submitted by Bill Peterson
Line 239: Line 243:


==Mean vs. median: Galton and the ox==
==Mean vs. median: Galton and the ox==
Mike Olinick sent a link to the following:
Mike Olinick sent a link to the following story (and also links to the Galton papers cited there):


:[http://www.nytimes.com/2015/05/12/upshot/more-consensus-on-coffees-benefits-than-you-might-think.html?abt=0002&abg=1 Voting on the ox]<br>
:[http://www.nytimes.com/2015/05/12/upshot/more-consensus-on-coffees-benefits-than-you-might-think.html?abt=0002&abg=1 Voting on the ox]<br>
Line 253: Line 257:


==Tooth fairy science==
==Tooth fairy science==
Paul Alper wrote sent a link to a slideshare presentation entitled [http://www.slideshare.net/jamesccoyne/final-tooth-fairy-presentation The Psychoneuroimmunology of cancer: Tooth fairy science]. On slide #5 we read:
Paul Alper sent a link to a slideshare presentation entitled [http://www.slideshare.net/jamesccoyne/final-tooth-fairy-presentation The Psychoneuroimmunology of cancer: Tooth fairy science]. On slide #5 we read:
<blockquote>
<blockquote>
Tooth Fairy Science seeks explanations for things before establishing that those things actually exist.  Tooth Fairy Scientists mistakenly think that if they have collected data that is consistent with their hypothesis, then they have collected data that confirms their hypothesis.
Tooth Fairy Science seeks explanations for things before establishing that those things actually exist.  Tooth Fairy Scientists mistakenly think that if they have collected data that is consistent with their hypothesis, then they have collected data that confirms their hypothesis.
</blockquote>
</blockquote>
Paul found additional background on this idea at the [http://skepdic.com/toothfairyscience.html Skeptic's Dictionary] website, where we find
Paul found additional background on this idea at the [http://skepdic.com/toothfairyscience.html Skeptic's Dictionary] website, which gives the following example:
<blockquote>
 
Tooth Fairy science seeks explanations for things before establishing that those things actually exist. For example:
<blockquote>
<blockquote>
You could measure how much money the Tooth Fairy leaves under the pillow, whether she leaves more cash for the first or last tooth, whether the payoff is greater if you leave the tooth in a plastic baggie versus wrapped in Kleenex. You can get all kinds of good data that is reproducible and statistically significant. Yes, you have learned something. But you haven’t learned what you think you’ve learned, because you haven’t bothered to establish whether the Tooth Fairy really exists.
You could measure how much money the Tooth Fairy leaves under the pillow, whether she leaves more cash for the first or last tooth, whether the payoff is greater if you leave the tooth in a plastic baggie versus wrapped in Kleenex. You can get all kinds of good data that is reproducible and statistically significant. Yes, you have learned something. But you haven’t learned what you think you’ve learned, because you haven’t bothered to establish whether the Tooth Fairy really exists.
</blockquote>
</blockquote>
</blockquote>
 
The term is attributed to Harriet Hall, a retired doctor who writes the [http://www.skepdoc.info Skepdoc blog].
The term is attributed to Harriet Hall, a retired doctor who writes the [http://www.skepdoc.info Skepdoc blog].

Latest revision as of 00:01, 29 July 2015

March 1, 2015 to May 15, 2015

Quotations

"Regression to the mean is so powerful that once-in-a-generation talent basically never sires once-in-a-generation talent. It explains why Michael Jordan’s sons were middling college basketball players and Jakob Dylan wrote two good songs....

"The Bush family’s dominance [in presidential politics] would be the basketball equivalent of Michael Jordan being the father of LeBron James and Kevin Durant — and of Michael Jordan’s father being Walt Frazier....In other words, it is virtually impossible, statistically speaking, that Bushes are consistently the most talented people to lead our country. Same for Chelsea Clinton or any other member of a political dynasty thought to be possible presidential timber."

-- Seth Stephens-Davidowitz, in: Just how nepotistic are we?, New York Times, 21 March 2015

Submitted by Bill Peterson

[Note. For an accessible popular account of regression to the mean, see What does regression to the mean, mean?, ABC Science (Australian Broadcasting Corporation), 12 May 2015.]


“Keeping an open mind is a virtue – but, as the space engineer James Oberg once said, not so open that your brains fall out.”

“You can often see error bars in public opinion polls .... Imagine a society in which every speech in the Congressional Record, every television commercial, every sermon had an accompanying error bar or its equivalent.”

-- Carl Sagan in The Demon-Haunted World, 1996

Submitted by Margaret Cibes


"In the context of most observational studies, worrying about whether p < 0.05 or > 0.05 is like worrying about whether you made your bed when your house is burning."

-- Donald Berry, quoted by Gary Schwitzer at Health News Review

Submitted by Paul Alper


“Why are governments so eager to protect their citizens against dread risks, from cows to swine, and so hesitant to protect the very same people against the risk of financial disaster from investment banking?”

Gerd Gigerenzer in Risk Savvy: How to Make Good Decisions, 2014

Submitted by Margaret Cibes


"The Will Rogers phenomenon: Improved diagnostic methods that artificially increase the prevalence of a disease can improve the apparent prognosis of a patient without the measurement parameters having been changed. [Translator’s note: If cancers are diagnosed earlier but there is no change the time of death, the survival time will appear to have increased. The name comes from a Will Rogers quotation: When the Okies left Oklahoma and moved to California, they raised the average intelligence level in both states.]"

in: Placebo, are you there?, at the Science-Based Medicine web site

Submitted by Paul Alper


Paul Meehl once stated that "Sir Ronald [Fisher] has befuddled us, mesmerized us, and led us down the primrose path. I believe that the almost universal reliance on merely refuting the null hypothesis as the standard method for corroborating substantive theories in the soft areas is a terrible mistake, is basically unsound, poor scientific strategy, and one of the worst things that ever happened in the history of psychology."

quoted by R. Chris Fraley here.

[The quotation can be found in A Paul Meehl Reader: Essays on the Practice of Scientific Psychology, Routledge (2006), p. 72]

Submitted by Paul Alper

Forsooth

A Vancouver demographer comments, tongue-in-cheek, on the result of making Canada’s census long form voluntary in 2010:
“Because of the move to the voluntary NHS, Canada is a richer, whiter, more educated country now.”
Note that the response rate dropped from 98.5 percent in 2006 to 68.6 in 2011.

“The Tragedy of Canada’s Census”
The Wall Street Journal, February 26, 2015

"The percentage of students scoring at/above Proficient in 3rd grade math increased .... Curiale posted the highest gain ..., improving from 27.0 percent to 51.9 percent, an increase of 24.9% percent. …. In 6th grade, the percentage of students scoring at/above Goal ... increased from 28.0 percent to 39.4 percent, a gain of 11.4 percent."

"2013 CAPT Results Show Increases and CMT Results Show Decreases"
CT State Department of Education CSDE News, August 13, 2013

“A few years ago I performed surgery to correct a displaced abomasums ... in a dairy cow .... Ben, the owner of the farm, asked how likely the cow was to have problems after the surgery. Trying to put it in terms that he could relate to I said, ‘If we did this procedure on 100 cows, I expect about 10 to 15 would not completely recover within a few weeks of surgery.’ He paused a moment and said, ‘Well that’s good because I only have 35 cows.’’”

Gerd Gigerenzer in Risk Savvy: How to Make Good Decisions, 2014

".... President Dwight Eisenhower express[ed] astonishment and alarm on discovering that fully half of all Americans have below average intelligence ....”

Carl Sagan in The Demon-Haunted World, 1996

“Texas GOP Representative Pete Sessions recently claimed on house floor that the cost of each [Obamacare] enrollee was costing the U.S. treasury $5 million. He came up with that estimate by taking a $108 billion estimate cost and dividing by 12 million new enrollees. The only problem with that? 108 billion divided by 12 million equals about 9,000. So he was only off by about $4,991,000.”

“GOP Rep claims Obamacare costing $5 million per enrollee”
Daily Kos, March 25, 2015

Submitted by Margaret Cibes


“I just bought jumbo rolls of toilet paper--big bargain. It says on label: 12 mega rolls equals 48 regular rolls. On the other side of the label it says: use four times less.”

Personal correspondence, March 21, 2015

Submitted by Margaret Cibes at the suggestion of Howard Mayer


"The percent of binge drinkers and the gun death rate both nearly achieved significance at the p = 0.1 level."

Daniel Zelterman in “A Groaning Demographic” Significance, December 2014

(Extract from Mathematical Modeling of Zombies, ed. Robert Smith? [sic]

University of Ottawa Press, 2014)

Submitted by Margaret Cibes at the suggestion of James Greenwood

Transitivity, correlation and causation

Theorem 1 of the article cited by Paul Alper in the previous issue, "Is the Property of Being Positively Correlated Transitive?" (The American Statistician, Vol. 55, No. 4, November, 2001), depends on the existence of non-observed independent random variables U, V, and W which cause the correlations between X=U+V, Y=W+V, and Z=W-U to be non-transitive. An interesting question is whether this relates back to the difference between causation and correlation.

The answer turns out to be no, we can get the same sort of result even in the presence of causative relationships between X, Y and Z. Here’s an example:

  • X is N(0,1);
  • Y = X + U, where U is N(0,1) and independent of X;
  • Z = Y - 1.5*X.

The correlation coefficients between X and Y and between Y and Z are both positive but the correlation coefficient between X and Z is negative.

Stan Lipopvetsky’s follow-up letter (The American Statistician, 56:4, 341-342, 2002) hints at this but does not include an actual example.

Submitted by Emil M Friedman

Baseball, medicine and politics

Thanks to John Allen Paulos for sending the following link:

Who's Counting: Non-transitivity in baseball, medicine, gambling and politics
by John Allen Paulos, ABCNews.com, 5 December 2010

This installment from the John's "Who's Counting" column describes several real world illustrations of non transitivity in correlation.

Among these is an analysis from the aforementioned American Statistician article. Looking at the 2000 batting data from the New York Yankees, it was found that the number of triples hit by a player correlated positively with the number of base hits he had, which in turn correlated positively with the number of home runs he hit; however, the number of triples a player hit correlated negatively with the number of home runs he hit. As John explains, good hitters get base hits of all kinds, so it is not surprising that home runs an triples are positively correlated with total hits. But triples tend to be the result of speed, while home runs require power, and powerfully built sluggers tend not to be fast runners.

See the column for further discussion, including an example of non-transitive dice, the potential for non-transitive preferences in three way elections, and the potential pitfalls resulting from the large number of correlations in medical data.

IQ and breast-feeding

A propos this last comment about medical data (though not transitivity per se) we received the following link from Douglas Rogers, with the comment "so many variables..."

Breastfeeding raises IQ… and some worrying questions
by Dean Burnett, Guardian, 18 March 2015

A long-term study found that the length of time babies were breastfed was positively associated with both IQ and financial success in later life. This article discusses the potential for confounding with such variables as the parents's income and education, the mother's age and health, and the baby's weight at birth. The researchers took great care to consider alternative explanations for the observed effects, but conceded that they "could not completely rule out the possibility mothers who breastfed helped their babies’ development in other ways. 'Some people say it is not the effect of breastfeeding but it is the mothers who breastfeed who are different in their motivation or their ability to stimulate the kids,' Horta [lead author on the study] told the Guardian."

TED Talk: Mathematics of Love

“The Mathematics of Love”, by Hannah Fry, April 2014
(17 min video, transcript provided)

Fry, an aerodynamicist, discusses three topics related to mating, based on recent statistical studies:

Topic #1: How to win at online dating (presenting oneself on social media in order to be popular)
Topic #2: How to pick the perfect partner (timing one’s choice)
Topic #3: How to avoid divorce (analogy to nations headed for war)

One study she refers to is “Why I Don’t Have a Girlfriend”, by economist Peter Backus, who uses the Drake equation to estimate the number of potential girlfriends for him.

Submitted by Margaret Cibes

Statistics of mate selection

On a related note (see above post) Paul Alper sent what he describes as some "provocative quotations" from Andrew Gelman's blog:

In general I think these literatures have too much focus on data analysis and not enough on data collection. (15 March 2015

Gelman is reacting to a paper entitled "Do Women’s Mate Preferences Change Across the Ovulatory Cycle? A Meta-Analytic Review" (by Gildersleeve, Haselton, and Fales, Psychological Bulletin, September 2014 [1]).

He writes:

But if we seek to learn anything positive about fecundity and psychology, then I don’t think preregistration will get you anywhere, unless you improve the data collection and design of your studies. Preregistration is a great way to get a sense of what information you have but not necessarily a great way to learn anything new. To put it another way, preregistration removes bias, which is great, but it does not increase effect size or reduce variance. Except in the indirect sense that, if you know your study is preregistered, you also realize that it will not be so easy to attain statistical significance, which in turn might motivate you to perform more careful studies with cleaner effects and less noise. But this will happen only if you get serious about your design and data collection, not if you think of your study as a static entity.

He concludes, "All the replication and preregistration in the world won’t help you, if you’re tied to a weak, noisy design."

Health-care advice diverges

From “Personal genetic testing service launches in the UK,” Significance, February 2015:

A study, published in Genetics in Medicine in 2013..., compared 23andMe’s analysis of risk factors to that of two similar services .... It found differences in the way each company scored risks for specific people and diseases, including examples where different companies placed the same people in entirely opposite risk categories.

From The Wall Street Journal, March 23, 2015:

Headline #1: “Are Low-Salt Diets Necessary (or Healthy) for Most People?”
Op-ed responses from doctors :

Yes: Less Salt Reduces the Risk of Heart Disease
No: A Low salt Diet is Neither Safe Nor Feasible

Headline #2: “Should All Adults Take a Daily Aspirin?”
Op-ed responses from doctors:

Yes: The Evidence Is Clear It Reduces Deaths From Cancer
No: The Risks Are Large, and Increase as a Person Ages

Headline #3: “Is a Paleo Diet Healthy?”
Op-ed responses from doctors:

Yes: It Helps Control Weight, and Lowers Risks of Cancer
No: You Lose Too Much Pleasure – For Dubious Benefits

Submitted by Margaret Cibes

KinTape

Ask Well: Does kinesiology tape really work?
by Gretchen Reynolds, "Well" blog, New York Times, 27 March 2015

The technical paper referred to is

Kinesiology tape does not facilitate muscle performance: A deceptive controlled trial
by K.Y. Poon, et.al., Manual Therapy, February 2015 (Vol 20, Issue 1, pp. 130–133)

The sample size started at 46, eventually 30 completed the study; each was blindfolded so that the subjects could not see what kind of taping was done: "Thirty healthy participants performed isokinetic testing of three taping conditions: true facilitative KinTape, sham KinTape, and no KinTape." Here are the ANOVA results for Normalized Peak Torque (NPT), Normalized Total Work (NTW), and Time to Peak Torque (TPT):

All the participants were confirmed to be ignorant about KinTape at the debriefing after the experiment. None of them used KinTape prior to the study and they had never heard of the application of KinTape in any circumstances. NPT, NTW, and TPT in different conditions were shown in Table 1. There was no significant difference in NPT between all three taping conditions at 60° (F(2,87) = 0.05, p = 0.96) and 180°/s (F(2,87) = 0.41, p = 0.66). Similar results were found in NTW (F(2,87) = 0.27, p = 0.76; F(2,87) = 0.53, p = 0.59) and TPT (F(2,87) = 0.03, p = 0.98; F(2,87) = 0.32, p = 0.73) at slow and fast contraction speed respectively.

With such enormously high p-values (correspondingly low F-values) the conclusion is

The present study demonstrated that the KinTape application did not generate higher peak torque, yield greater total work, or shorten time to peak torque in healthy young adults. Positive results in the previous studies of KinTape may be attributed to the placebo effects.

Submitted by Paul Alper

Parenting time

Yes, your time as a parent does make a difference
by Justin Wolfers, "Upshot" blog, New York Times, 1 April 2015

Why a claim about the irrelevance of parenting time doesn’t add up
by Justin Wolfers, "Upshot" blog, New York Times, 2 April 2015

In this pair of articles, Wolfers seeks to debunk a study on parenting time that was widely reported in the media (he cites articles from The Washington Post, The Guardian, and NBC News, among others). The study in question failed to find a significant correlation between parental time spent with children and positive outcomes later in life, such as better grades in school or higher scores on standardized tests. A common theme in all the media reports was that "quality beats quantity" in parenting time.

Wolfers's first article pointed out that the study in question was based on a survey that questioned parents about two specific days, one during the the week and one on the weekend. He compares this with trying to predict your income by looking a particular day: the result would vary wildly based on whether the day in question happened to be a payday. Similary he quotes developmental psychologists saying, “What you did yesterday should not be taken as representative of what you did last year.”

The second article responds to some readers's objections, and gives a particularly careful discussion of statistical issues related to "errors in variables." Wolfers acknowledges that randomness in the sample could reasonably be expected to guarantee that the average parenting time balances out correctly in the measure, in that some parents will respond about more time-intensive days, and others about lesser days. The real problem, he explains, comes in correlating this with another measure. To illustrate, he constructs a set of three scatterplots, the first showing a positive correlation between the time you spent with your children today and the time you typically spend, the second showing a positive correlation between test scores and the time you typically spend, but the third showing near-zero correlation between test scores and the time you spend with your children today.

Submitted by Bill Peterson

“Cochrane Collaboration”, Wikipedia

“The Cochrane Collaboration is an independent, non-profit, non-governmental organization consisting of a group of more than 31,000 volunteers in more than 120 countries. .... [T]he group conducts systematic reviews of randomized controlled trials of health-care interventions, which it publishes in The Cochrane Library.”

CochraneLogo.png

“The logo of the Cochrane Collaboration illustrates a meta analysis of data from seven randomized controlled trials ..., comparing one health care treatment with a placebo in a forest plot. The diagram shows the results of a systematic review and meta-analysis on [an] inexpensive course of corticosteroid given to women about to give birth too early .... This treatment reduces the odds of the babies of such women dying from the complications of immaturity by 30–50%. Because no systematic review of these trials had been published until 1989, most obstetricians had not realized that the treatment was so effective and therefore many premature babies have probably suffered or died unnecessarily.”

What is a forest plot?: “[A] graphical display designed to illustrate the relative strength of treatment effects in multiple quantitative scientific studies addressing the same question. The name derives from the phrase to ‘see the forest for the trees’. .... [The horizontal lines show a] plot of the measure of effect ... for each of these studies ... incorporating confidence intervals represented by horizontal lines. A vertical line representing no effect is ... plotted. If the confidence intervals for individual studies overlap with this line, it demonstrates that at the given level of confidence their effect sizes do not differ from no effect for the individual study.”

Submitted by Margaret Cibes at the suggestion of James Greenwood

A stopping problem

Henk Tijms: Dropping balls Into bins
By Gary Antonick, "Numberplay" blog, New York Times, 9 February 2015

Here is the puzzle, as posed by Prof Tijms:

A game machine is used to drop balls into four bins. The balls are dropped one at a time and any ball will land at random into one of the bins. You can turn off the machine whenever you wish. At the end of the game you win a dollar for every bin containing exactly one ball and you lose half a dollar for every bin containing two or more balls. What stopping rule will maximize your expected gain? In other words, when should you turn off the machine?

He notes that B=4 bins is the first case that gets very complicated, and asks for a heuristic for the general case. Later he cites a careful reader's correction of his initial solution in the B=2 case: the optimal strategy is to continue until each bin has at least one ball, for an expected winning of 1.25. Indeed, it is quite instructive to read through discussion in the comments section, and reactions from the proposer.

We are grateful to Henk Tijms for sending this link, along another balls and urns story, this one about drawings for the the schedule of European soccer tournaments. It is especially timely in light of the recent travails of the governing organization FIFA (Fédération Internationale de Football Association)!

Was the UEFA Champions League draw rigged?—Bayesian analysis by Henk Tijms
William M. Briggs statistics blog, 5 April 2013

For an accessible general introduction to the theory of optimal stopping see:

Knowing when to stop
by Theodore Hill, American Scientist, March-April 2009

Hill's discussion includes variations on the classical Secretary (a.k.a. Marriage or Dowry) Problem, the Chow-Robbins game (which is also discussed in the "Numberplay" article), and more.

Submitted by Bill Peterson

Benefits of coffee

Douglas Rogers sent a link to the following:

More consensus on coffee’s benefits than you might think
by Aaron E. Carroll, "Upshot" blog, New York Times, 11 May 2015

Carroll catalogs the large number of studies and meta-analyses showing that moderate coffee consumption is associated with a number of health benefits, this despite a widespread public perception that coffee can't be good for you. You can read Carroll's followup to readers's questions about the article here.

For an earlier discussion dispelling some myths about coffee, see

Sorting out coffee’s contradictions
by Jane Brody, New York Times, 5 August 2008

See also Coffee and a Forsooth! from Chance News 95.

Mean vs. median: Galton and the ox

Mike Olinick sent a link to the following story (and also links to the Galton papers cited there):

Voting on the ox
by Michel Balinski, The New York Review of Books, 7 May 2015

This is a letter in response to an earlier article, Is the right choice a good bargain? (5 March 2015), which argued that “statistical groups do especially well in answering factual questions.” Citing a 1907 publication by none other than Francis Galton, that article gave the following quotation: "The ox weighed 1,198 pounds; the average estimate…was 1,197 pounds, more accurate than any individual’s guess.”

But Balinski notes that the alleged quotation does not appear in Galton's paper! In fact, he cites an earlier paper by Galton, which includes the following:

How can the right conclusion be reached…? That conclusion is clearly not the average of all the estimates, which would give a voting power to “cranks” in proportion to their crankiness…. I wish to point out that the estimate to which least objection can be raised is the middlemost estimate, the number of votes that it is too high being exactly balanced by the number of votes that it is too low.

In other words, the median is resistant to outliers!

Tooth fairy science

Paul Alper sent a link to a slideshare presentation entitled The Psychoneuroimmunology of cancer: Tooth fairy science. On slide #5 we read:

Tooth Fairy Science seeks explanations for things before establishing that those things actually exist. Tooth Fairy Scientists mistakenly think that if they have collected data that is consistent with their hypothesis, then they have collected data that confirms their hypothesis.

Paul found additional background on this idea at the Skeptic's Dictionary website, which gives the following example:

You could measure how much money the Tooth Fairy leaves under the pillow, whether she leaves more cash for the first or last tooth, whether the payoff is greater if you leave the tooth in a plastic baggie versus wrapped in Kleenex. You can get all kinds of good data that is reproducible and statistically significant. Yes, you have learned something. But you haven’t learned what you think you’ve learned, because you haven’t bothered to establish whether the Tooth Fairy really exists.

The term is attributed to Harriet Hall, a retired doctor who writes the Skepdoc blog.