Chance News 79: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
 
(77 intermediate revisions by 2 users not shown)
Line 1: Line 1:
November 3, 2011 to December 22, 2011
==Quotations==
==Quotations==
"...risk, essentially, is measurable whereas uncertainty is not measurable.
"...risk, essentially, is measurable whereas uncertainty is not measurable.
Line 4: Line 5:
"In Mr. Cain’s case, I think we are dealing with an instance where there is considerable ''uncertainty''."
"In Mr. Cain’s case, I think we are dealing with an instance where there is considerable ''uncertainty''."


<div align=right>--Nate Silver, writing in [http://fivethirtyeight.blogs.nytimes.com/2011/10/27/herman-cain-outlier/#more-18209 Herman Cain, outlier], FiveThirtyEight blog, ''New York Times'', 27 October 2011
<div align=right>--Nate Silver, writing in [http://fivethirtyeight.blogs.nytimes.com/2011/10/27/herman-cain-outlier/#more-18209 Herman Cain, outlier]<br>
FiveThirtyEight blog, ''New York Times'', 27 October 2011
</div>
</div>


Line 16: Line 18:
"Another fundamental error: when you have such little data, you should almost never throw any of it out, and you should be especially wary of doing so when it happens to contradict your hypothesis."
"Another fundamental error: when you have such little data, you should almost never throw any of it out, and you should be especially wary of doing so when it happens to contradict your hypothesis."


<div align=right>--Nate Silver, writing in [http://fivethirtyeight.blogs.nytimes.com/2011/10/27/herman-cain-and-the-hubris-of-experts/?hp Herman Cain and the Hubris of Experts], FiveThirtyEight blog, <i>The New York Times</i>, 27 October 2011</div>
<div align=right>--Nate Silver, writing in [http://fivethirtyeight.blogs.nytimes.com/2011/10/27/herman-cain-and-the-hubris-of-experts/?hp Herman Cain and the Hubris of Experts]<br>
FiveThirtyEight blog, <i>The New York Times</i>, 27 October 2011</div>


Submitted by Margaret Cibes
Submitted by Margaret Cibes


==Forsooth==
==Forsooth==
“The most important statistics in football are wins and losses and whether or not a team can outscore his opponent.”


<div align = right>Mike Leach, in [http://www.sportsfordorks.com/''Sports for Dorks:  College Football''] (p. 14)</div>
The book is excerpted in the NYT College Sports Blog [http://thequad.blogs.nytimes.com/2011/12/01/inside-the-mind-of-mike-leach/ 1 December] and [http://thequad.blogs.nytimes.com/2011/12/02/sports-for-dorks-and-missing-ingredient-in-football/?scp=1&sq=statistical&st=cse 2 December]
Submitted by Bill Peterson
-----
“I think we’re in trouble.  ….  Look at the difference between the top 1 percent and the bottom 95.” 
<div align=right>Republican presidential primary candidate Buddy Roemer
on the Occupy Wall Street “99%” issue<br>
in an interview with Rachel Maddow, November 28, 2011 [http://www.youtube.com/watch?v=kLHvSG0hcFM]</div align=right>
Submitted by Margaret Cibes
----
In [http://www.nytimes.com/2011/12/18/us/reframing-the-debate-over-using-phones-while-driving.html?pagewanted=1&sq=risk%20analysis&st=cse&scp=1 Reframing the debate over using phones behind the wheel] (''New York Times'', 17 December 2011), we read,
"Part of the lure of smartphones...is that they randomly dispense valuable information. People do not know when an urgent or interesting e-mail or text will come in, so they feel compelled to check all the time."  The following sidebar appears in the online version of the article:
http://community.middlebury.edu/~wpeterso/Chance_News/images/CN79_twitter.png
So in case anyone is looking for distractions...
Submitted by Bill Peterson


==Fraud may just be the tip of the iceberg==
==Fraud may just be the tip of the iceberg==


[http://www.nytimes.com/2011/11/03/health/research/noted-dutch-psychologist-stapel-accused-of-research-fraud.html Fraud Case Seen as a Red Flag for Psychology Research] by Benedict Carey, The New York Times, November 2, 2011.
[http://www.nytimes.com/2011/11/03/health/research/noted-dutch-psychologist-stapel-accused-of-research-fraud.html Fraud Case Seen as a Red Flag for Psychology Research] by Benedict Carey, ''New York Times'', November 2, 2011.


A recently revealed case about fraud may point to a much larger problem.  
A recently revealed case about fraud may point to a much larger problem.  
Line 54: Line 81:


This is perhaps doubly ironic, in that the psychologists have been caught making psychological errors.
This is perhaps doubly ironic, in that the psychologists have been caught making psychological errors.
For more on all this see [http://chronicle.com/article/As-Dutch-Research-Scandal/129746/ Fraud scandal fuels debate over practices of social psychology]
by Christopher Shea, ''Chronicle of Higher Education'', 13 November 2011


Submitted by Paul Alper
Submitted by Paul Alper
Line 76: Line 106:
Many people will remember that the infamous Monty Hall problem first gained national attention after appearing in an "Ask Marilyn" [http://www.marilynvossavant.com/articles/gameshow.html column in 1990].  (More nostalgia:  In the [http://www.dartmouth.edu/~chance/chance_news/recent_news/chance_news_1.01.html inaugural issue of Chance News], Laurie Snell described a [http://www.condenaststore.com/-sp/In-your-case-Dave-there-s-a-choice-elective-surgery-outpatient-medicin-Prints_i8542829_.htm New Yorker cartoon] inspired by Monty's game show, Let's Make a Deal).  One important lesson from that discussion was that the host's behavior mattered, and that the problem was not well-defined without a model for how he chooses a door to open.   
Many people will remember that the infamous Monty Hall problem first gained national attention after appearing in an "Ask Marilyn" [http://www.marilynvossavant.com/articles/gameshow.html column in 1990].  (More nostalgia:  In the [http://www.dartmouth.edu/~chance/chance_news/recent_news/chance_news_1.01.html inaugural issue of Chance News], Laurie Snell described a [http://www.condenaststore.com/-sp/In-your-case-Dave-there-s-a-choice-elective-surgery-outpatient-medicin-Prints_i8542829_.htm New Yorker cartoon] inspired by Monty's game show, Let's Make a Deal).  One important lesson from that discussion was that the host's behavior mattered, and that the problem was not well-defined without a model for how he chooses a door to open.   


In that spirit, it might be relevant to consider how the other string of "rolls" was chosen.  Marilyn's answer suggests that she was already planning to write down a string of twenty 1s along with her string of twenty actual rolls.  If you know that is going to happen, then even before the roll has occurred, you might be prepared to guess that the real string is not the one consisting of twenty 1s.
In that spirit, it might be relevant to consider how the strings of "rolls" are produced.  Marilyn's answer suggests that she was already planning to write down a string of twenty 1s along with her string of twenty actual rolls.  If you know that is going to happen, then even before the roll has occurred, you might be prepared to guess in advance that the real string will not be the one consisting of twenty 1s.
 
Unrelated to Marilyn's column, this theme came up in a [http://dilbert.com/strips/comic/2001-10-25/ Dilbert cartoon on random number generators].


'''Discussion'''<br>
'''Discussion'''<br>
#What do you think Marilyn had in mind when she wrote "because the roll has already occurred..."?
#The other lesson from the Monty Hall discussion was not to jump to the conclusion that Marilyn is wrong.  So what do you think she had in mind when she wrote "because the roll has already occurred..."?
#I've just rolled a die twenty times.  Which of the following do you think it is:  (i) 14152653532346264333;  or (ii) 61655214235336553132?  Does your answer change if someone points out that (i) consists of the digits of pi after the decimal point, skipping the 0s, 7s, 8, and 9s?
#I've just rolled a die twenty times (OK, I used R to simulate 20 rolls).  Which of the following do you think it is:  (i) 14152653532346264333;  or (ii) 61655214235336553132?  Does your answer change if someone points out that (i) consists of the digits of pi after the decimal point, skipping the 0s, 7s, 8, and 9s?


Submitted by Bill Peterson
Submitted by Bill Peterson
===Comment===
Paul Alper wrote to point out an analogy with a famous classroom experiment, in which the instructor leaves the room while students compile lists of 200 "tosses" of a fair coin.  Half the students toss a real coin, while the other half produce a string of imagined tosses.  Upon return, the teacher classifies the strings as real or fake, depending on the length of the longest run.  The imagined strings typically will typically not include long runs, but with probability 0.965 a real string of 200 tosses will contain a run of at least six consecutive heads or six consecutive tails (see discussion in [http://www.dartmouth.edu/~chance/chance_news/recent_news/chance_news_7.07.html#Benford's%20law archives of the Chance Newsletter].  This activity is also described in the chapter "Streaky Behavior" in Scheaffer, et. al, ''Activity Based Statistics'').
==Grading health news reports==
Gary Schwitzer's invaluable website [http://www.healthnewsreview.org/ HealthNewsReviews.org] provides weekly reviews of news stories from the health field.  (The project is sponsored by the [http://informedmedicaldecisions.org/ Foundation for Informed Medical Decision Making].)
A recent story on the site gives the following overall summary of performance of the news media in providing accurate coverage. Schwitzer writes
"After 5 years and 7 months, and after reviewing 1,648 stories and publishing nearly 1,300 blog posts, we've revised the site (for the second time)."
Below is how these 1,648 stories fared on his rating system:
http://ih.constantcontact.com/fs055/1102072836906/img/117.png
The stories rated above come from 20 news organizations, including newspapers, magazines and web sources. When it comes to TV presentations of medical results, however, HealthNewsReviews.org has thrown in the towel and won't be reviewing them because, "After 3.5 years and 228 network TV health segments reviewed, we can make the data-driven statement that many of the stories are bad and they’re not getting much better."
Submitted by Paul Alper
==The goal of reproducibility==
[http://online.wsj.com/article/SB10001424052970203764804577059841672541590.html Scientists' elusive goal: Reproducing study results]<br>
by Gautam Naik, ''Wall Street Journal'', 2 December 2011
The ''WSJ'' says "This is one of medicine's dirty secrets: Most results, including those that appear in top-flight peer-reviewed journals, can't be reproduced."  The article includes the following graphic summarizing the (largely unsuccessful) attempts by Bayer to reproduce published findings.
<center> http://si.wsj.net/public/resources/images/P1-BD631_REPROD_NS_20111201165702.jpg</center>
The article goes on to discuss various reasons for this state of affairs,  pressure on researchers to to publish, the increasing complexity of medical experiments, and the well-known bias of journals for publishing only positive results.  Some of these issues were discussed in [http://www.causeweb.org/wiki/chance/index.php/Chance_News_5:_Sept_1_to_Sept_30_05#Just_how_reliable_are_scientific_papers.3F CN 5], which focuses on John Ionnidis's 2005 article in PLoS Medicine [http://www.plosmedicine.org/article/info:doi/10.1371/journal.pmed.0020124 Why most published research findings are false].
The more reliable popular media have finally been convinced to carry pretty accurate statements about interpreting confidence intervals in polling results.  Maybe the media - and even science journals themselves - need to be encouraged to carry a cigarette-like warning about study results:  "Caution:  Since science is an inductive process whose conclusions depend upon strong evidence that is reproducible, readers should take into account that any conclusions are preliminary, and should not be acted upon until further experiments have reinforced them."
Submitted by Margaret Cibes
==QL in the Media Contest finalists==
In [http://www.causeweb.org/wiki/chance/index.php/Chance_News_77#QL_in_the_Media_Contest CN 77], Margaret Cibes noted that the MAA SIGMAA on Quantitative Literacy was running a contest for best and worst examples of QL in the media.  They have posted the entries [http://sigmaa.maa.org/ql/contest.php here], where viewers are invited to cast their votes.
Suggested by Priscilla Bremser
==Dilbert: Wally as "lurking variable"==
See the cartoon [http://dilbert.com/strips/comic/2011-11-28/ here].
==Teaching stats with sports==
[http://www.nytimes.com/2011/11/06/education/edlife/at-moneyball-u-what-are-the-odds.html?_r=1&scp=6&sq=probabilities&st=cse At Moneyball U, what are the odds?]<br>
by Alan Schwarz, ''New York Times'', 4 November 2011
The title is a reference to the movie "Moneyball," which features the role of probability and statistics in the world of baseball.  Neverthless, the story leads with the comment that "Watching a baseball telecast may not be the best way to learn basic probability."  Indeed, when a good hitter has gone hitless in his last 10 at-bats, announcers can't resist saying that he is now "due for a hit."  The appeal is to a mythical Law of Averages that will even things out in the short run, but but of course the real Law of Large Numbers promises no such thing. Similarly, fans will offer a variety of explanations for the much-discussed  [http://www.knowledgerush.com/kr/encyclopedia/Sophomore_jinx/ sophomore jinx], a phenomenon which can generally be accounted by regression to the mean.
Given that many students can be readily engaged by such conversations, college courses have appeared that teach statistical concepts in the context of sports.  The article describes such offerings at Stanford, Ohio State, Bowling Green, Louisiana Tech, and James Madison University, among others.  It is a mistake, however, to assume that all students are sports fans.  The article relates an anecdote from the James Madison course.  When the professor asked if we should be surprised that the last 14 opening coin tosses at the Super Bowl have all been won by the N.F.C., one student asked "What's the N.F.C.?"  Fortunately, this student still seemed to appreciate the lighter atmosphere of the class.    Or, as Prof. Jim Cochran of Louisiana Tech professor observes, "You want them to demand data, to look for evidence, to test hypotheses. You have students who would otherwise not come close to this discipline. It’s very valuable."
The article catalogs a variety of probability and statistics techniques that are readily illustrated with sports data.  We were intrigued by a reference to Simpson's paradox, which "helps explain why the Cincinnati Reds, despite having the best record in the National League during the strike-split 1981 season, didn’t make the playoffs."  Details can be found in Prof.  Cochran's article [http://archive.ite.journal.informs.org/Vol5No1/Cochran/ Bowie Kuhn's worst wightmare] (''INFORMS Transactions on Education'', Vol 5, No 1, Sept 2004).  This is a sophisticated discussion which applies integer programming to explore the possibility that aggregation paradoxes will arise. As Cochran states in the abstract, "although the case deals with a baseball-related problem, it is relatively self-contained and requires no understanding of how baseball is played, so students who are unfamiliar with the sport will not be seriously disadvantaged."
'''Notes'''<br> 
#On the topic of Simpson's paradox, Tom Moore's 2006 article [http://www.amstat.org/publications/jse/v14n1/datasets.moore.html Paradoxes in Film Ratings] (''Journal of Statistics Education'' Vol 14, No 1) presents another interesting illustration, and also includes a [http://www.amstat.org/publications/jse/v14n1/datasets.moore.html#Table3 table] with references to favorite examples. 
#In a interview for a 2007 NPR story, [http://www.npr.org/templates/story/story.php?storyId=7320273 Why people probably don't understand probability], Andrew Gelman discussed the Super Bowl coin toss data (the streak then stood at 10).  He also described the classroom coin-tossing activity referenced [http://www.causeweb.org/wiki/chance/index.php/Chance_News_79#Comment above].
Submitted by Bill Peterson
==Probabilist for president?==
[http://online.wsj.com/article/SB10001424052970204262304577068080651797906.html?KEYWORDS=jeff+lawman  “A Pony for Every American?  New Hampshire Primary Has It All”]<br>
<i>The Wall Street Journal</i>, December 6, 2011<br>
A New Hampshire mathematician and Republican presidential primary candidate stated:
<blockquote>”I will accept any top-tier candidate's neutrally administered aptitude challenge that assesses the mental, physical and ethical qualities of leadership ….  No other candidate comes close to my structured problem-solving abilities and demonstrated proficiency in probabilistic risk assessment."  ….  <br>
“[I will] balance the federal budget through a mathematically superior tax platform that combines personal income, flat taxes, progressive taxes and capital gains into one elegant solution that no other candidate has formulated or is capable of generating.”</blockquote>
Submitted by Margaret Cibes
==Queues==
[http://online.wsj.com/article/SB10001424052970204770404577082933921432686.html?KEYWORDS=ray+a+smith#project%3DLINES120811%26articleTabs%3Darticle “Find the Best Checkout Line”]<br>
by Ray A. Smith, <i>The Wall Street Journal</i>, December 8, 2011<br>
This article discusses queuing issues and several recent studies about them.  It includes a 6-minute [http://online.wsj.com/article/SB10001424052970204770404577082933921432686.html?KEYWORDS=ray+a+smith#project%3DLINES120811%26articleTabs%3Dvideo video] about wait time perceptions and their effects on customers, and a [http://online.wsj.com/article/SB10001424052970204770404577082933921432686.html?KEYWORDS=ray+a+smith#project%3DLINES120811%26articleTabs%3Dinteractive graphic] showing  for expected wait time:  “Average wait time = average number of people in line divided by their arrival rate.” (Operations Researchers will recognize this as an illustration of [http://orforum.blog.informs.org/2011/07/12/littles-law-as-viewed-on-its-50th-anniversary/ Little's Law])
Submitted by Margaret Cibes
==Population pyramids==
[http://www.census.gov/population/international/data/idb/informationGateway.php “International Data Base”]<br>
Chance readers may be interested in population pyramids.  They are nice examples of distributions, as well as opportunities for comparisons of age and gender characteristics within one country, or in the same country over different years, or across different countries.  For the U.S., they illustrate clearly what Atul Gwande calls “the ‘rectangularization’ of survival”:
<blockquote>Throughout most of human history, a society's population formed a sort of pyramid: young children represented the largest portion – the base – and each successively older cohort represented a smaller and smaller group.  In 1950, children under the age of five were eleven per cent of the U.S. population, adults aged forty-five to forty-nine were six per cent, and those over eighty were one percent.  Today [2007], we have as many fifty-year olds as five-year-olds.  In thirty years, there will be as many people over eighty as there are under five.</blockquote>
See Gwande’s article, [http://www.newyorker.com/reporting/2007/04/30/070430fa_fact_gawande “The Way We Age Now”], from <i>The New Yorker</i>, April 30 2007.<br>
To view populations pyramids, go to the U.S. Census Bureau's "International Data Base" website[http://www.census.gov/population/international/data/idb/informationGateway.php],  select a country and year (1950-2050), and choose the tab “Population Pyramids.”<br>
Submitted by Margaret Cibes
'''Note''':  Instructions for creating your own pyramid plots in SAS or R are provided in a post by Nick Horton at the [http://sas-and-r.blogspot.com/2010/08/example-83-pyramid-plots.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+SASandR+%28SAS+and+R%29 SAS and R Blogspot].
==Modeling the financial world: some caveats==
[http://online.wsj.com/article/SB10001424052970203430404577094760894401548.html?KEYWORDS=burton+g+malkiel “Physics Envy”]<br> 
Book review of <i>Models Behaving Badly</i>, in <i>The Wall Street Journal</i>, December 14, 2011
Emanuel Derman, author of <i>Models Behaving Badly</i>, is a Columbia professor who was trained as a physicist and later worked at Goldman Sachs.  He writes that when people try to create financial models that involve human behavior, they “are trying to force the ugly stepsister's foot into Cinderella's pretty glass slipper”:
<blockquote>Although financial models employ the mathematics and style of physics, they are fundamentally different from the models that science produces. Physical models can provide an accurate description of reality. Financial models, despite their mathematical sophistication, can at best provide a vast oversimplification of reality.</blockquote>
Derman has an online blog, [http://www.wilmott.com/blogs/eman/index.cfm/2009/1/8/The-Financial-Modelers-Manifesto “The Financial Modelers’ Manifesto”], posted in 2009:<br>
“I will remember that I didn't make the world, and it doesn't satisfy my equations.”<br>
“Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.”<br>
“I will never sacrifice reality for elegance without explaining why I have done so.”<br>
“Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.”<br>
“I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension.”<br>
Derman also states in his book, “[I]n physics you're playing against God, and He doesn't change His laws very often. In finance, you're playing against God's creatures."
Submitted by Margaret Cibes
==Graphic on campaign contributions==
[http://campaignstops.blogs.nytimes.com/2011/12/19/deep-pockets-deeply-political/?hp Deep pockets, deeply political]<br>
by Charles Blow, ''New York Times'', 19 December 2011
Blow describes a recent report [http://sunlightfoundation.com/blog/2011/12/13/the-political-one-percent-of-the-one-percent/ The political one percent of the one percent], by the Sunlight Foundation, an organization dedicated to transparency in government.  Blow has developed the following graphic summarizing the trends in campaign contributions by those in the top one-percent-of-one-percent of the income distribution.
http://graphics8.nytimes.com/images/2011/12/19/opinion/cs-blow-donors/cs-blow-donors-blog533.jpg
Submitted by Paul Alper

Latest revision as of 21:21, 26 April 2013

November 3, 2011 to December 22, 2011

Quotations

"...risk, essentially, is measurable whereas uncertainty is not measurable.

"In Mr. Cain’s case, I think we are dealing with an instance where there is considerable uncertainty."

--Nate Silver, writing in Herman Cain, outlier

FiveThirtyEight blog, New York Times, 27 October 2011

Submitted by Paul Alper


"Experts have a poor understanding of uncertainty. Usually, this manifests itself in the form of overconfidence: experts underestimate the likelihood that their predictions might be wrong. …. [E]xperts who use terms like “never” and “certain” too often are playing Russian roulette with their reputations."

"I used to be annoyed when the margin of error was high in a forecasting model that I might put together. Now I view it as perhaps the single most important piece of information that a forecaster provides. When we publish a forecast on FiveThirtyEight, I go to great lengths to document the uncertainty attached to it, even if the uncertainty is sufficiently large that the forecast won’t make for punchy headlines."

"Another fundamental error: when you have such little data, you should almost never throw any of it out, and you should be especially wary of doing so when it happens to contradict your hypothesis."

--Nate Silver, writing in Herman Cain and the Hubris of Experts
FiveThirtyEight blog, The New York Times, 27 October 2011

Submitted by Margaret Cibes

Forsooth

“The most important statistics in football are wins and losses and whether or not a team can outscore his opponent.”

Mike Leach, in Sports for Dorks: College Football (p. 14)

The book is excerpted in the NYT College Sports Blog 1 December and 2 December

Submitted by Bill Peterson


“I think we’re in trouble. …. Look at the difference between the top 1 percent and the bottom 95.”

Republican presidential primary candidate Buddy Roemer

on the Occupy Wall Street “99%” issue

in an interview with Rachel Maddow, November 28, 2011 [1]

Submitted by Margaret Cibes


In Reframing the debate over using phones behind the wheel (New York Times, 17 December 2011), we read, "Part of the lure of smartphones...is that they randomly dispense valuable information. People do not know when an urgent or interesting e-mail or text will come in, so they feel compelled to check all the time." The following sidebar appears in the online version of the article:

http://community.middlebury.edu/~wpeterso/Chance_News/images/CN79_twitter.png

So in case anyone is looking for distractions...

Submitted by Bill Peterson

Fraud may just be the tip of the iceberg

Fraud Case Seen as a Red Flag for Psychology Research by Benedict Carey, New York Times, November 2, 2011.

A recently revealed case about fraud may point to a much larger problem.

A well-known psychologist in the Netherlands whose work has been published widely in professional journals falsified data and made up entire experiments, an investigating committee has found. Experts say the case exposes deep flaws in the way science is done in a field, psychology, that has only recently earned a fragile respectability.

The psychologist accused of fraud took advantage of some common practices in the field.

Dr. Stapel was able to operate for so long, the committee said, in large measure because he was “lord of the data,” the only person who saw the experimental evidence that had been gathered (or fabricated). This is a widespread problem in psychology, said Jelte M. Wicherts, a psychologist at the University of Amsterdam. In a recent survey, two-thirds of Dutch research psychologists said they did not make their raw data available for other researchers to see. “This is in violation of ethical rules established in the field,” Dr. Wicherts said.

The field also appears to be rather careless about their statistical analyses.

In an analysis published this year, Dr. Wicherts and Marjan Bakker, also at the University of Amsterdam, searched a random sample of 281 psychology papers for statistical errors. They found that about half of the papers in high-end journals contained some statistical error, and that about 15 percent of all papers had at least one error that changed a reported finding — almost always in opposition to the authors’ hypothesis.

This is not a surprise to psychologists.

Researchers in psychology are certainly aware of the issue. In recent years, some have mocked studies showing correlations between activity on brain images and personality measures as “voodoo” science, and a controversy over statistics erupted in January after The Journal of Personality and Social Psychology accepted a paper purporting to show evidence of extrasensory perception. In cases like these, the authors being challenged are often reluctant to share their raw data. But an analysis of 49 studies appearing Wednesday in the journal PLoS One, by Dr. Wicherts, Dr. Bakker and Dylan Molenaar [available here], found that the more reluctant that scientists were to share their data, the more likely that evidence contradicted their reported findings.


Submitted by Steve Simon

Remark

Andrew Gelman's blog has often considered questions of cheating in science. The following quote from E. J. Wagenmakers, a Dutch professor at Amsterdam University, appeared in a post from September 9 of this year :

Diederik Stapel was not just a productive researcher, but he also made appearances on Dutch TV shows. The scandal is all over the Dutch news. Oh, one of the courses he taught was on something like 'Ethical behavior in research', and one of his papers is about how power corrupts. It doesn’t get much more ironic than this. I should stress that the extent of the fraud is still unclear.

This is perhaps doubly ironic, in that the psychologists have been caught making psychological errors.

For more on all this see Fraud scandal fuels debate over practices of social psychology by Christopher Shea, Chronicle of Higher Education, 13 November 2011

Submitted by Paul Alper

Another Remark

"Much of Prof. Stapel's work made it into newspapers in no small part because he delivered scientific evidence for contentions journalists wanted to believe …..”

Eric Felten reporting[2] in The Wall Street Journal, November 4, 2011

Other stories include “Diederik Stapel; The Lying Dutchman”, in The Washington Post and “Massive Fraud Uncovered in Work by Social Psychologist”, the latter an article reprinted in the Scientific American, with permission from Nature. Both articles are dated November 1, 2011.

Submitted by Margaret Cibes

Marilyn tackles a dice problem

Ask Marilyn, by Marilyn vos Savant, Parade, 23 October 2011

It has been a while since we've reported on an "Ask Marilyn" story. In the Sunday column referenced above, a reader asks:

I’m a math instructor and I think you’re wrong about this question [originally from Marilyn's July 23 column]: “Say you plan to roll a die 20 times. Which result is more likely: (a) 11111111111111111111; or (b) 66234441536125563152?” You said they’re equally likely because both specify the number for each of the 20 tosses. I agree so far. However, you added, “But let’s say you rolled a die out of my view and then said the results were one of those series. Which is more likely? It’s (b) because the roll has already occurred. It was far more likely to have been that mix than a series of ones.” I disagree. Each of the results is equally likely—or unlikely. This is true even if you are not looking at the result.

Marilyn responds: "My answer was correct. To convince doubting readers, I have, in fact, rolled a die 20 times and noted the result, digit by digit. It was either: (a) 11111111111111111111; or (b) 63335643331622221214. Do you still believe that the two series are equally likely to be what I rolled?"

Many people will remember that the infamous Monty Hall problem first gained national attention after appearing in an "Ask Marilyn" column in 1990. (More nostalgia: In the inaugural issue of Chance News, Laurie Snell described a New Yorker cartoon inspired by Monty's game show, Let's Make a Deal). One important lesson from that discussion was that the host's behavior mattered, and that the problem was not well-defined without a model for how he chooses a door to open.

In that spirit, it might be relevant to consider how the strings of "rolls" are produced. Marilyn's answer suggests that she was already planning to write down a string of twenty 1s along with her string of twenty actual rolls. If you know that is going to happen, then even before the roll has occurred, you might be prepared to guess in advance that the real string will not be the one consisting of twenty 1s.

Unrelated to Marilyn's column, this theme came up in a Dilbert cartoon on random number generators.

Discussion

  1. The other lesson from the Monty Hall discussion was not to jump to the conclusion that Marilyn is wrong. So what do you think she had in mind when she wrote "because the roll has already occurred..."?
  2. I've just rolled a die twenty times (OK, I used R to simulate 20 rolls). Which of the following do you think it is: (i) 14152653532346264333; or (ii) 61655214235336553132? Does your answer change if someone points out that (i) consists of the digits of pi after the decimal point, skipping the 0s, 7s, 8, and 9s?

Submitted by Bill Peterson

Comment

Paul Alper wrote to point out an analogy with a famous classroom experiment, in which the instructor leaves the room while students compile lists of 200 "tosses" of a fair coin. Half the students toss a real coin, while the other half produce a string of imagined tosses. Upon return, the teacher classifies the strings as real or fake, depending on the length of the longest run. The imagined strings typically will typically not include long runs, but with probability 0.965 a real string of 200 tosses will contain a run of at least six consecutive heads or six consecutive tails (see discussion in archives of the Chance Newsletter. This activity is also described in the chapter "Streaky Behavior" in Scheaffer, et. al, Activity Based Statistics).

Grading health news reports

Gary Schwitzer's invaluable website HealthNewsReviews.org provides weekly reviews of news stories from the health field. (The project is sponsored by the Foundation for Informed Medical Decision Making.)

A recent story on the site gives the following overall summary of performance of the news media in providing accurate coverage. Schwitzer writes "After 5 years and 7 months, and after reviewing 1,648 stories and publishing nearly 1,300 blog posts, we've revised the site (for the second time)." Below is how these 1,648 stories fared on his rating system:

http://ih.constantcontact.com/fs055/1102072836906/img/117.png

The stories rated above come from 20 news organizations, including newspapers, magazines and web sources. When it comes to TV presentations of medical results, however, HealthNewsReviews.org has thrown in the towel and won't be reviewing them because, "After 3.5 years and 228 network TV health segments reviewed, we can make the data-driven statement that many of the stories are bad and they’re not getting much better."

Submitted by Paul Alper

The goal of reproducibility

Scientists' elusive goal: Reproducing study results
by Gautam Naik, Wall Street Journal, 2 December 2011

The WSJ says "This is one of medicine's dirty secrets: Most results, including those that appear in top-flight peer-reviewed journals, can't be reproduced." The article includes the following graphic summarizing the (largely unsuccessful) attempts by Bayer to reproduce published findings.

http://si.wsj.net/public/resources/images/P1-BD631_REPROD_NS_20111201165702.jpg

The article goes on to discuss various reasons for this state of affairs, pressure on researchers to to publish, the increasing complexity of medical experiments, and the well-known bias of journals for publishing only positive results. Some of these issues were discussed in CN 5, which focuses on John Ionnidis's 2005 article in PLoS Medicine Why most published research findings are false.

The more reliable popular media have finally been convinced to carry pretty accurate statements about interpreting confidence intervals in polling results. Maybe the media - and even science journals themselves - need to be encouraged to carry a cigarette-like warning about study results: "Caution: Since science is an inductive process whose conclusions depend upon strong evidence that is reproducible, readers should take into account that any conclusions are preliminary, and should not be acted upon until further experiments have reinforced them."

Submitted by Margaret Cibes

QL in the Media Contest finalists

In CN 77, Margaret Cibes noted that the MAA SIGMAA on Quantitative Literacy was running a contest for best and worst examples of QL in the media. They have posted the entries here, where viewers are invited to cast their votes.

Suggested by Priscilla Bremser

Dilbert: Wally as "lurking variable"

See the cartoon here.

Teaching stats with sports

At Moneyball U, what are the odds?
by Alan Schwarz, New York Times, 4 November 2011

The title is a reference to the movie "Moneyball," which features the role of probability and statistics in the world of baseball. Neverthless, the story leads with the comment that "Watching a baseball telecast may not be the best way to learn basic probability." Indeed, when a good hitter has gone hitless in his last 10 at-bats, announcers can't resist saying that he is now "due for a hit." The appeal is to a mythical Law of Averages that will even things out in the short run, but but of course the real Law of Large Numbers promises no such thing. Similarly, fans will offer a variety of explanations for the much-discussed sophomore jinx, a phenomenon which can generally be accounted by regression to the mean.

Given that many students can be readily engaged by such conversations, college courses have appeared that teach statistical concepts in the context of sports. The article describes such offerings at Stanford, Ohio State, Bowling Green, Louisiana Tech, and James Madison University, among others. It is a mistake, however, to assume that all students are sports fans. The article relates an anecdote from the James Madison course. When the professor asked if we should be surprised that the last 14 opening coin tosses at the Super Bowl have all been won by the N.F.C., one student asked "What's the N.F.C.?" Fortunately, this student still seemed to appreciate the lighter atmosphere of the class. Or, as Prof. Jim Cochran of Louisiana Tech professor observes, "You want them to demand data, to look for evidence, to test hypotheses. You have students who would otherwise not come close to this discipline. It’s very valuable."

The article catalogs a variety of probability and statistics techniques that are readily illustrated with sports data. We were intrigued by a reference to Simpson's paradox, which "helps explain why the Cincinnati Reds, despite having the best record in the National League during the strike-split 1981 season, didn’t make the playoffs." Details can be found in Prof. Cochran's article Bowie Kuhn's worst wightmare (INFORMS Transactions on Education, Vol 5, No 1, Sept 2004). This is a sophisticated discussion which applies integer programming to explore the possibility that aggregation paradoxes will arise. As Cochran states in the abstract, "although the case deals with a baseball-related problem, it is relatively self-contained and requires no understanding of how baseball is played, so students who are unfamiliar with the sport will not be seriously disadvantaged."

Notes

  1. On the topic of Simpson's paradox, Tom Moore's 2006 article Paradoxes in Film Ratings (Journal of Statistics Education Vol 14, No 1) presents another interesting illustration, and also includes a table with references to favorite examples.
  2. In a interview for a 2007 NPR story, Why people probably don't understand probability, Andrew Gelman discussed the Super Bowl coin toss data (the streak then stood at 10). He also described the classroom coin-tossing activity referenced above.

Submitted by Bill Peterson

Probabilist for president?

“A Pony for Every American? New Hampshire Primary Has It All”
The Wall Street Journal, December 6, 2011

A New Hampshire mathematician and Republican presidential primary candidate stated:

”I will accept any top-tier candidate's neutrally administered aptitude challenge that assesses the mental, physical and ethical qualities of leadership …. No other candidate comes close to my structured problem-solving abilities and demonstrated proficiency in probabilistic risk assessment." ….
“[I will] balance the federal budget through a mathematically superior tax platform that combines personal income, flat taxes, progressive taxes and capital gains into one elegant solution that no other candidate has formulated or is capable of generating.”

Submitted by Margaret Cibes

Queues

“Find the Best Checkout Line”
by Ray A. Smith, The Wall Street Journal, December 8, 2011

This article discusses queuing issues and several recent studies about them. It includes a 6-minute video about wait time perceptions and their effects on customers, and a graphic showing for expected wait time: “Average wait time = average number of people in line divided by their arrival rate.” (Operations Researchers will recognize this as an illustration of Little's Law)

Submitted by Margaret Cibes

Population pyramids

“International Data Base”

Chance readers may be interested in population pyramids. They are nice examples of distributions, as well as opportunities for comparisons of age and gender characteristics within one country, or in the same country over different years, or across different countries. For the U.S., they illustrate clearly what Atul Gwande calls “the ‘rectangularization’ of survival”:

Throughout most of human history, a society's population formed a sort of pyramid: young children represented the largest portion – the base – and each successively older cohort represented a smaller and smaller group. In 1950, children under the age of five were eleven per cent of the U.S. population, adults aged forty-five to forty-nine were six per cent, and those over eighty were one percent. Today [2007], we have as many fifty-year olds as five-year-olds. In thirty years, there will be as many people over eighty as there are under five.

See Gwande’s article, “The Way We Age Now”, from The New Yorker, April 30 2007.

To view populations pyramids, go to the U.S. Census Bureau's "International Data Base" website[3], select a country and year (1950-2050), and choose the tab “Population Pyramids.”

Submitted by Margaret Cibes

Note: Instructions for creating your own pyramid plots in SAS or R are provided in a post by Nick Horton at the SAS and R Blogspot.

Modeling the financial world: some caveats

“Physics Envy”

Book review of Models Behaving Badly, in The Wall Street Journal, December 14, 2011

Emanuel Derman, author of Models Behaving Badly, is a Columbia professor who was trained as a physicist and later worked at Goldman Sachs. He writes that when people try to create financial models that involve human behavior, they “are trying to force the ugly stepsister's foot into Cinderella's pretty glass slipper”:

Although financial models employ the mathematics and style of physics, they are fundamentally different from the models that science produces. Physical models can provide an accurate description of reality. Financial models, despite their mathematical sophistication, can at best provide a vast oversimplification of reality.

Derman has an online blog, “The Financial Modelers’ Manifesto”, posted in 2009:
“I will remember that I didn't make the world, and it doesn't satisfy my equations.”
“Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.”
“I will never sacrifice reality for elegance without explaining why I have done so.”
“Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.”
“I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension.”

Derman also states in his book, “[I]n physics you're playing against God, and He doesn't change His laws very often. In finance, you're playing against God's creatures."

Submitted by Margaret Cibes

Graphic on campaign contributions

Deep pockets, deeply political
by Charles Blow, New York Times, 19 December 2011

Blow describes a recent report The political one percent of the one percent, by the Sunlight Foundation, an organization dedicated to transparency in government. Blow has developed the following graphic summarizing the trends in campaign contributions by those in the top one-percent-of-one-percent of the income distribution.

http://graphics8.nytimes.com/images/2011/12/19/opinion/cs-blow-donors/cs-blow-donors-blog533.jpg

Submitted by Paul Alper