Chance News 77
September 6, 2011 to October 2, 2011
"Killing the publication [Statistical Abstract of the United States] for the sake of a tiny saving would be a truly gratuitous step toward a dumbed-down country."
Submitted by Bill Peterson
"Incidentally, every criminal hacker I've ever heard of has been a male. I don't know if this is significant, but it's intriguing."
Submitted by Paul Alper
"The Rays have won 21 games in a row when scoring five runs or more."
This may be true, but it is quite a statistic nonetheless! (In a related vein, see Seven statistical cliches used by baseball anounncers from Chance News #1.)
Submitted by Peter Doyle
“Although 'No More Deaths' [a humanitarian organization] conducted thousands of interviews, in places like Nogales they could only speak with a fraction of the migrants who crossed. This raised the issue of how representative their sample was, said Katerina Sinclair, a statistical consultant on the report.
"But in Naco, a smaller town, they were able to speak with enough migrants to have a representative sample."
Submitted by Emil Posavac
In Piece of falling satellite could hit you -- but it's not likely (Detroit Free Press, 11 September 2011) we read
Statistically, there's a 1 in 3,200 chance someone could be hit by one of the 26 objects...expected to crash into Earth's surface [in September].
The headline writer for the print edition of the newspaper created a Forsooth by rewriting this statistic in the secondary headline, beneath "NASA warns: Look out below!", as
There's a 1 in 3,200 chance a piece of satellite will hit you.
Submitted by Jerry Grossman
The Burlington [VT] Free Press, 13 September 2011, p B1, ran a story about hurricane damage with the headline:
"Irene's bill in Vermont could top half-billion dollars"
In the print edition, the story continued on a p. B3, with the secondary headline:
"COST: Irene repairs could top 500,000 dollars"
Submitted by Priscilla Bremser
Phil Everson recently reminded ISOSTAT readers of a pretty funny animated video (3.5 minutes) about a biostatician showing up for a last-minute consultation with a statistician:
See it here.
Submitted by Margaret Cibes
What is the payoff for high tech education?
In Classroom of Future, Stagnant Scores by Matt Richtel, New York Times, September 3, 2011.
Technology has changed how we teach our children.
Amy Furman, a seventh-grade English teacher here, roams among 31 students sitting at their desks or in clumps on the floor. They’re studying Shakespeare’s “As You Like It” — but not in any traditional way. In this technology-centric classroom, students are bent over laptops, some blogging or building Facebook pages from the perspective of Shakespeare’s characters. One student compiles a song list from the Internet, picking a tune by the rapper Kanye West to express the emotions of Shakespeare’s lovelorn Silvius.
The class, and the Kyrene School District as a whole, offer what some see as a utopian vision of education’s future. Classrooms are decked out with laptops, big interactive screens and software that drills students on every basic subject. Under a ballot initiative approved in 2005, the district has invested roughly $33 million in such technologies.
These technology upgrades arebeing implemented in many other school districts. The problem is that all this investment in technology does not appear to have any pay off.
Since 2005, scores in reading and math have stagnated in Kyrene, even as statewide scores have risen. To be sure, test scores can go up or down for many reasons. But to many education experts, something is not adding up — here and across the country. In a nutshell: schools are spending billions on technology, even as they cut budgets and lay off teachers, with little proof that this approach is improving basic learning.
The backers of this new technology alternate between fighting
Some backers of this idea say standardized tests, the most widely used measure of student performance, don’t capture the breadth of skills that computers can help develop.
“The data is pretty weak. It’s very difficult when we’re pressed to come up with convincing data,” said Tom Vander Ark, the former executive director for education at the Bill and Melinda Gates Foundation and an investor in educational technology companies. When it comes to showing results, he said, “We better put up or shut up.”
and relying on intuition
“My gut is telling me we’ve had growth,” said David K. Schauer, the superintendent here. “But we have to have some measure that is valid, and we don’t have that.”
1. Why is it difficult to get valid data to measure the impact of technology upgrades?
2. Should technology upgrades wait until there is quantitative proof of its value?
Submitted by Steve Simon
Data science is hot
Data scientist: The hot new gig in tech
by Michal Lev-Ram, CNNMoney online, 6 September 2011
This is an online reprint of an article that appeared in Fortune magazine on 5 September. After decades of jokes about "lies, damned lies and statisics", recent years have seen statistics jobs listed on numerous lists of top employment opportunities. So as the semester begins, it's nice to be able share with students some stories about our new, higher profile.
The emerging positions have a variety of names. Last month the Wall Street Journal (4 August 2011) ran a story entitled Business schools plan leap into data, where we read:
Faced with an increasing stream of data from the Web and other electronic sources, many companies are seeking managers who can make sense of the numbers through the growing practice of data analytics, also known as business intelligence. Finding qualified candidates has proven difficult, but business schools hope to fill the talent gap.
Submitted by Bill Peterson, based on a post from R Bloggers.
“The NFL’s Best-Looking Team”
by Reed Albergotti, The Wall Street Journal, September 8, 2011
At the Journal's request, researchers at Ursinus College in Pennsylvania analyzed the facial structure of a sampling of 320 NFL starters (five offensive and five defensive players from each team). They also threw in two of the most photographed personalities on any team, the owner and the head coach.
The researchers programmed a computer to measure attractiveness with respect to facial symmetry. Their results gave first-place ranking to the one of the lowest performing teams, the Buffalo Bills (“99.495533373140” out of 100 possible points). And they ranked at the bottom a much more successful team, the Kansas City Chiefs (“94.609018741541”). The top and bottom player positions were, respectively, kicker and wide receiver. Apparently “average people” score in the “high 80s.”
The article includes the summary score (to 12 decimal places) for each team, as well as the names of the players, coaches and owners in the survey.
A blogger commented: “This article is proof that some people should not have computers and access to data.”
Submitted by Margaret Cibes
As further evidence in support of that last assertion, Paul Alper sent a link to the ARK CODE Home Page, where we read:
As a U.S. Coast Guard officer/military planner with Hebrew skills, I needed to know if Drosnin's predictions [from The Bible Code ] were valid; and suspected that a real Code would include the location of the Ark.
Worst graph of the year
Andrew Gelman's blog awarded (Worst) graph of the year honors to the following:
This graphic was reproduced in a Wired magazine article (14 September 2011), as part of a PowerPoint presentation that was supposedly shown at an FBI training program (with a disclaimer stating the the views expressed do not necessarily reflect the views of the US government!). The commentary on the blog is all worth reading. Andrew concludes with this: "Perhaps the same crew can arrange a presentation for the Army Corps of Engineers, discussing the techniques used in the parting of the Red Sea?"
Submitted by Paul Alper
More on satellite debris
Apparently the satellite Forsooth is more than a headline editing issue. Priscilla Bremser wrote to say that this has now been discussed on the SIGMAA-QL discussion list, where one contributor (Martha Smith) reported hearing on the radio that "The odds that someone will be hit by debris are 1 in 3200." Here is how the problem is diagnosed in that posting:
It [the original headline] referred to "The probability of the event 'someone will be hit by debris'," rather than "The probability of the event, 'a randomly chosen person will be hit by debris'." But we often use the language "The probability that someone ..." with the latter meaning.
It was also noted there that other news stories have been more careful with the distinction. For example, an NPR story said
NASA put the chances that somebody somewhere on Earth will get hurt at 1 in 3,200. But any one person's odds of being struck have been estimated at 1 in 21 trillion.
Update: As reported in the NYT, on the day of the expected re-enty, NASA was still seeking to clarify this point:
NASA’s Twitter feed emphatically said: “The chances that you (yes, I mean YOU) will be hit by a piece of the #UARS satellite today are one in several trillion. Very unlikely.”
Do you see where the 1 in 21 trillion figure comes from?
The theory that would not die
Sharon Bertsch McGrayne has written a remarkable book entitled The theory that would not die; How Bayes’ Rule Cracked the Enigma Code, Hunted Down Russian Submarines and Emerged Triumphant From Two Centuries of Controversy. In Chance News #76 there is a reference to a highly positive review of the book in the New York Times by John Allen Paulos and is very readable. The headline of his review, “The Mathematics of Changing Your Mind,” is an interesting way of viewing Bayesianism. His opening paragraph is
Sharon Bertsch McGrayne introduces Bayes’s theorem in her new book with a remark by John Maynard Keynes: “When the facts change, I change my opinion. What do you do, sir?”
Paulos continues with
Specifically Bayes’s theorem states (trumpets sound here) that the posterior probability of a hypothesis is equal to the product of (a) the prior probability of the hypothesis and (b) the conditional probability of the evidence given the hypothesis, divided by (c) the probability of the new evidence.
Viewed this way, there can be no objections to Bayesianism. It is simply a way (the only consistent way!) of combining newly acquired results with what was known previously to produce what is now known. However, when the word “known” is replaced by the word “believed” then the dispute begins. McGrayne describes in great detail how the fur was flying ever since Bayes’s theorem was posthumously introduced to the world in 1763 and indeed, continues to this day. No one doubts the validity of Bayes’s theorem. What is in question is Bayesian inference, a procedure which starts with a prior which is modified (multiplied) by likelihood and finishes with a posterior.
The objections to using priors are many fold and in order to avoid even mentioning the beast, frequentism arose as championed by Fisher and Neyman who, according to McGrayne, evidently detested each other as much as they individually disliked Bayesianism. One objection which she does not seem to mention is that a seemingly innocuous, non-informative (flat, vague) prior might be far from non-informative when a transformation takes place. A standard example is that of specifying ignorance (flatness) for the standard deviation but results in a non-flat (informative) prior for the variance. The usual Bayesian justification in general for lack of concern about priors being personally subjective is that with enough data, the influence of the choice of the prior disappears.
Neither McGrayne nor Paulos stress the schism between Bayes and frequentism in the teaching arena. Many textbooks, for example in general statistics and business statistics, invariably heavy tomes that they are, will not likely mention Bayes’s theorem and fail to mention Bayesian inference at all. The eager student will be under the impression that the be all and end all of statistical existence are p-values and confidence intervals, concepts widely derided by Bayesians. On the other hand, a Bayesian book might sneeringly refer to a p-value or a confidence interval in passing in order to indicate why each is misleading and deficient.
Although the book makes for a fascinating read--try telling this to people you know who are not statisticians!--there is a video of McGrayne summarizing her book here. She is speaking about Bayes to people at Google which in fact, uses Bayes’s theorem extensively.
- As strange as it seems, the first applications of Bayes’s theorem was to billiards and to the existence of God. The ecclesiastical aspect focused on the conversion of Prob (data | God exists) to Prob (God exists | data); this was considered quite controversial because at that time there was no allowance for debating God’s non-existence. Discuss whether or not things have changed much regarding God’s existence since 1763.
- In her knee-buckling Chapter 14, “The Navy Searches,” she details the use of Bayes to find nuclear weapons that inadvertently dropped from the sky during the cold war: “unknown to the public, the incident at Palomares [Spain] was at least the twenty-ninth serious accident involving the air force and nuclear weapons.” Unfortunately for the Bayesian advocates, her punch line is a quotation from a U.S. admiral who says “Scientifically, the big thing in my mind was that Bayes was a sidelight to the H-bomb search.”
- T-shirts exist with the text, “Bayes, fighting spam since 1763.” Do a search to see why Bayes is useful against spam. Go here to see how spam got its name.
- If p-value and confidence intervals have the faults attributed to them by Bayesians, why then do frequentists dominate?
- On page 254 of McGrayne's book there is an appendix written by Michael J. Campbell where he likens the clash between Bayesians and frequentists to religious disagreements. He concludes with a pun: “Talking of religion, I am reminded of a strip of cartoons about Bayesians that appeared some time ago. They showed a series of monks. One was looking lost, one was dressed as a soldier, one was holding a guide book and one had his tongue stuck out. They were respectively, a vague prior, a uniform prior, an informative prior and, of course, an improper prior.”
Submitted by Paul Alper
Suspicious brain-imaging studies
Brain imaging studies report more positive findings than their numbers can support. This is fishy.
by Ben Goldacre, Bad Science blog, 13 August 2011
Ben Goldacre is a British journalist and author with a hard-hitting approach towards undisciplined science and weak use of statistics. His Bad Science reports are featured weekly in the Guardian. The brain-imaging story referenced above contains a thought-provoking discussion of the ‘Wide system of science’. He notes plenty of news stories telling us for example that one part of the brain is bigger or smaller in people with a particular mental health problem, or even a specific job, but asks whether these size relationships may truly be genuine.
Ben refers to the phenomenon of ‘Publication bias’: studies with boring negative results are less likely to get written up, and subsequently to become published. He reports the work of Professor John Ioannidis, ‘A godlike figure in the field of “research about research”’. Professor Ioannidis collected a large representative sample of these anatomical studies, counted up how many positive results they got, and compared this to how many similarly positive results you could plausibly have expected to detect, simply from the sizes of the studies. The answer was that there were twice as many positive findings than could realistically have been expected from the amount of data used.
Ben hypothesises that ‘Wishful nudges’ can creep in to judgement regarding measuring the size of a brain area on a scan. It’s possible, he says, that many brain areas are measured, to see if they’re bigger or smaller, and maybe, only then the positive findings get reported, within each study.
Ben proposes that there’s only one way to prevent the loss of such findings: researchers would have to publicly pre-register what areas they plan to measure, before they begin, and report all findings. Otherwise he maintains that ‘The entire field might be distorted by a form of exaggeration’ despite the honest intentions of the individual researchers.
Would this pre-registration approach be a good discipline in any other areas of research?
Submitted by Ben Humphrys
QL in the Media Contest
The MAA SIGMAA on Quantitative Literacy has a QL in the Media contest. Entries should exhibit the best and worst uses of QL in any media; see the website for examples of submissions to date.
December 1 is the deadline for entries, voting will take place online during December, and winners will be announced at the Joint Math Meetings in January 2012.
Submitted by Margaret Cibes
Particle physics forsooth
Faster than light particles found, claim scientists
by Ian Sample, Guardian, 22 September 2011
This story reports on the recent CERN experiment in which some neutrinos putatively exceeded the speed of light. It contains the following candidate for a Forsooth:
"Physicists can claim a discovery if the chances of their result being a fluke of statistics are greater than five standard deviations, or less than one in a few million. The Gran Sasso team's result is six standard deviations."
Later in the story, we read this description of the findings: "The measurement amounts to the neutrinos travelling faster than the speed of light by a fraction of 20 parts per million. Since the speed of light is 299,792,458 metres per second, the neutrinos were evidently travelling at 299,798,454 metres per second." Recalling Carl Sagan's dictum that "extraordinary claims require extraordinary evidence," we should be wary of this combination of an exceedingly small p-value and really tiny effect. Indeed, the article goes on to quote project coordinator Antonio Ereditato on the need for verification before claiming a discovery: "Whenever you touch something so fundamental, you have to be much more prudent."
It is also important to bear in mind the similar reaction in the scientific and lay media to Daryl Bem's ESP experiments "Feeling The Future". Taken at face value Bem's ESP experiments would also overthrow conventional belief that an effect (response) cannot precede a cause (stimulus). Bem's evaluation of his experiments is "The odds against the possibility that the combined results are merely chance coincidences or statistical flukes are about 74 billion to 1." See earlier discussion of Bem in CN 70 and CN 71. For a detailed critique of Bem's methodology, see Tal Yarkoni's blog.
Submitted by Paul Alper