Chance News 25
Do not put faith in what statistics say until you have carefully considered what they do not say. William W. Watt
These Forsooth's are from the March 2007 RSS News.Hundreds of jobs to go at factory.
Almost 200 workers are to lose their jobs at Lincolnshire factory after rescue attempts to keep it open failed, the trade union Amicus has confirmed.BBC news website
12 December 2006Despite the ceaseless terrorist attacks on the country's infrastructure and particularly the oil industry, the value of the Iraqi dinar has continued to rise-in November, from D1,410 to the doller to D1,480. That is obviously good for the vast majority of people whose pay comes in dinars.The Spectator
30 December 2006
Paul Campbell provided the following Forsooth:The population of the USA has topped 300 million for the first time. It reached the figure sometime in October. It passed the 200 million mark in 1967. The U.S. census bureau, which reports the figure, calculates that, if current trends continue, it is expected to reach 400 billion by 2043. This makes it an acceleration of growth...Significance 2 (4) (Dec. 2006) 146
Jeremy Miles provided the following Forsooth:Most modern turbines installed onshore are rated to produce between 1.5 and 1.8 MW of electricity each, which is enough to power 1000 homes for an entire year.Significance 4 (1) (Mar. 2007) 11
Statistics in the Doctors Office
This is an unscientific look at a number of medical statistics I have run into, so far. It is inspired by the great title of the article in Chance News 24 that included the words "medical research", and "rot" in the same breath.
When I was born, quite premature, the doctors said I had a 50 50 chance of staying alive. My father told my mother that this meant they did not know what the probability was.
I met my husband at 39, got married over the age of 40, and gave birth with my own eggs to a healthy child at the age of 43. Statistically speaking, chance was on my side.
In January 2006, I had a large node on my thyroid, and was told by my doctor to have it removed in 2 months. I then went to a specialist, who said there was a 30 percent chance it was Cancer, although if it wasn't yet cancer, it would most likely develop into cancer because of its size. He said there was no chance it would shrink. I went to another specialist, who said surgery should be done by Labor Day. I went back to the first specialist, who agreed with this timetable. These days I tend to ask, not "what is my chance" but, "how long do I have before I have to make a permanent decision?"
To pass the time before this surgery, that was beginning to seem inevitable, I went to a Tibetan medicine doctor, researched yoga poses to help the thyroid, and altered my diet. I also got more childcare help for my child. Come Labor Day, I did my pre-op, and then insisted on having a final meeting and sonogram with the endocrinologist. I figured if they did a sonogram right before my going into surgery, it would be statistically highly unlikely for them to cancel surgery once everyone had been called into the room, and the surgeon had taken that hour or two just for me.
I scheduled the final sonogram one week before the surgery, with an endocrinologist who works closely with the surgeon. They are on the same team. She was one of the coldest doctors I have ever met, and she had no regard for alternative medicine, so I figured she was the one to go to. Without any emotion or hint in her face, the endocrinologist scanned my throat over and over again, and then suddenly said, "Cancel the surgery. Nothing is over 1 cm. There are no nodes large enough to even have another biopsy on. (My node had been 2.6 cm, and had previously flunked a biopsy)
I have no data to support any alternative medicine I went through. I go to this same endocrinologist every six months for check ups. I continue to be a non-advocate for both western and eastern medicine, but have recently become a certified yoga teacher just in case someone else wants to, if nothing else, distract themselves while they are procrastinating.
Submitted by Mary Snell
Are beautiful politicians more likely to be elected?
Pretty vacancies, Tim Harford, Undercover Economist, FT magazine, 23 Feb 2007.
Beauty and the Geek - Maybe good looks do make you smarter, Tim Harford, Slate, 3 Mar 2007.
This article considers correlations between subjective beauty and electoral success. It cites a paper by Amy King and Andrew Leigh who, when studying Australian elections, wondered whether the results were driven by ageism or racism. They conclude that a candidate's beauty is the best explanation.Beautiful candidates are indeed more likely to be elected, with a one standard deviation increase in beauty associated with a 1.5-2 percentage point increase in voteshare.
King and Leigh highlight that their results are adjusted for potential biases, such as:
- adding party fixed effects,
- dropping well-known politicians,
- using a non-Australian beauty rater,
- omitting candidates of non-Anglo Saxon appearance,
- controlling for age,
- analyzing the 'beauty gap' between candidates running in the same electorate.
They also conclude that, consistent with the theory that returns to beauty reflect discrimination, there is 'suggestive evidence that beauty matters more in electorates with a higher share of apathetic voters'.
Hartford is persuaded that beautiful people are better at their jobs.There is no mystery as to why we want decorative Hollywood stars, but the same logic might apply to sales staff. Even a bureaucrat might be more persuasive if he or she is good-looking, and who wouldn't want persuasive employees instead of charmless ones?
Hartfold also speculates that, as beautiful people have probably been treated well all their lives, this might affect abilities that have nothing to do with appearance. For example, if handsome kids get all the attention from teacher, why would they not do better at school?
Another study (Mobius and Rosenblat) found that attractive people were more self-confident about how they might perform in a maths test but they did not actually do any better. Hartford also speculates that perhaps the self-confidence of the beautiful helps them fool employers into paying more.
Beautiful people could well be genuinely more productive but American economist Daniel Hamermesh devised a clever way to demonstrate that whatever lies behind our preference, our choices are based on skin-deep evidence. He showed that when candidates stood for election on more than one occasion, their chances of success rose simply when they used a more flattering photograph.
- In Australia, voting is compulsory and voters are given How to Vote cards depicting photos of the major party candidates as they arrive to vote. Do you think these features might affect the outcome of the test?
- A priori, would you expect the marginal effect of beauty to be larger for male candidates than for female candidates?
- Can you think of any other background factors that might ideally be controlled for in a beauty test? Might the outcome depend the underlying profession in question or on how beauty is defined and who defines it?
- How might it be possible to test for explanations such as employers like to be surrounded by pretty staff, or voters like to see pretty politicians on TV or that we irrationally conflate beauty with useful qualities such as honesty or intelligence?
- Beautiful Politicians, Amy King, University of South Australia and Andrew Leigh, Australian National University. (One conclusion is that the marginal effect of beauty is larger for male candidates than for female candidates.)
- Links to six studies, many involving American economist Daniel Hamermesh, which assert that 'ugly' people earn less. The beauty premium seems to apply even in professions where there is no reason to expect that beauty counts.
- Why Beauty Matters, Markus M. Mobius, Harvard University and Tanya S. Rosenblaty, Wesleyan University, 14 Sep 2003.
- Andrew Gelman blogs about another related paper Inferences of Competence from Faces Predict Election Outcomes.
The importance of statistics
The UN requires every country to have a national statistics office, governed by a statistics law, to guarantee quality and independence. The OECD compares national statistics on health, education and economic activity to encourage laggards. Despite this, an editorial in The Economist claims that a chance to make UK government numbers more trustworthy is being missed. It also claims that three things are needed to make any nation's national statistics trustworthy:
- Pay statisticians well, to attract good candidates away from other fields. Good statistics are not cheap: Britain's next decennial census will cost £500m ($1bn), America's $12bn.
- Statistics should be free from political interference. For example, the article quotes an extreme example where the director of Argentina's statistics agency resigned following government interference in the calculation of inflation rates.
- Figures and their accompanying explanatory text should be published on an independent timetable, free from any political spin. For example, in Norway, ministers do not see statistics in advance of their publication, as happens in the UK.
The Economist says that this issue matters more in the UK than in other countries because the UK is fondest of setting targets, such as hospital waiting times or the number of new schools built, and it is the country where statistics are most obviously spun by politicians, such as crime figures coming from the government's crime fighting department, with the best gloss on hard-to-interpret figures. For these reasons, fewer than 20% of the UK population believes official statistics. The Economist claims:If governments tweak some of the numbers that they are judged by, they deprive themselves of their best guide to future policymaking and create distrust of all national statistics.
Coincidently, there is a three-part BBC documentary, called The Trap, which argues (2nd programme aired on BBC2, 18 March 2007) that the UK government's obsession with setting performance targets turns people into the calculating machines, with disastrous consequences. Bizarrely, the programme argues that this all started with game theory, the Nash equilibrium and the cold-war arms-race.
- Are you aware of the UN-mandated 'statistics law', to guarantee the quality and independence of your country's national statistics office? Do you know where to go to find out more about it? Assuming you are familiar with it, how well do you think it has been implemented in your country's case? Is such a law really necessary, or can you trust your politicians and government officials?
- Are the three conditions listed by The Economist sufficient to make every nation's national statistics trustworthy? What others would you like to impose, ideally?
- The Trap: This series consists of three one-hour programmes which explore the concept and definition of freedom, specifically: "how a simplistic model of human beings as self-seeking, almost robotic, creatures led to today's idea of freedom." It relies heavily on game theory, including the Nash equilibrium. (See Solution to the Car Talk problem (Chance News 24) for another example of Nash equlibrium.)
Submitted by John Gavin.
Paul Cambell suggested the next article.
Mixing a night out with probability & making a fortune
Kari Lock, Williams College
Oh, New York bring back those big dippers. August 1, 1999
Both of these articles tell us how two lottery players took advantage of their knowledge of probability theory to win a lot of money.
As part of the lottery program the New York State Lottery allows you to play Quick Draw (Keno) in bars, restaurants, bowling alleys and other places. To play their version of Quick Draw you buy a ticket for $1 which has the numbers from 1 through 80 on it. Then you pick a number r between 1 and 10 numbers. The lottery then randomly chooses 20 of the 80 numbers. The object, for the player, is to match as many of the 20 'house' numbers with the player's r numbers as possible. A new game is played every 4 or 5 minutes so a lot of plays can be made while drinking a few beers.
Catlin states that in the month of November in 1997, the New York Lottery had a promotion using the Quick Draw game. He writes, "The following is a direct quote taken from a table card advertising the special promotion"Win a double dip on Big Dipper Wednesdays. During our 'Big Dipper Wednesday Special' promotion November 5, 12, 19, and 26, prizes for all winning Quick Draw 4-spot tickets will be doubled!
A 4-spot ticket means that the player chooses r = 4. The two players realized that if you bought a Quick Draw 4-spot ticket you would be playing a favorable game. In his book "Lottery Book: The truth behind the numbers" Catlin says that the players were graduate students in mathematics. Kari Lock refers to them as former students.
Let's see why this would be a favorable game.
When you play Quick Draw on an ordinary day the payoffs are:
4 Spot Game
Numbers Matched Prize per $1 played 4 $55 3 $5 2 $1
The probability p(x) of x matches is given by the hypergeometric distribution:
From this we find the expect payoff per dollar for the 4-spot game is
p(4}*55 p(3)*5 + p(2)*1 =$.597361.
So on ordinary days if you choose the 4-spot game you can expect to win about 60 percent of the amount you spent. When the payoffs are doubled your expected payoff becomes twice as much which is $1.19472 so you can expect to make about 19.45 percent of the amount that you bet making this a favorable bet
Kari Lock writes:When the bar opened at 10 am the first Wednesday in November, they were there and ready to go. From opening until the deal expired at midnight, for all four Wednesdays in November, these two guys feverishly played 4 Spot Quick Draw. Purchasing around 1500 tickets a day, they played the maximum amount of 20 games with each ticket, betting $5 a game. As they played more and more games, they started making a profit as predicted, and were able to use their winnings to keep purchasing more tickets. The only factors limiting the number of tickets they played were the printer--it took a certain amount of time for the matching to process and print out a ticket--and the actual process of cashing in the tickets.
After purchasing a new house and a new car, one of the guys was asked to comment on the experience. His words of wisdom after the whole event:” It shows that paying attention in math class can, in fact, be useful."
Neither author told us who the students were nor any evidence that the story was not an urban legend. We wrote to the New York Lottery and they said that their records did not go back to 1997 but they did remember the promotion. However further research resulted convinced us that this is a true story and at the same time to find our "source" Kari Lock.
We found an interview by Joan Garfield, of "chance enhanced course" fame, of statistician Robin Lock in the "Newsletter for the Section on Statistical Education, Volume 7, Number 1, (Winter 2001)". Joan's interview started with:How many statistics instructors learn that their former students have applied their statistical skills to earn over $100,000 playing a lottery game? This happened to Robin Lock, Professor of Mathematics at St. Lawrence University in New York. Lock's former student and a friend applied their knowledge of probability in figuring out that the expected value of a Quick Draw lottery game at a local restaurant was greater than $0 during a special promotion.
According to Lock, these students "first raised enough cash to start play with little chance of going bust before the law of large numbers took effect to assure their expected winnings." They computed the probabilities and expectations by hand, then simulated the game many times on a computer to confirm the long run behavior. Putting the theory into practice for the remaining three days of the promotion netted the pair a profit of more than $100,000, matching almost exactly what the theory had predicted. Lock noted "Not only did they understand the application of mathematical expectation to this problem, but they had confidence in what they learned and the free time to sit all day in the restaurant playing the game." After hearing about these students' success, Lock invited them to visit his class and share the information about how they worked out the expected value, simulated the game, and decided how much to gamble.
Anyone who knows Robin Lock
will not be surprised to learn that his daughter Kari Lock has a gold medal in ice dancing from the US Figure Skating Association. And Kari is in her first year of the stat PhD program at Harvard.
(1) Does winning $100,000 seem plausible?
(2) Kari Lock writes:Their final profit after the four days of playing ended up within $100 of what they had originally computed to be their expected payoff.
How would you decide if this is about what you would expect?
Written by Laurie Snell
Extraordinary Knowing: Science Skepticism, and the Inexplicable Powers of the Human Mind
Elizabeth Lloyd Mayer,
Random House 2007
"In nonstatistical language, the odds that pure chance was responsible for 50 percent of women getting pregnant in the prayed-for group but only 26 percent in the non-prayed-for group were less than 13 out of ten thousand. The odds that pure chance explained why 16.3 percent of the embryos successfully implanted in the prayed-for group versus only 8 percent in the non-prayed-for group were less than five out of ten thousand." --page 155 in Extraordinary Knowing: Science Skepticism, and the Inexplicable Powers of the Human Mind by Elizabeth Lloyd Mayer. The above quotation gives the reader barely an indication of the credulity of Mayer. In statistical terms, she is confusing p-value with the probability that prayer had no effect. Further, the study to which she is referring has been thoroughly debunked; one of the authors,Daniel Worth, was convicted for unrelated bank and mail fraud and another author, Rogerio Lobo, arrived on the scene after the study was done and now has withdrawn from the study. A better indication of the strange collection of events depicted in this book comes from how Mayer first became interested in her quest for evidence for the occult, ESP, telekinesis. When her daughter's hand-carved harp was stolen in Oakland she eventually turned to a dowser (whom she had not met) some 2000 miles away in Arkansas who told her where the harp would be found. From then on, her gullibility knew no bound. About the only element missing in the book is astrology.
- What evidence would make you believe in dowsing, astrology, or telekinesis?
- What evidence would make you disbelieve in dowsing, astrology, or telekinesis?
- On the back cover of the book may be found the following quotation from Judith Orloff, M.D.: "A fascinating look at the power of non-local awareness to transcend the limits of the linear mind." Translate that into English focusing on the words "non-local" and "linear."
- In an experiment in which all possible outcomes are due to pure chance, what is the probability that the results are due to pure chance?
Submitted by Paul Alper
Further reading and listening
Unlocking the minds
Submitted by Laurie Snell
Lung cancer screening may increase your risk of dying
Seeing is not always relieving, The Economist, 8 Mar 2007.
The aim of screening healthy people for cancer is to discover tumours when they are small and still treatable but sometimes this leads to unnecessary treatment because our bodies have a battery of mechanisms for stopping small tumours from becoming large. So treating those that would have been suppressed anyway does no good and can often be harmful according to Peter Bach of the Memorial Sloan-Kettering Cancer Centre in New York and his colleagues, who studied the use of computed tomography (CT) to detect tumours in the lungs.
Earlier studies, by Claudia Henschke of Cornell University and her colleagues along with another paper from Dr. Back, reported that patients whose lung cancer had been diagnosed early by CT screening had excellent long-term survival prospects. But Dr. Bach latest paper warns that survival data alone fail to answer a basic question: compared with what?. People are bound to live longer after their diagnosis if that diagnosis is made earlier and early diagnosis is of little value unless it results in a better prognosis.
To answer this comparative question, Dr Bach interrogated his data more thoroughly. He used statistical models based on results from studies of lung cancer that did not involve CT screening, to try to predict what would have happened to the individuals in his own study if they had not been part of that study. The results were not encouraging.Screening did, indeed, detect more tumours. Over the course of five years, 144 cases of lung cancer were picked up in a population of 3,200, compared with a predicted number of 44. Despite these early diagnoses, though, there was no reduction in the number of people who went on to develop advanced cancer, nor a significant drop in the number who died of the disease (38, compared with a prediction of 39). Considering that early diagnosis prompted a tenfold increase in surgery aimed at removing the cancer (the predicted number of surgical interventions was 11; the actual number was 109), and that such surgery is unsafe—5% of patients die and another 20-40% suffer serious complications—the whole process seems to make things worse.
Dr Bach concluded that many extra cancers picked up by CT screening would never have caused clinical disease, while the most aggressive tumours — those that cause most of the 160,000 lung-cancer deaths in America each year — grow too quickly to be found early, even with annual CT screening.
In an another earlier attempt to deal with lung cancer, researchers uncovered 20% more tumours in groups that underwent screening using chest X-rays than in those who did not but the frequency of death from the disease did not differ between the two groups. This situation resembles prostate-cancer screening where a lot of disease is identified but there is doubt over the number of lives that screening actually saves.
Randomized controlled trials to accurately assess the benefits of CT screening are under way, over the next two years. But The Economist is pessimistic:The omens, however, are bad. What you do know can hurt you.
Computed Tomography Screening and Lung Cancer Outcomes, Bach et al, JAMA, Vol. 297 No. 9, March 7, 2007.
Continuing Uncertainty About CT Screening for Lung Cancer, Thomas L. Schwenk, Journal Watch - General Medicine, 6 Mar 2007.
Submitted by John Gavin.
Napping is good for your heart
It pays to sleep on the job, Stephen Pincock, Science matters, FT magazine, 24-25 Feb 2007.
Midday napping (siesta) in apparently healthy individuals is inversely associated with coronary mortality and the association was particularly evident among working men, after controlling for potential confounders.
Siesta is common in populations with low coronary mortality but epidemiological studies to confirm the relationship have generated conflicting results. So Androniki Naska and his colleagues studied a cohort of 23,681 people for an average of 6.32 years, individuals who at enrollment had no history of coronary heart disease, stroke or cancer and had complete information on frequency and duration of midday napping, as well as on potentially confounding variables. They modeled their data using a Cox regression, with time to coronary death and treating deaths from other causes as censoring events as outcomes.
Overall, 792 participants died in the six years of the study, including 133 who died from heart disease. Among men and women, those taking a siesta of any frequency or duration had a coronary mortality ratio of 0.66 (95% confidence interval [CI], 0.45-0.97). The more the subjects slept the lower the ratio. Among men, the inverse association was stronger when the analysis was restricted to those who were currently working at enrollment.
French health minister, Xavier Bertrand, has called for the country’s employers to take seriously the idea of letting their workers take siestas.Sleep must not be trivialised. Why not a siesta at work? It can’t be a taboo subject.
Bertrand advocates an experiment in which a limited number of volunteer companies introduce the after-lunch nap and then study the results. If it improves concentration and quality of work it might be adopted more widely.
In 2000, for example, Masaya Takahashi and Heihachiro Arito from Japan’s National Institute of Industrial Health studied the effects of a 15-minute nap after lunch on alertness and logical reasoning in 12 students who’d only been allowed four hour’s sleep the night before. They found that the students who had a nap did better in a test of cognitive function than their napless counterparts. The benefits were seen particularly in the mid-afternoon, the scientists claimed.
- Can you think of potentially 'confounding variables' that the article is referring to?
- What practical difficulties might have to be addressed to make Betrand's napping at work suggestion a reality?
- Is a sample size of 12 sufficient to draw meaningful conclusions in the second experiment mentioned above?
- Siesta in Healthy Adults and Coronary Mortality in the General Population, Androniki Naska, et al, Arch Intern Med. 2007;167:296-301.
Submitted by John Gavin.
Is the problem with the drug or with the data
How safe is Celebrex? By Diedtra Henderson, The Boston Globe, February 25, 2007.
The FDA routinely receives reports of adverse drug reactions, but it is unclear what they should do with this data. The case of Celebrex illustrates this issue all too well.From August 2004 to July 2005 , as consumers grew increasingly wary about Celebrex, they filed thousands of documents with the FDA. The reports raise heart attack and stroke concerns that are similar to those linked to two other painkillers, Bextra and Vioxx. But unlike those drugs, Celebrex is still sold.
Part of the problem is that drugs are tested for efficacy on hundreds or maybe a few thousand patients. But after the drug is approved, it may be taken by a million people or more. Side effects that could never be detected in a small clinical trial may now appear.
Another problem is that the system of reporting adverse drug reactions to the FDA is voluntary. Doctors and patients can contribute a report when something bad happens, but it is unclear how often they go to the trouble.According to Dr. Curt Furberg , a drug-safety expert at Wake Forest University School of Medicine , the reports actually underestimate harm attributed to Celebrex, since as few as 10 percent of doctors submit the time-consuming documentation.
The reporting system can also surge with reports in response to media coverage. If the media raises general awareness of a possible link between a popular drug and a particular side effect that occurs fairly commonly, people may draw links between the two that they might otherwise have overlooked.Pfizer's Dr. Steven J. Romano said sudden jumps in the number of reports could have been driven by media coverage of Celebrex and related painkillers. "It is important to recognize that an increase in reporting rates does not necessarily mean an increase in incidence rates," Romano wrote in response to questions from the Globe.
The FDA has been criticized by some for allowing drugs like Celebrex to stay on the market.The agency's harshest critics say it has become ineffective and point to a growing list of unsafe drugs and devices that still have the FDA's stamp of approval. "They're pussycats at FDA," said Wake Forest's Furberg. "They have absolutely no clout."
Submitted by Steve Simon
1. What types of biases is a voluntary system of adverse drug reaction reporting going to be subject to?
2. Are the biases so serious as to make such a system worthless?
3. What alternatives are there for examining the risks of adverse drug reactions?4. The article seems to offer different opinions about the types of adverse drug reactions that are the most difficult to determine. In one paragraph, they state that slight increases in a common ailment such as heart attacks, strokes, and heart failures can be easily overlooked. Why is this difficult to detect? Would an increase in the risk of a rare side effect be more difficult to detect?