https://www.causeweb.org/wiki/chance/api.php?action=feedcontributions&user=Mmartin&feedformat=atomChanceWiki - User contributions [en]2024-03-29T11:55:01ZUser contributionsMediaWiki 1.40.0-alphahttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_31&diff=4694Chance News 312007-11-06T18:33:26Z<p>Mmartin: /* Migration statistics */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote> Statistics are no substitute for judgment.<br />
<div align=right> Henry Clay</div></blockquote><br />
==Forsooth==<br />
<br />
The following Forsooth from the Nov. 2007 issue of RSS NEWS.<br />
<br />
<blockquote>The odds of an $18 million Lotto win are one in 30 million but in the tiny Northland town of Kaeo they've been slashed to just one in 500. The town is abuzz with gossip that it could be home to New Zealand's biggest ever Lotto winner but Far North district councillor Sue Shepherd says the 500 residents are keeping their cards, and their tickets, close to their chest.<br />
<br />
<div align=right>The Dominion Post, New Zealand<br><br />
22 May 2006 </div></blockquote><br />
<br />
<br />
<br />
==Using Statistics to bust myths==<br />
<br />
[http://freakonomics.blogs.nytimes.com/2007/10/25/the-mythbusters-answer-your-questions/ The MythBusters Answer Your Questions] Stephen J. Dubner, Freakonomics Blog, October 25, 2007.<br />
<br />
"The MythBusters" is a television show on The Discovery Channel where Jamie Hyneman and Adam Savage examine commonly held myths and see if they have any validity. Their prior experience was in movie special effects and stunts, and sometimes their experiments lead to big (but carefully controlled) explosions. They were interviewed on the Freakonomics blog, and there were a pair of the questions asking why they didn't use more Statistics in their investigations.<br />
<br />
<blockquote>"Q: Often, when testing a myth, you conduct one full scale test and then draw your conclusions. I know you are both aware of the scientific method and the need to run multiple trials to fully prove or disprove a theory. How confident are you that when you’ve run one test on a myth, you can then accurately capture whether or not it is true?"</blockquote><br />
<br />
and<br />
<br />
<blockquote>"Q: How much statistics training do you guys have, and how much statistics do you use off camera? I get frustrated with the show over what appears to be a lack of statistical knowledge and rigor. (I’m thinking of the “football kick with helium” episode in particular, but the issue is sort of endemic to the show.) I realize that statistics makes for bad TV, while building machines that shoot things and break things make good TV. So the Freakonomics-y question would be: how much of this type of stuff is hidden off-camera?"</blockquote><br />
<br />
Both Jamie and Adam point out their time and budget limitations and remind us that the show has to be entertaining as well as illustrate a scientific approach to investigation. Adam does admit that he'd like to include more statistics, though.<br />
<br />
<blockquote>ADAM: These two (very difficult), questions are similar, so I’ll answer them together. I would love to get more statistics into the show, and I’ve been talking to a statistician friend about just that. It’s true that statistics are not very telegenic, and are often difficult to get across.<br />
<br />
We do worry about consistency, and it’s usually because our data sets are so small. With larger sets, we can work with things like standard deviation; but with a data set of 2, we don’t have that luxury.<br />
<br />
Also, I sense a frustration in some of these questions. I’ll say this: I don’t pretend to be a scientist. We’re not deliverers of scientific truth. But I am curious. And if there’s one complaint I have about people, it’s that most of them aren’t curious enough to look around and figure stuff out for themselves. So if you’re yelling at me at the TV, you’re involved, and as such, I’ve done my job. </blockquote><br />
<br />
===Questions===<br />
<br />
1. Is it true that statistics are not very telegenic? Are there any aspects of Statistics that would lend themselves to a medium like television?<br />
<br />
2. The Discovery Channel website has an [http://dsc.discovery.com/fansites/mythbusters/episode/episode.html episode guide]. Select a show and explain how statistics could be used to investigate the myth(s) on that episode.<br />
<br />
Submitted by Steve Simon<br />
<br />
==Migration statistics==<br />
<br />
[http://uk.reuters.com/article/domesticNews/idUKL3028018520071030 Stats office to improve data on migration flows,] Reuters, 30th Oct 2007.<br><br />
[http://politics.guardian.co.uk/homeaffairs/story/0,,2201872,00.html Smith apologises for foreign workers error,] Guardian Unlimited, 30th October 2007.<br><br />
[http://www.guardian.co.uk/britain/article/0,,2204561,00.html How many people live in Britain? We haven't the foggiest idea,] The Guardian, 3rd November 2007.<br><br />
<br />
UK politicians were recenly forced to answer the question <em>how many foreign workers were in the country?</em> but were unable to do so.<br />
The initial estimate (800,000) had to be revised upwards, not once, but twice (1.1 million, then the government's chief statistician said it was more like 1.5m), much to the government's embarrassment.<br />
<br />
The shadow pensions secretary, Chris Grayling, said<br />
<blockquote><br />
This situation just gets worse. It's clear we simply can't trust the figures or statements put out by the Government on migrant workers in the UK.<br />
Ministers need to carry out an urgent review of how they handle this data and need to clear up once and for all how many people come to work in Britain.<br />
</blockquote><br />
<br />
Then just a few hours after the government was forced to admit it had hugely <br />
underestimated the number of immigrant workers, <br />
the (UK's) national statistics office (ONS) announced changes to the way it collects migration data.<br />
Publishing an interim report into the issue, the ONS said it would increase the sample sizes for its International Passenger Survey and consider making better use of administrative data, such as school and patient registers.<br />
[http://qb.soc.surrey.ac.uk/surveys/ips/ipsintro.htm The (UK's) International Passenger Survey] currently samples around 0.3 percent of people entering and leaving the country at 16 airports, 21 ferry routes and the Channel Tunnel.<br />
The ONS said extra "filter shifts" would be introduced at specific airports from next April to reflect the higher number of migrants who arrived and departed from these airports in 2006.<br />
<br />
How does the survey work? According to Michael Blastland writing in the Evening Standard<br />
<blockquote><br />
For ferry passengers, a team in blue blazers stands at the top of each of stairs into the passenger deck and scribbles a quick description of every 10th [passenger] aboard. As the ship sails, the blazers go hunting for their sample, the woman in the green hat, the trucker in overalls by the slot machine, and ask them if they plan to stay, then extrapolate.<br />
</blockquote><br />
One objective of this survey is to say how many of the 2.17m jobs created since 1997 have been filled by foreign nationals, the statistic that caused the furore.<br />
<br />
[http://www.statistics.gov.uk/about/other_letters/richard_alldritt_23aug04.asp Richard Alldritt,] the Statistics Commission's chief executive, wants the government to spend more money on improved monitoring of travel movements: the international passenger survey has become a key estimate of migration levels, but Alldritt said it didn't cover every port and that there was <br />
<blockquote><br />
no guarantee that those surveyed give accurate answers and the results have to be scaled up enormously.<br />
</blockquote><br />
The lack of reliable data on migrant flows has been a major headache for policymakers, complicating everything from the allocation of government resources to the setting of interest rates.<br />
<br />
US-born, National Statistician [http://en.wikipedia.org/wiki/Karen_Dunnell Karen Dunnell] said<br />
<blockquote><br />
The ONS is engaged in a major programme to improve further the quality of its migration statistics.<br />
The International Passenger Survey is a vital source of data on this, so improving the sampling of migrants is a step forward in this very important area of our work.<br />
</blockquote><br />
<br />
This week on [http://news.bbc.co.uk/1/hi/programmes/question_time/default.stm BBC's Question Time,] David Dimbleby asked the audience if they would believe any statistic mentioned by a politician and the audience roared 'No!'.<br />
<br />
===Questions===<br />
* Speculate on [http://qb.soc.surrey.ac.uk/surveys/ips/ipsintro.htm what questions might be asked] in such a survey?<br />
* What criteria might the ONS use to decide which airports to locate their extra 'filter shifts' at?<br />
* The revised figure of 1.5m included children. What is the implication of counting them as 'workers'?<br />
* Sir Andrew Green, chairman of [http://www.migrationwatchuk.com/ Migration Watch,] which campaigns against mass immigration, claimed that the rise was equivalent to a city the size of Coventry. Is it fair and unbiased to compare the size of the error in the initial estimate to a specific city? Can you think of alternative analogies?<br />
<br />
===Further reading===<br />
* The [http://www.statistics.gov.uk/ssd/surveys/international_passenger_survey.asp International Passenger Survey] is a survey of a random sample of passengers entering and leaving the UK by air, sea or the Channel Tunnel. <br />
** Over a quarter of million face-to-face interviews are carried out each year with passengers entering and leaving the UK through the main airports, seaports and the Channel Tunnel.<br />
** There are six versions of the questionnaire depending on the mode of transport (air, sea or Eurostar) and which direction the passenger is travelling in (arrivals or departures).<br />
** The sampling procedures for air, sea and tunnel passengers are slightly different but the underlying principle for each is similar. In the absence of a readily available sampling frame, <em>time shifts</em> or crossings are sampled at the first stage. During these shifts or crossings, the travellers are counted as they pass a particular point (for example, after passing through passport control) then travellers are systematically chosen at fixed intervals from a random start. <br />
** Interviewing is carried out throughout the year and over a quarter of a million face-to-face interviews are conducted each year, and represents about 1 in every 500 passengers.<br />
** The interview usually take 3-5 minutes and contains questions about passengers’ country of residence (for overseas residents) or country of visit (for UK residents), the reason for their visit, and details of their expenditure and fares. <br />
*** There are additional questions for passengers migrating to or from the UK. <br />
*** While much of the content of the interview remains the same from one year to the next, new questions are sometimes added or appear periodically on the survey.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Item3==</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_31&diff=4693Chance News 312007-11-06T18:27:08Z<p>Mmartin: /* Using Statistics to bust myths */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote> Statistics are no substitute for judgment.<br />
<div align=right> Henry Clay</div></blockquote><br />
==Forsooth==<br />
<br />
The following Forsooth from the Nov. 2007 issue of RSS NEWS.<br />
<br />
<blockquote>The odds of an $18 million Lotto win are one in 30 million but in the tiny Northland town of Kaeo they've been slashed to just one in 500. The town is abuzz with gossip that it could be home to New Zealand's biggest ever Lotto winner but Far North district councillor Sue Shepherd says the 500 residents are keeping their cards, and their tickets, close to their chest.<br />
<br />
<div align=right>The Dominion Post, New Zealand<br><br />
22 May 2006 </div></blockquote><br />
<br />
<br />
<br />
==Using Statistics to bust myths==<br />
<br />
[http://freakonomics.blogs.nytimes.com/2007/10/25/the-mythbusters-answer-your-questions/ The MythBusters Answer Your Questions] Stephen J. Dubner, Freakonomics Blog, October 25, 2007.<br />
<br />
"The MythBusters" is a television show on The Discovery Channel where Jamie Hyneman and Adam Savage examine commonly held myths and see if they have any validity. Their prior experience was in movie special effects and stunts, and sometimes their experiments lead to big (but carefully controlled) explosions. They were interviewed on the Freakonomics blog, and there were a pair of the questions asking why they didn't use more Statistics in their investigations.<br />
<br />
<blockquote>"Q: Often, when testing a myth, you conduct one full scale test and then draw your conclusions. I know you are both aware of the scientific method and the need to run multiple trials to fully prove or disprove a theory. How confident are you that when you’ve run one test on a myth, you can then accurately capture whether or not it is true?"</blockquote><br />
<br />
and<br />
<br />
<blockquote>"Q: How much statistics training do you guys have, and how much statistics do you use off camera? I get frustrated with the show over what appears to be a lack of statistical knowledge and rigor. (I’m thinking of the “football kick with helium” episode in particular, but the issue is sort of endemic to the show.) I realize that statistics makes for bad TV, while building machines that shoot things and break things make good TV. So the Freakonomics-y question would be: how much of this type of stuff is hidden off-camera?"</blockquote><br />
<br />
Both Jamie and Adam point out their time and budget limitations and remind us that the show has to be entertaining as well as illustrate a scientific approach to investigation. Adam does admit that he'd like to include more statistics, though.<br />
<br />
<blockquote>ADAM: These two (very difficult), questions are similar, so I’ll answer them together. I would love to get more statistics into the show, and I’ve been talking to a statistician friend about just that. It’s true that statistics are not very telegenic, and are often difficult to get across.<br />
<br />
We do worry about consistency, and it’s usually because our data sets are so small. With larger sets, we can work with things like standard deviation; but with a data set of 2, we don’t have that luxury.<br />
<br />
Also, I sense a frustration in some of these questions. I’ll say this: I don’t pretend to be a scientist. We’re not deliverers of scientific truth. But I am curious. And if there’s one complaint I have about people, it’s that most of them aren’t curious enough to look around and figure stuff out for themselves. So if you’re yelling at me at the TV, you’re involved, and as such, I’ve done my job. </blockquote><br />
<br />
===Questions===<br />
<br />
1. Is it true that statistics are not very telegenic? Are there any aspects of Statistics that would lend themselves to a medium like television?<br />
<br />
2. The Discovery Channel website has an [http://dsc.discovery.com/fansites/mythbusters/episode/episode.html episode guide]. Select a show and explain how statistics could be used to investigate the myth(s) on that episode.<br />
<br />
Submitted by Steve Simon<br />
<br />
==Migration statistics==<br />
<br />
[http://uk.reuters.com/article/domesticNews/idUKL3028018520071030 Stats office to improve data on migration flows,] Reuters, 30th Oct 2007.<br><br />
[http://politics.guardian.co.uk/homeaffairs/story/0,,2201872,00.html Smith apologises for foreign workers error,] Guardian Unlimited, 30th October 2007.<br><br />
[http://www.guardian.co.uk/britain/article/0,,2204561,00.html How many people live in Britain? We haven't the foggiest idea,] The Guardian, 3rd November 2007.<br><br />
<br />
UK politicians were recenly forced to answer the question <em>how many foreign workers were in the country?</em> but were unable to do so.<br />
The initial estimate (800,000) had to be revised upwards, not once, but twice (1.1 million, then the government's chief statistician said it was more like 1.5m), much to the government's embarrasement.<br />
<br />
The shadow pensions secretary, Chris Grayling, said<br />
<blockquote><br />
This situation just gets worse. It's clear we simply can't trust the figures or statements put out by the Government on migrant workers in the UK.<br />
Ministers need to carry out an urgent review of how they handle this data and need to clear up once and for all how many people come to work in Britain.<br />
</blockquote><br />
<br />
Then just a few hours after the the government was forced to admit it had hugely <br />
underestimated the number of immigrant workers, <br />
the (UK's) national statistics office (ONS) announced changes to the way it collects migration data.<br />
Publishing an interim report into the issue, the ONS said it would increase the sample sizes for its International Passenger Survey and consider making better use of administrative data, such as school and patient registers.<br />
[http://qb.soc.surrey.ac.uk/surveys/ips/ipsintro.htm The (UK's) International Passenger Survey] currently samples around 0.3 percent of people entering and leaving the country at 16 airports, 21 ferry routes and the Channel Tunnel.<br />
The ONS said extra "filter shifts" would be introduced at specific airports from next April to reflect the higher number of migrants who arrived and departed from these airports in 2006.<br />
<br />
How does the survey work? According to Michael Blastland writing in the Evening Standard<br />
<blockquote><br />
For ferry passengers, a team in blue blazers stands at the top of each of stairs into the passenger deck and scribbles a quick description of every 10th [passenger] aboard. As the ship sails, the blazers go hunting for their sample, the woman in the green hat, the trucker in overalls by the slot machine, and ask them if they plan to stay, then extrapolate.<br />
</blockquote><br />
One objective of this survey is to say how many of the 2.17m jobs created since 1997 have been filled by foreign nationals, the statistic that caused the furore.<br />
<br />
[http://www.statistics.gov.uk/about/other_letters/richard_alldritt_23aug04.asp Richard Alldritt,] the Statistics Commission's chief executive, wants the government to spend more money on improved monitoring of travel movements: the international passenger survey has become a key estimate of migration levels, but Alldritt said it didn't cover every port and that there was <br />
<blockquote><br />
no guarantee that those surveyed give accurate answers and the results have to be scaled up enormously.<br />
</blockquote><br />
The lack of reliable data on migrant flows has been a major headache for policymakers, complicating everything from the allocation of government resources to the setting of interest rates.<br />
<br />
US-born, National Statistician [http://en.wikipedia.org/wiki/Karen_Dunnell Karen Dunnell] said<br />
<blockquote><br />
The ONS is engaged in a major programme to improve further the quality of its migration statistics.<br />
The International Passenger Survey is a vital source of data on this, so improving the sampling of migrants is a step forward in this very important area of our work.<br />
</blockquote><br />
<br />
This week on [http://news.bbc.co.uk/1/hi/programmes/question_time/default.stm BBC's Question Time,] David Dimbleby asked the audience if they would believe any statistic mentioned by a politician and the audience roared 'No!'.<br />
<br />
===Questions===<br />
* Speculate on [http://qb.soc.surrey.ac.uk/surveys/ips/ipsintro.htm what questions might be asked] in such a survey?<br />
* What criteria might the ONS use to decide which airports to locate their extra 'filter shifts' at?<br />
* The revised figure of 1.5m included children. What is the implication of counting them as 'workers'?<br />
* Sir Andrew Green, chairman of [http://www.migrationwatchuk.com/ Migration Watch,] which campaigns against mass immigration, claimed that the rise was equivalent to a city the size of Coventry. Is it fair and unbiased to compare the size of the error in the initial estimate to a specific city? Can you think of alternative analogies?<br />
<br />
===Further reading===<br />
* The [http://www.statistics.gov.uk/ssd/surveys/international_passenger_survey.asp International Passenger Survey] is a survey of a random sample of passengers entering and leaving the UK by air, sea or the Channel Tunnel. <br />
** Over a quarter of million face-to-face interviews are carried out each year with passengers entering and leaving the UK through the main airports, seaports and the Channel Tunnel.<br />
** There are six versions of the questionnaire depending on the mode of transport (air, sea or Eurostar) and which direction the passenger is travelling in (arrivals or departures).<br />
** The sampling procedures for air, sea and tunnel passengers are slightly different but the underlying principle for each is similar. In the absence of a readily available sampling frame, <em>time shifts</em> or crossings are sampled at the first stage. During these shifts or crossings, the travellers are counted as they pass a particular point (for example, after passing through passport control) then travellers are systematically chosen at fixed intervals from a random start. <br />
** Interviewing is carried out throughout the year and over a quarter of a million face-to-face interviews are conducted each year, and represents about 1 in every 500 passengers.<br />
** The interview usually take 3-5 minutes and contains questions about passengers’ country of residence (for overseas residents) or country of visit (for UK residents), the reason for their visit, and details of their expenditure and fares. <br />
*** There are additional questions for passengers migrating to or from the UK. <br />
*** While much of the content of the interview remains the same from one year to the next, new questions are sometimes added or appear periodically on the survey.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Item3==</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_29&diff=4522Chance News 292007-09-07T00:09:47Z<p>Mmartin: /* The Myth, the Math, the Sex */</p>
<hr />
<div>==Quotations==<br />
<br />
"There are few things that are so unpardonably neglected in our country as poker. The upper class knows very little about it. Now and then you find ambassadors who have sort of a general knowledge of the game, but the ignorance of the people is fearful. Why, I have known clergymen, good men, kind-hearted, liberal, sincere, and all that, who did not know the meaning of a "flush." It is enough to make one ashamed of the species". <br />
<br />
<div align=right>Mark Twain.</div><br />
<br />
==Forsooth==<br />
<br />
The following Forsooths are from the September 07 issue of the RSS NEWS.<br />
<br />
<center>THE BIGGEST KILLER BY FAR<br />
<blockquote>Heart disease claimed the lives of one in five men<br> and about one in six women last year, figures indicate. <div align=right>The Times <br> 26 May 2006</div></blockquote></center><br />
See the end of this Chance News for the data that was the basis for this claim.<br />
----<br />
<center><blockquote>[Hanson plc is the]] Largest aggregates producer<br> in the world and 3rd largest in the USA <div align=right> Daily Telegraph<br> 3 March, 2006</div></blockquote></center><br />
<br />
----<br />
This Forsooth was suggested by Jerry Grossman.<br />
<br />
<blockquote>In addition, a person's odds of becoming obese increased by 57 percent if he or she had a friend who became obese over a certain time interval. If the two people were mutual friends, the odds increased to 171 percent. <br> <div align=right> Family, Friend May "Spread" Obesity <br>[http://www.revolutionhealth.com/news/?id=hd-606718&msc=S36090 Revolution Health]<br> July 25, 2007</div></blockquote><br />
<br />
This discussion relates to an article [http://content.nejm.org/cgi/content/full/357/4/370 The Spread of Obesity in a Large Social Network over 32 Years] that appeared in the July 26, 2007 issue of the New England Journal of Medicine and seems to be freely available. Of course, here the "increased to 171 percent" is "increased by 171%." <br />
<br />
Jerry remarks "The NEJM article is interesting to those of us interested<br />
in the mathematical aspects of the social network."<br />
<br />
<br />
----<br />
This forsooth was suggested by Paul Alper<br />
<br />
<blockquote>I've done 120 short-term energy outlooks, and I've probably gotten two of them right.<br><br />
<br />
<div align=right>Mark Rodekohr, a veteran Department of Energy (DOE) economist<br>Minnesota Star Tribune<br> August 12, 2007</div></blockquote><br />
<br />
==Is Poker predominantly skill or luck?==<br />
Harvard ponders just what it takes to excel at poker.<br><br />
''Wall Street Journal'', May 3, 2007, A1<br><br />
Neil King JR<br><br />
<br />
The WSJ article reports on a one-day meeting in the Harvard Faculty Club of poker pros, game theorists, statisticians, law students and gambling lobbyists to develop a strategy to show that poker is not predominantly a game of chance.<br />
<br />
In the [http://www.bigslickak.com/murv/index.nsf/HarvardArticle!OpenPage article] we read:<br />
<blockquote>The skill debate has been a preoccupation in poker circles since September (2006), when Congress barred the use of credit cards for online wagers. Horse racing and stock trading were exempt, but otherwise the new law hit any game "predominantly subject to chance". Included among such games was poker, which is increasingly played on Internet sites hosting players from all over the world.</blockquote><br />
<br />
This, of course, is not a new issue. For example it is the subject of the Mark Twain's short story [http://www.twainquotes.com/Galaxy/187010d.html "Science vs. Luck"] published in the October 1870 issue of The Galaxy. The Galaxy no longer exists but co-founder Francis Church will always be remembered for his reply to Virginia's letter to the New York Sun: "Yes, Virginia, there is a Santa Claus". <br />
<br />
In Mark Twain's story a number of boys were arrested for playing "old sledge" for money. Old sledge was a popular card game in those times and often played for money. In the trial the judge finds that half the experts say that old sledge is a game of science and half that it is a game of skill. The lawyer for the boys suggests:<br />
<blockquote>Impanel a jury of six of each, Luck versus Science -- give them candles and a couple of decks of cards, send them into the jury room, and just abide by the result!</blockquote><br />
<br />
The Judge agrees to do this and so four deacons and the two dominies (Clergymen) were sworn in as the "chance" jurymen, and six inveterate old seven-up professors were chosen to represent the "science" side of the issue. <br />
They retired to the jury room. When they came out, the professors had ended up with all the money. So the Judge ruled that the boys were innocent. <br />
<br />
Today more sophisticated ways to determine if a gambling game is predominantly skill or luck are being studied. Ryne Sherman has written two articles on this, <br />
[http://www.dartmouth.edu/~chance/forwiki/Sherman1 "Towards a Skill Ratio"] and <br />
[http://www.dartmouth.edu/~chance/forwiki/Sherman2 "More on Skill and Individual Differences"] <br />
in which he proposes a way to estimate luck and skill in poker and other games.<br />
<br />
To estimate skill and luck percentages Sherman uses a statistical procedure called analysis of variance (ANOVA). <br />
To understand Sherman's method of comparing luck and skill we need to understand how ANOVA works so we will show how it works using a simple example from [http://www.unc.edu/courses/2006spring/psyc/130/001/variance.htm Variance and the Design of Experiments] It begins with the following hypothetical data. <br />
<center><br />
<p>&nbsp;</p><br />
<table width="70%" border="1"><br />
<tr> <br />
<td><div align="center">Treatment 1</div></td><br />
<td><div align="center">Treatment 2</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">4</div></td><br />
<td><div align="center">7</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">6</div></td><br />
<td><div align="center">5</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">8</div></td><br />
<td><div align="center">8</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">4</div></td><br />
<td><div align="center">9</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">5</div></td><br />
<td><div align="center">7</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">3</div></td><br />
<td><div align="center">9</div></td><br />
</tr><br />
</table><br />
</center><br><br />
<br />
The author does not explain how the data might occur. But lets assume that these are the result of a clinical trial to determine if vitamin ME improves memory. In the study. Two groups are formed from 12 participants. 6 were given a placebo and 6 were given vitamin ME. The study went on for a month. At the end of the month the two groups are given a memory test and the numbers in the first column are the number of correct answers for the placebo group and those in the second column are the number of correct answers for Vitamin ME group. ANOVA can be used to see if there is significant difference between the groups. Here is Bill Peterson's explanation for how this works.<br />
<br />
There are two group means:<br />
<br />
Mean1 = (4+6+8+4+5+3)/6 = 30/6 = 5.0<br><br />
Mean2 = (7+5+8+9+7+9)/6 = 45/6 = 7.5<br><br />
<br />
Then a grand mean over all observations:<br><br />
Mean = (30+45)/(6+6) = 6.25<br><br />
<br />
Variance is always a sum of square deviations divided by degree of freedom:<br />
SS/df. This is also called a mean squared deviation MS.<br />
<br />
ANOVA begins by expressing the deviation of each observation from the grand mean as a sum of two terms: the difference of the observation from its group mean, plus the difference of the group mean from the grand mean. Writing this out explicitly for the example, we have:<br><br><br />
(4 - 6.25) = (4 - 5.0) + (5.0 - 6.25)<br><br />
(6 - 6.25) = (6 - 5.0) + (5.0 - 6.25)<br><br />
...<br><br />
(3 - 6.25) = (3 - 5.0) + (5.0 - 6.25)<br><br />
<br />
(7 - 6.25) = (7 - 7.5) + (7.5 - 6.25)<br><br />
<br />
(5 - 6.25) = (5 - 7.5) + (7.5 - 6.25)<br><br />
...<br><br />
(9 - 6.25) = (9 - 7.5) + (7.5 - 6.25)<br><br />
<br />
The magic (actually the Pythagorean Theorem in an appropriate dimensional space)<br />
is that the sums of squares decompose in this way.<br><br />
<math><br />
(4-6.25)^2 +...+(9-6.25)^2 =<br />
[(4-5.0)^2+...+(9 - 7.5)^2] + [(5.0 - 6.25)^2+...+(7.5 - 6.25)^2] </math><br><br />
Check: 46.25 = 27.5 + 18.75<br><br />
<br />
In the usual abbreviations:<br><br />
<br />
SST = SSE + SSG<br><br />
<br />
(total sum of sqs = error sum of sqs + group sum of sqs)<br><br />
<br />
Fisher's F statistic is F = MSG/MSE. Large values of F are taken as evidence that there is a real treatment<br />
effect.<br />
<br />
Now Sherman uses this same kind of decomposition for his measure of skill and of chance for a game. We illustrate how he does this using the following data from five weeks of our low-key Monday night poker games.<br />
<br />
<center><br />
<table width="90%" border="1"><br />
<tr> <br />
<td width="13%"><div align="center"></div></td><br />
<td width="13%"><div align="center">Sally</div></td><br />
<td width="12%"><div align="center">Laurie</div></td><br />
<td width="13%"><div align="center">John</div></td><br />
<td width="13%"><div align="center">Mary</div></td><br />
<td width="12%"><div align="center">Sarge</div></td><br />
<td width="12%"><div align="center">Dick</div></td><br />
<td width="12%"><div align="center">Glenn</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 1</div></td><br />
<td><div align="center">-6.75</div></td><br />
<td><div align="center">-10.10</div></td><br />
<td><div align="center">-5.75</div></td><br />
<td><div align="center">10.35</div></td><br />
<td><div align="center">9.7</div></td><br />
<td><div align="center">4.43</div></td><br />
<td><div align="center">-1.95</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 2</div></td><br />
<td><div align="center">4.35</div></td><br />
<td><div align="center">-4.25</div></td><br />
<td><div align="center">.40</div></td><br />
<td><div align="center">-.35</div></td><br />
<td><div align="center">-8.8</div></td><br />
<td><div align="center">-.15</div></td><br />
<td><div align="center">5.8</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 3</div></td><br />
<td><div align="center">6.95</div></td><br />
<td><div align="center">-4.35</div></td><br />
<td><div align="center">.18</div></td><br />
<td><div align="center">-7.75</div></td><br />
<td><div align="center">7.65</div></td><br />
<td><div align="center">-5.9</div></td><br />
<td><div align="center">3.9</div></td><br />
</tr><br />
<tr> <br />
<td height="24"><div align="center">Game 4</div></td><br />
<td><div align="center">-1.23</div></td><br />
<td><div align="center">-11.55</div></td><br />
<td><div align="center">4.35</div></td><br />
<td><div align="center">2.9</div></td><br />
<td><div align="center">4.85</div></td><br />
<td><div align="center">-3.9</div></td><br />
<td><div align="center">3.25</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 5</div></td><br />
<td><div align="center">6.35</div></td><br />
<td><div align="center">-1.5</div></td><br />
<td><div align="center">-.45</div></td><br />
<td><div align="center">-.65</div></td><br />
<td><div align="center">-.25</div></td><br />
<td><div align="center">-4.9</div></td><br />
<td><div align="center">1.42</div></td><br />
</tr><br />
</table><br />
</center><br />
<br />
<br />
To compare the amount of skill and luck in these games Sherman would have us carry out an analysis of variance in the same way we did for our example. The players are now seen in the role of treatments. Each player has a mean net gain over the set of games. For each outcome in the table we write the difference between this outcome and the overall mean as the sum of two terms: the difference between the outcome and the player's mean plus the difference between the player's mean and the overall mean. Sherman suggests that the difference between the outcome and the players mean is due primarily to luck while the difference between the players mean and the overall mean is due primarily to skill. This leads him to define the skill % as the ratio of the of the group sums of squares to the total sums of squares and the luck % as the ratio of the within group sums of squares to the total sums of squares.<br />
<br />
We could estimate the luck % and the skill % for our poker games by making the same kind of computations we made in our example. However, most statistical packages such as SAS have a program to make these computations for us. <br />
<br />
Using the SAS ANOVA program, we find that the total sums of squares is 1069.95, the between groups sums of squares is 311.447 and the within groups sums of squares is 758.499. Thus, from our poker game we would estimate the skill % to be 311.447/1069.95 = 29.1 % and the luck % to be 758.499/1069.06 = 70.9 %. Thus, not surprisingly, luck is more important than skill.<br />
<br />
In his second article, Sherman reports the skill % he obtained using data from a number of different types of games. For example, using data for Major League Batting hits the skill % was 39 % and for home runs it was 68 %. For NBA Basketball for points scored it was 75 % and for poker stars in weekly tournaments it was 35 %.<br />
<br />
Sherman concludes his articles with the remarks:<br />
<br />
<blockquote>If two persons play the same game, why don't both achieve the same results? The purpose of last month's article and this article was to address this question. This article suggests that there are two answers to this question: Skill (or systematic variance) or Luck (or random variance). Using both the correlation approach described last month and the ANOVA approach described in this article, one can estimate the amount of skill involved in any game. Last, and maybe most importantly, Table 4 demonstrated that the skill estimated involved in playing poker (or at least tournament poker) are not very different from other sport outcomes which are widely accepted as skillful.<\blockquote><br />
<br />
===Discussion questions===<br />
<br />
(1) Do you think that Sherman's measure of skill and luck in a game is reasonable? If not why not?<br />
<br />
(2) There is a form of [http://www.duplicatepoker.com/WebSite/epokerusa.aspx?page=dpg_rules duplicate poker] modeled after duplicate bridge. Do you think that the congressional decision should apply to this form of gambling? <br />
<br />
Submitted by Laurie Snell<br />
<br />
==Second chance lottery drawing==<br />
Ask Marilyn<br><br />
Parade, 5 August 2007<br><br />
Marilyn vos Savant<br />
<br />
A reader poses the following question.<br />
<blockquote><br />
Say that a state runs a lottery with scratch-off tickets and has a second-chance drawing for losing tickets. The latter are sent to a central location, where they are boxed and stored until it’s time for the drawing. An official then chooses one box and draws a ticket from it. All the other boxes are untouched. Is this fair, compared to storing all the tickets in a large container and then drawing a ticket from it?<br />
</blockquote><br />
<br />
<br />
Marilyn responds that, &quot;The methods are equivalent, and both are perfectly fair: One winner was chosen at random&quot;, and suggests that the method is used purely for physical convenience. (In a state lottery, however, we imagine the whole affair would be conducted electronically.)<br />
<br />
DISCUSSION QUESTIONS:<br />
<br />
(1) Marilyn's answer is almost correct. What has been implicitly assumed here?<br />
<br />
(2) Here is a related problem (from Grinstead & Snell, [http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/book.html Introduction to Probability], p. 152, problem 23).<br />
<blockquote><br />
You are given two urns and fifty balls. Half of the balls are white and half<br />
are black. You are asked to distribute the balls in the urns with no restriction<br />
placed on the number of either type in an urn. How should you distribute<br />
the balls in the urns to maximize the probability of obtaining a white ball if<br />
an urn is chosen at random and a ball drawn out at random? Justify your<br />
answer.<br />
</blockquote><br />
<br />
==Second chance lottery drawing==<br />
Ask Marilyn<br><br />
Parade, 5 August 2007<br><br />
Marilyn vos Savant<br />
<br />
A reader poses the following question.<br />
<blockquote><br />
Say that a state runs a lottery with scratch-off tickets and has a second-chance drawing for losing tickets. The latter are sent to a central location, where they are boxed and stored until it’s time for the drawing. An official then chooses one box and draws a ticket from it. All the other boxes are untouched. Is this fair, compared to storing all the tickets in a large container and then drawing a ticket from it?<br />
</blockquote><br />
<br />
Marilyn responds that, &quot;The methods are equivalent, and both are perfectly fair: One winner was chosen at random&quot;, and suggests that the method is used purely for physical convenience. (In a state lottery, however, we imagine the whole affair would be conducted electronically.)<br />
<br />
DISCUSSION QUESTIONS:<br />
<br />
(1) Marilyn's answer is almost correct. What has been implicitly assumed here?<br />
<br />
(2) Here is a related problem (from Grinstead & Snell, [http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/book.html Introduction to Probability], p. 152, problem 23).<br />
<blockquote><br />
You are given two urns and fifty balls. Half of the balls are white and half<br />
are black. You are asked to distribute the balls in the urns with no restriction<br />
placed on the number of either type in an urn. How should you distribute<br />
the balls in the urns to maximize the probability of obtaining a white ball if<br />
an urn is chosen at random and a ball drawn out at random? Justify your<br />
answer.<br />
</blockquote><br />
<br />
Submitted by Bill Peterson<br />
<br />
==The understanding and misunderstanding of Bayesian statistics==<br />
[http://www.economist.com/science/PrinterFriendly.cfm?story_id=9645336 <em>Gambling on tomorrow</em>,] The Economist, Aug 16th 2007 <br><br />
[http://news.yahoo.com/s/nm/20070812/sc_nm/climate_uncertainty_dc <em>Scientists try new ways to predict climate risks</em>,] Reuters 12 Aug 2007.<br><br />
<em>Too late to escape climate disaster?</em>, New Scientist, 18 Aug 2007.<br><br />
<em>Earth Log - Complex lesson</em>, Daily Telegraph, 17 Aug 2007.<br><br />
<br />
The latest edition of one of the Royal Society's journals, [http://www.journals.royalsoc.ac.uk/content/102021/ Philosophical Transactions] is devoted to the science of climate modelling:<br />
<blockquote>predictions from different models are pooled to produces estimates of future climate change, together with their associated uncertainties</blockquote><br />
the Royal Society said,<br />
and it partly focusses on 'the understanding and misunderstanding' of Bayesian statistics.<br />
So this Economist article discusses the difference between the frequentist and Bayesian view of statistics, in the context of forecasting the weather.<br />
<br />
The article starts by claiming that there were just two main influences on the early development of probability theory and statistics:<br />
[http://en.wikipedia.org/wiki/Thomas_Bayes Bayes] and [http://en.wikipedia.org/wiki/Blaise_Pascal Pascal].<br />
It claims that Pascal's ideas are simple and widely understood while Bayes are not. <br />
Pascal adopted a [http://en.wikipedia.org/wiki/Frequency_probability frequentist] view, which The Economist characterises as <em>the world was that of the gambler: each throw of the dice is independent of the previous one.</em><br />
Bayes promoted what we now call [http://en.wikipedia.org/wiki/Bayesian_probability Bayesian probability,] which The Economist characterises as <em>incorporating the accumulation of experience into a statistical model in the form of prior assumptions:</em><br />
<blockquote><br />
A good prior assumption about tomorrow's weather, for example, is that it will be similar to today's. <br />
Assumptions about the weather the day after tomorrow, though, will be modified by what actually happens tomorrow. <br />
</blockquote><br />
<br />
But prior assumptions can influence model outcomes in subtle ways, The Economist warns:<br />
<blockquote><br />
Since the future is uncertain, (weather) forecasts are run thousands of times, with varying parameters, to produce a range of possible outcomes. <br />
The outcomes are assumed to cluster around the most probable version of the future.<br />
The particular range of values chosen for a parameter is an example of a Bayesian prior assumption, since it may be modified in the light of experience. But the way you pick the individual values to plug into the model can cause trouble. <br />
They might, for example, be assumed to be evenly spaced, say 1,2,3,4. <br />
But in the example of snow retention, evenly spacing both rate-of-fall and rate-of-residence-in-the-clouds values will give different distributions of results. <br />
That is because the second parameter is actually the reciprocal of the first. <br />
To make the two match, value for value, you would need, in the second case, to count 1, ½, ⅓, ¼—which is not evenly spaced. <br />
If you use evenly spaced values instead, the two models' outcomes will cluster differently.<br />
</blockquote><br />
<br />
It goes on to claim that those who use statistical models often fail to account for the uncertainty associated with such models:<br />
<blockquote><br />
Psychologically, people tend to be Bayesian—to the extent of often making false connections. And that risk of false connection is why scientists like Pascal's version of the world. It appears to be objective. But when models are built, it is almost impossible to avoid including Bayesian-style prior assumptions in them. By failing to acknowledge that, model builders risk making serious mistakes.<br />
</blockquote><br />
<br />
One of the Philosophical Transactions papers authors', David Stainforth of Oxford University, says<br />
<blockquote><br />
The answer is more comprehensive assessments of uncertainty, if we are to provide better information for today's policy makers.<br />
Such assessments would help steer the development of climate models and focus observational campaigns. Together this would improve our ability to inform decision makers in the future.<br />
</blockquote><br />
<br />
===Questions===<br />
* What influences on the early development of probability theory and statistics can you think of, other than Pascal and Bayes?<br />
* Is the frequentist view of statistics nothing more than <em>each throw of the dice is independent of the previous one</em>. What other characteristics would you associate with this view of statistics? Can you offer a better one-line summary? What about a better descrption of Bayesian statistics than <em>incorporating the accumulation experience into a statistical model in the form of prior assumptions</em>.<br />
* In one of the Royal Society's papers, authors David Stainforth from Oxford University and Leonard Smith from the LSE, advocate making a clearer distinction between the output of model experiments designed for improving the model and those of immediate relevance for decision making. What do you think they meant by that? Can you think of a simple example to illustrate your interpretation?<br />
* The Economist claims that scientists are not easily able to understand Bayes because of their philosophical training in the rigours of Pascal's method. How would you reply to this assertion?<br />
<br />
===Further reading===<br />
* [http://www.lse.ac.uk/collections/pressAndInformationOffice/newsAndEvents/archives/2007/ClimateChangeReport.htm Confidence, uncertainty and decision-support relevance in climate predictions,] [http://www.atm.ox.ac.uk/user/das/ David Stainforth,] Oxford University and Leonard Smith, LSE.<br />
** This [http://www.lse.ac.uk/collections/cats/papersPDFs/75_Stainforth_ConfidenceUncertaintyRelevance_2007.pdf paper] discusses the sources of uncertainty in the interpretation of climate model simulations as projections of the future. <br />
* See also [http://www.climateprediction.net/science/scientific_papers.php Climateprediction.net].<br />
<br />
Sumbitted by John Gavin.<br />
<br />
==The Myth, the Math, the Sex==<br />
[http://www.nytimes.com/2007/08/12/weekinreview/12kolata.html?ex=1188792000&en=4f4f1484b2912d4b&ei=5070 The Myth, the Math, the sex].<br><br />
''The New York Times'', August 12, 2007, The Week in Review<br><br />
Gina Kolata<br />
<br />
[http://www.nytimes.com/2007/08/19/weekinreview/19kolata.html?ex=1188792000&en=2b4b9a5b0a9293b4&ei=5070 The Median, the Math and the Sex].<br><br />
''The New York Times'', August 19, 2007, The Week in Review<br><br />
Gina Kolata<br />
<br />
In the first article Gina Kolata comments that there have been numerous studies claiming to show that men have more sexual partners than women. <br />
<br />
She reports a recent government study, reporting that men have had a medium of seven female sex partners while women have had a median of four. Kolata writes:<br />
<br />
<blockquote> "It is about time for mathematicians to set the record straight," said David Gale, an emeritus mathematics professor at the University of California, Berkeley.<br><br><br />
<br />
"Surveys and studies to the contrary notwithstanding, the conclusion that men have substantially more sex partners than women is not and cannot be true for purely logical reasons," Dr Gale said. He even provided a proof, writing in an e-mail message.<br><br><br />
<br />
By way of dramatization, we change the context slightly and will prove what will be called the High School Prom Theorem. We suppose that on the day after the prom, each girl is asked to give the number of boys she danced with. These numbers are then added up giving a number G. The same information is then obtained from the boy, giving a number B<br><br><br />
<br />
Theorem: G = B <br></blockquote><br />
<br />
Kolata reports further:<br><br />
<br />
<blockquote>Ronald Graham, a professor of mathematics and computer science at the University of California, San Diego, agreed with Dr. Gale. After all, on average, men would have to have three more partners than women, raising the question of where all these extra partners might be.</blockquote><br />
<br />
The second Gina Kolata article deals primarily with the shower of responses pointing out that the study reported that the medians were different, and so Gale's proof is either irrelevant or not true.<br />
<br />
Of course the Blogs had a field day with this mathematics. One of the best is the [http://delong.typepad.com/sdj/2007/08/why-oh-why-ca-2.html Blog of Brad Delong], Delong is an economist at the University of California and hence a colleague of David Gale. He blames Gina Kolata saying that she did not tell Gale that the study reported the results of the study in terms of medians rather than means. However the comments to this Blog are very interesting and show just how hard it is to apply mathematics to the real world. Their comments subjext good discussion questions. <br />
<br />
==Discussion questions==.<br />
<br />
(1) What explantions can you give for the resutls of the survey? Are they enough to explain the difference reported in this survey.<br />
<br />
(2) Did you dance with more than one person at your high school prom? <br />
<br />
(3) Is Gales theory true if there are more women than men or more men than women in the population sampled?<br />
<br />
(4)The article reports:<br />
<blockquote> I have heard this question before,” said Cheryl D. Fryar, a health statistician at the National Center for Health Statistics and a lead author of the new federal report, “Drug Use and Sexual Behaviors Reported by Adults: United States, 1999-2002,” which found that men had a median of seven partners and women four. But when it comes to an explanation, she added, “I have no idea.”</blockquote> <br />
<br />
Do you think that Fryar Knows the difference between mean and median?<br />
<br />
==Data for first forsooth==<br />
<br />
The Times 26 May 2006 article that is the source for the Forsooth included the following data:<br />
<br />
LEADING CAUSES OF DEATH<br><br />
MEN Total deaths Percentage<br><br />
Heart disease 49,205 20.2<br><br />
Cerebrovascular diseases 19,266 7.9<br><br />
Cancer of trachea, bronchus & lung 16,775 6.9<br><br />
Chronic lower respiratory diseases 13,589 5.6<br><br />
Influenza and pneumonia 12,209 5<br><br />
Prostate cancer 9,018 3.7<br><br />
Cancer of colon, rectum and anus 7,570 3.1<br><br />
Lymphoid cancer 5,606 2.3<br><br />
Dementia and Alzheimer's 5,076 2.1<br><br />
<br />
WOMEN Total deaths Percentage<br><br />
Heart disease 38,969 16<br><br />
Influenza and pneumonia 31,366 12.9<br><br />
Dementia and Alzheimer's 19,255 7.9<br><br />
Chronic lower respiratory diseases 12,605 5.2<br><br />
Cancer of trachea, bronchus & lung 11,895 4.9<br><br />
Breast cancer 10,986 4.5<br><br />
Heart failure & complications, & ill-defined heart disease (not included above) 7,212 3<br><br />
Cancer of colon, rectum and anus 6,537 2.7<br></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_29&diff=4521Chance News 292007-09-06T23:46:56Z<p>Mmartin: /* Forsooth */</p>
<hr />
<div>==Quotations==<br />
<br />
"There are few things that are so unpardonably neglected in our country as poker. The upper class knows very little about it. Now and then you find ambassadors who have sort of a general knowledge of the game, but the ignorance of the people is fearful. Why, I have known clergymen, good men, kind-hearted, liberal, sincere, and all that, who did not know the meaning of a "flush." It is enough to make one ashamed of the species". <br />
<br />
<div align=right>Mark Twain.</div><br />
<br />
==Forsooth==<br />
<br />
The following Forsooths are from the September 07 issue of the RSS NEWS.<br />
<br />
<center>THE BIGGEST KILLER BY FAR<br />
<blockquote>Heart disease claimed the lives of one in five men<br> and about one in six women last year, figures indicate. <div align=right>The Times <br> 26 May 2006</div></blockquote></center><br />
See the end of this Chance News for the data that was the basis for this claim.<br />
----<br />
<center><blockquote>[Hanson plc is the]] Largest aggregates producer<br> in the world and 3rd largest in the USA <div align=right> Daily Telegraph<br> 3 March, 2006</div></blockquote></center><br />
<br />
----<br />
This Forsooth was suggested by Jerry Grossman.<br />
<br />
<blockquote>In addition, a person's odds of becoming obese increased by 57 percent if he or she had a friend who became obese over a certain time interval. If the two people were mutual friends, the odds increased to 171 percent. <br> <div align=right> Family, Friend May "Spread" Obesity <br>[http://www.revolutionhealth.com/news/?id=hd-606718&msc=S36090 Revolution Health]<br> July 25, 2007</div></blockquote><br />
<br />
This discussion relates to an article [http://content.nejm.org/cgi/content/full/357/4/370 The Spread of Obesity in a Large Social Network over 32 Years] that appeared in the July 26, 2007 issue of the New England Journal of Medicine and seems to be freely available. Of course, here the "increased to 171 percent" is "increased by 171%." <br />
<br />
Jerry remarks "The NEJM article is interesting to those of us interested<br />
in the mathematical aspects of the social network."<br />
<br />
<br />
----<br />
This forsooth was suggested by Paul Alper<br />
<br />
<blockquote>I've done 120 short-term energy outlooks, and I've probably gotten two of them right.<br><br />
<br />
<div align=right>Mark Rodekohr, a veteran Department of Energy (DOE) economist<br>Minnesota Star Tribune<br> August 12, 2007</div></blockquote><br />
<br />
==Is Poker predominantly skill or luck?==<br />
Harvard ponders just what it takes to excel at poker.<br><br />
''Wall Street Journal'', May 3, 2007, A1<br><br />
Neil King JR<br><br />
<br />
The WSJ article reports on a one-day meeting in the Harvard Faculty Club of poker pros, game theorists, statisticians, law students and gambling lobbyists to develop a strategy to show that poker is not predominantly a game of chance.<br />
<br />
In the [http://www.bigslickak.com/murv/index.nsf/HarvardArticle!OpenPage article] we read:<br />
<blockquote>The skill debate has been a preoccupation in poker circles since September (2006), when Congress barred the use of credit cards for online wagers. Horse racing and stock trading were exempt, but otherwise the new law hit any game "predominantly subject to chance". Included among such games was poker, which is increasingly played on Internet sites hosting players from all over the world.</blockquote><br />
<br />
This, of course, is not a new issue. For example it is the subject of the Mark Twain's short story [http://www.twainquotes.com/Galaxy/187010d.html "Science vs. Luck"] published in the October 1870 issue of The Galaxy. The Galaxy no longer exists but co-founder Francis Church will always be remembered for his reply to Virginia's letter to the New York Sun: "Yes, Virginia, there is a Santa Claus". <br />
<br />
In Mark Twain's story a number of boys were arrested for playing "old sledge" for money. Old sledge was a popular card game in those times and often played for money. In the trial the judge finds that half the experts say that old sledge is a game of science and half that it is a game of skill. The lawyer for the boys suggests:<br />
<blockquote>Impanel a jury of six of each, Luck versus Science -- give them candles and a couple of decks of cards, send them into the jury room, and just abide by the result!</blockquote><br />
<br />
The Judge agrees to do this and so four deacons and the two dominies (Clergymen) were sworn in as the "chance" jurymen, and six inveterate old seven-up professors were chosen to represent the "science" side of the issue. <br />
They retired to the jury room. When they came out, the professors had ended up with all the money. So the Judge ruled that the boys were innocent. <br />
<br />
Today more sophisticated ways to determine if a gambling game is predominantly skill or luck are being studied. Ryne Sherman has written two articles on this, <br />
[http://www.dartmouth.edu/~chance/forwiki/Sherman1 "Towards a Skill Ratio"] and <br />
[http://www.dartmouth.edu/~chance/forwiki/Sherman2 "More on Skill and Individual Differences"] <br />
in which he proposes a way to estimate luck and skill in poker and other games.<br />
<br />
To estimate skill and luck percentages Sherman uses a statistical procedure called analysis of variance (ANOVA). <br />
To understand Sherman's method of comparing luck and skill we need to understand how ANOVA works so we will show how it works using a simple example from [http://www.unc.edu/courses/2006spring/psyc/130/001/variance.htm Variance and the Design of Experiments] It begins with the following hypothetical data. <br />
<center><br />
<p>&nbsp;</p><br />
<table width="70%" border="1"><br />
<tr> <br />
<td><div align="center">Treatment 1</div></td><br />
<td><div align="center">Treatment 2</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">4</div></td><br />
<td><div align="center">7</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">6</div></td><br />
<td><div align="center">5</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">8</div></td><br />
<td><div align="center">8</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">4</div></td><br />
<td><div align="center">9</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">5</div></td><br />
<td><div align="center">7</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">3</div></td><br />
<td><div align="center">9</div></td><br />
</tr><br />
</table><br />
</center><br><br />
<br />
The author does not explain how the data might occur. But lets assume that these are the result of a clinical trial to determine if vitamin ME improves memory. In the study. Two groups are formed from 12 participants. 6 were given a placebo and 6 were given vitamin ME. The study went on for a month. At the end of the month the two groups are given a memory test and the numbers in the first column are the number of correct answers for the placebo group and those in the second column are the number of correct answers for Vitamin ME group. ANOVA can be used to see if there is significant difference between the groups. Here is Bill Peterson's explanation for how this works.<br />
<br />
There are two group means:<br />
<br />
Mean1 = (4+6+8+4+5+3)/6 = 30/6 = 5.0<br><br />
Mean2 = (7+5+8+9+7+9)/6 = 45/6 = 7.5<br><br />
<br />
Then a grand mean over all observations:<br><br />
Mean = (30+45)/(6+6) = 6.25<br><br />
<br />
Variance is always a sum of square deviations divided by degree of freedom:<br />
SS/df. This is also called a mean squared deviation MS.<br />
<br />
ANOVA begins by expressing the deviation of each observation from the grand mean as a sum of two terms: the difference of the observation from its group mean, plus the difference of the group mean from the grand mean. Writing this out explicitly for the example, we have:<br><br><br />
(4 - 6.25) = (4 - 5.0) + (5.0 - 6.25)<br><br />
(6 - 6.25) = (6 - 5.0) + (5.0 - 6.25)<br><br />
...<br><br />
(3 - 6.25) = (3 - 5.0) + (5.0 - 6.25)<br><br />
<br />
(7 - 6.25) = (7 - 7.5) + (7.5 - 6.25)<br><br />
<br />
(5 - 6.25) = (5 - 7.5) + (7.5 - 6.25)<br><br />
...<br><br />
(9 - 6.25) = (9 - 7.5) + (7.5 - 6.25)<br><br />
<br />
The magic (actually the Pythagorean Theorem in an appropriate dimensional space)<br />
is that the sums of squares decompose in this way.<br><br />
<math><br />
(4-6.25)^2 +...+(9-6.25)^2 =<br />
[(4-5.0)^2+...+(9 - 7.5)^2] + [(5.0 - 6.25)^2+...+(7.5 - 6.25)^2] </math><br><br />
Check: 46.25 = 27.5 + 18.75<br><br />
<br />
In the usual abbreviations:<br><br />
<br />
SST = SSE + SSG<br><br />
<br />
(total sum of sqs = error sum of sqs + group sum of sqs)<br><br />
<br />
Fisher's F statistic is F = MSG/MSE. Large values of F are taken as evidence that there is a real treatment<br />
effect.<br />
<br />
Now Sherman uses this same kind of decomposition for his measure of skill and of chance for a game. We illustrate how he does this using the following data from five weeks of our low-key Monday night poker games.<br />
<br />
<center><br />
<table width="90%" border="1"><br />
<tr> <br />
<td width="13%"><div align="center"></div></td><br />
<td width="13%"><div align="center">Sally</div></td><br />
<td width="12%"><div align="center">Laurie</div></td><br />
<td width="13%"><div align="center">John</div></td><br />
<td width="13%"><div align="center">Mary</div></td><br />
<td width="12%"><div align="center">Sarge</div></td><br />
<td width="12%"><div align="center">Dick</div></td><br />
<td width="12%"><div align="center">Glenn</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 1</div></td><br />
<td><div align="center">-6.75</div></td><br />
<td><div align="center">-10.10</div></td><br />
<td><div align="center">-5.75</div></td><br />
<td><div align="center">10.35</div></td><br />
<td><div align="center">9.7</div></td><br />
<td><div align="center">4.43</div></td><br />
<td><div align="center">-1.95</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 2</div></td><br />
<td><div align="center">4.35</div></td><br />
<td><div align="center">-4.25</div></td><br />
<td><div align="center">.40</div></td><br />
<td><div align="center">-.35</div></td><br />
<td><div align="center">-8.8</div></td><br />
<td><div align="center">-.15</div></td><br />
<td><div align="center">5.8</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 3</div></td><br />
<td><div align="center">6.95</div></td><br />
<td><div align="center">-4.35</div></td><br />
<td><div align="center">.18</div></td><br />
<td><div align="center">-7.75</div></td><br />
<td><div align="center">7.65</div></td><br />
<td><div align="center">-5.9</div></td><br />
<td><div align="center">3.9</div></td><br />
</tr><br />
<tr> <br />
<td height="24"><div align="center">Game 4</div></td><br />
<td><div align="center">-1.23</div></td><br />
<td><div align="center">-11.55</div></td><br />
<td><div align="center">4.35</div></td><br />
<td><div align="center">2.9</div></td><br />
<td><div align="center">4.85</div></td><br />
<td><div align="center">-3.9</div></td><br />
<td><div align="center">3.25</div></td><br />
</tr><br />
<tr> <br />
<td><div align="center">Game 5</div></td><br />
<td><div align="center">6.35</div></td><br />
<td><div align="center">-1.5</div></td><br />
<td><div align="center">-.45</div></td><br />
<td><div align="center">-.65</div></td><br />
<td><div align="center">-.25</div></td><br />
<td><div align="center">-4.9</div></td><br />
<td><div align="center">1.42</div></td><br />
</tr><br />
</table><br />
</center><br />
<br />
<br />
To compare the amount of skill and luck in these games Sherman would have us carry out an analysis of variance in the same way we did for our example. The players are now seen in the role of treatments. Each player has a mean net gain over the set of games. For each outcome in the table we write the difference between this outcome and the overall mean as the sum of two terms: the difference between the outcome and the player's mean plus the difference between the player's mean and the overall mean. Sherman suggests that the difference between the outcome and the players mean is due primarily to luck while the difference between the players mean and the overall mean is due primarily to skill. This leads him to define the skill % as the ratio of the of the group sums of squares to the total sums of squares and the luck % as the ratio of the within group sums of squares to the total sums of squares.<br />
<br />
We could estimate the luck % and the skill % for our poker games by making the same kind of computations we made in our example. However, most statistical packages such as SAS have a program to make these computations for us. <br />
<br />
Using the SAS ANOVA program, we find that the total sums of squares is 1069.95, the between groups sums of squares is 311.447 and the within groups sums of squares is 758.499. Thus, from our poker game we would estimate the skill % to be 311.447/1069.95 = 29.1 % and the luck % to be 758.499/1069.06 = 70.9 %. Thus, not surprisingly, luck is more important than skill.<br />
<br />
In his second article, Sherman reports the skill % he obtained using data from a number of different types of games. For example, using data for Major League Batting hits the skill % was 39 % and for home runs it was 68 %. For NBA Basketball for points scored it was 75 % and for poker stars in weekly tournaments it was 35 %.<br />
<br />
Sherman concludes his articles with the remarks:<br />
<br />
<blockquote>If two persons play the same game, why don't both achieve the same results? The purpose of last month's article and this article was to address this question. This article suggests that there are two answers to this question: Skill (or systematic variance) or Luck (or random variance). Using both the correlation approach described last month and the ANOVA approach described in this article, one can estimate the amount of skill involved in any game. Last, and maybe most importantly, Table 4 demonstrated that the skill estimated involved in playing poker (or at least tournament poker) are not very different from other sport outcomes which are widely accepted as skillful.<\blockquote><br />
<br />
===Discussion questions===<br />
<br />
(1) Do you think that Sherman's measure of skill and luck in a game is reasonable? If not why not?<br />
<br />
(2) There is a form of [http://www.duplicatepoker.com/WebSite/epokerusa.aspx?page=dpg_rules duplicate poker] modeled after duplicate bridge. Do you think that the congressional decision should apply to this form of gambling? <br />
<br />
Submitted by Laurie Snell<br />
<br />
==Second chance lottery drawing==<br />
Ask Marilyn<br><br />
Parade, 5 August 2007<br><br />
Marilyn vos Savant<br />
<br />
A reader poses the following question.<br />
<blockquote><br />
Say that a state runs a lottery with scratch-off tickets and has a second-chance drawing for losing tickets. The latter are sent to a central location, where they are boxed and stored until it’s time for the drawing. An official then chooses one box and draws a ticket from it. All the other boxes are untouched. Is this fair, compared to storing all the tickets in a large container and then drawing a ticket from it?<br />
</blockquote><br />
<br />
<br />
Marilyn responds that, &quot;The methods are equivalent, and both are perfectly fair: One winner was chosen at random&quot;, and suggests that the method is used purely for physical convenience. (In a state lottery, however, we imagine the whole affair would be conducted electronically.)<br />
<br />
DISCUSSION QUESTIONS:<br />
<br />
(1) Marilyn's answer is almost correct. What has been implicitly assumed here?<br />
<br />
(2) Here is a related problem (from Grinstead & Snell, [http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/book.html Introduction to Probability], p. 152, problem 23).<br />
<blockquote><br />
You are given two urns and fifty balls. Half of the balls are white and half<br />
are black. You are asked to distribute the balls in the urns with no restriction<br />
placed on the number of either type in an urn. How should you distribute<br />
the balls in the urns to maximize the probability of obtaining a white ball if<br />
an urn is chosen at random and a ball drawn out at random? Justify your<br />
answer.<br />
</blockquote><br />
<br />
==Second chance lottery drawing==<br />
Ask Marilyn<br><br />
Parade, 5 August 2007<br><br />
Marilyn vos Savant<br />
<br />
A reader poses the following question.<br />
<blockquote><br />
Say that a state runs a lottery with scratch-off tickets and has a second-chance drawing for losing tickets. The latter are sent to a central location, where they are boxed and stored until it’s time for the drawing. An official then chooses one box and draws a ticket from it. All the other boxes are untouched. Is this fair, compared to storing all the tickets in a large container and then drawing a ticket from it?<br />
</blockquote><br />
<br />
Marilyn responds that, &quot;The methods are equivalent, and both are perfectly fair: One winner was chosen at random&quot;, and suggests that the method is used purely for physical convenience. (In a state lottery, however, we imagine the whole affair would be conducted electronically.)<br />
<br />
DISCUSSION QUESTIONS:<br />
<br />
(1) Marilyn's answer is almost correct. What has been implicitly assumed here?<br />
<br />
(2) Here is a related problem (from Grinstead & Snell, [http://www.dartmouth.edu/~chance/teaching_aids/books_articles/probability_book/book.html Introduction to Probability], p. 152, problem 23).<br />
<blockquote><br />
You are given two urns and fifty balls. Half of the balls are white and half<br />
are black. You are asked to distribute the balls in the urns with no restriction<br />
placed on the number of either type in an urn. How should you distribute<br />
the balls in the urns to maximize the probability of obtaining a white ball if<br />
an urn is chosen at random and a ball drawn out at random? Justify your<br />
answer.<br />
</blockquote><br />
<br />
Submitted by Bill Peterson<br />
<br />
==The understanding and misunderstanding of Bayesian statistics==<br />
[http://www.economist.com/science/PrinterFriendly.cfm?story_id=9645336 <em>Gambling on tomorrow</em>,] The Economist, Aug 16th 2007 <br><br />
[http://news.yahoo.com/s/nm/20070812/sc_nm/climate_uncertainty_dc <em>Scientists try new ways to predict climate risks</em>,] Reuters 12 Aug 2007.<br><br />
<em>Too late to escape climate disaster?</em>, New Scientist, 18 Aug 2007.<br><br />
<em>Earth Log - Complex lesson</em>, Daily Telegraph, 17 Aug 2007.<br><br />
<br />
The latest edition of one of the Royal Society's journals, [http://www.journals.royalsoc.ac.uk/content/102021/ Philosophical Transactions] is devoted to the science of climate modelling:<br />
<blockquote>predictions from different models are pooled to produces estimates of future climate change, together with their associated uncertainties</blockquote><br />
the Royal Society said,<br />
and it partly focusses on 'the understanding and misunderstanding' of Bayesian statistics.<br />
So this Economist article discusses the difference between the frequentist and Bayesian view of statistics, in the context of forecasting the weather.<br />
<br />
The article starts by claiming that there were just two main influences on the early development of probability theory and statistics:<br />
[http://en.wikipedia.org/wiki/Thomas_Bayes Bayes] and [http://en.wikipedia.org/wiki/Blaise_Pascal Pascal].<br />
It claims that Pascal's ideas are simple and widely understood while Bayes are not. <br />
Pascal adopted a [http://en.wikipedia.org/wiki/Frequency_probability frequentist] view, which The Economist characterises as <em>the world was that of the gambler: each throw of the dice is independent of the previous one.</em><br />
Bayes promoted what we now call [http://en.wikipedia.org/wiki/Bayesian_probability Bayesian probability,] which The Economist characterises as <em>incorporating the accumulation of experience into a statistical model in the form of prior assumptions:</em><br />
<blockquote><br />
A good prior assumption about tomorrow's weather, for example, is that it will be similar to today's. <br />
Assumptions about the weather the day after tomorrow, though, will be modified by what actually happens tomorrow. <br />
</blockquote><br />
<br />
But prior assumptions can influence model outcomes in subtle ways, The Economist warns:<br />
<blockquote><br />
Since the future is uncertain, (weather) forecasts are run thousands of times, with varying parameters, to produce a range of possible outcomes. <br />
The outcomes are assumed to cluster around the most probable version of the future.<br />
The particular range of values chosen for a parameter is an example of a Bayesian prior assumption, since it may be modified in the light of experience. But the way you pick the individual values to plug into the model can cause trouble. <br />
They might, for example, be assumed to be evenly spaced, say 1,2,3,4. <br />
But in the example of snow retention, evenly spacing both rate-of-fall and rate-of-residence-in-the-clouds values will give different distributions of results. <br />
That is because the second parameter is actually the reciprocal of the first. <br />
To make the two match, value for value, you would need, in the second case, to count 1, ½, ⅓, ¼—which is not evenly spaced. <br />
If you use evenly spaced values instead, the two models' outcomes will cluster differently.<br />
</blockquote><br />
<br />
It goes on to claim that those who use statistical models often fail to account for the uncertainty associated with such models:<br />
<blockquote><br />
Psychologically, people tend to be Bayesian—to the extent of often making false connections. And that risk of false connection is why scientists like Pascal's version of the world. It appears to be objective. But when models are built, it is almost impossible to avoid including Bayesian-style prior assumptions in them. By failing to acknowledge that, model builders risk making serious mistakes.<br />
</blockquote><br />
<br />
One of the Philosophical Transactions papers authors', David Stainforth of Oxford University, says<br />
<blockquote><br />
The answer is more comprehensive assessments of uncertainty, if we are to provide better information for today's policy makers.<br />
Such assessments would help steer the development of climate models and focus observational campaigns. Together this would improve our ability to inform decision makers in the future.<br />
</blockquote><br />
<br />
===Questions===<br />
* What influences on the early development of probability theory and statistics can you think of, other than Pascal and Bayes?<br />
* Is the frequentist view of statistics nothing more than <em>each throw of the dice is independent of the previous one</em>. What other characteristics would you associate with this view of statistics? Can you offer a better one-line summary? What about a better descrption of Bayesian statistics than <em>incorporating the accumulation experience into a statistical model in the form of prior assumptions</em>.<br />
* In one of the Royal Society's papers, authors David Stainforth from Oxford University and Leonard Smith from the LSE, advocate making a clearer distinction between the output of model experiments designed for improving the model and those of immediate relevance for decision making. What do you think they meant by that? Can you think of a simple example to illustrate your interpretation?<br />
* The Economist claims that scientists are not easily able to understand Bayes because of their philosophical training in the rigours of Pascal's method. How would you reply to this assertion?<br />
<br />
===Further reading===<br />
* [http://www.lse.ac.uk/collections/pressAndInformationOffice/newsAndEvents/archives/2007/ClimateChangeReport.htm Confidence, uncertainty and decision-support relevance in climate predictions,] [http://www.atm.ox.ac.uk/user/das/ David Stainforth,] Oxford University and Leonard Smith, LSE.<br />
** This [http://www.lse.ac.uk/collections/cats/papersPDFs/75_Stainforth_ConfidenceUncertaintyRelevance_2007.pdf paper] discusses the sources of uncertainty in the interpretation of climate model simulations as projections of the future. <br />
* See also [http://www.climateprediction.net/science/scientific_papers.php Climateprediction.net].<br />
<br />
Sumbitted by John Gavin.<br />
<br />
==The Myth, the Math, the Sex==<br />
[http://www.nytimes.com/2007/08/12/weekinreview/12kolata.html?ex=1188792000&en=4f4f1484b2912d4b&ei=5070 The Myth, the Math, the sex].<br><br />
''The New York Times'', August 12, 2007, The Week in Review<br><br />
Gina Kolata<br />
<br />
[http://www.nytimes.com/2007/08/19/weekinreview/19kolata.html?ex=1188792000&en=2b4b9a5b0a9293b4&ei=5070 The Median, the Math and the Sex].<br><br />
''The New York Times'', August 19, 2007, The Week in Review<br><br />
Gina Kolata<br />
<br />
In the first article Gina Kolata comments that there have been numerous studies claiming to show that men have more sexual partners than women. <br />
<br />
She reports a recent government study, reporting that men have had a medium of seven female sex partners while women have had a median of four. Kolata writes:<br />
<br />
<blockquote> "It is about time for mathematicians to set the record straight," said David Gale, an emeritus mathematics professor at the University of California, Berkeley.<br><br><br />
<br />
"Surveys and studies to the contrary notwithstanding, the conclusion that men have substantially more sex partners than women is not and cannot be true for purely logical reasons," Dr Gale said. He even provided a proof, writing in an e-mail message.<br><br><br />
<br />
By way of dramatization, we change the context slightly and will prove what will be called the High School Prom Theorem. We suppose that on the day after the prom, each girl is asked to give the number of boys she danced with. These numbers are then added up giving a number G. The same information is then obtained from the boy, giving a number B<br><br><br />
<br />
Theorem: G = B <br></blockquote><br />
<br />
Kolata reports further:<br><br />
<br />
<blockquote>Ronald Graham, a professor of mathematics and computer science at the University of California, San Diego, agreed with Dr. Gale. After all, on average, men would have to have three more partners than women, raising the question of where all these extra partners might be.</blockquote><br />
<br />
The second Gina Kolata article deals primarily with the shower of responses pointing out that the study reported that the medians were different, and so Gale's proof is either irrelevant or not true.<br />
<br />
Of course the Blogs had a field day with this mathematics. One of the best is the [http://delong.typepad.com/sdj/2007/08/why-oh-why-ca-2.html Blog of Brad Delong], Delong is an economist at the University of California and hence a colleague of David Gale. He blames Gina Kolata saying that she did not tell Gale that the study reported the results of the study in terms of medians rather than means. However the comments to this Blog are very interesting and show just how hard it is to apply mathematics to the real world. Their comments subjext good discussion questions. <br />
<br />
==Discussion questions==.<br />
<br />
(1) What explantions can you give for the resutls of the survey? Are they enough to explain the difference reported in this survey.<br />
<br />
(2) Did you dance with more than one person at your high school prom? <br />
<br />
(3) Is Gales theory true if there are more women than men or more men than women in the population sampled?<br />
<br />
(4)The article reports:<br />
<blockquote> I have heard this question before,” said Cheryl D. Fryar, a health statistician at the National Center for Health Statistics and a lead author of the new federal report, “Drug Use and Sexual Behaviors Reported by Adults: United States, 1999-2002,” which found that men had a median of seven partners and women four. But when it comes to an explanation, she added, “I have no idea.”</blockquote> <br />
<br />
Do you think that Fryar Knows the difference beteen mean and median?<br />
<br />
==Data for first forsooth==<br />
<br />
The Times 26 May 2006 article that is the source for the Forsooth included the following data:<br />
<br />
LEADING CAUSES OF DEATH<br><br />
MEN Total deaths Percentage<br><br />
Heart disease 49,205 20.2<br><br />
Cerebrovascular diseases 19,266 7.9<br><br />
Cancer of trachea, bronchus & lung 16,775 6.9<br><br />
Chronic lower respiratory diseases 13,589 5.6<br><br />
Influenza and pneumonia 12,209 5<br><br />
Prostate cancer 9,018 3.7<br><br />
Cancer of colon, rectum and anus 7,570 3.1<br><br />
Lymphoid cancer 5,606 2.3<br><br />
Dementia and Alzheimer's 5,076 2.1<br><br />
<br />
WOMEN Total deaths Percentage<br><br />
Heart disease 38,969 16<br><br />
Influenza and pneumonia 31,366 12.9<br><br />
Dementia and Alzheimer's 19,255 7.9<br><br />
Chronic lower respiratory diseases 12,605 5.2<br><br />
Cancer of trachea, bronchus & lung 11,895 4.9<br><br />
Breast cancer 10,986 4.5<br><br />
Heart failure & complications, & ill-defined heart disease (not included above) 7,212 3<br><br />
Cancer of colon, rectum and anus 6,537 2.7<br></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_26&diff=4061Chance News 262007-05-10T18:13:28Z<p>Mmartin: /* The Numbers Guy */</p>
<hr />
<div>==Quotations==<br />
<br />
<blockquote> It is now proved beyond doubt that smoking is one of the<br />
leading causes of statistics. <br><br />
<div align=right>Fletcher Krebel<br> Reader's Digest (December 1961)<br />
</div></blockquote><br />
----<br />
<blockquote>One of the naturalists had argued that <i>On the Origin of Species</i> was too theoretical, that Darwin should have just "put his facts before us and let them rest." In response, Darwin reflected that science, to be of any service, required more than list making; it needed larger ideas that could make sense of piles of data. Otherwise, Darwin said, a geologist "might as well go into a gravel-pit and count the pebbles and describe the colours." Data without generalizations are useless; facts without explanatory principles are meaningless.<br><br />
<div align=right>Michael Shermer<br>Why Darwin Matters. The Case Against Intelligent Design. (page 1)</div></blockquote><br />
<br />
==Forsooth==<br />
<br />
The following Forsooths were in the April 2007 RSS News:<br />
<br />
<blockquote>Britain has been basking in the early onset of spring with temperatures almost twice as warm as the same time last year.<div align=right>Lucy Ballinger<br> ''Daily Mirror'' <br> 12 March 2007</div></blockquote><br />
----<br />
<blockquote>PHEW! Twice as warm as Corfu<br><br><br />
It's not often we put Corfu in the shade weatherwise, especially at tis time of the year. But while the Greek holiday spot could only manage a paltry B8C (46F) yesterday, Britons basked in the sun as temperatures reached 16C (60F) yesterday.<div align=right>Stephen White<br> ''Daily Mirror'' <br> 12 March 2007</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> He (Persi Diaconis) proved that it takes seven shuffles to perfectly randomize a pack of cards.<br><br />
<div align=right>Justin Mullins<br> New Scientist <br> March 24-30, 2007, p 52</div></blockquote><br />
<br />
Contributed by Laurie Snell.<br />
----<br />
<blockquote> We were eleven people obtaining those 30.000 millions. I want the 11% that corresponds to me. <br />
<div align=right>A politician of Madrid in a phone dialogue recorded by the police. <br> El Pais <br> 20th October, 2006,</div></blockquote><br />
<br />
Contributed by Carlos Silva.<br />
----<br />
Keith Crank, Assistant Director for Research and Graduate Education, writes:<br />
<br />
<blockquote> Here is a possible entry under Forsooths in Chance News. It comes from a<br />
recent report of the Council of Graduate Schools, titled: Graduate<br />
Education The Backbone of American Competitiveness and Innovation. On<br />
page 22 of the document, it states, "..., while the majority of students<br />
who enter doctoral programs have the academic ability to complete the<br />
degree, on average only 50-60 percent of those who enter doctoral<br />
programs in the United States complete their degrees."<br><br> <br />
The report is available<br />
[http://www.cgsnet.org/portals/0/pdf/GR_GradEdAmComp_0407.pdf. here]. Given the<br />
members of the committee that prepared this report, you would think<br />
someone among them would realize that 50-60% is a majority.</blockquote><br />
----<br />
From the San Francisco Chronicle, March 5, 2007<br />
[http://www.sfgate.com/cgi-bin/article.cgi?f=/c/a/2007/03/05/BAG1MOG0OJ2.DTL&hw=lottery&sn=006&sc=506]<br />
<br />
''' 'It just has to be your day' '''<br><br />
'''$355 million Mega Millions jackpot has Californians dreaming big'''<br />
<br />
<blockquote>I realize I don't have a chance, but nobody's got a chance. So the way I look at it, I have a 50-50 chance -- either I win it or someone else wins it," reasoned Barrie Green, 60, after buying a single ticket Monday afternoon at the Merritt Restaurant and Bakery near his home in Oakland.</blockquote><br />
<br />
Contributed by Alan Shuchat<br />
<br />
----<br />
<br />
<blockquote> CANADA is to investigate claims that tens of <br />
thousands of native Indian and Inuit (First <br />
Nation) children died of tuberculosis at <br />
church-run residential schools in the early 20th <br />
century, and that their deaths were hushed up. <br />
…Their experiences were often brutal, and Canada <br />
is finalising a C$1.9 billion ($1.7 million) <br />
class-action settlement for 80,000 surviving <br />
former inmates…<br />
<div align=right> New Scientist <br> May 5, 2007, </div></blockquote><br />
<br />
Contributed by Paul Campbell<br />
<br />
==A history of smoking in the US==<br />
The Cigarette Century: The rise, fall, and deadly persistence of the product that defined America<br><br />
Allan M. Brandt, 600 pp.<br><br />
Basic Books, 2007, Amazon $23.76.<br />
<br />
Allen Brandt is Professor of the History of Medicine at Harvard Medical School, and a professor in the Department of History of Science at Harvard University. His book is a complete history of smoking in the U. S. It is divided into five chapters: ''Culture'' (how cigarettes came into the American culture), ''Science'', (The Causal Conundrum), ''Politics'', (The Surgeon General report), ''Law'' (The trials of Big Tobacco), and ''Globalization'', (Exporting an Epidemic). While this book will clearly be the bible of smoking you might want to start with some related videos. <br />
<br />
For an overall picture of the book you can watch [http://www.booktv.org/feature/index.asp?segid=8051&schedID=486 here] a lecture Brandt gave about his book. <br />
<br />
We are most interested in Brandt's chapter: "The Causal Conundrum". For an introduction to this we recommend watching video 11 (ˇThe Question of Causation') of [http://www.learner.org/resources/series65.html Against all Odds] (This is free but you have to sign in). This video was made while the pioneers who recognized the association of cigarette smoking with lung cancer were still alive and it is great to see and hear their personal involvement. Here is sample: Doctor Dwight Harken is speaking:<br />
<br />
<blockquote>Dr. Wyndor, then a student at St Louis under Dr Everts Graham, came to see me and said "Camel Cigarettes cause cancer of the lungs". I couldn’t believe it and, you know, you see what you look for and look for what you know -- and it never occurred to me that cigarettes caused cancer. So we went to see my patients and at that time I had quite a large practice. We discovered, to our amazement, that patients who had cancer of the lung were 17 times to 1 as apt to be to be two- pack-a-day smokers. So here was a fact trying to tell us something. </blockquote><br />
<br />
In 1950 Wyndor, with the help of his teacher Dr. Graham, collected questionnaires regarding smoking habits from hospital patients and concluded that lung cancer was associated with smoking. This belief was supported by prospective and retrospective studies by the well-known statisticians Richard Dole and Bradford Hill. As Brandt observes these new kind of studies were originated in the attempt to show that smoking caused lung cancer.<br />
<br />
Neither source discusses in any detail the reasons that the two famous statisticians Joseph Berkson in the US and R.A Fisher in the UK were not convinced that smoking caused lung cancer. Berkson explained his reasons in his article: ''Smoking and lung cancer: some observations on two recent reports''. J Am Stat Assoc 1958;53:28-38.<br />
<br />
Here he argues that causation cannot be concluded from statistical studies which do not deal with laboratory experiments or placebo-controlled clinical trials. He is also concerned that the studies upon which causation is concluded found an association of smoking with a wide variety of other diseases including those for which, unlike lung cancer, there was no reason to expect an association. In addition he had reasons to believe that the studies were not as carefully carried out, as he explained in an earlier paper ''Smoking and cancer of the lung'', Proceedings of the Mayo Clinic, Vol. 34, No. 13, pp. 367 to 385. <br />
<br />
The best way to understand Fisher's concerns about the claim that smoking caused lung cancer is to read the six articles he wrote about smoking and lung cancer. There are numbers 269, 270, 274, 275, 276, and 276A in ''Collected Papers of R. A. Fisher'', Edited by J. S. Bennett, 1971-1974. <br />
<br />
His first article is a letter to the ''British Medical Journal'' 2: 43, (1957) in which he states, on p. 1418, that the hazards of cigarette-smoking "must be brought home to the public by all the modern devices of publicity", and, on p. 1519, "in the presence of the painstaking investigations of statisticians that seem to have closed every loophole of escape for tobacco as the villain in the piece." Concerning the first statement he writes: "This is just what some of us with research interests are afraid of. A common 'device' is to point to a real cause for anxiety, such as the increased incidence of lung cancer, and to ascribe it in urgent tones to what is possibly an entirely imaginary cause. Concerning the second statement he writes "I believe I have seen the sources of all the evidence cited. I do see a great deal of other statisticians. Many would still feel, and I did about five years ago, that a good prima facie case has been made for further investigation. None think that the matter is already settled".<br />
<br />
While Fisher agreed that [http://www.childrens-mercy.org/stats/definitions/retrospective.htm retrospective studies] of Hill and Doll and others suggest that smoking is associated with lung cancer, he believed that more research was necessary to establish causation. He gives examples of further research that might be done. One of the Hill and Doll papers included a question about inhaling. It found fewer inhalers among the cancer patients than among the non-cancer patients. Fisher felt that this should be studied further. He remarks that if it could be shown that inhaling was in fact strongly associated with lung cancer, this would support causation. But if not, one could not accept the simple theory that smoking causes cancer. The second area of research he suggested was to see if there are genotypic differences between the different smoking classes. If so he says " we might expect differences in the type or frequency of cancer they display."<br />
<br />
Fisher was a scientific consultant to the ''Tobacco Manufactures' Standing Committee'' set up in 1956 by the UK tobacco industry to assist research on the relationship between smoking and health and to make this information available to the public. Berkson was a consultant for the similar US ''Council for Tobacco Research''. <br />
Brandt remarks that the establishment of the Council for Tobacco Research was a good idea because:<br />
<br />
<blockquote> The call for new research implied that existing studies were inadequate or flawed. It made clear that there was "more to know," and it made the industry seem a committed participant in the scientific enterprise rather than a detractor. </blockquote><br />
<br />
What was the solution to the Brandt's Causal Conundrum? For example did evidence justify saying that smoking caused lung cancer? In 1965 Sir Bradford Hill addressed the question of how to make such decisions in his Royal Statistical Society Presidental address. You can read this address [http://www.edwardtufte.com/tufte/hill here]. In this article Hill suggests nine aspects of association that we should especially consider before deciding that the most likely interpretation is causation. Note that he does not say that we can prove causation. He discusses these in detail but here are short describtions provided [http://www.childrens-mercy.org/stats/ask/causation.asp here]<br />
<br />
1.'''Strength''' (Is the risk so large that we can easily rule out other factors?)<br><br />
2. '''Consistency''' (Have the results have been replicated by different researchers and under different conditions?)<br><br />
3.'''Specificity''' (Is the exposure associated with a very specific disease as opposed to a wide range of diseases?)<br><br />
4.'''Temporality''' (Did the exposure precede the disease?)<br><br />
5.'''Biological Gradient''' (Are increasing exposures associated with increasing risks of disease?)<br><br />
6.'''Plausibility ''' (Is there a credible scientific mechanism that can explain the association?)<br><br />
7.'''Coherence''' (Is the association consistent with the natural history of the disease?)<br><br />
8.'''Experimental Evidence''' (Does a physical intervention show results consistent with the association?)<br><br />
9.'''Analogy''' (Is there a similar result to which we can draw a relationship?)<br />
<br />
The Advisory Committee for the 1964 Surgeon General Report used criteria 1,2,3,4,7 in concluding that cigarette smoking causes Lung Cancer. It seems that the 1964 report is no longer available from the Surgeon General web site but the 1967 and later reports are available [http://www.surgeongeneral.gov/library/reports.htm here]. Of course your library should have the 1964 report.<br />
<br />
===Discussion questions:===<br />
<br />
(1) Can a [http://www.childrens-mercy.org/stats/definitions/retrospective.htm retrospective studies] distinguish between "smokers are more likely to get lung cancer than non-smokers" and "those who get lung cancer are more likely to be smokers"? Does it matter?<br />
<br />
(2) Why do you think the Advisory Committee for the Surgeon General report did not use all the Hill criteria?<br />
<br />
Submitted by Laurie Snell<br />
<br />
==The Numbers Guy==<br />
<br />
Annette Georgey recently wrote to the [http://www.lawrence.edu/fast/jordanj/isostat.html Isolated Statisticians]:<br />
<blockquote> <br />
A friend just alerted me to a [http://blogs.wsj.com/numbersguy/ blog] maintained by "The Numbers Guy," a columnist for the Wall Street Journal who writes about probability and statistics in the news. Although the WSJ online is available to subscribers only, the blog is available to all. It contains many great examples for the classroom, written in everyday English, such as the odds of a three-way tie in the TV game show "Jeopardy," understanding statistical significance in recent hormone studies, the Texas lottery, and more.</blockquote><br />
<br />
And don't forget statistician Andrew Gelman's wonderful Blog [http://www.stat.columbia.edu/~gelman/blog/ Statistical Modeling, Causal Inference, and Social Science]<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Lies, Damned Lies, and Drug War Statistics==<br />
A Critical Analysis of Claims Made by the Office of National Drug Control Policy<br><br />
Matthew B. Robinson, Renee G. Scherlen, Renee G. Scherl<br><br />
State University of New York Press, 2007<br />
<br />
From the Back Cover:<br />
<br />
Book Description: <br />
<br />
This book critically analyzes claims made by the Office of National Drug<br />
Control Policy (ONDCP), the White House agency of accountability in the<br />
nation's drug war. Specifically, the book examines six editions of the<br />
annual National Drug Control Strategy between 2000 and 2005 to determine if<br />
ONDCP accurately and honestly presents information or intentionally distorts<br />
evidence to justify continuing the war on drugs.<br />
<br />
<blockquote> The authors have performed a valuable service to our democracy with their<br />
meticulous analysis of the White House ONDCP public statements and reports.<br />
They have pulled the sheet off what appears to be an official policy of<br />
deception using clever and sometimes clumsy attempts at statistical<br />
manipulation. This document, at last, gives us a map of the truth. <br><br />
<div align=right> Mike Gray<br><br />
Author of Drug Crazy: How We Got into This Mess and How We Can Get Out</div></blockquote><br />
<br />
<blockquote>Robinson and Scherlen make a valuable contribution to documenting how<br />
ONDCP fails to live up to basic standards of accountability and<br />
consistency. <div align=right> Ethan Nadelmann<br> Executive Director, Drug Policy Alliance</div></blockquote><br />
<br />
At Appalachian State University, Matthew B. Robinson is Associate<br />
Professor of Criminal Justice, and Renee G. Scherlen is Associate Professor<br />
of Political Science. Robinson is the author of several books, including<br />
Justice Blind? Ideals and Realities of American Criminal Justice, Second<br />
Edition.<br />
<br />
Submitted by John Finn<br />
<br />
==Excluding car bombs from a measure of sectarian violence==<br />
<br />
[http://www.realcities.com/mld/krwashington/17134253.htm Optimistic Iraq report omits bombs killing civilians] Nancy Youssef, McClatchy Newspapers, April 26, 2007.<br />
<br />
In a story widely touted in the blogosphere, the U.S. report showing a sharp decline in sectarian violence in Iraq excluded any casualties associated with car bombs.<br />
<br />
<blockquote>Car bombs and other explosive devices have killed thousands of Iraqis in the past three years, but the administration doesn't include them in the casualty counts it has been citing as evidence that the surge of additional U.S. forces is beginning to defuse tensions between Shiite and Sunni Muslims.</blockquote><br />
<br />
What's the rationale for this exclusion?<br />
<br />
<blockquote> Experts who have studied car bombings say it's no surprise that U.S. officials would want to exclude their victims from any measure of success. Car bombs are almost impossible to detect and stop, particularly in a traffic-jammed city such as Baghdad. U.S. officials in Baghdad concede that while they've found scores of car bomb factories in Iraq, they've made only a small dent in the manufacturing of these weapons. </blockquote><br />
<br />
Critics of the Bush administration have another explanation.<br />
<br />
<blockquote>"Since the administration keeps saying that failure is not an option, they are redefining success in a way that suits them," said James Denselow, an Iraq specialist at London-based Chatham House, a foreign policy think tank.</blockquote><br />
<br />
Submitted by Steve Simon<br />
<br />
==Infinite Regress, Turtles All the Way Down==<br />
<br />
No one doubts that weather forecasting is important. Most would concede that, with the use of intensive computing of computer models based on massive data acquisition, weather forecasting has greatly improved. But, would you be willing to pay $90,000 annually to "WSI, a firm that owns the Weather Channel and sells forecasts of its own to airlines and other weather-dependent companies" for its "new product called MarketFirst, a sort of forecast of the forecast"? The British magazine, The Economist, points out that WSI "has detected the subtle biases in both the American and European models" used by government agencies. WSI claims that "European weathermen, for example, underestimate temperatures for western America in spring and autumn." On the other hand, "American forecasters are prone to predict chillier temperatures than they should for the period from 11 to 15 days from the time of the forecast."<br />
<br />
To illustrate just how flexible capitalism is, WSI is considering "various methods of selling it [MarketFirst], including releasing it earlier to certain customers for a higher fee." Further, "Another option would be to sell a forecast of the forecast of the forecast," or as someone put it, "Turtles all the way down."<br />
<br />
===Discussion===<br />
<br />
1. Google the expression "Turtles all the way down" to determine what it refers to and why it is relevant to this wiki. <br />
<br />
2. Ira Scharf is in charge of selling MarketFirst. According to The Economist, he concedes that "The more widespread MarketFirst becomes, the less useful it will be to its subscribers." Why would this be true? <br />
<br />
3. "WSI claims that it [MarketFirst] is right 70 percent of the time." The Economist indicates that ordinary weather forecasts are less reliable. Discuss why "right 70 percent of the time" is a nebulous statement. Further, discuss how "right 70 percent of the time" compares with tossing a coin.<br />
<br />
Submitted by Paul Alper<br />
<br />
==The media usually back the wrong horse==<br />
[http://www.economist.com/finance/displaystory.cfm?story_id=9122878 Covered in shame,], Buttonwood, The Economist, 3rd <br />
May 2007.<br><br />
<br />
This article discusses whether the media's coverage of a company's performance <br />
is indicative of how that company will perform in the future.<br />
It starts by mentioning some famous cases of where newspapers got their predictions completely wrong. <br />
For example, Business Week's 1979 cover predicted 'The Death of Equities',<br />
when the Dow-Jones market index was at 800; today it is at 13,000 and<br />
The Economist was modest enough to remind its own readers of its infamous '$5 per barrel of oil' prediction, in the late 1990s; these days oil rarely trades below $50.<br />
<br />
[http://papers.ssrn.com/sol3/papers.cfm?abstract_id=980690 An academic study] prompted this article. That study tests the belief<br />
that the media usually get their prediction wrong.<br />
Over a 20-year period, the 549 stories on individual companies<br />
that made the cover of Business Week are grouped into categories, <br />
depending on whether the coverage was very positive, neutral or very negative<br />
and the share performance in the 500 days before the cover story <br />
is compared to the following 500 days.<br />
<br />
Prior to publication, the more positive the story, <br />
the more positive the share performance.<br />
After publication, the more positive the story, <br />
the more negative the share performance.<br />
The authors summarised this as <br />
<blockquote><br />
positive stories generally indicate the end of superior performance <br />
and negative news generally indicates the end of poor performance.<br />
</blockquote><br />
<br />
The Economist refers to this phenomenon as 'recency bias', <br />
the tendency to be excessively affected by the pattern of recent data.<br />
For example, brokers may subconsciously favour 'hot stocks' when making recommendations, <br />
since they believe clients will also favour such shares. <br />
<br />
The academic study recommends a trading strategy: <br />
for those who are shorting a stock, that is betting that the price will fall, <br />
a cover exposé of that company is a good time to unwind the shorts.<br />
<br />
===Questions===<br />
* Is what The Economist calls 'recency bias' just another name for regression to the mean? What are the differences, if any?<br />
* The study observed that there were a lot more positive than negative stories. Can you think of reasons for this bias? How might this affect the results of the study?<br />
<br />
===Further reading===<br />
* [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=980690 Are Cover Stories Effective Contrarian Indicators?,] Tom Arnold, John Earl and David North, Financial Analysts Journal, Volume 63, Number 2. Also, [http://www.cfapubs.org/doi/pdf/10.2469/faj.v63.n2.4520 available here.]<br />
<br />
Submitted by John Gavin.</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_22&diff=13235Chance News 222007-01-05T17:12:52Z<p>Mmartin: /* Science in the Courtroom */</p>
<hr />
<div>==Quotations==<br />
<br />
<br />
<blockquote>It would be hard to make a probability course boring.<br><br />
<div align=right><br />
William Feller<br><br />
Personal comment to Laurie Snell</div><br />
</blockquote><br />
----<br />
<blockquote> Apart from Fred, [an obstreperous rat in her psychology lab] I was sick of trying to master statistics. I had a mental block when it came to any form of mathematics. 'Rats and stats,' I complained to a fellow student one day, 'I came here to learn about people.' I wasn't the only student disgruntled. Many complained but to no avail.<br />
<div align=right><br />
Sally Morgan in her book, My Place<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote>The risk of going into cardiac arrest as a spectator, he [Dr. Siegal of Massachusetts General Hospital] said, is only about one in a million. (The applicable studies of spectators involved Super Bowl fans.)<br />
<br />
==Forsooth==<br />
<br />
<blockquote> NOAA's heating degree day forecast for December, January and February projects a 2 percent warmer winter than the 30 year average<Br><br />
<div align=right>[http://www.noaanews.noaa.gov/stories2006/s2742.htm NOAA Magazine]<br><br />
</div></blockquote><br />
<br />
The following Forsooths are from the November 2006 RRS NEWS.<br />
----<br />
<blockquote>At St John's Wood station alone, the number of CCTV cameras has jumped from 20 to 57, an increase of 300 per cent.<br />
<br><br />
<div align=right>Metro <br><br />
3 May 2006<br />
</div></blockquote><br />
----<br />
<blockquote>Now 78% of female veterinary medicine students are women, almost a complete turn-around from the previous situation.<br><br />
<div align=right><br />
The Herald (Glasgow) <br><br />
4 May 2006</div><br />
</blockquote><br />
----<br />
<blockquote>Drought to ravage half the world within 100 years<br><br><br />
Half the world's surface will be gripped by drought by the end of the century, the Met Office said yesterday.<br><br />
<div align=right><br />
Times online <br><br />
6 October 2006</div><br />
</blockquote><br />
----<br />
<br />
==I wasn't making up data, I was imputing!==<br />
<br />
An Unwelcome Discovery, by Jeneen Interlandi, The New York Times, October 22, 2006.<br />
<br />
The New York Times has an informative summary of a recent scandal involving a prominent researcher at the University of Vermont, Eric Poehlman. The Poehlman scandal represents perhaps the biggest case of research fraud in recent history.<br />
<br />
<blockquote>He presented fraudulent data in lectures and in published papers, and he used this data to obtain millions of dollars in federal grants from the National Institutes of Health — a crime subject to as many as five years in federal prison.</blockquote><br />
<br />
The first person to speak up about the possibility of fraud in Poehlman's work was one of his research assistants, Walter DeNino.<br />
<br />
<blockquote>The fall that DeNino returned to the lab, Poehlman was looking into how fat levels in the blood change with age. DeNino’s task was to compare the levels of lipids, or fats, in two sets of blood samples taken several years apart from a large group of patients. As the patients aged, Poehlman expected, the data would show an increase in low-density lipoprotein (LDL), which deposits cholesterol in arteries, and a decrease in high-density lipoprotein (HDL), which carries it to the liver, where it can be broken down. Poehlman’s hypothesis was not controversial; the idea that lipid levels worsen with age was supported by decades of circumstantial evidence. Poehlman expected to contribute to this body of work by demonstrating the change unequivocally in a clinical study of actual patients over time. But when DeNino ran his first analysis, the data did not support the premise.</blockquote><br />
<br />
<blockquote>When Poehlman saw the unexpected results, he took the electronic file home with him. The following week, Poehlman returned the database to DeNino, explained that he had corrected some mistaken entries and asked DeNino to re-run the statistical analysis. Now the trend was clear: HDL appeared to decrease markedly over time, while LDL increased, exactly as they had hypothesized.</blockquote><br />
<br />
<blockquote>Although DeNino trusted his boss implicitly, the change was too great to be explained by a handful of improperly entered numbers, which was all Poehlman claimed to have fixed. DeNino pulled up the original figures and compared them with the ones Poehlman had just given him. In the initial spreadsheet, many patients showed an increase in HDL from the first visit to the second. In the revised sheet, all patients showed a decrease. Astonished, DeNino read through the data again. Sure enough, the only numbers that hadn’t been changed were the ones that supported his hypothesis.<br />
</blockquote><br />
<br />
Poehlman brushed DeNino's concerns aside, so DeNino started asking around and other graduate students and postdocs had similar concerns. He got some cautionary advice from a former postdoctoral fellow<br />
<br />
<blockquote>Being associated with either falsified data or a frivolous allegation against a scientist as prominent as Poehlman could end DeNino’s career before it even began.</blockquote><br />
<br />
and a faculty member who shared lab space with Poehlman who advised<br />
<br />
<blockquote>If you’re going to do something, make sure you really have the evidence.</blockquote><br />
<br />
So DeNino started looking for the evidence.<br />
<br />
<blockquote>DeNino spent the next several evenings combing through hundreds of patients’ records in the lab and university hospital, trying to verify the data contained in Poehlman’s spreadsheets. Each night was worse than the one before. He discovered not only reversed data points, but also figures for measurements that had never been taken and even patients who appeared not to exist at all.</blockquote><br />
<br />
DeNino presented his evidence to the university counsel and the response of Poehlman (to his department chair, Burton Sobel) was rather startling.<br />
<br />
<blockquote>The accused scientist gave him the impression that nothing was wrong and seemed mostly annoyed by all the fuss. In his written response to the allegations, Poehlman suggested that the data had gotten out of hand, accumulating numerous errors because of handling by multiple technicians and postdocs over the years. “I found that noncredible, really, for an investigator of Eric’s experience,” Sobel later told the investigative panel. “There had to be a backup copy that was pure,” Sobel reasoned before the panel. “You would not have postdocs and lab techs in charge of discrepant data sets.” But Poehlman told Sobel that there was no master copy.</blockquote><br />
<br />
At the formal hearing, Poehlman had a different defense.<br />
<br />
<blockquote>First, he attributed his mistakes to his own self-proclaimed ineptitude with Excel files. Then, when pressed on how fictitious numbers found their way into the spreadsheet he’d given DeNino, Poehlman laid out his most elaborate explanation yet. He had imputed data — that is, he had derived predicted values for measurements using a complicated statistical model. His intention, he said, was to look at hypothetical outcomes that he would later compare to the actual results. He insisted that he never meant for DeNino to analyze the imputed values and had given him the spreadsheet by mistake.</blockquote><br />
<br />
The New York Times article points out how pathetic this attempted explanation was.<br />
<br />
<blockquote>Although data can be imputed legitimately in some disciplines, it is generally frowned upon in clinical research, and this explanation came across as hollow and suspicious, especially since Poehlman appeared to have no idea how imputation was done.</blockquote><br />
<br />
A large portion of the article examines how research fraud can occur in a system that is supposed to be self-correcting.<br />
<br />
First, the people who are mostly likely to notice fraud are junior investigators who are subordinate to their research mentor. It's psychologically and emotionally difficult to confront someone who has devoted time to your professional development. Even when an investigator is emotionally willing to confront their mentor, they have their career concerns to worry about.<br />
<br />
<blockquote>The principal investigator in a lab has the power to jump-start careers. By writing papers with graduate students and postdocs and using connections to help obtain fellowships and appointments, senior scientists can help their lab workers secure coveted tenure-track jobs. They can also do damage by withholding this support.</blockquote><br />
<br />
Every university will have a system in place to investigate claims of fraud. But there are problems here as well.<br />
<br />
<blockquote>All universities that receive public money to conduct research are required to have an integrity officer who ensures compliance with federal guidelines. But policing its scientists can be a heavy burden for a university. “It’s your own faculty, and there’s this idea of supporting and nurturing them,” says Ellen Hyman-Browne, a research-compliance officer at the Children’s Hospital of Philadelphia, a teaching hospital. Moreover, investigations cost time and money, and no institution wants to discover something that could cast a shadow on its reputation.</blockquote><br />
<br />
<blockquote>“There are conflicting influences on a university where they are the co-grantor and responsible to other investigators,” says Stephen Kelly, the Justice Department attorney who prosecuted Poehlman. “For the system to work, the university has to be very ethical.”</blockquote><br />
<br />
Poehlman himself was careful and chose areas where fraud would be especially difficult to detect. He specialized in presenting longitudinal data, data that is very expensive to replicate. He also presented research results that confirmed what most researchers had suspected, rather than results that would undermine existing theories of nutrition.<br />
<br />
At his sentencing, Poehlman was sentenced to one year and one day in federal prison, making him the first researcher to serve time in jail for research fraud.<br />
<br />
<blockquote>“When scientists use their skill and their intelligence and their sophistication and their position of trust to do something which puts people at risk, that is extraordinarily serious,” the judge said. “In one way, this is a final lesson that you are offering.”</blockquote><br />
<br />
===Questions===<br />
<br />
1. Do you have experience with a researcher changing the data values after seeing the initial analysis results? What would make you suspicious of fraud?<br />
<br />
2. Is the peer-review system of research self-correcting? What changes could be made to this system?<br />
<br />
3. When is imputation legitimate and when is it fraudulent?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Independence for national statistics==<br />
[http://www.johnkay.com/political/453 A better way to restore faith in official statistics], [http://www.johnkay.com/ John Kay], Financial Times 25 July 2006.<br><br />
<br />
[http://www.johnkay.com/political/453 John Kay], a columnist for the [http://www.ft.com Financial Times], outlines the measures needed to ensure that national statistics are truly independent. <br />
<br />
The current state of UK official statistics was covered in a previous Chance article<br />
<em>[http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9#Pick_a_number.2C_any_number Pick a number, any number,]</em> in [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9 Chance News 9.] That article summarised a report on this topic, to which professional users, such the Royal Statistical Society, gave a cautious welcome to the government’s announcement of independence for the UK Office of National Statistics (ONS). <br />
<br />
Kay's article follows up on the reaction to that report.<br />
He tells us that accurate public information is a prerequisite of democracy, government statisticians are honest people but ministers (politicians) needs are often for propaganda rather than facts. <br />
Kay claims that decentralisation of responsibility for the production of official statistics has created a two-tier system in the UK. <br />
<blockquote><br />
statistics produced by the Office for National Statistics (ONS), which operates to internationally agreed criteria, are of higher quality than those produced by (government) departments. <br />
</blockquote><br />
The proposal to hand responsibility for all official statistics to the ONS was rejected,<br />
as were the suggestions for greater independence, made by bodies such as the Statistics Commission and the Royal Statistical Society,<br />
* separating statistical information from political statements, <br />
* reducing access by ministers to new data before their release, <br />
* giving parliament a defined role in the appointment of the National Statistician. <br />
<br />
Instead, the latest news is that the ONS will be demoted to a non-ministerial department.<br />
The worst news is the abolition of the Statistics Commission, which reviews all government statistics, and has made itself unpopular with government by proving itself robustly independent. <br />
<br />
Kay also cautions that statistics may be misused in contexts other than those intended. The value of health services increases as incomes rise and it can be argued that this increases the value of health output even if outcomes and procedures are unchanged. This statistical adjustment provides no basis whatever for claims that the National Health Service is more efficient. But the assertion grabs a headline, and it is only much later that pedantic journalists and academics can discover what is actually going on. <br />
<br />
Submitted by John Gavin.<br />
<br />
==An example of Simpson's Paradox==<br />
<br />
Study finds wealth inequality is widening worldwide<br><br />
''New York Times'', Dec. 6, 2006, C-3<br><br />
Eduardo Porter<br />
<br />
The article contains stats from a 2000 report on wealth distribution by <br />
country and worldwide. The article points out (toward the end) that <br />
even though every country has seen growing income inequality in the <br />
last six years, the *worldwide* inequality gap may be narrowing from <br />
the year 2000 stats to the present. The reason is the huge growth and <br />
wealth accumulation in China and India, which raises income overall, <br />
even though both those countries have also seen greater inequality.<br />
<br />
Submitted by Bob Dobrow<br />
<br />
==Predecessors of Poehlman==<br />
<br />
Steve Simon's wiki, "I wasn't making up data, I was imputing!" is quite interesting and informative. Nevertheless, some elaboration is in order regarding fraud and Simon's statement that "The Poehlman scandal represents perhaps the biggest case of research fraud in recent history." <br />
<br />
The term "recent history" is sufficiently elastic to permit quoting myself in the 1980s: <blockquote>Admittedly Slutsky is an extreme example...even after the investigation [proving fraud in many of his papers]...Robert G. Slutsky was [still] given credit for [an additional] 77 publications in his seven years with [the University of California, San Diego]...in 1984 he published at the astonishing rate of one paper every ten days..Slutsky's phenomenal productivity was encouraged, applauded and rewarded...John R. Darsee [another cardiologist but at Harvard], had about 100 papers in a period of two years and his undoing in 1981 was colleagues who secretly saw him forging the data.</blockquote> <br />
<br />
Put Slutsky and Darsee into Google.com and you will see the entire treatment. My point is that the Eric Poehlman scandal is nowhere near the biggest--Slutsky and Darsee involved entire prestigious labs. And we tend to ignore history at our peril. An extensive treatment of Slutsky, Darsee and many others (Baltimore, Imanishi-Kari, Spector, Summerlin, Long, Alsabti, Soman, Breuning, Pearce, Hermann, Brach, Schoen, not to mention more illustrative predecessors such as Newton, Mendel, Pasteur and Freud) can be found in The Great Betrayal: Fraud in Science by Horace Freeland Judson [Harcourt, Inc., 2004].<br />
<br />
Although Judson's book is a wonderful page-turner, go to www.bmj.com/cgi/content/full/329/7471/922 to see a critique of the book by Peter Wilmshurst, a British cardiologist who is very active in unearthing medical fraud. Wilmshurst suggests that "Judson paints a rosier picture of the mechanisms for dealing with research fraud than I recognize." Further, "Judson only briefly describes what may be the most common form of research misconduct: failure to publish results...for the sake of company profits."<br />
<br />
Although research frauds tend to have things in common--colossal egos, external as well as internal pressures, desire for fame, money, etc.--each instance is possibly unique. Poehlman evidenced a typical trait: he fabricated the data. According to the original New York Times article, his study on menopause "was almost entirely fabricated. Poehlman had tested only 2 women, not 35." On the other hand, Poehlman was downright stupid to have changed his (real, existing) cholesterol data to fit his (and others) belief that cholesterol levels worsen with age because he had the only large longitudinal study, implying that it would be publishable and valuable regardless of the results. The other unusual feature was that "He was only the second scientist in the United States to face criminal prosecution for falsifying research data."<br />
<br />
Buried in the NYT article is the statement made by Steven Heymsfield, an obesity researcher at Merck and should be a guiding light for all researchers: "But deans love people who bring in money and recognition to universities, so there is Eric."<br />
<br />
Discussion<br />
<br />
1. Use a search engine to determine what fraud was committed by some of the predecessors of Poehlman.<br />
<br />
2. Scientists claim that peer review and duplication of results act to inhibit fraud. Pick a researcher and determine why either or both failed.<br />
<br />
3. This wiki ends with a disparaging remark about university deans. Defend them.<br />
<br />
Submitted by Paul Alper<br />
<br />
==Wealth of nations==<br />
* Winner takes (almost) all, The Economist, 9th Dec 2006.<br><br />
* [http://www.eurekalert.org/pub_releases/2006-12/unu-pss120106.php Pioneering study shows richest 2 percent own half world wealth], James Davies of the University of Western Ontario, Anthony Shorrocks and Susanna Sandstrom of UNU-WIDER and Edward Wolff of New York University.<br />
<br />
The Helsinki-based World Institute for Development Economics Research of the United Nations University (UNU-WIDER)<br />
has conducted what it claims is a path-breaking study into the most comprehensive study of personal wealth ever undertaken: it is the first-of-its-kind to cover all countries in the world and all major components of household wealth, including financial assets and debts, land, buildings and other tangible property.<br />
<br />
[[Image:WorldWeathLevels.jpg|frame|World Wealth Levels in Year 2000: The world map shows per capita wealth of different countries. Average wealth amounted to $144,000 per person in the USA in year 2000, and $181,000 in Japan. Lower down among countries with wealth data are India, with per capita assets of $1,100, and Indonesia with $1,400 per capita. Source: UNU-WIDER.]]<br />
<br />
The report contains a plethora of statistics, such as:<br />
<br />
* The richest 2% of adults in the world own more than half of global household wealth.<br />
* The richest 1% of adults alone owned 40% of global assets in the year 2000.<br />
* The richest 10% of adults accounted for 85% of the world total. <br />
* The bottom half of the world adult population owned barely 1% of global wealth.<br />
* To be among the richest 10% of adults in the world required $61,000 in assets.<br />
* More than $500,000 was needed to belong to the richest 1% (37 million members).<br />
* Household wealth amounted to $125 trillion in the year 2000, equivalent to roughly three times the value of total global production (GDP) or to $20,500 per person. Adjusting for differences in the cost-of-living across nations raises the value of wealth to $26,000 per capita when measured in terms of purchasing power parity dollars.<br />
* Wealth levels vary widely across countries: ranging from $37,000 per person for New Zealand and $70,000 for Denmark to $127,000 for the UK (for high-income OECD nations).<br />
* North America has only 6% of the world adult population, yet it accounts for 34% of household wealth.<br />
* Wealth is more unequally distributed than income across countries. High income countries tend to have a bigger share of world wealth than of world GDP. The reverse is true of middle- and low-income nations. <br />
<br />
The authors warn about the ambiguity in the definition of wealth<br />
<blockquote><br />
One should be clear about what is meant by 'wealth'. <br />
In everyday conversation the term 'wealth' often signifies little more than 'money income'. <br />
On other occasions economists use 'wealth' to refer to the value of all household resources, <br />
including human capabilities.<br />
</blockquote><br />
The authors define wealth to mean 'the value of physical and financial assets less debts',<br />
so wealth represents the ownership of capital. <br />
They claim that capital is widely believed to have a disproportionate impact on household well-being and economic success, and more broadly on economic development and growth.<br />
<br />
The authors use the [http://en.wikipedia.org/wiki/Gini_coefficient Gini value] to measure inequality on a scale from zero to one. They claim that wealth is shared much less equitably than income. Income inequality ranges from 35% to 45% and wealth inequality are usually between 65% and 75% (e.g. zero would mean everyone has the same income and one means that one person has all the income and everyone else has none).<br />
The authors claim<br />
<blockquote><br />
The global wealth Gini for adults is 89%. The same degree of inequality would be obtained if one person in a group of ten takes 99% of the total pie and the other nine share the remaining 1%.<br />
</blockquote><br />
<br />
Surprisingly, household debt is seen as relatively unimportant in poor countries. As the authors of the study point out:<br />
<blockquote><br />
while many poor people in poor countries are in debt, their debts are relatively small in total. This is mainly due to the absence of financial institutions that allow households to incur large mortgage and consumer debts, as is increasingly the situation in rich countries. Many people in high-income countries have negative net worth and—somewhat paradoxically—are among the poorest people in the world in terms of household wealth.<br />
</blockquote><br />
For example, the bottom half of the Swedish population have a collective net worth of less than zero, although Nordic countries, in general, seem to thrive with relatively little personal wealth.<br />
<br />
===Questions===<br />
* A presentation format consisting of a list of such point-estimate statistics seems disjointed, as it swaps repeatedly between statistics for the richest and the poorest. Could the data be more meaningfully presented via a distribution?<br />
* The graph shows a discrete five point distribution. Is such a split of the data into buckets such as 'under 2000' and 'over 50000' meaningful? <br />
* Mapping the output to countries via colours, shows the geographic distribution of the underlying variable, wealth. What is misleading about this graph? How might countries be scaled in size to better relflect the data.<br />
* How might switching from measuring wealth to income affect the preception of the results? (A [http://en.wikipedia.org/wiki/Image:World_Map_Gini_coefficient.png gini measure of income inequality] is available from Wikipedia, along with time trends since the 1940s.)<br />
* Two high wealth economies, Japan and the United States, show very different patterns of wealth inequality, with Japan having a wealth Gini of 55% and the USA a wealth Gini of around 80%. Speculate on what factors might explain this difference.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Science in the Courtroom==<br />
[http://www.nytimes.com/2006/12/05/science/05law.html?ex=1322974800&en=28d0cbd0efade415&ei=5088&partner=rssnyt&emc=rss When questions of science come to a courtroom, truth has many faces]<br><br />
New York Times, 5 December 2006, F3<br><br />
Cornelia Dean<br />
<br />
This article appeared as the US Supreme Court began hearing its first case involving global warming. A case has been filed against the federal government by a group of state and local governments, together with environmental groups. These plaintiffs charge that that the Environmental Protection Agency, by refusing to regulate greenhouse gas emissions, is failing to enforce the Clean Air Act.<br />
<br />
Some of the arguments involve legal technicalities, such as whether the states actually have standing to bring such a suit. But the present article is concerned with the scientific evidence, and what responsibility the Court has to educate itself about the scientific underpinnings of a case. The article draws the following distinction between statistical and legal standards for proof:<br />
<br />
<blockquote><br />
Typically, scientists don't accept a finding unless, statistically, the odds are less than 1 in 20 that it occurred by chance. This standard is higher than the typical standard of proof in civil trials (&quot;preponderance of the evidence&quot;) and lower than the standard for criminal trials (&quot;beyond a reasonable doubt&quot;).<br />
</blockquote><br />
<br />
The article provides some historical references on how the Court has previously viewed scientific testimony, beginning with discussion of the 1923 Frey case on lie detectors, which introduced the &quot;general acceptance&quot; standard. This was updated in the 1993 case Daubert v. Merrell Dow Pharmaceuticals, which involved the drug Bendectin and its possible association with birth defects. The Court introduced the concepts of &quot;testability&quot; and &quot;peer review&quot; into its deliberations on science. In the 1998 case General Electric Company v. Joiner, the Court ruled that &quot;judges could reject evidence if there was simply too great a gap between 'the data and the opinion proffered.'&quot;<br />
<br />
The main thrust of the article, however, is that the Court still has been too slow to keep up with the explosion of scientific knowledge, which can be expected to play an ever larger role in future cases. For example, when corrected on a technical point in the discussion about carbon dioxide, Justice Scalia responded, &quot;Troposphere, whatever. I told you before I'm not a scientist.&quot;<br />
<br />
DISCUSSION QUESTIONS<br />
<br />
(1) What do you think of the suggested correspondence between the legal and statistical standards for evidence? What probability numbers would you attach to &quot;preponderance of the evidence&quot; and &quot;beyond a reasonable doubt&quot;?<br />
<br />
(2) How should a judge decide when there is too great a gap between &quot;the data and the opinion proffered&quot;?<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Magic numbers==<br />
<br />
[http://www.economist.com/research/articlesBySubject/displayStory.cfm?subjectid=2512631&story_id=7953427 Technical failure], Buttonwood, The Economist, Sep 21st 2006.<br />
<br />
In financial markets, some traders believe that markets change trend when they reach, say, 61.8% of their previous high, or 61.8% above their low. Such seemingly magical numbers are derived from Fibonacci series and are often given special names such as <em>the golden ratio</em> (approx 1.618) in architecture and design.<br />
<br />
This article categorises such traders as follows:<br />
<blockquote><br />
Believers in Fibonacci numbers are part of a school known as [http://en.wikipedia.org/wiki/Technical_analysis technical analysis,] or chartism, which believes the future movement of asset prices can be divined from past data. <br />
Some chartists follow patterns such as <em>head and shoulders</em> and <em>double tops</em>; others focus on moving averages; a third group believes markets move in pre-determined waves. The Fibonacci fans fall into this last set.<br />
</blockquote><br />
<br />
The Economist article points out that <br />
a [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 new study], by Professor Roy Batchelor and Richard Ramyar of the Cass Business School, finds no indication that trends reverse at the 61.8% level, or indeed at any predictable milestone in American stockmarkets.<br />
<br />
Fibonacci numbers at least have the virtue of creating a testable proposition; one that they appear to fail. However, chartists will not be completely discouraged as The Economist highlights [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 another study] which claims that 58 of 92 modern studies of technical analysis produced positive results. The authors of this second paper conclude:<br />
<blockquote><br />
Despite the positive evidence ... it appears that most empirical studies are subject to various problems in their testing procedures, e.g. data snooping, ex-post selection of trading rules or search technologies and difficulties in estimation of risk and transaction costs.<br />
</blockquote><br />
<br />
The Economist article goes on to imply that the theory which dominates at any point in time may simply be a matter of fashion:<br />
<blockquote><br />
If financial markets are efficient, technical analysis should not work at all; the prevailing market price should reflect all information, including past price movements. However, academic fashion has moved in favour of behavioural finance, which suggests that investors may not be completely rational and that their psychological biases could cause prices to deviate from their 'correct' level.<br />
</blockquote><br />
<br />
The article claims that chartism probably works best in the [http://en.wikipedia.org/wiki/Foreign_exchange_market foreign-exchange market] because major participants, especially central banks, are not 'profit-maximising' leading to inefficient pricing. Furthermore, some technical predictions may be self-fulfilling; if everyone believes that the dollar will rebound at 100 yen, they will buy it as it approaches that level. <br />
<br />
But it finishes with a warning<br />
<blockquote><br />
Chartists fall prey to their own behavioural flaw, finding “confirmation” of patterns everywhere, as if they were reading clouds in their coffee futures.<br />
</blockquote><br />
<br />
===Questions===<br />
* Can you think of possible ways to alleviate the biases mentioned: 'data snooping', 'ex-post selection of trading rules' and 'transaction costs'? Which of these issues do you think is easiest to incorporate into an analysis?<br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'Critics of technical analysis include well known [http://en.wikipedia.org/wiki/Fundamental_analysis fundamental analysts.] Warren Buffett has said, <em>I realized technical analysis didn't work when I turned the charts upside down and didn't get a different answer</em> and <em>if past history was all there was to the game, the richest people would be librarians.</em>' How might you test if Buffett's assertions are true? <br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'To a technician, however, Buffett paraphrased [technical analysis] when he commented in a recent conference on investing in mining companies, <em>in metals and oils, there's been a terrific [price] move. It's like most trends: at the beginning, it's driven by fundamentals, then speculation takes over ... then the speculation becomes dominant.</em>' Do you agree that Buffett is acknowledging that markets are inefficient because they trend? Would a basic, first-order, auto-regressive model (AR(1)) on price differences be sufficient to test the existance of such a trend?<br />
* Technicians argue that many investors base their future expectations on past earnings, track records, etc. Because future stock prices can be strongly influenced by investor expectations, technicians claim this means that past prices can influence future prices. Does this argument persuade you?<br />
<br />
===Further reading===<br />
* [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 Magic numbers in the Dow,] Roy Batchelor and Richard Ramyar, Cass Business School, City of London, Sep 2006.<br />
* [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 The Profitability of Technical Analysis: a review,] by Cheol-Ho Park and Scott H Irwin, University of Illinois, October 2004.<br />
* The [http://en.wikipedia.org/wiki/Random_walk_hypothesis random walk hypothesis] is at odds with technical analysis and charting. This hypothesis claims that stock price movements are a Brownian Motion with either independent or uncorrelated increments. In such a model, movements in stock prices are not dependent on past stock prices, so trends cannot exist and technical analysis has no basis. Random Walk advocates such as [http://www.math.temple.edu/~paulos John Allen Paulos] believe that technical analysis and fundamental analysis are pseudo-sciences. The latter tried his hand at playing the stock markets without success:<br />
<blockquote><br />
[http://www.math.temple.edu/~paulos/contents.html A Mathematician Plays the Stock Market] is the story of my disastrous love affair with WorldCom, but lest you dread a cloyingly personal account of how I lost my shirt (or at least had my sleeves shortened), I assure you that the book's primary purpose is to lay out, elucidate, and explore the basic conceptual mathematics of the market. I'll examine ... issues associated with the market. Is it efficient? random? Is there anything to technical analysis, fundamental analysis, and other supposedly time-tested methods of picking stocks? How can one quantify risk? What is the role of cognitive illusion and pyschological foible (to which, alas, I am not immune)?<br />
...<br />
In short, what can the tools of mathematics can tell us about the vagaries of the stock market?<br />
</blockquote><br />
<br />
Submitted by John Gavin.</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_22&diff=3471Chance News 222007-01-05T17:06:34Z<p>Mmartin: /* Questions */</p>
<hr />
<div>==Quotations==<br />
<br />
<br />
<blockquote>It would be hard to make a probability course boring.<br><br />
<div align=right><br />
William Feller<br><br />
Personal comment to Laurie Snell</div><br />
</blockquote><br />
----<br />
<blockquote> Apart from Fred, [an obstreperous rat in her psychology lab] I was sick of trying to master statistics. I had a mental block when it came to any form of mathematics. 'Rats and stats,' I complained to a fellow student one day, 'I came here to learn about people.' I wasn't the only student disgruntled. Many complained but to no avail.<br />
<div align=right><br />
Sally Morgan in her book, My Place<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote>The risk of going into cardiac arrest as a spectator, he [Dr. Siegal of Massachusetts General Hospital] said, is only about one in a million. (The applicable studies of spectators involved Super Bowl fans.)<br />
<br />
==Forsooth==<br />
<br />
<blockquote> NOAA's heating degree day forecast for December, January and February projects a 2 percent warmer winter than the 30 year average<Br><br />
<div align=right>[http://www.noaanews.noaa.gov/stories2006/s2742.htm NOAA Magazine]<br><br />
</div></blockquote><br />
<br />
The following Forsooths are from the November 2006 RRS NEWS.<br />
----<br />
<blockquote>At St John's Wood station alone, the number of CCTV cameras has jumped from 20 to 57, an increase of 300 per cent.<br />
<br><br />
<div align=right>Metro <br><br />
3 May 2006<br />
</div></blockquote><br />
----<br />
<blockquote>Now 78% of female veterinary medicine students are women, almost a complete turn-around from the previous situation.<br><br />
<div align=right><br />
The Herald (Glasgow) <br><br />
4 May 2006</div><br />
</blockquote><br />
----<br />
<blockquote>Drought to ravage half the world within 100 years<br><br><br />
Half the world's surface will be gripped by drought by the end of the century, the Met Office said yesterday.<br><br />
<div align=right><br />
Times online <br><br />
6 October 2006</div><br />
</blockquote><br />
----<br />
<br />
==I wasn't making up data, I was imputing!==<br />
<br />
An Unwelcome Discovery, by Jeneen Interlandi, The New York Times, October 22, 2006.<br />
<br />
The New York Times has an informative summary of a recent scandal involving a prominent researcher at the University of Vermont, Eric Poehlman. The Poehlman scandal represents perhaps the biggest case of research fraud in recent history.<br />
<br />
<blockquote>He presented fraudulent data in lectures and in published papers, and he used this data to obtain millions of dollars in federal grants from the National Institutes of Health — a crime subject to as many as five years in federal prison.</blockquote><br />
<br />
The first person to speak up about the possibility of fraud in Poehlman's work was one of his research assistants, Walter DeNino.<br />
<br />
<blockquote>The fall that DeNino returned to the lab, Poehlman was looking into how fat levels in the blood change with age. DeNino’s task was to compare the levels of lipids, or fats, in two sets of blood samples taken several years apart from a large group of patients. As the patients aged, Poehlman expected, the data would show an increase in low-density lipoprotein (LDL), which deposits cholesterol in arteries, and a decrease in high-density lipoprotein (HDL), which carries it to the liver, where it can be broken down. Poehlman’s hypothesis was not controversial; the idea that lipid levels worsen with age was supported by decades of circumstantial evidence. Poehlman expected to contribute to this body of work by demonstrating the change unequivocally in a clinical study of actual patients over time. But when DeNino ran his first analysis, the data did not support the premise.</blockquote><br />
<br />
<blockquote>When Poehlman saw the unexpected results, he took the electronic file home with him. The following week, Poehlman returned the database to DeNino, explained that he had corrected some mistaken entries and asked DeNino to re-run the statistical analysis. Now the trend was clear: HDL appeared to decrease markedly over time, while LDL increased, exactly as they had hypothesized.</blockquote><br />
<br />
<blockquote>Although DeNino trusted his boss implicitly, the change was too great to be explained by a handful of improperly entered numbers, which was all Poehlman claimed to have fixed. DeNino pulled up the original figures and compared them with the ones Poehlman had just given him. In the initial spreadsheet, many patients showed an increase in HDL from the first visit to the second. In the revised sheet, all patients showed a decrease. Astonished, DeNino read through the data again. Sure enough, the only numbers that hadn’t been changed were the ones that supported his hypothesis.<br />
</blockquote><br />
<br />
Poehlman brushed DeNino's concerns aside, so DeNino started asking around and other graduate students and postdocs had similar concerns. He got some cautionary advice from a former postdoctoral fellow<br />
<br />
<blockquote>Being associated with either falsified data or a frivolous allegation against a scientist as prominent as Poehlman could end DeNino’s career before it even began.</blockquote><br />
<br />
and a faculty member who shared lab space with Poehlman who advised<br />
<br />
<blockquote>If you’re going to do something, make sure you really have the evidence.</blockquote><br />
<br />
So DeNino started looking for the evidence.<br />
<br />
<blockquote>DeNino spent the next several evenings combing through hundreds of patients’ records in the lab and university hospital, trying to verify the data contained in Poehlman’s spreadsheets. Each night was worse than the one before. He discovered not only reversed data points, but also figures for measurements that had never been taken and even patients who appeared not to exist at all.</blockquote><br />
<br />
DeNino presented his evidence to the university counsel and the response of Poehlman (to his department chair, Burton Sobel) was rather startling.<br />
<br />
<blockquote>The accused scientist gave him the impression that nothing was wrong and seemed mostly annoyed by all the fuss. In his written response to the allegations, Poehlman suggested that the data had gotten out of hand, accumulating numerous errors because of handling by multiple technicians and postdocs over the years. “I found that noncredible, really, for an investigator of Eric’s experience,” Sobel later told the investigative panel. “There had to be a backup copy that was pure,” Sobel reasoned before the panel. “You would not have postdocs and lab techs in charge of discrepant data sets.” But Poehlman told Sobel that there was no master copy.</blockquote><br />
<br />
At the formal hearing, Poehlman had a different defense.<br />
<br />
<blockquote>First, he attributed his mistakes to his own self-proclaimed ineptitude with Excel files. Then, when pressed on how fictitious numbers found their way into the spreadsheet he’d given DeNino, Poehlman laid out his most elaborate explanation yet. He had imputed data — that is, he had derived predicted values for measurements using a complicated statistical model. His intention, he said, was to look at hypothetical outcomes that he would later compare to the actual results. He insisted that he never meant for DeNino to analyze the imputed values and had given him the spreadsheet by mistake.</blockquote><br />
<br />
The New York Times article points out how pathetic this attempted explanation was.<br />
<br />
<blockquote>Although data can be imputed legitimately in some disciplines, it is generally frowned upon in clinical research, and this explanation came across as hollow and suspicious, especially since Poehlman appeared to have no idea how imputation was done.</blockquote><br />
<br />
A large portion of the article examines how research fraud can occur in a system that is supposed to be self-correcting.<br />
<br />
First, the people who are mostly likely to notice fraud are junior investigators who are subordinate to their research mentor. It's psychologically and emotionally difficult to confront someone who has devoted time to your professional development. Even when an investigator is emotionally willing to confront their mentor, they have their career concerns to worry about.<br />
<br />
<blockquote>The principal investigator in a lab has the power to jump-start careers. By writing papers with graduate students and postdocs and using connections to help obtain fellowships and appointments, senior scientists can help their lab workers secure coveted tenure-track jobs. They can also do damage by withholding this support.</blockquote><br />
<br />
Every university will have a system in place to investigate claims of fraud. But there are problems here as well.<br />
<br />
<blockquote>All universities that receive public money to conduct research are required to have an integrity officer who ensures compliance with federal guidelines. But policing its scientists can be a heavy burden for a university. “It’s your own faculty, and there’s this idea of supporting and nurturing them,” says Ellen Hyman-Browne, a research-compliance officer at the Children’s Hospital of Philadelphia, a teaching hospital. Moreover, investigations cost time and money, and no institution wants to discover something that could cast a shadow on its reputation.</blockquote><br />
<br />
<blockquote>“There are conflicting influences on a university where they are the co-grantor and responsible to other investigators,” says Stephen Kelly, the Justice Department attorney who prosecuted Poehlman. “For the system to work, the university has to be very ethical.”</blockquote><br />
<br />
Poehlman himself was careful and chose areas where fraud would be especially difficult to detect. He specialized in presenting longitudinal data, data that is very expensive to replicate. He also presented research results that confirmed what most researchers had suspected, rather than results that would undermine existing theories of nutrition.<br />
<br />
At his sentencing, Poehlman was sentenced to one year and one day in federal prison, making him the first researcher to serve time in jail for research fraud.<br />
<br />
<blockquote>“When scientists use their skill and their intelligence and their sophistication and their position of trust to do something which puts people at risk, that is extraordinarily serious,” the judge said. “In one way, this is a final lesson that you are offering.”</blockquote><br />
<br />
===Questions===<br />
<br />
1. Do you have experience with a researcher changing the data values after seeing the initial analysis results? What would make you suspicious of fraud?<br />
<br />
2. Is the peer-review system of research self-correcting? What changes could be made to this system?<br />
<br />
3. When is imputation legitimate and when is it fraudulent?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Independence for national statistics==<br />
[http://www.johnkay.com/political/453 A better way to restore faith in official statistics], [http://www.johnkay.com/ John Kay], Financial Times 25 July 2006.<br><br />
<br />
[http://www.johnkay.com/political/453 John Kay], a columnist for the [http://www.ft.com Financial Times], outlines the measures needed to ensure that national statistics are truly independent. <br />
<br />
The current state of UK official statistics was covered in a previous Chance article<br />
<em>[http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9#Pick_a_number.2C_any_number Pick a number, any number,]</em> in [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9 Chance News 9.] That article summarised a report on this topic, to which professional users, such the Royal Statistical Society, gave a cautious welcome to the government’s announcement of independence for the UK Office of National Statistics (ONS). <br />
<br />
Kay's article follows up on the reaction to that report.<br />
He tells us that accurate public information is a prerequisite of democracy, government statisticians are honest people but ministers (politicians) needs are often for propaganda rather than facts. <br />
Kay claims that decentralisation of responsibility for the production of official statistics has created a two-tier system in the UK. <br />
<blockquote><br />
statistics produced by the Office for National Statistics (ONS), which operates to internationally agreed criteria, are of higher quality than those produced by (government) departments. <br />
</blockquote><br />
The proposal to hand responsibility for all official statistics to the ONS was rejected,<br />
as were the suggestions for greater independence, made by bodies such as the Statistics Commission and the Royal Statistical Society,<br />
* separating statistical information from political statements, <br />
* reducing access by ministers to new data before their release, <br />
* giving parliament a defined role in the appointment of the National Statistician. <br />
<br />
Instead, the latest news is that the ONS will be demoted to a non-ministerial department.<br />
The worst news is the abolition of the Statistics Commission, which reviews all government statistics, and has made itself unpopular with government by proving itself robustly independent. <br />
<br />
Kay also cautions that statistics may be misused in contexts other than those intended. The value of health services increases as incomes rise and it can be argued that this increases the value of health output even if outcomes and procedures are unchanged. This statistical adjustment provides no basis whatever for claims that the National Health Service is more efficient. But the assertion grabs a headline, and it is only much later that pedantic journalists and academics can discover what is actually going on. <br />
<br />
Submitted by John Gavin.<br />
<br />
==An example of Simpson's Paradox==<br />
<br />
Study finds wealth inequality is widening worldwide<br><br />
''New York Times'', Dec. 6, 2006, C-3<br><br />
Eduardo Porter<br />
<br />
The article contains stats from a 2000 report on wealth distribution by <br />
country and worldwide. The article points out (toward the end) that <br />
even though every country has seen growing income inequality in the <br />
last six years, the *worldwide* inequality gap may be narrowing from <br />
the year 2000 stats to the present. The reason is the huge growth and <br />
wealth accumulation in China and India, which raises income overall, <br />
even though both those countries have also seen greater inequality.<br />
<br />
Submitted by Bob Dobrow<br />
<br />
==Predecessors of Poehlman==<br />
<br />
Steve Simon's wiki, "I wasn't making up data, I was imputing!" is quite interesting and informative. Nevertheless, some elaboration is in order regarding fraud and Simon's statement that "The Poehlman scandal represents perhaps the biggest case of research fraud in recent history." <br />
<br />
The term "recent history" is sufficiently elastic to permit quoting myself in the 1980s: <blockquote>Admittedly Slutsky is an extreme example...even after the investigation [proving fraud in many of his papers]...Robert G. Slutsky was [still] given credit for [an additional] 77 publications in his seven years with [the University of California, San Diego]...in 1984 he published at the astonishing rate of one paper every ten days..Slutsky's phenomenal productivity was encouraged, applauded and rewarded...John R. Darsee [another cardiologist but at Harvard], had about 100 papers in a period of two years and his undoing in 1981 was colleagues who secretly saw him forging the data.</blockquote> <br />
<br />
Put Slutsky and Darsee into Google.com and you will see the entire treatment. My point is that the Eric Poehlman scandal is nowhere near the biggest--Slutsky and Darsee involved entire prestigious labs. And we tend to ignore history at our peril. An extensive treatment of Slutsky, Darsee and many others (Baltimore, Imanishi-Kari, Spector, Summerlin, Long, Alsabti, Soman, Breuning, Pearce, Hermann, Brach, Schoen, not to mention more illustrative predecessors such as Newton, Mendel, Pasteur and Freud) can be found in The Great Betrayal: Fraud in Science by Horace Freeland Judson [Harcourt, Inc., 2004].<br />
<br />
Although Judson's book is a wonderful page-turner, go to www.bmj.com/cgi/content/full/329/7471/922 to see a critique of the book by Peter Wilmshurst, a British cardiologist who is very active in unearthing medical fraud. Wilmshurst suggests that "Judson paints a rosier picture of the mechanisms for dealing with research fraud than I recognize." Further, "Judson only briefly describes what may be the most common form of research misconduct: failure to publish results...for the sake of company profits."<br />
<br />
Although research frauds tend to have things in common--colossal egos, external as well as internal pressures, desire for fame, money, etc.--each instance is possibly unique. Poehlman evidenced a typical trait: he fabricated the data. According to the original New York Times article, his study on menopause "was almost entirely fabricated. Poehlman had tested only 2 women, not 35." On the other hand, Poehlman was downright stupid to have changed his (real, existing) cholesterol data to fit his (and others) belief that cholesterol levels worsen with age because he had the only large longitudinal study, implying that it would be publishable and valuable regardless of the results. The other unusual feature was that "He was only the second scientist in the United States to face criminal prosecution for falsifying research data."<br />
<br />
Buried in the NYT article is the statement made by Steven Heymsfield, an obesity researcher at Merck and should be a guiding light for all researchers: "But deans love people who bring in money and recognition to universities, so there is Eric."<br />
<br />
Discussion<br />
<br />
1. Use a search engine to determine what fraud was committed by some of the predecessors of Poehlman.<br />
<br />
2. Scientists claim that peer review and duplication of results act to inhibit fraud. Pick a researcher and determine why either or both failed.<br />
<br />
3. This wiki ends with a disparaging remark about university deans. Defend them.<br />
<br />
Submitted by Paul Alper<br />
<br />
==Wealth of nations==<br />
* Winner takes (almost) all, The Economist, 9th Dec 2006.<br><br />
* [http://www.eurekalert.org/pub_releases/2006-12/unu-pss120106.php Pioneering study shows richest 2 percent own half world wealth], James Davies of the University of Western Ontario, Anthony Shorrocks and Susanna Sandstrom of UNU-WIDER and Edward Wolff of New York University.<br />
<br />
The Helsinki-based World Institute for Development Economics Research of the United Nations University (UNU-WIDER)<br />
has conducted what it claims is a path-breaking study into the most comprehensive study of personal wealth ever undertaken: it is the first-of-its-kind to cover all countries in the world and all major components of household wealth, including financial assets and debts, land, buildings and other tangible property.<br />
<br />
[[Image:WorldWeathLevels.jpg|frame|World Wealth Levels in Year 2000: The world map shows per capita wealth of different countries. Average wealth amounted to $144,000 per person in the USA in year 2000, and $181,000 in Japan. Lower down among countries with wealth data are India, with per capita assets of $1,100, and Indonesia with $1,400 per capita. Source: UNU-WIDER.]]<br />
<br />
The report contains a plethora of statistics, such as:<br />
<br />
* The richest 2% of adults in the world own more than half of global household wealth.<br />
* The richest 1% of adults alone owned 40% of global assets in the year 2000.<br />
* The richest 10% of adults accounted for 85% of the world total. <br />
* The bottom half of the world adult population owned barely 1% of global wealth.<br />
* To be among the richest 10% of adults in the world required $61,000 in assets.<br />
* More than $500,000 was needed to belong to the richest 1% (37 million members).<br />
* Household wealth amounted to $125 trillion in the year 2000, equivalent to roughly three times the value of total global production (GDP) or to $20,500 per person. Adjusting for differences in the cost-of-living across nations raises the value of wealth to $26,000 per capita when measured in terms of purchasing power parity dollars.<br />
* Wealth levels vary widely across countries: ranging from $37,000 per person for New Zealand and $70,000 for Denmark to $127,000 for the UK (for high-income OECD nations).<br />
* North America has only 6% of the world adult population, yet it accounts for 34% of household wealth.<br />
* Wealth is more unequally distributed than income across countries. High income countries tend to have a bigger share of world wealth than of world GDP. The reverse is true of middle- and low-income nations. <br />
<br />
The authors warn about the ambiguity in the definition of wealth<br />
<blockquote><br />
One should be clear about what is meant by 'wealth'. <br />
In everyday conversation the term 'wealth' often signifies little more than 'money income'. <br />
On other occasions economists use 'wealth' to refer to the value of all household resources, <br />
including human capabilities.<br />
</blockquote><br />
The authors define wealth to mean 'the value of physical and financial assets less debts',<br />
so wealth represents the ownership of capital. <br />
They claim that capital is widely believed to have a disproportionate impact on household well-being and economic success, and more broadly on economic development and growth.<br />
<br />
The authors use the [http://en.wikipedia.org/wiki/Gini_coefficient Gini value] to measure inequality on a scale from zero to one. They claim that wealth is shared much less equitably than income. Income inequality ranges from 35% to 45% and wealth inequality are usually between 65% and 75% (e.g. zero would mean everyone has the same income and one means that one person has all the income and everyone else has none).<br />
The authors claim<br />
<blockquote><br />
The global wealth Gini for adults is 89%. The same degree of inequality would be obtained if one person in a group of ten takes 99% of the total pie and the other nine share the remaining 1%.<br />
</blockquote><br />
<br />
Surprisingly, household debt is seen as relatively unimportant in poor countries. As the authors of the study point out:<br />
<blockquote><br />
while many poor people in poor countries are in debt, their debts are relatively small in total. This is mainly due to the absence of financial institutions that allow households to incur large mortgage and consumer debts, as is increasingly the situation in rich countries. Many people in high-income countries have negative net worth and—somewhat paradoxically—are among the poorest people in the world in terms of household wealth.<br />
</blockquote><br />
For example, the bottom half of the Swedish population have a collective net worth of less than zero, although Nordic countries, in general, seem to thrive with relatively little personal wealth.<br />
<br />
===Questions===<br />
* A presentation format consisting of a list of such point-estimate statistics seems disjointed, as it swaps repeatedly between statistics for the richest and the poorest. Could the data be more meaningfully presented via a distribution?<br />
* The graph shows a discrete five point distribution. Is such a split of the data into buckets such as 'under 2000' and 'over 50000' meaningful? <br />
* Mapping the output to countries via colours, shows the geographic distribution of the underlying variable, wealth. What is misleading about this graph? How might countries be scaled in size to better relflect the data.<br />
* How might switching from measuring wealth to income affect the preception of the results? (A [http://en.wikipedia.org/wiki/Image:World_Map_Gini_coefficient.png gini measure of income inequality] is available from Wikipedia, along with time trends since the 1940s.)<br />
* Two high wealth economies, Japan and the United States, show very different patterns of wealth inequality, with Japan having a wealth Gini of 55% and the USA a wealth Gini of around 80%. Speculate on what factors might explain this difference.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Science in the Courtroom==<br />
[http://www.nytimes.com/2006/12/05/science/05law.html?ex=1322974800&en=28d0cbd0efade415&ei=5088&partner=rssnyt&emc=rss When questions of science come to a courtroom, truth has many faces]<br><br />
New York Times, 5 December 2006, F3<br><br />
Cornelia Dean<br />
<br />
This article appeared as the US Supreme Court began hearing its first case involving global warming. A case has been filed against the federal government by a group of state and local goverments, together with environmental groups. These plaintiffs charge that that the Environmental Protection Agency, by refusing to regulate greenhouse gas emissions, is failing to enforce the Clean Air Act.<br />
<br />
Some of the arguments involve legal technicalities, such as whether the states actually have standing to bring such a suit. But the present article is concerned with the scientific evidence, and what responsibility the Court has to educate itself about the scientific underpinnings of a case. The article draws the following distinction between between statistical and legal standards for proof:<br />
<br />
<blockquote><br />
Typically, scientists don't accept a finding unless, statistically, the odds are less than 1 in 20 that it occurred by chance. This standard is higher than the typical standard of proof in civil trials (&quot;preponderance of the evidence&quot;) and lower than the standard for criminal trials (&quot;beyond a reasonable doubt&quot;).<br />
</blockquote><br />
<br />
The article provides some historical references on how the Court has previously viewed scientific testimony, beginning discussion of the 1923 Frey case on lie detectors, which introduced the &quot;general acceptance&quot; standard. This was updated in the 1993 case Daubert v. Merrell Dow Pharmaceuticals, which involved the drug Bendectin and it possible association with birth defects. The Court introduced the concepts of &quot;testability&quot; and &quot;peer review&quot; into its deliberations on science. In the 1998 case General Electric Company v. Joiner, the Court ruled that &quot;judges could reject evidence if there was simply too great a gap between 'the data and the opinion proffered.'&quot;<br />
<br />
The main thrust of the article, however, is that the Court still has been too slow to keep up with the explosion of scientific knowledge, which can be expected to play an ever larger role in future cases. For example, when corrected on a technical point in the discussion about carbon dioxide, Justice Scalia responded, &quot;Troposphere, whatever. I told you before I'm not a scientist.&quot;<br />
<br />
DISCUSSION QUESTIONS<br />
<br />
(1) What do you think of the suggested correspondence between the legal and statistical standards for evidence? What probability numbers would you attach to &quot;preponderance of the evidence&quot; and &quot;beyond a reasonable doubt&quot;?<br />
<br />
(2) How should a judge decide when there is too great a gap between &quot;the data and the opinion proffered&quot;?<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Magic numbers==<br />
<br />
[http://www.economist.com/research/articlesBySubject/displayStory.cfm?subjectid=2512631&story_id=7953427 Technical failure], Buttonwood, The Economist, Sep 21st 2006.<br />
<br />
In financial markets, some traders believe that markets change trend when they reach, say, 61.8% of their previous high, or 61.8% above their low. Such seemingly magical numbers are derived from Fibonacci series and are often given special names such as <em>the golden ratio</em> (approx 1.618) in architecture and design.<br />
<br />
This article categorises such traders as follows:<br />
<blockquote><br />
Believers in Fibonacci numbers are part of a school known as [http://en.wikipedia.org/wiki/Technical_analysis technical analysis,] or chartism, which believes the future movement of asset prices can be divined from past data. <br />
Some chartists follow patterns such as <em>head and shoulders</em> and <em>double tops</em>; others focus on moving averages; a third group believes markets move in pre-determined waves. The Fibonacci fans fall into this last set.<br />
</blockquote><br />
<br />
The Economist article points out that <br />
a [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 new study], by Professor Roy Batchelor and Richard Ramyar of the Cass Business School, finds no indication that trends reverse at the 61.8% level, or indeed at any predictable milestone in American stockmarkets.<br />
<br />
Fibonacci numbers at least have the virtue of creating a testable proposition; one that they appear to fail. However, chartists will not be completely discouraged as The Economist highlights [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 another study] which claims that 58 of 92 modern studies of technical analysis produced positive results. The authors of this second paper conclude:<br />
<blockquote><br />
Despite the positive evidence ... it appears that most empirical studies are subject to various problems in their testing procedures, e.g. data snooping, ex-post selection of trading rules or search technologies and difficulties in estimation of risk and transaction costs.<br />
</blockquote><br />
<br />
The Economist article goes on to imply that the theory which dominates at any point in time may simply be a matter of fashion:<br />
<blockquote><br />
If financial markets are efficient, technical analysis should not work at all; the prevailing market price should reflect all information, including past price movements. However, academic fashion has moved in favour of behavioural finance, which suggests that investors may not be completely rational and that their psychological biases could cause prices to deviate from their 'correct' level.<br />
</blockquote><br />
<br />
The article claims that chartism probably works best in the [http://en.wikipedia.org/wiki/Foreign_exchange_market foreign-exchange market] because major participants, especially central banks, are not 'profit-maximising' leading to inefficient pricing. Furthermore, some technical predictions may be self-fulfilling; if everyone believes that the dollar will rebound at 100 yen, they will buy it as it approaches that level. <br />
<br />
But it finishes with a warning<br />
<blockquote><br />
Chartists fall prey to their own behavioural flaw, finding “confirmation” of patterns everywhere, as if they were reading clouds in their coffee futures.<br />
</blockquote><br />
<br />
===Questions===<br />
* Can you think of possible ways to alleviate the biases mentioned: 'data snooping', 'ex-post selection of trading rules' and 'transaction costs'? Which of these issues do you think is easiest to incorporate into an analysis?<br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'Critics of technical analysis include well known [http://en.wikipedia.org/wiki/Fundamental_analysis fundamental analysts.] Warren Buffett has said, <em>I realized technical analysis didn't work when I turned the charts upside down and didn't get a different answer</em> and <em>if past history was all there was to the game, the richest people would be librarians.</em>' How might you test if Buffett's assertions are true? <br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'To a technician, however, Buffett paraphrased [technical analysis] when he commented in a recent conference on investing in mining companies, <em>in metals and oils, there's been a terrific [price] move. It's like most trends: at the beginning, it's driven by fundamentals, then speculation takes over ... then the speculation becomes dominant.</em>' Do you agree that Buffett is acknowledging that markets are inefficient because they trend? Would a basic, first-order, auto-regressive model (AR(1)) on price differences be sufficient to test the existance of such a trend?<br />
* Technicians argue that many investors base their future expectations on past earnings, track records, etc. Because future stock prices can be strongly influenced by investor expectations, technicians claim this means that past prices can influence future prices. Does this argument persuade you?<br />
<br />
===Further reading===<br />
* [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 Magic numbers in the Dow,] Roy Batchelor and Richard Ramyar, Cass Business School, City of London, Sep 2006.<br />
* [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 The Profitability of Technical Analysis: a review,] by Cheol-Ho Park and Scott H Irwin, University of Illinois, October 2004.<br />
* The [http://en.wikipedia.org/wiki/Random_walk_hypothesis random walk hypothesis] is at odds with technical analysis and charting. This hypothesis claims that stock price movements are a Brownian Motion with either independent or uncorrelated increments. In such a model, movements in stock prices are not dependent on past stock prices, so trends cannot exist and technical analysis has no basis. Random Walk advocates such as [http://www.math.temple.edu/~paulos John Allen Paulos] believe that technical analysis and fundamental analysis are pseudo-sciences. The latter tried his hand at playing the stock markets without success:<br />
<blockquote><br />
[http://www.math.temple.edu/~paulos/contents.html A Mathematician Plays the Stock Market] is the story of my disastrous love affair with WorldCom, but lest you dread a cloyingly personal account of how I lost my shirt (or at least had my sleeves shortened), I assure you that the book's primary purpose is to lay out, elucidate, and explore the basic conceptual mathematics of the market. I'll examine ... issues associated with the market. Is it efficient? random? Is there anything to technical analysis, fundamental analysis, and other supposedly time-tested methods of picking stocks? How can one quantify risk? What is the role of cognitive illusion and pyschological foible (to which, alas, I am not immune)?<br />
...<br />
In short, what can the tools of mathematics can tell us about the vagaries of the stock market?<br />
</blockquote><br />
<br />
Submitted by John Gavin.</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_22&diff=3470Chance News 222007-01-05T17:05:37Z<p>Mmartin: /* Wealth of nations */</p>
<hr />
<div>==Quotations==<br />
<br />
<br />
<blockquote>It would be hard to make a probability course boring.<br><br />
<div align=right><br />
William Feller<br><br />
Personal comment to Laurie Snell</div><br />
</blockquote><br />
----<br />
<blockquote> Apart from Fred, [an obstreperous rat in her psychology lab] I was sick of trying to master statistics. I had a mental block when it came to any form of mathematics. 'Rats and stats,' I complained to a fellow student one day, 'I came here to learn about people.' I wasn't the only student disgruntled. Many complained but to no avail.<br />
<div align=right><br />
Sally Morgan in her book, My Place<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote>The risk of going into cardiac arrest as a spectator, he [Dr. Siegal of Massachusetts General Hospital] said, is only about one in a million. (The applicable studies of spectators involved Super Bowl fans.)<br />
<br />
==Forsooth==<br />
<br />
<blockquote> NOAA's heating degree day forecast for December, January and February projects a 2 percent warmer winter than the 30 year average<Br><br />
<div align=right>[http://www.noaanews.noaa.gov/stories2006/s2742.htm NOAA Magazine]<br><br />
</div></blockquote><br />
<br />
The following Forsooths are from the November 2006 RRS NEWS.<br />
----<br />
<blockquote>At St John's Wood station alone, the number of CCTV cameras has jumped from 20 to 57, an increase of 300 per cent.<br />
<br><br />
<div align=right>Metro <br><br />
3 May 2006<br />
</div></blockquote><br />
----<br />
<blockquote>Now 78% of female veterinary medicine students are women, almost a complete turn-around from the previous situation.<br><br />
<div align=right><br />
The Herald (Glasgow) <br><br />
4 May 2006</div><br />
</blockquote><br />
----<br />
<blockquote>Drought to ravage half the world within 100 years<br><br><br />
Half the world's surface will be gripped by drought by the end of the century, the Met Office said yesterday.<br><br />
<div align=right><br />
Times online <br><br />
6 October 2006</div><br />
</blockquote><br />
----<br />
<br />
==I wasn't making up data, I was imputing!==<br />
<br />
An Unwelcome Discovery, by Jeneen Interlandi, The New York Times, October 22, 2006.<br />
<br />
The New York Times has an informative summary of a recent scandal involving a prominent researcher at the University of Vermont, Eric Poehlman. The Poehlman scandal represents perhaps the biggest case of research fraud in recent history.<br />
<br />
<blockquote>He presented fraudulent data in lectures and in published papers, and he used this data to obtain millions of dollars in federal grants from the National Institutes of Health — a crime subject to as many as five years in federal prison.</blockquote><br />
<br />
The first person to speak up about the possibility of fraud in Poehlman's work was one of his research assistants, Walter DeNino.<br />
<br />
<blockquote>The fall that DeNino returned to the lab, Poehlman was looking into how fat levels in the blood change with age. DeNino’s task was to compare the levels of lipids, or fats, in two sets of blood samples taken several years apart from a large group of patients. As the patients aged, Poehlman expected, the data would show an increase in low-density lipoprotein (LDL), which deposits cholesterol in arteries, and a decrease in high-density lipoprotein (HDL), which carries it to the liver, where it can be broken down. Poehlman’s hypothesis was not controversial; the idea that lipid levels worsen with age was supported by decades of circumstantial evidence. Poehlman expected to contribute to this body of work by demonstrating the change unequivocally in a clinical study of actual patients over time. But when DeNino ran his first analysis, the data did not support the premise.</blockquote><br />
<br />
<blockquote>When Poehlman saw the unexpected results, he took the electronic file home with him. The following week, Poehlman returned the database to DeNino, explained that he had corrected some mistaken entries and asked DeNino to re-run the statistical analysis. Now the trend was clear: HDL appeared to decrease markedly over time, while LDL increased, exactly as they had hypothesized.</blockquote><br />
<br />
<blockquote>Although DeNino trusted his boss implicitly, the change was too great to be explained by a handful of improperly entered numbers, which was all Poehlman claimed to have fixed. DeNino pulled up the original figures and compared them with the ones Poehlman had just given him. In the initial spreadsheet, many patients showed an increase in HDL from the first visit to the second. In the revised sheet, all patients showed a decrease. Astonished, DeNino read through the data again. Sure enough, the only numbers that hadn’t been changed were the ones that supported his hypothesis.<br />
</blockquote><br />
<br />
Poehlman brushed DeNino's concerns aside, so DeNino started asking around and other graduate students and postdocs had similar concerns. He got some cautionary advice from a former postdoctoral fellow<br />
<br />
<blockquote>Being associated with either falsified data or a frivolous allegation against a scientist as prominent as Poehlman could end DeNino’s career before it even began.</blockquote><br />
<br />
and a faculty member who shared lab space with Poehlman who advised<br />
<br />
<blockquote>If you’re going to do something, make sure you really have the evidence.</blockquote><br />
<br />
So DeNino started looking for the evidence.<br />
<br />
<blockquote>DeNino spent the next several evenings combing through hundreds of patients’ records in the lab and university hospital, trying to verify the data contained in Poehlman’s spreadsheets. Each night was worse than the one before. He discovered not only reversed data points, but also figures for measurements that had never been taken and even patients who appeared not to exist at all.</blockquote><br />
<br />
DeNino presented his evidence to the university counsel and the response of Poehlman (to his department chair, Burton Sobel) was rather startling.<br />
<br />
<blockquote>The accused scientist gave him the impression that nothing was wrong and seemed mostly annoyed by all the fuss. In his written response to the allegations, Poehlman suggested that the data had gotten out of hand, accumulating numerous errors because of handling by multiple technicians and postdocs over the years. “I found that noncredible, really, for an investigator of Eric’s experience,” Sobel later told the investigative panel. “There had to be a backup copy that was pure,” Sobel reasoned before the panel. “You would not have postdocs and lab techs in charge of discrepant data sets.” But Poehlman told Sobel that there was no master copy.</blockquote><br />
<br />
At the formal hearing, Poehlman had a different defense.<br />
<br />
<blockquote>First, he attributed his mistakes to his own self-proclaimed ineptitude with Excel files. Then, when pressed on how fictitious numbers found their way into the spreadsheet he’d given DeNino, Poehlman laid out his most elaborate explanation yet. He had imputed data — that is, he had derived predicted values for measurements using a complicated statistical model. His intention, he said, was to look at hypothetical outcomes that he would later compare to the actual results. He insisted that he never meant for DeNino to analyze the imputed values and had given him the spreadsheet by mistake.</blockquote><br />
<br />
The New York Times article points out how pathetic this attempted explanation was.<br />
<br />
<blockquote>Although data can be imputed legitimately in some disciplines, it is generally frowned upon in clinical research, and this explanation came across as hollow and suspicious, especially since Poehlman appeared to have no idea how imputation was done.</blockquote><br />
<br />
A large portion of the article examines how research fraud can occur in a system that is supposed to be self-correcting.<br />
<br />
First, the people who are mostly likely to notice fraud are junior investigators who are subordinate to their research mentor. It's psychologically and emotionally difficult to confront someone who has devoted time to your professional development. Even when an investigator is emotionally willing to confront their mentor, they have their career concerns to worry about.<br />
<br />
<blockquote>The principal investigator in a lab has the power to jump-start careers. By writing papers with graduate students and postdocs and using connections to help obtain fellowships and appointments, senior scientists can help their lab workers secure coveted tenure-track jobs. They can also do damage by withholding this support.</blockquote><br />
<br />
Every university will have a system in place to investigate claims of fraud. But there are problems here as well.<br />
<br />
<blockquote>All universities that receive public money to conduct research are required to have an integrity officer who ensures compliance with federal guidelines. But policing its scientists can be a heavy burden for a university. “It’s your own faculty, and there’s this idea of supporting and nurturing them,” says Ellen Hyman-Browne, a research-compliance officer at the Children’s Hospital of Philadelphia, a teaching hospital. Moreover, investigations cost time and money, and no institution wants to discover something that could cast a shadow on its reputation.</blockquote><br />
<br />
<blockquote>“There are conflicting influences on a university where they are the co-grantor and responsible to other investigators,” says Stephen Kelly, the Justice Department attorney who prosecuted Poehlman. “For the system to work, the university has to be very ethical.”</blockquote><br />
<br />
Poehlman himself was careful and chose areas where fraud would be especially difficult to detect. He specialized in presenting longitudinal data, data that is very expensive to replicate. He also presented research results that confirmed what most researchers had suspected, rather than results that would undermine existing theories of nutrition.<br />
<br />
At his sentencing, Poehlman was sentenced to one year and one day in federal prison, making him the first researcher to serve time in jail for research fraud.<br />
<br />
<blockquote>“When scientists use their skill and their intelligence and their sophistication and their position of trust to do something which puts people at risk, that is extraordinarily serious,” the judge said. “In one way, this is a final lesson that you are offering.”</blockquote><br />
<br />
===Questions===<br />
<br />
1. Do you have experience with a researcher changing the data values after seeing the initial analysis results? What would make you suspicious of fraud?<br />
<br />
2. Is the peer-review system of research self-correcting? What changes could be made to this system?<br />
<br />
3. When is imputation legitimate and when is it fraudulent?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Independence for national statistics==<br />
[http://www.johnkay.com/political/453 A better way to restore faith in official statistics], [http://www.johnkay.com/ John Kay], Financial Times 25 July 2006.<br><br />
<br />
[http://www.johnkay.com/political/453 John Kay], a columnist for the [http://www.ft.com Financial Times], outlines the measures needed to ensure that national statistics are truly independent. <br />
<br />
The current state of UK official statistics was covered in a previous Chance article<br />
<em>[http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9#Pick_a_number.2C_any_number Pick a number, any number,]</em> in [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9 Chance News 9.] That article summarised a report on this topic, to which professional users, such the Royal Statistical Society, gave a cautious welcome to the government’s announcement of independence for the UK Office of National Statistics (ONS). <br />
<br />
Kay's article follows up on the reaction to that report.<br />
He tells us that accurate public information is a prerequisite of democracy, government statisticians are honest people but ministers (politicians) needs are often for propaganda rather than facts. <br />
Kay claims that decentralisation of responsibility for the production of official statistics has created a two-tier system in the UK. <br />
<blockquote><br />
statistics produced by the Office for National Statistics (ONS), which operates to internationally agreed criteria, are of higher quality than those produced by (government) departments. <br />
</blockquote><br />
The proposal to hand responsibility for all official statistics to the ONS was rejected,<br />
as were the suggestions for greater independence, made by bodies such as the Statistics Commission and the Royal Statistical Society,<br />
* separating statistical information from political statements, <br />
* reducing access by ministers to new data before their release, <br />
* giving parliament a defined role in the appointment of the National Statistician. <br />
<br />
Instead, the latest news is that the ONS will be demoted to a non-ministerial department.<br />
The worst news is the abolition of the Statistics Commission, which reviews all government statistics, and has made itself unpopular with government by proving itself robustly independent. <br />
<br />
Kay also cautions that statistics may be misused in contexts other than those intended. The value of health services increases as incomes rise and it can be argued that this increases the value of health output even if outcomes and procedures are unchanged. This statistical adjustment provides no basis whatever for claims that the National Health Service is more efficient. But the assertion grabs a headline, and it is only much later that pedantic journalists and academics can discover what is actually going on. <br />
<br />
Submitted by John Gavin.<br />
<br />
==An example of Simpson's Paradox==<br />
<br />
Study finds wealth inequality is widening worldwide<br><br />
''New York Times'', Dec. 6, 2006, C-3<br><br />
Eduardo Porter<br />
<br />
The article contains stats from a 2000 report on wealth distribution by <br />
country and worldwide. The article points out (toward the end) that <br />
even though every country has seen growing income inequality in the <br />
last six years, the *worldwide* inequality gap may be narrowing from <br />
the year 2000 stats to the present. The reason is the huge growth and <br />
wealth accumulation in China and India, which raises income overall, <br />
even though both those countries have also seen greater inequality.<br />
<br />
Submitted by Bob Dobrow<br />
<br />
==Predecessors of Poehlman==<br />
<br />
Steve Simon's wiki, "I wasn't making up data, I was imputing!" is quite interesting and informative. Nevertheless, some elaboration is in order regarding fraud and Simon's statement that "The Poehlman scandal represents perhaps the biggest case of research fraud in recent history." <br />
<br />
The term "recent history" is sufficiently elastic to permit quoting myself in the 1980s: <blockquote>Admittedly Slutsky is an extreme example...even after the investigation [proving fraud in many of his papers]...Robert G. Slutsky was [still] given credit for [an additional] 77 publications in his seven years with [the University of California, San Diego]...in 1984 he published at the astonishing rate of one paper every ten days..Slutsky's phenomenal productivity was encouraged, applauded and rewarded...John R. Darsee [another cardiologist but at Harvard], had about 100 papers in a period of two years and his undoing in 1981 was colleagues who secretly saw him forging the data.</blockquote> <br />
<br />
Put Slutsky and Darsee into Google.com and you will see the entire treatment. My point is that the Eric Poehlman scandal is nowhere near the biggest--Slutsky and Darsee involved entire prestigious labs. And we tend to ignore history at our peril. An extensive treatment of Slutsky, Darsee and many others (Baltimore, Imanishi-Kari, Spector, Summerlin, Long, Alsabti, Soman, Breuning, Pearce, Hermann, Brach, Schoen, not to mention more illustrative predecessors such as Newton, Mendel, Pasteur and Freud) can be found in The Great Betrayal: Fraud in Science by Horace Freeland Judson [Harcourt, Inc., 2004].<br />
<br />
Although Judson's book is a wonderful page-turner, go to www.bmj.com/cgi/content/full/329/7471/922 to see a critique of the book by Peter Wilmshurst, a British cardiologist who is very active in unearthing medical fraud. Wilmshurst suggests that "Judson paints a rosier picture of the mechanisms for dealing with research fraud than I recognize." Further, "Judson only briefly describes what may be the most common form of research misconduct: failure to publish results...for the sake of company profits."<br />
<br />
Although research frauds tend to have things in common--colossal egos, external as well as internal pressures, desire for fame, money, etc.--each instance is possibly unique. Poehlman evidenced a typical trait: he fabricated the data. According to the original New York Times article, his study on menopause "was almost entirely fabricated. Poehlman had tested only 2 women, not 35." On the other hand, Poehlman was downright stupid to have changed his (real, existing) cholesterol data to fit his (and others) belief that cholesterol levels worsen with age because he had the only large longitudinal study, implying that it would be publishable and valuable regardless of the results. The other unusual feature was that "He was only the second scientist in the United States to face criminal prosecution for falsifying research data."<br />
<br />
Buried in the NYT article is the statement made by Steven Heymsfield, an obesity researcher at Merck and should be a guiding light for all researchers: "But deans love people who bring in money and recognition to universities, so there is Eric."<br />
<br />
Discussion<br />
<br />
1. Use a search engine to determine what fraud was committed by some of the predecessors of Poehlman.<br />
<br />
2. Scientists claim that peer review and duplication of results act to inhibit fraud. Pick a researcher and determine why either or both failed.<br />
<br />
3. This wiki ends with a disparaging remark about university deans. Defend them.<br />
<br />
Submitted by Paul Alper<br />
<br />
==Wealth of nations==<br />
* Winner takes (almost) all, The Economist, 9th Dec 2006.<br><br />
* [http://www.eurekalert.org/pub_releases/2006-12/unu-pss120106.php Pioneering study shows richest 2 percent own half world wealth], James Davies of the University of Western Ontario, Anthony Shorrocks and Susanna Sandstrom of UNU-WIDER and Edward Wolff of New York University.<br />
<br />
The Helsinki-based World Institute for Development Economics Research of the United Nations University (UNU-WIDER)<br />
has conducted what it claims is a path-breaking study into the most comprehensive study of personal wealth ever undertaken: it is the first-of-its-kind to cover all countries in the world and all major components of household wealth, including financial assets and debts, land, buildings and other tangible property.<br />
<br />
[[Image:WorldWeathLevels.jpg|frame|World Wealth Levels in Year 2000: The world map shows per capita wealth of different countries. Average wealth amounted to $144,000 per person in the USA in year 2000, and $181,000 in Japan. Lower down among countries with wealth data are India, with per capita assets of $1,100, and Indonesia with $1,400 per capita. Source: UNU-WIDER.]]<br />
<br />
The report contains a plethora of statistics, such as:<br />
<br />
* The richest 2% of adults in the world own more than half of global household wealth.<br />
* The richest 1% of adults alone owned 40% of global assets in the year 2000.<br />
* The richest 10% of adults accounted for 85% of the world total. <br />
* The bottom half of the world adult population owned barely 1% of global wealth.<br />
* To be among the richest 10% of adults in the world required $61,000 in assets.<br />
* More than $500,000 was needed to belong to the richest 1% (37 million members).<br />
* Household wealth amounted to $125 trillion in the year 2000, equivalent to roughly three times the value of total global production (GDP) or to $20,500 per person. Adjusting for differences in the cost-of-living across nations raises the value of wealth to $26,000 per capita when measured in terms of purchasing power parity dollars.<br />
* Wealth levels vary widely across countries: ranging from $37,000 per person for New Zealand and $70,000 for Denmark to $127,000 for the UK (for high-income OECD nations).<br />
* North America has only 6% of the world adult population, yet it accounts for 34% of household wealth.<br />
* Wealth is more unequally distributed than income across countries. High income countries tend to have a bigger share of world wealth than of world GDP. The reverse is true of middle- and low-income nations. <br />
<br />
The authors warn about the ambiguity in the definition of wealth<br />
<blockquote><br />
One should be clear about what is meant by 'wealth'. <br />
In everyday conversation the term 'wealth' often signifies little more than 'money income'. <br />
On other occasions economists use 'wealth' to refer to the value of all household resources, <br />
including human capabilities.<br />
</blockquote><br />
The authors define wealth to mean 'the value of physical and financial assets less debts',<br />
so wealth represents the ownership of capital. <br />
They claim that capital is widely believed to have a disproportionate impact on household well-being and economic success, and more broadly on economic development and growth.<br />
<br />
The authors use the [http://en.wikipedia.org/wiki/Gini_coefficient Gini value] to measure inequality on a scale from zero to one. They claim that wealth is shared much less equitably than income. Income inequality ranges from 35% to 45% and wealth inequality are usually between 65% and 75% (e.g. zero would mean everyone has the same income and one means that one person has all the income and everyone else has none).<br />
The authors claim<br />
<blockquote><br />
The global wealth Gini for adults is 89%. The same degree of inequality would be obtained if one person in a group of ten takes 99% of the total pie and the other nine share the remaining 1%.<br />
</blockquote><br />
<br />
Surprisingly, household debt is seen as relatively unimportant in poor countries. As the authors of the study point out:<br />
<blockquote><br />
while many poor people in poor countries are in debt, their debts are relatively small in total. This is mainly due to the absence of financial institutions that allow households to incur large mortgage and consumer debts, as is increasingly the situation in rich countries. Many people in high-income countries have negative net worth and—somewhat paradoxically—are among the poorest people in the world in terms of household wealth.<br />
</blockquote><br />
For example, the bottom half of the Swedish population have a collective net worth of less than zero, although Nordic countries, in general, seem to thrive with relatively little personal wealth.<br />
<br />
===Questions===<br />
* A presentation format consisteing of a list of such point-estimate statistics seems disjointed, as it swaps repeatedly between statistics for the richest and the poorest. Could the data be more meaningfully presented via a distribution?<br />
* The graph shows a discrete five point distribution. Is such a split of the data into buckets such as 'under 2000' and 'over 50000' meaningful? <br />
* Mapping the output to countries via colours, shows the geographic distribution of the underlying variable, wealth. What is misleading about this graph? How might countries be scaled in size to better relflect the data.<br />
* How might switching from measuring wealth to income affect the preception of the results? (A [http://en.wikipedia.org/wiki/Image:World_Map_Gini_coefficient.png gini measure of income inequality] is available from Wikipedia, along with time trends since the 1940s.)<br />
* Two high wealth economies, Japan and the United States, show very different patterns of wealth inequality, with Japan having a wealth Gini of 55% and the USA a wealth Gini of around 80%. Speculate on what factors might explain this difference.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Science in the Courtroom==<br />
[http://www.nytimes.com/2006/12/05/science/05law.html?ex=1322974800&en=28d0cbd0efade415&ei=5088&partner=rssnyt&emc=rss When questions of science come to a courtroom, truth has many faces]<br><br />
New York Times, 5 December 2006, F3<br><br />
Cornelia Dean<br />
<br />
This article appeared as the US Supreme Court began hearing its first case involving global warming. A case has been filed against the federal government by a group of state and local goverments, together with environmental groups. These plaintiffs charge that that the Environmental Protection Agency, by refusing to regulate greenhouse gas emissions, is failing to enforce the Clean Air Act.<br />
<br />
Some of the arguments involve legal technicalities, such as whether the states actually have standing to bring such a suit. But the present article is concerned with the scientific evidence, and what responsibility the Court has to educate itself about the scientific underpinnings of a case. The article draws the following distinction between between statistical and legal standards for proof:<br />
<br />
<blockquote><br />
Typically, scientists don't accept a finding unless, statistically, the odds are less than 1 in 20 that it occurred by chance. This standard is higher than the typical standard of proof in civil trials (&quot;preponderance of the evidence&quot;) and lower than the standard for criminal trials (&quot;beyond a reasonable doubt&quot;).<br />
</blockquote><br />
<br />
The article provides some historical references on how the Court has previously viewed scientific testimony, beginning discussion of the 1923 Frey case on lie detectors, which introduced the &quot;general acceptance&quot; standard. This was updated in the 1993 case Daubert v. Merrell Dow Pharmaceuticals, which involved the drug Bendectin and it possible association with birth defects. The Court introduced the concepts of &quot;testability&quot; and &quot;peer review&quot; into its deliberations on science. In the 1998 case General Electric Company v. Joiner, the Court ruled that &quot;judges could reject evidence if there was simply too great a gap between 'the data and the opinion proffered.'&quot;<br />
<br />
The main thrust of the article, however, is that the Court still has been too slow to keep up with the explosion of scientific knowledge, which can be expected to play an ever larger role in future cases. For example, when corrected on a technical point in the discussion about carbon dioxide, Justice Scalia responded, &quot;Troposphere, whatever. I told you before I'm not a scientist.&quot;<br />
<br />
DISCUSSION QUESTIONS<br />
<br />
(1) What do you think of the suggested correspondence between the legal and statistical standards for evidence? What probability numbers would you attach to &quot;preponderance of the evidence&quot; and &quot;beyond a reasonable doubt&quot;?<br />
<br />
(2) How should a judge decide when there is too great a gap between &quot;the data and the opinion proffered&quot;?<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Magic numbers==<br />
<br />
[http://www.economist.com/research/articlesBySubject/displayStory.cfm?subjectid=2512631&story_id=7953427 Technical failure], Buttonwood, The Economist, Sep 21st 2006.<br />
<br />
In financial markets, some traders believe that markets change trend when they reach, say, 61.8% of their previous high, or 61.8% above their low. Such seemingly magical numbers are derived from Fibonacci series and are often given special names such as <em>the golden ratio</em> (approx 1.618) in architecture and design.<br />
<br />
This article categorises such traders as follows:<br />
<blockquote><br />
Believers in Fibonacci numbers are part of a school known as [http://en.wikipedia.org/wiki/Technical_analysis technical analysis,] or chartism, which believes the future movement of asset prices can be divined from past data. <br />
Some chartists follow patterns such as <em>head and shoulders</em> and <em>double tops</em>; others focus on moving averages; a third group believes markets move in pre-determined waves. The Fibonacci fans fall into this last set.<br />
</blockquote><br />
<br />
The Economist article points out that <br />
a [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 new study], by Professor Roy Batchelor and Richard Ramyar of the Cass Business School, finds no indication that trends reverse at the 61.8% level, or indeed at any predictable milestone in American stockmarkets.<br />
<br />
Fibonacci numbers at least have the virtue of creating a testable proposition; one that they appear to fail. However, chartists will not be completely discouraged as The Economist highlights [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 another study] which claims that 58 of 92 modern studies of technical analysis produced positive results. The authors of this second paper conclude:<br />
<blockquote><br />
Despite the positive evidence ... it appears that most empirical studies are subject to various problems in their testing procedures, e.g. data snooping, ex-post selection of trading rules or search technologies and difficulties in estimation of risk and transaction costs.<br />
</blockquote><br />
<br />
The Economist article goes on to imply that the theory which dominates at any point in time may simply be a matter of fashion:<br />
<blockquote><br />
If financial markets are efficient, technical analysis should not work at all; the prevailing market price should reflect all information, including past price movements. However, academic fashion has moved in favour of behavioural finance, which suggests that investors may not be completely rational and that their psychological biases could cause prices to deviate from their 'correct' level.<br />
</blockquote><br />
<br />
The article claims that chartism probably works best in the [http://en.wikipedia.org/wiki/Foreign_exchange_market foreign-exchange market] because major participants, especially central banks, are not 'profit-maximising' leading to inefficient pricing. Furthermore, some technical predictions may be self-fulfilling; if everyone believes that the dollar will rebound at 100 yen, they will buy it as it approaches that level. <br />
<br />
But it finishes with a warning<br />
<blockquote><br />
Chartists fall prey to their own behavioural flaw, finding “confirmation” of patterns everywhere, as if they were reading clouds in their coffee futures.<br />
</blockquote><br />
<br />
===Questions===<br />
* Can you think of possible ways to alleviate the biases mentioned: 'data snooping', 'ex-post selection of trading rules' and 'transaction costs'? Which of these issues do you think is easiest to incorporate into an analysis?<br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'Critics of technical analysis include well known [http://en.wikipedia.org/wiki/Fundamental_analysis fundamental analysts.] Warren Buffett has said, <em>I realized technical analysis didn't work when I turned the charts upside down and didn't get a different answer</em> and <em>if past history was all there was to the game, the richest people would be librarians.</em>' How might you test if Buffett's assertions are true? <br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'To a technician, however, Buffett paraphrased [technical analysis] when he commented in a recent conference on investing in mining companies, <em>in metals and oils, there's been a terrific [price] move. It's like most trends: at the beginning, it's driven by fundamentals, then speculation takes over ... then the speculation becomes dominant.</em>' Do you agree that Buffett is acknowledging that markets are inefficient because they trend? Would a basic, first-order, auto-regressive model (AR(1)) on price differences be sufficient to test the existance of such a trend?<br />
* Technicians argue that many investors base their future expectations on past earnings, track records, etc. Because future stock prices can be strongly influenced by investor expectations, technicians claim this means that past prices can influence future prices. Does this argument persuade you?<br />
<br />
===Further reading===<br />
* [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 Magic numbers in the Dow,] Roy Batchelor and Richard Ramyar, Cass Business School, City of London, Sep 2006.<br />
* [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 The Profitability of Technical Analysis: a review,] by Cheol-Ho Park and Scott H Irwin, University of Illinois, October 2004.<br />
* The [http://en.wikipedia.org/wiki/Random_walk_hypothesis random walk hypothesis] is at odds with technical analysis and charting. This hypothesis claims that stock price movements are a Brownian Motion with either independent or uncorrelated increments. In such a model, movements in stock prices are not dependent on past stock prices, so trends cannot exist and technical analysis has no basis. Random Walk advocates such as [http://www.math.temple.edu/~paulos John Allen Paulos] believe that technical analysis and fundamental analysis are pseudo-sciences. The latter tried his hand at playing the stock markets without success:<br />
<blockquote><br />
[http://www.math.temple.edu/~paulos/contents.html A Mathematician Plays the Stock Market] is the story of my disastrous love affair with WorldCom, but lest you dread a cloyingly personal account of how I lost my shirt (or at least had my sleeves shortened), I assure you that the book's primary purpose is to lay out, elucidate, and explore the basic conceptual mathematics of the market. I'll examine ... issues associated with the market. Is it efficient? random? Is there anything to technical analysis, fundamental analysis, and other supposedly time-tested methods of picking stocks? How can one quantify risk? What is the role of cognitive illusion and pyschological foible (to which, alas, I am not immune)?<br />
...<br />
In short, what can the tools of mathematics can tell us about the vagaries of the stock market?<br />
</blockquote><br />
<br />
Submitted by John Gavin.</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_22&diff=3469Chance News 222007-01-05T16:50:46Z<p>Mmartin: /* Predecessors of Poehlman */</p>
<hr />
<div>==Quotations==<br />
<br />
<br />
<blockquote>It would be hard to make a probability course boring.<br><br />
<div align=right><br />
William Feller<br><br />
Personal comment to Laurie Snell</div><br />
</blockquote><br />
----<br />
<blockquote> Apart from Fred, [an obstreperous rat in her psychology lab] I was sick of trying to master statistics. I had a mental block when it came to any form of mathematics. 'Rats and stats,' I complained to a fellow student one day, 'I came here to learn about people.' I wasn't the only student disgruntled. Many complained but to no avail.<br />
<div align=right><br />
Sally Morgan in her book, My Place<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote>The risk of going into cardiac arrest as a spectator, he [Dr. Siegal of Massachusetts General Hospital] said, is only about one in a million. (The applicable studies of spectators involved Super Bowl fans.)<br />
<br />
==Forsooth==<br />
<br />
<blockquote> NOAA's heating degree day forecast for December, January and February projects a 2 percent warmer winter than the 30 year average<Br><br />
<div align=right>[http://www.noaanews.noaa.gov/stories2006/s2742.htm NOAA Magazine]<br><br />
</div></blockquote><br />
<br />
The following Forsooths are from the November 2006 RRS NEWS.<br />
----<br />
<blockquote>At St John's Wood station alone, the number of CCTV cameras has jumped from 20 to 57, an increase of 300 per cent.<br />
<br><br />
<div align=right>Metro <br><br />
3 May 2006<br />
</div></blockquote><br />
----<br />
<blockquote>Now 78% of female veterinary medicine students are women, almost a complete turn-around from the previous situation.<br><br />
<div align=right><br />
The Herald (Glasgow) <br><br />
4 May 2006</div><br />
</blockquote><br />
----<br />
<blockquote>Drought to ravage half the world within 100 years<br><br><br />
Half the world's surface will be gripped by drought by the end of the century, the Met Office said yesterday.<br><br />
<div align=right><br />
Times online <br><br />
6 October 2006</div><br />
</blockquote><br />
----<br />
<br />
==I wasn't making up data, I was imputing!==<br />
<br />
An Unwelcome Discovery, by Jeneen Interlandi, The New York Times, October 22, 2006.<br />
<br />
The New York Times has an informative summary of a recent scandal involving a prominent researcher at the University of Vermont, Eric Poehlman. The Poehlman scandal represents perhaps the biggest case of research fraud in recent history.<br />
<br />
<blockquote>He presented fraudulent data in lectures and in published papers, and he used this data to obtain millions of dollars in federal grants from the National Institutes of Health — a crime subject to as many as five years in federal prison.</blockquote><br />
<br />
The first person to speak up about the possibility of fraud in Poehlman's work was one of his research assistants, Walter DeNino.<br />
<br />
<blockquote>The fall that DeNino returned to the lab, Poehlman was looking into how fat levels in the blood change with age. DeNino’s task was to compare the levels of lipids, or fats, in two sets of blood samples taken several years apart from a large group of patients. As the patients aged, Poehlman expected, the data would show an increase in low-density lipoprotein (LDL), which deposits cholesterol in arteries, and a decrease in high-density lipoprotein (HDL), which carries it to the liver, where it can be broken down. Poehlman’s hypothesis was not controversial; the idea that lipid levels worsen with age was supported by decades of circumstantial evidence. Poehlman expected to contribute to this body of work by demonstrating the change unequivocally in a clinical study of actual patients over time. But when DeNino ran his first analysis, the data did not support the premise.</blockquote><br />
<br />
<blockquote>When Poehlman saw the unexpected results, he took the electronic file home with him. The following week, Poehlman returned the database to DeNino, explained that he had corrected some mistaken entries and asked DeNino to re-run the statistical analysis. Now the trend was clear: HDL appeared to decrease markedly over time, while LDL increased, exactly as they had hypothesized.</blockquote><br />
<br />
<blockquote>Although DeNino trusted his boss implicitly, the change was too great to be explained by a handful of improperly entered numbers, which was all Poehlman claimed to have fixed. DeNino pulled up the original figures and compared them with the ones Poehlman had just given him. In the initial spreadsheet, many patients showed an increase in HDL from the first visit to the second. In the revised sheet, all patients showed a decrease. Astonished, DeNino read through the data again. Sure enough, the only numbers that hadn’t been changed were the ones that supported his hypothesis.<br />
</blockquote><br />
<br />
Poehlman brushed DeNino's concerns aside, so DeNino started asking around and other graduate students and postdocs had similar concerns. He got some cautionary advice from a former postdoctoral fellow<br />
<br />
<blockquote>Being associated with either falsified data or a frivolous allegation against a scientist as prominent as Poehlman could end DeNino’s career before it even began.</blockquote><br />
<br />
and a faculty member who shared lab space with Poehlman who advised<br />
<br />
<blockquote>If you’re going to do something, make sure you really have the evidence.</blockquote><br />
<br />
So DeNino started looking for the evidence.<br />
<br />
<blockquote>DeNino spent the next several evenings combing through hundreds of patients’ records in the lab and university hospital, trying to verify the data contained in Poehlman’s spreadsheets. Each night was worse than the one before. He discovered not only reversed data points, but also figures for measurements that had never been taken and even patients who appeared not to exist at all.</blockquote><br />
<br />
DeNino presented his evidence to the university counsel and the response of Poehlman (to his department chair, Burton Sobel) was rather startling.<br />
<br />
<blockquote>The accused scientist gave him the impression that nothing was wrong and seemed mostly annoyed by all the fuss. In his written response to the allegations, Poehlman suggested that the data had gotten out of hand, accumulating numerous errors because of handling by multiple technicians and postdocs over the years. “I found that noncredible, really, for an investigator of Eric’s experience,” Sobel later told the investigative panel. “There had to be a backup copy that was pure,” Sobel reasoned before the panel. “You would not have postdocs and lab techs in charge of discrepant data sets.” But Poehlman told Sobel that there was no master copy.</blockquote><br />
<br />
At the formal hearing, Poehlman had a different defense.<br />
<br />
<blockquote>First, he attributed his mistakes to his own self-proclaimed ineptitude with Excel files. Then, when pressed on how fictitious numbers found their way into the spreadsheet he’d given DeNino, Poehlman laid out his most elaborate explanation yet. He had imputed data — that is, he had derived predicted values for measurements using a complicated statistical model. His intention, he said, was to look at hypothetical outcomes that he would later compare to the actual results. He insisted that he never meant for DeNino to analyze the imputed values and had given him the spreadsheet by mistake.</blockquote><br />
<br />
The New York Times article points out how pathetic this attempted explanation was.<br />
<br />
<blockquote>Although data can be imputed legitimately in some disciplines, it is generally frowned upon in clinical research, and this explanation came across as hollow and suspicious, especially since Poehlman appeared to have no idea how imputation was done.</blockquote><br />
<br />
A large portion of the article examines how research fraud can occur in a system that is supposed to be self-correcting.<br />
<br />
First, the people who are mostly likely to notice fraud are junior investigators who are subordinate to their research mentor. It's psychologically and emotionally difficult to confront someone who has devoted time to your professional development. Even when an investigator is emotionally willing to confront their mentor, they have their career concerns to worry about.<br />
<br />
<blockquote>The principal investigator in a lab has the power to jump-start careers. By writing papers with graduate students and postdocs and using connections to help obtain fellowships and appointments, senior scientists can help their lab workers secure coveted tenure-track jobs. They can also do damage by withholding this support.</blockquote><br />
<br />
Every university will have a system in place to investigate claims of fraud. But there are problems here as well.<br />
<br />
<blockquote>All universities that receive public money to conduct research are required to have an integrity officer who ensures compliance with federal guidelines. But policing its scientists can be a heavy burden for a university. “It’s your own faculty, and there’s this idea of supporting and nurturing them,” says Ellen Hyman-Browne, a research-compliance officer at the Children’s Hospital of Philadelphia, a teaching hospital. Moreover, investigations cost time and money, and no institution wants to discover something that could cast a shadow on its reputation.</blockquote><br />
<br />
<blockquote>“There are conflicting influences on a university where they are the co-grantor and responsible to other investigators,” says Stephen Kelly, the Justice Department attorney who prosecuted Poehlman. “For the system to work, the university has to be very ethical.”</blockquote><br />
<br />
Poehlman himself was careful and chose areas where fraud would be especially difficult to detect. He specialized in presenting longitudinal data, data that is very expensive to replicate. He also presented research results that confirmed what most researchers had suspected, rather than results that would undermine existing theories of nutrition.<br />
<br />
At his sentencing, Poehlman was sentenced to one year and one day in federal prison, making him the first researcher to serve time in jail for research fraud.<br />
<br />
<blockquote>“When scientists use their skill and their intelligence and their sophistication and their position of trust to do something which puts people at risk, that is extraordinarily serious,” the judge said. “In one way, this is a final lesson that you are offering.”</blockquote><br />
<br />
===Questions===<br />
<br />
1. Do you have experience with a researcher changing the data values after seeing the initial analysis results? What would make you suspicious of fraud?<br />
<br />
2. Is the peer-review system of research self-correcting? What changes could be made to this system?<br />
<br />
3. When is imputation legitimate and when is it fraudulent?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Independence for national statistics==<br />
[http://www.johnkay.com/political/453 A better way to restore faith in official statistics], [http://www.johnkay.com/ John Kay], Financial Times 25 July 2006.<br><br />
<br />
[http://www.johnkay.com/political/453 John Kay], a columnist for the [http://www.ft.com Financial Times], outlines the measures needed to ensure that national statistics are truly independent. <br />
<br />
The current state of UK official statistics was covered in a previous Chance article<br />
<em>[http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9#Pick_a_number.2C_any_number Pick a number, any number,]</em> in [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9 Chance News 9.] That article summarised a report on this topic, to which professional users, such the Royal Statistical Society, gave a cautious welcome to the government’s announcement of independence for the UK Office of National Statistics (ONS). <br />
<br />
Kay's article follows up on the reaction to that report.<br />
He tells us that accurate public information is a prerequisite of democracy, government statisticians are honest people but ministers (politicians) needs are often for propaganda rather than facts. <br />
Kay claims that decentralisation of responsibility for the production of official statistics has created a two-tier system in the UK. <br />
<blockquote><br />
statistics produced by the Office for National Statistics (ONS), which operates to internationally agreed criteria, are of higher quality than those produced by (government) departments. <br />
</blockquote><br />
The proposal to hand responsibility for all official statistics to the ONS was rejected,<br />
as were the suggestions for greater independence, made by bodies such as the Statistics Commission and the Royal Statistical Society,<br />
* separating statistical information from political statements, <br />
* reducing access by ministers to new data before their release, <br />
* giving parliament a defined role in the appointment of the National Statistician. <br />
<br />
Instead, the latest news is that the ONS will be demoted to a non-ministerial department.<br />
The worst news is the abolition of the Statistics Commission, which reviews all government statistics, and has made itself unpopular with government by proving itself robustly independent. <br />
<br />
Kay also cautions that statistics may be misused in contexts other than those intended. The value of health services increases as incomes rise and it can be argued that this increases the value of health output even if outcomes and procedures are unchanged. This statistical adjustment provides no basis whatever for claims that the National Health Service is more efficient. But the assertion grabs a headline, and it is only much later that pedantic journalists and academics can discover what is actually going on. <br />
<br />
Submitted by John Gavin.<br />
<br />
==An example of Simpson's Paradox==<br />
<br />
Study finds wealth inequality is widening worldwide<br><br />
''New York Times'', Dec. 6, 2006, C-3<br><br />
Eduardo Porter<br />
<br />
The article contains stats from a 2000 report on wealth distribution by <br />
country and worldwide. The article points out (toward the end) that <br />
even though every country has seen growing income inequality in the <br />
last six years, the *worldwide* inequality gap may be narrowing from <br />
the year 2000 stats to the present. The reason is the huge growth and <br />
wealth accumulation in China and India, which raises income overall, <br />
even though both those countries have also seen greater inequality.<br />
<br />
Submitted by Bob Dobrow<br />
<br />
==Predecessors of Poehlman==<br />
<br />
Steve Simon's wiki, "I wasn't making up data, I was imputing!" is quite interesting and informative. Nevertheless, some elaboration is in order regarding fraud and Simon's statement that "The Poehlman scandal represents perhaps the biggest case of research fraud in recent history." <br />
<br />
The term "recent history" is sufficiently elastic to permit quoting myself in the 1980s: <blockquote>Admittedly Slutsky is an extreme example...even after the investigation [proving fraud in many of his papers]...Robert G. Slutsky was [still] given credit for [an additional] 77 publications in his seven years with [the University of California, San Diego]...in 1984 he published at the astonishing rate of one paper every ten days..Slutsky's phenomenal productivity was encouraged, applauded and rewarded...John R. Darsee [another cardiologist but at Harvard], had about 100 papers in a period of two years and his undoing in 1981 was colleagues who secretly saw him forging the data.</blockquote> <br />
<br />
Put Slutsky and Darsee into Google.com and you will see the entire treatment. My point is that the Eric Poehlman scandal is nowhere near the biggest--Slutsky and Darsee involved entire prestigious labs. And we tend to ignore history at our peril. An extensive treatment of Slutsky, Darsee and many others (Baltimore, Imanishi-Kari, Spector, Summerlin, Long, Alsabti, Soman, Breuning, Pearce, Hermann, Brach, Schoen, not to mention more illustrative predecessors such as Newton, Mendel, Pasteur and Freud) can be found in The Great Betrayal: Fraud in Science by Horace Freeland Judson [Harcourt, Inc., 2004].<br />
<br />
Although Judson's book is a wonderful page-turner, go to www.bmj.com/cgi/content/full/329/7471/922 to see a critique of the book by Peter Wilmshurst, a British cardiologist who is very active in unearthing medical fraud. Wilmshurst suggests that "Judson paints a rosier picture of the mechanisms for dealing with research fraud than I recognize." Further, "Judson only briefly describes what may be the most common form of research misconduct: failure to publish results...for the sake of company profits."<br />
<br />
Although research frauds tend to have things in common--colossal egos, external as well as internal pressures, desire for fame, money, etc.--each instance is possibly unique. Poehlman evidenced a typical trait: he fabricated the data. According to the original New York Times article, his study on menopause "was almost entirely fabricated. Poehlman had tested only 2 women, not 35." On the other hand, Poehlman was downright stupid to have changed his (real, existing) cholesterol data to fit his (and others) belief that cholesterol levels worsen with age because he had the only large longitudinal study, implying that it would be publishable and valuable regardless of the results. The other unusual feature was that "He was only the second scientist in the United States to face criminal prosecution for falsifying research data."<br />
<br />
Buried in the NYT article is the statement made by Steven Heymsfield, an obesity researcher at Merck and should be a guiding light for all researchers: "But deans love people who bring in money and recognition to universities, so there is Eric."<br />
<br />
Discussion<br />
<br />
1. Use a search engine to determine what fraud was committed by some of the predecessors of Poehlman.<br />
<br />
2. Scientists claim that peer review and duplication of results act to inhibit fraud. Pick a researcher and determine why either or both failed.<br />
<br />
3. This wiki ends with a disparaging remark about university deans. Defend them.<br />
<br />
Submitted by Paul Alper<br />
<br />
==Wealth of nations==<br />
* Winner takes (almost) all, The Economist, 9th Dec 2006.<br><br />
* [http://www.eurekalert.org/pub_releases/2006-12/unu-pss120106.php Pioneering study shows richest 2 percent own half world wealth], James Davies of the University of Western Ontario, Anthony Shorrocks and Susanna Sandstrom of UNU-WIDER and Edward Wolff of New York University.<br />
<br />
The Helsinki-based World Institute for Development Economics Research of the United Nations University (UNU-WIDER)<br />
has conducted what it claims is a path-breaking study into the most comprehensive study of personal wealth ever undertaken: it is the first-of-its-kind to cover all countries in the world and all major components of household wealth, including financial assets and debts, land, buildings and other tangible property.<br />
<br />
[[Image:WorldWeathLevels.jpg|frame|World Wealth Levels in Year 2000: The world map shows per capita wealth of different countries. Average wealth amounted to $144,000 per person in the USA in year 2000, and $181,000 in Japan. Lower down among countries with wealth data are India, with per capita assets of $1,100, and Indonesia with $1,400 per capita. Source: UNU-WIDER.]]<br />
<br />
The report contains a plethora of statistics, such as:<br />
<br />
* The richest 2% of adults in the world own more than half of global household wealth.<br />
* The richest 1% of adults alone owned 40% of global assets in the year 2000.<br />
* The richest 10% of adults accounted for 85% of the world total. <br />
* The bottom half of the world adult population owned barely 1% of global wealth.<br />
* To be among the richest 10% of adults in the world required $61,000 in assets.<br />
* More than $500,000 was needed to belong to the richest 1% (37 million members).<br />
* Household wealth amounted to $125 trillion in the year 2000, equivalent to roughly three times the value of total global production (GDP) or to $20,500 per person. Adjusting for differences in the cost-of-living across nations raises the value of wealth to $26,000 per capita when measured in terms of purchasing power parity dollars.<br />
* Wealth levels vary widely across countries: ranging from $37,000 per person for New Zealand and $70,000 for Denmark to $127,000 for the UK (for high-income OECD nations).<br />
* North America has only 6% of the world adult population, yet it accounts for 34% of household wealth.<br />
* Wealth is more unequally distributed than income across countries. High income countries tend to have a bigger share of world wealth than of world GDP. The reverse is true of middle- and low-income nations. <br />
<br />
The authors warn about the ambiguity in the definition of wealth<br />
<blockquote><br />
One should be clear about what is meant by 'wealth'. <br />
In everyday conversation the term 'wealth' often signifies little more than 'money income'. <br />
On other occasions economists use 'wealth' to refer to the value of all household resources, <br />
including human capabilities.<br />
</blockquote><br />
The authors define wealth to mean 'the value of physical and financial assets less debts',<br />
so wealth represents the ownership of capital. <br />
They claim that capital is widely believed to have a disproportionate impact on household wellbeing and economic success, and more broadly on economic development and growth.<br />
<br />
The authors use the [http://en.wikipedia.org/wiki/Gini_coefficient Gini value] to measure inequality on a scale from zero to one. They claim that wealth is shared much less equitably than income. Income inequality ranges from 35% to 45% and wealth inequality are usually between 65% and 75% (e.g. zero would mean everyone has the same income and one means that one person has all the income and everyone else has none).<br />
The authors claim<br />
<blockquote><br />
The global wealth Gini for adults is 89%. The same degree of inequality would be obtained if one person in a group of ten takes 99% of the total pie and the other nine share the remaining 1%.<br />
</blockquote><br />
<br />
Surprisingly, household debt is seen as relatively unimportant in poor countries. As the authors of the study point out:<br />
<blockquote><br />
while many poor people in poor countries are in debt, their debts are relatively small in total. This is mainly due to the absence of financial institutions that allow households to incur large mortgage and consumer debts, as is increasingly the situation in rich countries. Many people in high-income countries have negative net worth and—somewhat paradoxically—are among the poorest people in the world in terms of household wealth.<br />
</blockquote><br />
For example, the bottom half of the Swedish population have a collective net worth of less than zero, although Nordic countries, in general, seem to thrive with relatively little personal wealth.<br />
<br />
===Questions===<br />
* A presentation format consisteing of a list of such point-estimate statistics seems disjointed, as it swaps repeatedly between statistics for the richest and the poorest. Could the data be more meaningfully presented via a distribution?<br />
* The graph shows a discrete five point distribution. Is such a split of the data into buckets such as 'under 2000' and 'over 50000' meaningful? <br />
* Mapping the output to countries via colours, shows the geographic distribution of the underlying variable, wealth. What is misleading about this graph? How might countries be scaled in size to better relflect the data.<br />
* How might switching from measuring wealth to income affect the preception of the results? (A [http://en.wikipedia.org/wiki/Image:World_Map_Gini_coefficient.png gini measure of income inequality] is available from Wikipedia, along with time trends since the 1940s.)<br />
* Two high wealth economies, Japan and the United States, show very different patterns of wealth inequality, with Japan having a wealth Gini of 55% and the USA a wealth Gini of around 80%. Speculate on what factors might explain this difference.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Science in the Courtroom==<br />
[http://www.nytimes.com/2006/12/05/science/05law.html?ex=1322974800&en=28d0cbd0efade415&ei=5088&partner=rssnyt&emc=rss When questions of science come to a courtroom, truth has many faces]<br><br />
New York Times, 5 December 2006, F3<br><br />
Cornelia Dean<br />
<br />
This article appeared as the US Supreme Court began hearing its first case involving global warming. A case has been filed against the federal government by a group of state and local goverments, together with environmental groups. These plaintiffs charge that that the Environmental Protection Agency, by refusing to regulate greenhouse gas emissions, is failing to enforce the Clean Air Act.<br />
<br />
Some of the arguments involve legal technicalities, such as whether the states actually have standing to bring such a suit. But the present article is concerned with the scientific evidence, and what responsibility the Court has to educate itself about the scientific underpinnings of a case. The article draws the following distinction between between statistical and legal standards for proof:<br />
<br />
<blockquote><br />
Typically, scientists don't accept a finding unless, statistically, the odds are less than 1 in 20 that it occurred by chance. This standard is higher than the typical standard of proof in civil trials (&quot;preponderance of the evidence&quot;) and lower than the standard for criminal trials (&quot;beyond a reasonable doubt&quot;).<br />
</blockquote><br />
<br />
The article provides some historical references on how the Court has previously viewed scientific testimony, beginning discussion of the 1923 Frey case on lie detectors, which introduced the &quot;general acceptance&quot; standard. This was updated in the 1993 case Daubert v. Merrell Dow Pharmaceuticals, which involved the drug Bendectin and it possible association with birth defects. The Court introduced the concepts of &quot;testability&quot; and &quot;peer review&quot; into its deliberations on science. In the 1998 case General Electric Company v. Joiner, the Court ruled that &quot;judges could reject evidence if there was simply too great a gap between 'the data and the opinion proffered.'&quot;<br />
<br />
The main thrust of the article, however, is that the Court still has been too slow to keep up with the explosion of scientific knowledge, which can be expected to play an ever larger role in future cases. For example, when corrected on a technical point in the discussion about carbon dioxide, Justice Scalia responded, &quot;Troposphere, whatever. I told you before I'm not a scientist.&quot;<br />
<br />
DISCUSSION QUESTIONS<br />
<br />
(1) What do you think of the suggested correspondence between the legal and statistical standards for evidence? What probability numbers would you attach to &quot;preponderance of the evidence&quot; and &quot;beyond a reasonable doubt&quot;?<br />
<br />
(2) How should a judge decide when there is too great a gap between &quot;the data and the opinion proffered&quot;?<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Magic numbers==<br />
<br />
[http://www.economist.com/research/articlesBySubject/displayStory.cfm?subjectid=2512631&story_id=7953427 Technical failure], Buttonwood, The Economist, Sep 21st 2006.<br />
<br />
In financial markets, some traders believe that markets change trend when they reach, say, 61.8% of their previous high, or 61.8% above their low. Such seemingly magical numbers are derived from Fibonacci series and are often given special names such as <em>the golden ratio</em> (approx 1.618) in architecture and design.<br />
<br />
This article categorises such traders as follows:<br />
<blockquote><br />
Believers in Fibonacci numbers are part of a school known as [http://en.wikipedia.org/wiki/Technical_analysis technical analysis,] or chartism, which believes the future movement of asset prices can be divined from past data. <br />
Some chartists follow patterns such as <em>head and shoulders</em> and <em>double tops</em>; others focus on moving averages; a third group believes markets move in pre-determined waves. The Fibonacci fans fall into this last set.<br />
</blockquote><br />
<br />
The Economist article points out that <br />
a [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 new study], by Professor Roy Batchelor and Richard Ramyar of the Cass Business School, finds no indication that trends reverse at the 61.8% level, or indeed at any predictable milestone in American stockmarkets.<br />
<br />
Fibonacci numbers at least have the virtue of creating a testable proposition; one that they appear to fail. However, chartists will not be completely discouraged as The Economist highlights [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 another study] which claims that 58 of 92 modern studies of technical analysis produced positive results. The authors of this second paper conclude:<br />
<blockquote><br />
Despite the positive evidence ... it appears that most empirical studies are subject to various problems in their testing procedures, e.g. data snooping, ex-post selection of trading rules or search technologies and difficulties in estimation of risk and transaction costs.<br />
</blockquote><br />
<br />
The Economist article goes on to imply that the theory which dominates at any point in time may simply be a matter of fashion:<br />
<blockquote><br />
If financial markets are efficient, technical analysis should not work at all; the prevailing market price should reflect all information, including past price movements. However, academic fashion has moved in favour of behavioural finance, which suggests that investors may not be completely rational and that their psychological biases could cause prices to deviate from their 'correct' level.<br />
</blockquote><br />
<br />
The article claims that chartism probably works best in the [http://en.wikipedia.org/wiki/Foreign_exchange_market foreign-exchange market] because major participants, especially central banks, are not 'profit-maximising' leading to inefficient pricing. Furthermore, some technical predictions may be self-fulfilling; if everyone believes that the dollar will rebound at 100 yen, they will buy it as it approaches that level. <br />
<br />
But it finishes with a warning<br />
<blockquote><br />
Chartists fall prey to their own behavioural flaw, finding “confirmation” of patterns everywhere, as if they were reading clouds in their coffee futures.<br />
</blockquote><br />
<br />
===Questions===<br />
* Can you think of possible ways to alleviate the biases mentioned: 'data snooping', 'ex-post selection of trading rules' and 'transaction costs'? Which of these issues do you think is easiest to incorporate into an analysis?<br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'Critics of technical analysis include well known [http://en.wikipedia.org/wiki/Fundamental_analysis fundamental analysts.] Warren Buffett has said, <em>I realized technical analysis didn't work when I turned the charts upside down and didn't get a different answer</em> and <em>if past history was all there was to the game, the richest people would be librarians.</em>' How might you test if Buffett's assertions are true? <br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'To a technician, however, Buffett paraphrased [technical analysis] when he commented in a recent conference on investing in mining companies, <em>in metals and oils, there's been a terrific [price] move. It's like most trends: at the beginning, it's driven by fundamentals, then speculation takes over ... then the speculation becomes dominant.</em>' Do you agree that Buffett is acknowledging that markets are inefficient because they trend? Would a basic, first-order, auto-regressive model (AR(1)) on price differences be sufficient to test the existance of such a trend?<br />
* Technicians argue that many investors base their future expectations on past earnings, track records, etc. Because future stock prices can be strongly influenced by investor expectations, technicians claim this means that past prices can influence future prices. Does this argument persuade you?<br />
<br />
===Further reading===<br />
* [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 Magic numbers in the Dow,] Roy Batchelor and Richard Ramyar, Cass Business School, City of London, Sep 2006.<br />
* [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 The Profitability of Technical Analysis: a review,] by Cheol-Ho Park and Scott H Irwin, University of Illinois, October 2004.<br />
* The [http://en.wikipedia.org/wiki/Random_walk_hypothesis random walk hypothesis] is at odds with technical analysis and charting. This hypothesis claims that stock price movements are a Brownian Motion with either independent or uncorrelated increments. In such a model, movements in stock prices are not dependent on past stock prices, so trends cannot exist and technical analysis has no basis. Random Walk advocates such as [http://www.math.temple.edu/~paulos John Allen Paulos] believe that technical analysis and fundamental analysis are pseudo-sciences. The latter tried his hand at playing the stock markets without success:<br />
<blockquote><br />
[http://www.math.temple.edu/~paulos/contents.html A Mathematician Plays the Stock Market] is the story of my disastrous love affair with WorldCom, but lest you dread a cloyingly personal account of how I lost my shirt (or at least had my sleeves shortened), I assure you that the book's primary purpose is to lay out, elucidate, and explore the basic conceptual mathematics of the market. I'll examine ... issues associated with the market. Is it efficient? random? Is there anything to technical analysis, fundamental analysis, and other supposedly time-tested methods of picking stocks? How can one quantify risk? What is the role of cognitive illusion and pyschological foible (to which, alas, I am not immune)?<br />
...<br />
In short, what can the tools of mathematics can tell us about the vagaries of the stock market?<br />
</blockquote><br />
<br />
Submitted by John Gavin.</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_22&diff=3468Chance News 222007-01-05T16:44:05Z<p>Mmartin: /* Independence for national statistics */</p>
<hr />
<div>==Quotations==<br />
<br />
<br />
<blockquote>It would be hard to make a probability course boring.<br><br />
<div align=right><br />
William Feller<br><br />
Personal comment to Laurie Snell</div><br />
</blockquote><br />
----<br />
<blockquote> Apart from Fred, [an obstreperous rat in her psychology lab] I was sick of trying to master statistics. I had a mental block when it came to any form of mathematics. 'Rats and stats,' I complained to a fellow student one day, 'I came here to learn about people.' I wasn't the only student disgruntled. Many complained but to no avail.<br />
<div align=right><br />
Sally Morgan in her book, My Place<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote>The risk of going into cardiac arrest as a spectator, he [Dr. Siegal of Massachusetts General Hospital] said, is only about one in a million. (The applicable studies of spectators involved Super Bowl fans.)<br />
<br />
==Forsooth==<br />
<br />
<blockquote> NOAA's heating degree day forecast for December, January and February projects a 2 percent warmer winter than the 30 year average<Br><br />
<div align=right>[http://www.noaanews.noaa.gov/stories2006/s2742.htm NOAA Magazine]<br><br />
</div></blockquote><br />
<br />
The following Forsooths are from the November 2006 RRS NEWS.<br />
----<br />
<blockquote>At St John's Wood station alone, the number of CCTV cameras has jumped from 20 to 57, an increase of 300 per cent.<br />
<br><br />
<div align=right>Metro <br><br />
3 May 2006<br />
</div></blockquote><br />
----<br />
<blockquote>Now 78% of female veterinary medicine students are women, almost a complete turn-around from the previous situation.<br><br />
<div align=right><br />
The Herald (Glasgow) <br><br />
4 May 2006</div><br />
</blockquote><br />
----<br />
<blockquote>Drought to ravage half the world within 100 years<br><br><br />
Half the world's surface will be gripped by drought by the end of the century, the Met Office said yesterday.<br><br />
<div align=right><br />
Times online <br><br />
6 October 2006</div><br />
</blockquote><br />
----<br />
<br />
==I wasn't making up data, I was imputing!==<br />
<br />
An Unwelcome Discovery, by Jeneen Interlandi, The New York Times, October 22, 2006.<br />
<br />
The New York Times has an informative summary of a recent scandal involving a prominent researcher at the University of Vermont, Eric Poehlman. The Poehlman scandal represents perhaps the biggest case of research fraud in recent history.<br />
<br />
<blockquote>He presented fraudulent data in lectures and in published papers, and he used this data to obtain millions of dollars in federal grants from the National Institutes of Health — a crime subject to as many as five years in federal prison.</blockquote><br />
<br />
The first person to speak up about the possibility of fraud in Poehlman's work was one of his research assistants, Walter DeNino.<br />
<br />
<blockquote>The fall that DeNino returned to the lab, Poehlman was looking into how fat levels in the blood change with age. DeNino’s task was to compare the levels of lipids, or fats, in two sets of blood samples taken several years apart from a large group of patients. As the patients aged, Poehlman expected, the data would show an increase in low-density lipoprotein (LDL), which deposits cholesterol in arteries, and a decrease in high-density lipoprotein (HDL), which carries it to the liver, where it can be broken down. Poehlman’s hypothesis was not controversial; the idea that lipid levels worsen with age was supported by decades of circumstantial evidence. Poehlman expected to contribute to this body of work by demonstrating the change unequivocally in a clinical study of actual patients over time. But when DeNino ran his first analysis, the data did not support the premise.</blockquote><br />
<br />
<blockquote>When Poehlman saw the unexpected results, he took the electronic file home with him. The following week, Poehlman returned the database to DeNino, explained that he had corrected some mistaken entries and asked DeNino to re-run the statistical analysis. Now the trend was clear: HDL appeared to decrease markedly over time, while LDL increased, exactly as they had hypothesized.</blockquote><br />
<br />
<blockquote>Although DeNino trusted his boss implicitly, the change was too great to be explained by a handful of improperly entered numbers, which was all Poehlman claimed to have fixed. DeNino pulled up the original figures and compared them with the ones Poehlman had just given him. In the initial spreadsheet, many patients showed an increase in HDL from the first visit to the second. In the revised sheet, all patients showed a decrease. Astonished, DeNino read through the data again. Sure enough, the only numbers that hadn’t been changed were the ones that supported his hypothesis.<br />
</blockquote><br />
<br />
Poehlman brushed DeNino's concerns aside, so DeNino started asking around and other graduate students and postdocs had similar concerns. He got some cautionary advice from a former postdoctoral fellow<br />
<br />
<blockquote>Being associated with either falsified data or a frivolous allegation against a scientist as prominent as Poehlman could end DeNino’s career before it even began.</blockquote><br />
<br />
and a faculty member who shared lab space with Poehlman who advised<br />
<br />
<blockquote>If you’re going to do something, make sure you really have the evidence.</blockquote><br />
<br />
So DeNino started looking for the evidence.<br />
<br />
<blockquote>DeNino spent the next several evenings combing through hundreds of patients’ records in the lab and university hospital, trying to verify the data contained in Poehlman’s spreadsheets. Each night was worse than the one before. He discovered not only reversed data points, but also figures for measurements that had never been taken and even patients who appeared not to exist at all.</blockquote><br />
<br />
DeNino presented his evidence to the university counsel and the response of Poehlman (to his department chair, Burton Sobel) was rather startling.<br />
<br />
<blockquote>The accused scientist gave him the impression that nothing was wrong and seemed mostly annoyed by all the fuss. In his written response to the allegations, Poehlman suggested that the data had gotten out of hand, accumulating numerous errors because of handling by multiple technicians and postdocs over the years. “I found that noncredible, really, for an investigator of Eric’s experience,” Sobel later told the investigative panel. “There had to be a backup copy that was pure,” Sobel reasoned before the panel. “You would not have postdocs and lab techs in charge of discrepant data sets.” But Poehlman told Sobel that there was no master copy.</blockquote><br />
<br />
At the formal hearing, Poehlman had a different defense.<br />
<br />
<blockquote>First, he attributed his mistakes to his own self-proclaimed ineptitude with Excel files. Then, when pressed on how fictitious numbers found their way into the spreadsheet he’d given DeNino, Poehlman laid out his most elaborate explanation yet. He had imputed data — that is, he had derived predicted values for measurements using a complicated statistical model. His intention, he said, was to look at hypothetical outcomes that he would later compare to the actual results. He insisted that he never meant for DeNino to analyze the imputed values and had given him the spreadsheet by mistake.</blockquote><br />
<br />
The New York Times article points out how pathetic this attempted explanation was.<br />
<br />
<blockquote>Although data can be imputed legitimately in some disciplines, it is generally frowned upon in clinical research, and this explanation came across as hollow and suspicious, especially since Poehlman appeared to have no idea how imputation was done.</blockquote><br />
<br />
A large portion of the article examines how research fraud can occur in a system that is supposed to be self-correcting.<br />
<br />
First, the people who are mostly likely to notice fraud are junior investigators who are subordinate to their research mentor. It's psychologically and emotionally difficult to confront someone who has devoted time to your professional development. Even when an investigator is emotionally willing to confront their mentor, they have their career concerns to worry about.<br />
<br />
<blockquote>The principal investigator in a lab has the power to jump-start careers. By writing papers with graduate students and postdocs and using connections to help obtain fellowships and appointments, senior scientists can help their lab workers secure coveted tenure-track jobs. They can also do damage by withholding this support.</blockquote><br />
<br />
Every university will have a system in place to investigate claims of fraud. But there are problems here as well.<br />
<br />
<blockquote>All universities that receive public money to conduct research are required to have an integrity officer who ensures compliance with federal guidelines. But policing its scientists can be a heavy burden for a university. “It’s your own faculty, and there’s this idea of supporting and nurturing them,” says Ellen Hyman-Browne, a research-compliance officer at the Children’s Hospital of Philadelphia, a teaching hospital. Moreover, investigations cost time and money, and no institution wants to discover something that could cast a shadow on its reputation.</blockquote><br />
<br />
<blockquote>“There are conflicting influences on a university where they are the co-grantor and responsible to other investigators,” says Stephen Kelly, the Justice Department attorney who prosecuted Poehlman. “For the system to work, the university has to be very ethical.”</blockquote><br />
<br />
Poehlman himself was careful and chose areas where fraud would be especially difficult to detect. He specialized in presenting longitudinal data, data that is very expensive to replicate. He also presented research results that confirmed what most researchers had suspected, rather than results that would undermine existing theories of nutrition.<br />
<br />
At his sentencing, Poehlman was sentenced to one year and one day in federal prison, making him the first researcher to serve time in jail for research fraud.<br />
<br />
<blockquote>“When scientists use their skill and their intelligence and their sophistication and their position of trust to do something which puts people at risk, that is extraordinarily serious,” the judge said. “In one way, this is a final lesson that you are offering.”</blockquote><br />
<br />
===Questions===<br />
<br />
1. Do you have experience with a researcher changing the data values after seeing the initial analysis results? What would make you suspicious of fraud?<br />
<br />
2. Is the peer-review system of research self-correcting? What changes could be made to this system?<br />
<br />
3. When is imputation legitimate and when is it fraudulent?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Independence for national statistics==<br />
[http://www.johnkay.com/political/453 A better way to restore faith in official statistics], [http://www.johnkay.com/ John Kay], Financial Times 25 July 2006.<br><br />
<br />
[http://www.johnkay.com/political/453 John Kay], a columnist for the [http://www.ft.com Financial Times], outlines the measures needed to ensure that national statistics are truly independent. <br />
<br />
The current state of UK official statistics was covered in a previous Chance article<br />
<em>[http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9#Pick_a_number.2C_any_number Pick a number, any number,]</em> in [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9 Chance News 9.] That article summarised a report on this topic, to which professional users, such the Royal Statistical Society, gave a cautious welcome to the government’s announcement of independence for the UK Office of National Statistics (ONS). <br />
<br />
Kay's article follows up on the reaction to that report.<br />
He tells us that accurate public information is a prerequisite of democracy, government statisticians are honest people but ministers (politicians) needs are often for propaganda rather than facts. <br />
Kay claims that decentralisation of responsibility for the production of official statistics has created a two-tier system in the UK. <br />
<blockquote><br />
statistics produced by the Office for National Statistics (ONS), which operates to internationally agreed criteria, are of higher quality than those produced by (government) departments. <br />
</blockquote><br />
The proposal to hand responsibility for all official statistics to the ONS was rejected,<br />
as were the suggestions for greater independence, made by bodies such as the Statistics Commission and the Royal Statistical Society,<br />
* separating statistical information from political statements, <br />
* reducing access by ministers to new data before their release, <br />
* giving parliament a defined role in the appointment of the National Statistician. <br />
<br />
Instead, the latest news is that the ONS will be demoted to a non-ministerial department.<br />
The worst news is the abolition of the Statistics Commission, which reviews all government statistics, and has made itself unpopular with government by proving itself robustly independent. <br />
<br />
Kay also cautions that statistics may be misused in contexts other than those intended. The value of health services increases as incomes rise and it can be argued that this increases the value of health output even if outcomes and procedures are unchanged. This statistical adjustment provides no basis whatever for claims that the National Health Service is more efficient. But the assertion grabs a headline, and it is only much later that pedantic journalists and academics can discover what is actually going on. <br />
<br />
Submitted by John Gavin.<br />
<br />
==An example of Simpson's Paradox==<br />
<br />
Study finds wealth inequality is widening worldwide<br><br />
''New York Times'', Dec. 6, 2006, C-3<br><br />
Eduardo Porter<br />
<br />
The article contains stats from a 2000 report on wealth distribution by <br />
country and worldwide. The article points out (toward the end) that <br />
even though every country has seen growing income inequality in the <br />
last six years, the *worldwide* inequality gap may be narrowing from <br />
the year 2000 stats to the present. The reason is the huge growth and <br />
wealth accumulation in China and India, which raises income overall, <br />
even though both those countries have also seen greater inequality.<br />
<br />
Submitted by Bob Dobrow<br />
<br />
==Predecessors of Poehlman==<br />
<br />
Steve Simon's wiki, "I wasn't making up data, I was imputing!" is quite interesting and informative. Nevertheless, some elaboration is in order regarding fraud and Simon's statement that "The Poehlman scandal represents perhaps the biggest cases of research fraud in recent history." <br />
<br />
The term "recent history" is sufficiently elastic to permit quoting myself in the 1980s: <blockquote>Admittedly Slutsky is an extreme example...even after the investigation [proving fraud in many of his papers]...Robert G. Slutsky was [still] given credit for [an additional] 77 publications in his seven years with [the University of California, San Diego]...in 1984 he published at the astonishing rate of one paper every ten days..Slutsky's phenomenal productivity was encouraged, applauded and rewarded...John R. Darsee [another cardiologist but at Harvard], had about 100 papers in a period of two years and his undoing in 1981 was colleagues who secretly saw him forging the data.</blockquote> <br />
<br />
Put Slutsky and Darsee into Google.com and you will see the entire treatment. My point is that the Eric Poehlman scandal is nowhere near the biggest--Slutsky and Darsee involved entire prestigious labs. And we tend to ignore history at our peril. An extensive treatment of Slutsky, Darsee and many others (Baltimore, Imanishi-Kari, Spector, Summerlin, Long, Alsabti, Soman, Breuning, Pearce, Hermann, Brach, Schoen, not to mention more illustrative predecessors such as Newton, Mendel, Pasteur and Freud) can be found in The Great Betrayal: Fraud in Science by Horace Freeland Judson [Harcourt, Inc., 2004].<br />
<br />
Although Judson's book is a wonderful page-turner, go to www.bmj.com/cgi/content/full/329/7471/922 to see a critique of the book by Peter Wilmshurst, a British cardiologist who is very active in unearthing medical fraud. Wilmshurst suggests that "Judson paints a rosier picture of the mechanisms for dealing with research fraud than I recognize." Further, "Judson only briefly describes what may be the most common form of research misconduct: failure to publish results...for the sake of company profits."<br />
<br />
Although research frauds tend to have things in common--colossal egos, external as well as internal pressures, desire for fame, money, etc.--each instance is possibly unique. Poehlman evidenced a typical trait: he fabricated the data. According to the original New York Times article, his study on menopause "was almost entirely fabricated. Poehlman had tested only 2 women, not 35." On the other hand, Poehlman was downright stupid to have changed his (real, existing) cholesterol data to fit his (and others) belief that cholesterol levels worsen with age because he had the only large longitudinal study, implying that it would be publishable and valuable regardless of the results. The other unusual feature was that "He was only the second scientist in the United States to face criminal prosecution for falsifying research data."<br />
<br />
Buried in the NYT article is the statement made by Steven Heymsfield, an obesity researcher at Merck and should be a guiding light for all researchers: "But deans love people who bring in money and recognition to universities, so there is Eric."<br />
<br />
Discussion<br />
<br />
1. Use a search engine to determine what fraud was committed by some of the predecessors of Poehlman.<br />
<br />
2. Scientists claim that peer review and duplication of results act to inhibit fraud. Pick a researcher and determine why either or both failed.<br />
<br />
3. This wiki ends with a disparaging remark about university deans. Defend them.<br />
<br />
Submitted by Paul Alper<br />
<br />
==Wealth of nations==<br />
* Winner takes (almost) all, The Economist, 9th Dec 2006.<br><br />
* [http://www.eurekalert.org/pub_releases/2006-12/unu-pss120106.php Pioneering study shows richest 2 percent own half world wealth], James Davies of the University of Western Ontario, Anthony Shorrocks and Susanna Sandstrom of UNU-WIDER and Edward Wolff of New York University.<br />
<br />
The Helsinki-based World Institute for Development Economics Research of the United Nations University (UNU-WIDER)<br />
has conducted what it claims is a path-breaking study into the most comprehensive study of personal wealth ever undertaken: it is the first-of-its-kind to cover all countries in the world and all major components of household wealth, including financial assets and debts, land, buildings and other tangible property.<br />
<br />
[[Image:WorldWeathLevels.jpg|frame|World Wealth Levels in Year 2000: The world map shows per capita wealth of different countries. Average wealth amounted to $144,000 per person in the USA in year 2000, and $181,000 in Japan. Lower down among countries with wealth data are India, with per capita assets of $1,100, and Indonesia with $1,400 per capita. Source: UNU-WIDER.]]<br />
<br />
The report contains a plethora of statistics, such as:<br />
<br />
* The richest 2% of adults in the world own more than half of global household wealth.<br />
* The richest 1% of adults alone owned 40% of global assets in the year 2000.<br />
* The richest 10% of adults accounted for 85% of the world total. <br />
* The bottom half of the world adult population owned barely 1% of global wealth.<br />
* To be among the richest 10% of adults in the world required $61,000 in assets.<br />
* More than $500,000 was needed to belong to the richest 1% (37 million members).<br />
* Household wealth amounted to $125 trillion in the year 2000, equivalent to roughly three times the value of total global production (GDP) or to $20,500 per person. Adjusting for differences in the cost-of-living across nations raises the value of wealth to $26,000 per capita when measured in terms of purchasing power parity dollars.<br />
* Wealth levels vary widely across countries: ranging from $37,000 per person for New Zealand and $70,000 for Denmark to $127,000 for the UK (for high-income OECD nations).<br />
* North America has only 6% of the world adult population, yet it accounts for 34% of household wealth.<br />
* Wealth is more unequally distributed than income across countries. High income countries tend to have a bigger share of world wealth than of world GDP. The reverse is true of middle- and low-income nations. <br />
<br />
The authors warn about the ambiguity in the definition of wealth<br />
<blockquote><br />
One should be clear about what is meant by 'wealth'. <br />
In everyday conversation the term 'wealth' often signifies little more than 'money income'. <br />
On other occasions economists use 'wealth' to refer to the value of all household resources, <br />
including human capabilities.<br />
</blockquote><br />
The authors define wealth to mean 'the value of physical and financial assets less debts',<br />
so wealth represents the ownership of capital. <br />
They claim that capital is widely believed to have a disproportionate impact on household wellbeing and economic success, and more broadly on economic development and growth.<br />
<br />
The authors use the [http://en.wikipedia.org/wiki/Gini_coefficient Gini value] to measure inequality on a scale from zero to one. They claim that wealth is shared much less equitably than income. Income inequality ranges from 35% to 45% and wealth inequality are usually between 65% and 75% (e.g. zero would mean everyone has the same income and one means that one person has all the income and everyone else has none).<br />
The authors claim<br />
<blockquote><br />
The global wealth Gini for adults is 89%. The same degree of inequality would be obtained if one person in a group of ten takes 99% of the total pie and the other nine share the remaining 1%.<br />
</blockquote><br />
<br />
Surprisingly, household debt is seen as relatively unimportant in poor countries. As the authors of the study point out:<br />
<blockquote><br />
while many poor people in poor countries are in debt, their debts are relatively small in total. This is mainly due to the absence of financial institutions that allow households to incur large mortgage and consumer debts, as is increasingly the situation in rich countries. Many people in high-income countries have negative net worth and—somewhat paradoxically—are among the poorest people in the world in terms of household wealth.<br />
</blockquote><br />
For example, the bottom half of the Swedish population have a collective net worth of less than zero, although Nordic countries, in general, seem to thrive with relatively little personal wealth.<br />
<br />
===Questions===<br />
* A presentation format consisteing of a list of such point-estimate statistics seems disjointed, as it swaps repeatedly between statistics for the richest and the poorest. Could the data be more meaningfully presented via a distribution?<br />
* The graph shows a discrete five point distribution. Is such a split of the data into buckets such as 'under 2000' and 'over 50000' meaningful? <br />
* Mapping the output to countries via colours, shows the geographic distribution of the underlying variable, wealth. What is misleading about this graph? How might countries be scaled in size to better relflect the data.<br />
* How might switching from measuring wealth to income affect the preception of the results? (A [http://en.wikipedia.org/wiki/Image:World_Map_Gini_coefficient.png gini measure of income inequality] is available from Wikipedia, along with time trends since the 1940s.)<br />
* Two high wealth economies, Japan and the United States, show very different patterns of wealth inequality, with Japan having a wealth Gini of 55% and the USA a wealth Gini of around 80%. Speculate on what factors might explain this difference.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Science in the Courtroom==<br />
[http://www.nytimes.com/2006/12/05/science/05law.html?ex=1322974800&en=28d0cbd0efade415&ei=5088&partner=rssnyt&emc=rss When questions of science come to a courtroom, truth has many faces]<br><br />
New York Times, 5 December 2006, F3<br><br />
Cornelia Dean<br />
<br />
This article appeared as the US Supreme Court began hearing its first case involving global warming. A case has been filed against the federal government by a group of state and local goverments, together with environmental groups. These plaintiffs charge that that the Environmental Protection Agency, by refusing to regulate greenhouse gas emissions, is failing to enforce the Clean Air Act.<br />
<br />
Some of the arguments involve legal technicalities, such as whether the states actually have standing to bring such a suit. But the present article is concerned with the scientific evidence, and what responsibility the Court has to educate itself about the scientific underpinnings of a case. The article draws the following distinction between between statistical and legal standards for proof:<br />
<br />
<blockquote><br />
Typically, scientists don't accept a finding unless, statistically, the odds are less than 1 in 20 that it occurred by chance. This standard is higher than the typical standard of proof in civil trials (&quot;preponderance of the evidence&quot;) and lower than the standard for criminal trials (&quot;beyond a reasonable doubt&quot;).<br />
</blockquote><br />
<br />
The article provides some historical references on how the Court has previously viewed scientific testimony, beginning discussion of the 1923 Frey case on lie detectors, which introduced the &quot;general acceptance&quot; standard. This was updated in the 1993 case Daubert v. Merrell Dow Pharmaceuticals, which involved the drug Bendectin and it possible association with birth defects. The Court introduced the concepts of &quot;testability&quot; and &quot;peer review&quot; into its deliberations on science. In the 1998 case General Electric Company v. Joiner, the Court ruled that &quot;judges could reject evidence if there was simply too great a gap between 'the data and the opinion proffered.'&quot;<br />
<br />
The main thrust of the article, however, is that the Court still has been too slow to keep up with the explosion of scientific knowledge, which can be expected to play an ever larger role in future cases. For example, when corrected on a technical point in the discussion about carbon dioxide, Justice Scalia responded, &quot;Troposphere, whatever. I told you before I'm not a scientist.&quot;<br />
<br />
DISCUSSION QUESTIONS<br />
<br />
(1) What do you think of the suggested correspondence between the legal and statistical standards for evidence? What probability numbers would you attach to &quot;preponderance of the evidence&quot; and &quot;beyond a reasonable doubt&quot;?<br />
<br />
(2) How should a judge decide when there is too great a gap between &quot;the data and the opinion proffered&quot;?<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Magic numbers==<br />
<br />
[http://www.economist.com/research/articlesBySubject/displayStory.cfm?subjectid=2512631&story_id=7953427 Technical failure], Buttonwood, The Economist, Sep 21st 2006.<br />
<br />
In financial markets, some traders believe that markets change trend when they reach, say, 61.8% of their previous high, or 61.8% above their low. Such seemingly magical numbers are derived from Fibonacci series and are often given special names such as <em>the golden ratio</em> (approx 1.618) in architecture and design.<br />
<br />
This article categorises such traders as follows:<br />
<blockquote><br />
Believers in Fibonacci numbers are part of a school known as [http://en.wikipedia.org/wiki/Technical_analysis technical analysis,] or chartism, which believes the future movement of asset prices can be divined from past data. <br />
Some chartists follow patterns such as <em>head and shoulders</em> and <em>double tops</em>; others focus on moving averages; a third group believes markets move in pre-determined waves. The Fibonacci fans fall into this last set.<br />
</blockquote><br />
<br />
The Economist article points out that <br />
a [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 new study], by Professor Roy Batchelor and Richard Ramyar of the Cass Business School, finds no indication that trends reverse at the 61.8% level, or indeed at any predictable milestone in American stockmarkets.<br />
<br />
Fibonacci numbers at least have the virtue of creating a testable proposition; one that they appear to fail. However, chartists will not be completely discouraged as The Economist highlights [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 another study] which claims that 58 of 92 modern studies of technical analysis produced positive results. The authors of this second paper conclude:<br />
<blockquote><br />
Despite the positive evidence ... it appears that most empirical studies are subject to various problems in their testing procedures, e.g. data snooping, ex-post selection of trading rules or search technologies and difficulties in estimation of risk and transaction costs.<br />
</blockquote><br />
<br />
The Economist article goes on to imply that the theory which dominates at any point in time may simply be a matter of fashion:<br />
<blockquote><br />
If financial markets are efficient, technical analysis should not work at all; the prevailing market price should reflect all information, including past price movements. However, academic fashion has moved in favour of behavioural finance, which suggests that investors may not be completely rational and that their psychological biases could cause prices to deviate from their 'correct' level.<br />
</blockquote><br />
<br />
The article claims that chartism probably works best in the [http://en.wikipedia.org/wiki/Foreign_exchange_market foreign-exchange market] because major participants, especially central banks, are not 'profit-maximising' leading to inefficient pricing. Furthermore, some technical predictions may be self-fulfilling; if everyone believes that the dollar will rebound at 100 yen, they will buy it as it approaches that level. <br />
<br />
But it finishes with a warning<br />
<blockquote><br />
Chartists fall prey to their own behavioural flaw, finding “confirmation” of patterns everywhere, as if they were reading clouds in their coffee futures.<br />
</blockquote><br />
<br />
===Questions===<br />
* Can you think of possible ways to alleviate the biases mentioned: 'data snooping', 'ex-post selection of trading rules' and 'transaction costs'? Which of these issues do you think is easiest to incorporate into an analysis?<br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'Critics of technical analysis include well known [http://en.wikipedia.org/wiki/Fundamental_analysis fundamental analysts.] Warren Buffett has said, <em>I realized technical analysis didn't work when I turned the charts upside down and didn't get a different answer</em> and <em>if past history was all there was to the game, the richest people would be librarians.</em>' How might you test if Buffett's assertions are true? <br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'To a technician, however, Buffett paraphrased [technical analysis] when he commented in a recent conference on investing in mining companies, <em>in metals and oils, there's been a terrific [price] move. It's like most trends: at the beginning, it's driven by fundamentals, then speculation takes over ... then the speculation becomes dominant.</em>' Do you agree that Buffett is acknowledging that markets are inefficient because they trend? Would a basic, first-order, auto-regressive model (AR(1)) on price differences be sufficient to test the existance of such a trend?<br />
* Technicians argue that many investors base their future expectations on past earnings, track records, etc. Because future stock prices can be strongly influenced by investor expectations, technicians claim this means that past prices can influence future prices. Does this argument persuade you?<br />
<br />
===Further reading===<br />
* [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 Magic numbers in the Dow,] Roy Batchelor and Richard Ramyar, Cass Business School, City of London, Sep 2006.<br />
* [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 The Profitability of Technical Analysis: a review,] by Cheol-Ho Park and Scott H Irwin, University of Illinois, October 2004.<br />
* The [http://en.wikipedia.org/wiki/Random_walk_hypothesis random walk hypothesis] is at odds with technical analysis and charting. This hypothesis claims that stock price movements are a Brownian Motion with either independent or uncorrelated increments. In such a model, movements in stock prices are not dependent on past stock prices, so trends cannot exist and technical analysis has no basis. Random Walk advocates such as [http://www.math.temple.edu/~paulos John Allen Paulos] believe that technical analysis and fundamental analysis are pseudo-sciences. The latter tried his hand at playing the stock markets without success:<br />
<blockquote><br />
[http://www.math.temple.edu/~paulos/contents.html A Mathematician Plays the Stock Market] is the story of my disastrous love affair with WorldCom, but lest you dread a cloyingly personal account of how I lost my shirt (or at least had my sleeves shortened), I assure you that the book's primary purpose is to lay out, elucidate, and explore the basic conceptual mathematics of the market. I'll examine ... issues associated with the market. Is it efficient? random? Is there anything to technical analysis, fundamental analysis, and other supposedly time-tested methods of picking stocks? How can one quantify risk? What is the role of cognitive illusion and pyschological foible (to which, alas, I am not immune)?<br />
...<br />
In short, what can the tools of mathematics can tell us about the vagaries of the stock market?<br />
</blockquote><br />
<br />
Submitted by John Gavin.</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_22&diff=3467Chance News 222007-01-05T16:38:51Z<p>Mmartin: /* I wasn't making up data, I was imputing! */</p>
<hr />
<div>==Quotations==<br />
<br />
<br />
<blockquote>It would be hard to make a probability course boring.<br><br />
<div align=right><br />
William Feller<br><br />
Personal comment to Laurie Snell</div><br />
</blockquote><br />
----<br />
<blockquote> Apart from Fred, [an obstreperous rat in her psychology lab] I was sick of trying to master statistics. I had a mental block when it came to any form of mathematics. 'Rats and stats,' I complained to a fellow student one day, 'I came here to learn about people.' I wasn't the only student disgruntled. Many complained but to no avail.<br />
<div align=right><br />
Sally Morgan in her book, My Place<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote>The risk of going into cardiac arrest as a spectator, he [Dr. Siegal of Massachusetts General Hospital] said, is only about one in a million. (The applicable studies of spectators involved Super Bowl fans.)<br />
<br />
==Forsooth==<br />
<br />
<blockquote> NOAA's heating degree day forecast for December, January and February projects a 2 percent warmer winter than the 30 year average<Br><br />
<div align=right>[http://www.noaanews.noaa.gov/stories2006/s2742.htm NOAA Magazine]<br><br />
</div></blockquote><br />
<br />
The following Forsooths are from the November 2006 RRS NEWS.<br />
----<br />
<blockquote>At St John's Wood station alone, the number of CCTV cameras has jumped from 20 to 57, an increase of 300 per cent.<br />
<br><br />
<div align=right>Metro <br><br />
3 May 2006<br />
</div></blockquote><br />
----<br />
<blockquote>Now 78% of female veterinary medicine students are women, almost a complete turn-around from the previous situation.<br><br />
<div align=right><br />
The Herald (Glasgow) <br><br />
4 May 2006</div><br />
</blockquote><br />
----<br />
<blockquote>Drought to ravage half the world within 100 years<br><br><br />
Half the world's surface will be gripped by drought by the end of the century, the Met Office said yesterday.<br><br />
<div align=right><br />
Times online <br><br />
6 October 2006</div><br />
</blockquote><br />
----<br />
<br />
==I wasn't making up data, I was imputing!==<br />
<br />
An Unwelcome Discovery, by Jeneen Interlandi, The New York Times, October 22, 2006.<br />
<br />
The New York Times has an informative summary of a recent scandal involving a prominent researcher at the University of Vermont, Eric Poehlman. The Poehlman scandal represents perhaps the biggest case of research fraud in recent history.<br />
<br />
<blockquote>He presented fraudulent data in lectures and in published papers, and he used this data to obtain millions of dollars in federal grants from the National Institutes of Health — a crime subject to as many as five years in federal prison.</blockquote><br />
<br />
The first person to speak up about the possibility of fraud in Poehlman's work was one of his research assistants, Walter DeNino.<br />
<br />
<blockquote>The fall that DeNino returned to the lab, Poehlman was looking into how fat levels in the blood change with age. DeNino’s task was to compare the levels of lipids, or fats, in two sets of blood samples taken several years apart from a large group of patients. As the patients aged, Poehlman expected, the data would show an increase in low-density lipoprotein (LDL), which deposits cholesterol in arteries, and a decrease in high-density lipoprotein (HDL), which carries it to the liver, where it can be broken down. Poehlman’s hypothesis was not controversial; the idea that lipid levels worsen with age was supported by decades of circumstantial evidence. Poehlman expected to contribute to this body of work by demonstrating the change unequivocally in a clinical study of actual patients over time. But when DeNino ran his first analysis, the data did not support the premise.</blockquote><br />
<br />
<blockquote>When Poehlman saw the unexpected results, he took the electronic file home with him. The following week, Poehlman returned the database to DeNino, explained that he had corrected some mistaken entries and asked DeNino to re-run the statistical analysis. Now the trend was clear: HDL appeared to decrease markedly over time, while LDL increased, exactly as they had hypothesized.</blockquote><br />
<br />
<blockquote>Although DeNino trusted his boss implicitly, the change was too great to be explained by a handful of improperly entered numbers, which was all Poehlman claimed to have fixed. DeNino pulled up the original figures and compared them with the ones Poehlman had just given him. In the initial spreadsheet, many patients showed an increase in HDL from the first visit to the second. In the revised sheet, all patients showed a decrease. Astonished, DeNino read through the data again. Sure enough, the only numbers that hadn’t been changed were the ones that supported his hypothesis.<br />
</blockquote><br />
<br />
Poehlman brushed DeNino's concerns aside, so DeNino started asking around and other graduate students and postdocs had similar concerns. He got some cautionary advice from a former postdoctoral fellow<br />
<br />
<blockquote>Being associated with either falsified data or a frivolous allegation against a scientist as prominent as Poehlman could end DeNino’s career before it even began.</blockquote><br />
<br />
and a faculty member who shared lab space with Poehlman who advised<br />
<br />
<blockquote>If you’re going to do something, make sure you really have the evidence.</blockquote><br />
<br />
So DeNino started looking for the evidence.<br />
<br />
<blockquote>DeNino spent the next several evenings combing through hundreds of patients’ records in the lab and university hospital, trying to verify the data contained in Poehlman’s spreadsheets. Each night was worse than the one before. He discovered not only reversed data points, but also figures for measurements that had never been taken and even patients who appeared not to exist at all.</blockquote><br />
<br />
DeNino presented his evidence to the university counsel and the response of Poehlman (to his department chair, Burton Sobel) was rather startling.<br />
<br />
<blockquote>The accused scientist gave him the impression that nothing was wrong and seemed mostly annoyed by all the fuss. In his written response to the allegations, Poehlman suggested that the data had gotten out of hand, accumulating numerous errors because of handling by multiple technicians and postdocs over the years. “I found that noncredible, really, for an investigator of Eric’s experience,” Sobel later told the investigative panel. “There had to be a backup copy that was pure,” Sobel reasoned before the panel. “You would not have postdocs and lab techs in charge of discrepant data sets.” But Poehlman told Sobel that there was no master copy.</blockquote><br />
<br />
At the formal hearing, Poehlman had a different defense.<br />
<br />
<blockquote>First, he attributed his mistakes to his own self-proclaimed ineptitude with Excel files. Then, when pressed on how fictitious numbers found their way into the spreadsheet he’d given DeNino, Poehlman laid out his most elaborate explanation yet. He had imputed data — that is, he had derived predicted values for measurements using a complicated statistical model. His intention, he said, was to look at hypothetical outcomes that he would later compare to the actual results. He insisted that he never meant for DeNino to analyze the imputed values and had given him the spreadsheet by mistake.</blockquote><br />
<br />
The New York Times article points out how pathetic this attempted explanation was.<br />
<br />
<blockquote>Although data can be imputed legitimately in some disciplines, it is generally frowned upon in clinical research, and this explanation came across as hollow and suspicious, especially since Poehlman appeared to have no idea how imputation was done.</blockquote><br />
<br />
A large portion of the article examines how research fraud can occur in a system that is supposed to be self-correcting.<br />
<br />
First, the people who are mostly likely to notice fraud are junior investigators who are subordinate to their research mentor. It's psychologically and emotionally difficult to confront someone who has devoted time to your professional development. Even when an investigator is emotionally willing to confront their mentor, they have their career concerns to worry about.<br />
<br />
<blockquote>The principal investigator in a lab has the power to jump-start careers. By writing papers with graduate students and postdocs and using connections to help obtain fellowships and appointments, senior scientists can help their lab workers secure coveted tenure-track jobs. They can also do damage by withholding this support.</blockquote><br />
<br />
Every university will have a system in place to investigate claims of fraud. But there are problems here as well.<br />
<br />
<blockquote>All universities that receive public money to conduct research are required to have an integrity officer who ensures compliance with federal guidelines. But policing its scientists can be a heavy burden for a university. “It’s your own faculty, and there’s this idea of supporting and nurturing them,” says Ellen Hyman-Browne, a research-compliance officer at the Children’s Hospital of Philadelphia, a teaching hospital. Moreover, investigations cost time and money, and no institution wants to discover something that could cast a shadow on its reputation.</blockquote><br />
<br />
<blockquote>“There are conflicting influences on a university where they are the co-grantor and responsible to other investigators,” says Stephen Kelly, the Justice Department attorney who prosecuted Poehlman. “For the system to work, the university has to be very ethical.”</blockquote><br />
<br />
Poehlman himself was careful and chose areas where fraud would be especially difficult to detect. He specialized in presenting longitudinal data, data that is very expensive to replicate. He also presented research results that confirmed what most researchers had suspected, rather than results that would undermine existing theories of nutrition.<br />
<br />
At his sentencing, Poehlman was sentenced to one year and one day in federal prison, making him the first researcher to serve time in jail for research fraud.<br />
<br />
<blockquote>“When scientists use their skill and their intelligence and their sophistication and their position of trust to do something which puts people at risk, that is extraordinarily serious,” the judge said. “In one way, this is a final lesson that you are offering.”</blockquote><br />
<br />
===Questions===<br />
<br />
1. Do you have experience with a researcher changing the data values after seeing the initial analysis results? What would make you suspicious of fraud?<br />
<br />
2. Is the peer-review system of research self-correcting? What changes could be made to this system?<br />
<br />
3. When is imputation legitimate and when is it fraudulent?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Independence for national statistics==<br />
[http://www.johnkay.com/political/453 A better way to restore faith in official statistics], [http://www.johnkay.com/ John Kay], Financial Times 25 July 2006.<br><br />
<br />
[http://www.johnkay.com/political/453 John Kay], a columnist for the [http://www.ft.com Financial Times], outlines the measures needed to ensure that national statistics are truly independent. <br />
<br />
The current state of UK official statistics was covered in a previous Chance article<br />
<em>[http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9#Pick_a_number.2C_any_number Pick a number, any number,]</em> in [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_9 Chance News 9.] That article summarised a report on this topic, to which professional users, such the Royal Statistical Society, gave a cautious welcome to the government’s announcement of independence for the UK Office of National Statistics (ONS). <br />
<br />
Kay's article follows up on the reaction to that report.<br />
He tells us that accurate public information is a prerequisite of democracy, governement statisticians are honest people but ministers (politicians) needs are often for propaganda rather than facts. <br />
Kay claims that decentralisation of responsibility for the production of official statistics has created a two-tier system in the UK. <br />
<blockquote><br />
statistics produced by the Office for National Statistics (ONS), which operates to internationally agreed criteria, are of higher quality than those produced by (government) departments. <br />
</blockquote><br />
The proposal to hand repsonsibility for all official statistics to the ONS was rejected,<br />
as were the suggestions for greater independence, made by bodies such as the Statistics Commission and the Royal Statistical Society,<br />
* separating statistical information from political statements, <br />
* reducing access by ministers to new data before their release, <br />
* giving parliament a defined role in the appointment of the National Statistician. <br />
<br />
Instead, the lastest news is that the ONS will be demoted to a non-ministerial department.<br />
The worst news is the abolition of the Statistics Commission, which reviews all government statistics, and has made itself unpopular with government by proving itself robustly independent. <br />
<br />
Kay also cautions that statistics may be misused in contexts other than those intended. The value of health services increases as incomes rise and it can be argued that this increases the value of health output even if outcomes and procedures are unchanged. This statistical adjustment provides no basis whatever for claims that the National Health Service is more efficient. But the assertion grabs a headline, and it is only much later that pedantic journalists and academics can discover what is actually going on. <br />
<br />
Submitted by John Gavin.<br />
<br />
==An example of Simpson's Paradox==<br />
<br />
Study finds wealth inequality is widening worldwide<br><br />
''New York Times'', Dec. 6, 2006, C-3<br><br />
Eduardo Porter<br />
<br />
The article contains stats from a 2000 report on wealth distribution by <br />
country and worldwide. The article points out (toward the end) that <br />
even though every country has seen growing income inequality in the <br />
last six years, the *worldwide* inequality gap may be narrowing from <br />
the year 2000 stats to the present. The reason is the huge growth and <br />
wealth accumulation in China and India, which raises income overall, <br />
even though both those countries have also seen greater inequality.<br />
<br />
Submitted by Bob Dobrow<br />
<br />
==Predecessors of Poehlman==<br />
<br />
Steve Simon's wiki, "I wasn't making up data, I was imputing!" is quite interesting and informative. Nevertheless, some elaboration is in order regarding fraud and Simon's statement that "The Poehlman scandal represents perhaps the biggest cases of research fraud in recent history." <br />
<br />
The term "recent history" is sufficiently elastic to permit quoting myself in the 1980s: <blockquote>Admittedly Slutsky is an extreme example...even after the investigation [proving fraud in many of his papers]...Robert G. Slutsky was [still] given credit for [an additional] 77 publications in his seven years with [the University of California, San Diego]...in 1984 he published at the astonishing rate of one paper every ten days..Slutsky's phenomenal productivity was encouraged, applauded and rewarded...John R. Darsee [another cardiologist but at Harvard], had about 100 papers in a period of two years and his undoing in 1981 was colleagues who secretly saw him forging the data.</blockquote> <br />
<br />
Put Slutsky and Darsee into Google.com and you will see the entire treatment. My point is that the Eric Poehlman scandal is nowhere near the biggest--Slutsky and Darsee involved entire prestigious labs. And we tend to ignore history at our peril. An extensive treatment of Slutsky, Darsee and many others (Baltimore, Imanishi-Kari, Spector, Summerlin, Long, Alsabti, Soman, Breuning, Pearce, Hermann, Brach, Schoen, not to mention more illustrative predecessors such as Newton, Mendel, Pasteur and Freud) can be found in The Great Betrayal: Fraud in Science by Horace Freeland Judson [Harcourt, Inc., 2004].<br />
<br />
Although Judson's book is a wonderful page-turner, go to www.bmj.com/cgi/content/full/329/7471/922 to see a critique of the book by Peter Wilmshurst, a British cardiologist who is very active in unearthing medical fraud. Wilmshurst suggests that "Judson paints a rosier picture of the mechanisms for dealing with research fraud than I recognize." Further, "Judson only briefly describes what may be the most common form of research misconduct: failure to publish results...for the sake of company profits."<br />
<br />
Although research frauds tend to have things in common--colossal egos, external as well as internal pressures, desire for fame, money, etc.--each instance is possibly unique. Poehlman evidenced a typical trait: he fabricated the data. According to the original New York Times article, his study on menopause "was almost entirely fabricated. Poehlman had tested only 2 women, not 35." On the other hand, Poehlman was downright stupid to have changed his (real, existing) cholesterol data to fit his (and others) belief that cholesterol levels worsen with age because he had the only large longitudinal study, implying that it would be publishable and valuable regardless of the results. The other unusual feature was that "He was only the second scientist in the United States to face criminal prosecution for falsifying research data."<br />
<br />
Buried in the NYT article is the statement made by Steven Heymsfield, an obesity researcher at Merck and should be a guiding light for all researchers: "But deans love people who bring in money and recognition to universities, so there is Eric."<br />
<br />
Discussion<br />
<br />
1. Use a search engine to determine what fraud was committed by some of the predecessors of Poehlman.<br />
<br />
2. Scientists claim that peer review and duplication of results act to inhibit fraud. Pick a researcher and determine why either or both failed.<br />
<br />
3. This wiki ends with a disparaging remark about university deans. Defend them.<br />
<br />
Submitted by Paul Alper<br />
<br />
==Wealth of nations==<br />
* Winner takes (almost) all, The Economist, 9th Dec 2006.<br><br />
* [http://www.eurekalert.org/pub_releases/2006-12/unu-pss120106.php Pioneering study shows richest 2 percent own half world wealth], James Davies of the University of Western Ontario, Anthony Shorrocks and Susanna Sandstrom of UNU-WIDER and Edward Wolff of New York University.<br />
<br />
The Helsinki-based World Institute for Development Economics Research of the United Nations University (UNU-WIDER)<br />
has conducted what it claims is a path-breaking study into the most comprehensive study of personal wealth ever undertaken: it is the first-of-its-kind to cover all countries in the world and all major components of household wealth, including financial assets and debts, land, buildings and other tangible property.<br />
<br />
[[Image:WorldWeathLevels.jpg|frame|World Wealth Levels in Year 2000: The world map shows per capita wealth of different countries. Average wealth amounted to $144,000 per person in the USA in year 2000, and $181,000 in Japan. Lower down among countries with wealth data are India, with per capita assets of $1,100, and Indonesia with $1,400 per capita. Source: UNU-WIDER.]]<br />
<br />
The report contains a plethora of statistics, such as:<br />
<br />
* The richest 2% of adults in the world own more than half of global household wealth.<br />
* The richest 1% of adults alone owned 40% of global assets in the year 2000.<br />
* The richest 10% of adults accounted for 85% of the world total. <br />
* The bottom half of the world adult population owned barely 1% of global wealth.<br />
* To be among the richest 10% of adults in the world required $61,000 in assets.<br />
* More than $500,000 was needed to belong to the richest 1% (37 million members).<br />
* Household wealth amounted to $125 trillion in the year 2000, equivalent to roughly three times the value of total global production (GDP) or to $20,500 per person. Adjusting for differences in the cost-of-living across nations raises the value of wealth to $26,000 per capita when measured in terms of purchasing power parity dollars.<br />
* Wealth levels vary widely across countries: ranging from $37,000 per person for New Zealand and $70,000 for Denmark to $127,000 for the UK (for high-income OECD nations).<br />
* North America has only 6% of the world adult population, yet it accounts for 34% of household wealth.<br />
* Wealth is more unequally distributed than income across countries. High income countries tend to have a bigger share of world wealth than of world GDP. The reverse is true of middle- and low-income nations. <br />
<br />
The authors warn about the ambiguity in the definition of wealth<br />
<blockquote><br />
One should be clear about what is meant by 'wealth'. <br />
In everyday conversation the term 'wealth' often signifies little more than 'money income'. <br />
On other occasions economists use 'wealth' to refer to the value of all household resources, <br />
including human capabilities.<br />
</blockquote><br />
The authors define wealth to mean 'the value of physical and financial assets less debts',<br />
so wealth represents the ownership of capital. <br />
They claim that capital is widely believed to have a disproportionate impact on household wellbeing and economic success, and more broadly on economic development and growth.<br />
<br />
The authors use the [http://en.wikipedia.org/wiki/Gini_coefficient Gini value] to measure inequality on a scale from zero to one. They claim that wealth is shared much less equitably than income. Income inequality ranges from 35% to 45% and wealth inequality are usually between 65% and 75% (e.g. zero would mean everyone has the same income and one means that one person has all the income and everyone else has none).<br />
The authors claim<br />
<blockquote><br />
The global wealth Gini for adults is 89%. The same degree of inequality would be obtained if one person in a group of ten takes 99% of the total pie and the other nine share the remaining 1%.<br />
</blockquote><br />
<br />
Surprisingly, household debt is seen as relatively unimportant in poor countries. As the authors of the study point out:<br />
<blockquote><br />
while many poor people in poor countries are in debt, their debts are relatively small in total. This is mainly due to the absence of financial institutions that allow households to incur large mortgage and consumer debts, as is increasingly the situation in rich countries. Many people in high-income countries have negative net worth and—somewhat paradoxically—are among the poorest people in the world in terms of household wealth.<br />
</blockquote><br />
For example, the bottom half of the Swedish population have a collective net worth of less than zero, although Nordic countries, in general, seem to thrive with relatively little personal wealth.<br />
<br />
===Questions===<br />
* A presentation format consisteing of a list of such point-estimate statistics seems disjointed, as it swaps repeatedly between statistics for the richest and the poorest. Could the data be more meaningfully presented via a distribution?<br />
* The graph shows a discrete five point distribution. Is such a split of the data into buckets such as 'under 2000' and 'over 50000' meaningful? <br />
* Mapping the output to countries via colours, shows the geographic distribution of the underlying variable, wealth. What is misleading about this graph? How might countries be scaled in size to better relflect the data.<br />
* How might switching from measuring wealth to income affect the preception of the results? (A [http://en.wikipedia.org/wiki/Image:World_Map_Gini_coefficient.png gini measure of income inequality] is available from Wikipedia, along with time trends since the 1940s.)<br />
* Two high wealth economies, Japan and the United States, show very different patterns of wealth inequality, with Japan having a wealth Gini of 55% and the USA a wealth Gini of around 80%. Speculate on what factors might explain this difference.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Science in the Courtroom==<br />
[http://www.nytimes.com/2006/12/05/science/05law.html?ex=1322974800&en=28d0cbd0efade415&ei=5088&partner=rssnyt&emc=rss When questions of science come to a courtroom, truth has many faces]<br><br />
New York Times, 5 December 2006, F3<br><br />
Cornelia Dean<br />
<br />
This article appeared as the US Supreme Court began hearing its first case involving global warming. A case has been filed against the federal government by a group of state and local goverments, together with environmental groups. These plaintiffs charge that that the Environmental Protection Agency, by refusing to regulate greenhouse gas emissions, is failing to enforce the Clean Air Act.<br />
<br />
Some of the arguments involve legal technicalities, such as whether the states actually have standing to bring such a suit. But the present article is concerned with the scientific evidence, and what responsibility the Court has to educate itself about the scientific underpinnings of a case. The article draws the following distinction between between statistical and legal standards for proof:<br />
<br />
<blockquote><br />
Typically, scientists don't accept a finding unless, statistically, the odds are less than 1 in 20 that it occurred by chance. This standard is higher than the typical standard of proof in civil trials (&quot;preponderance of the evidence&quot;) and lower than the standard for criminal trials (&quot;beyond a reasonable doubt&quot;).<br />
</blockquote><br />
<br />
The article provides some historical references on how the Court has previously viewed scientific testimony, beginning discussion of the 1923 Frey case on lie detectors, which introduced the &quot;general acceptance&quot; standard. This was updated in the 1993 case Daubert v. Merrell Dow Pharmaceuticals, which involved the drug Bendectin and it possible association with birth defects. The Court introduced the concepts of &quot;testability&quot; and &quot;peer review&quot; into its deliberations on science. In the 1998 case General Electric Company v. Joiner, the Court ruled that &quot;judges could reject evidence if there was simply too great a gap between 'the data and the opinion proffered.'&quot;<br />
<br />
The main thrust of the article, however, is that the Court still has been too slow to keep up with the explosion of scientific knowledge, which can be expected to play an ever larger role in future cases. For example, when corrected on a technical point in the discussion about carbon dioxide, Justice Scalia responded, &quot;Troposphere, whatever. I told you before I'm not a scientist.&quot;<br />
<br />
DISCUSSION QUESTIONS<br />
<br />
(1) What do you think of the suggested correspondence between the legal and statistical standards for evidence? What probability numbers would you attach to &quot;preponderance of the evidence&quot; and &quot;beyond a reasonable doubt&quot;?<br />
<br />
(2) How should a judge decide when there is too great a gap between &quot;the data and the opinion proffered&quot;?<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Magic numbers==<br />
<br />
[http://www.economist.com/research/articlesBySubject/displayStory.cfm?subjectid=2512631&story_id=7953427 Technical failure], Buttonwood, The Economist, Sep 21st 2006.<br />
<br />
In financial markets, some traders believe that markets change trend when they reach, say, 61.8% of their previous high, or 61.8% above their low. Such seemingly magical numbers are derived from Fibonacci series and are often given special names such as <em>the golden ratio</em> (approx 1.618) in architecture and design.<br />
<br />
This article categorises such traders as follows:<br />
<blockquote><br />
Believers in Fibonacci numbers are part of a school known as [http://en.wikipedia.org/wiki/Technical_analysis technical analysis,] or chartism, which believes the future movement of asset prices can be divined from past data. <br />
Some chartists follow patterns such as <em>head and shoulders</em> and <em>double tops</em>; others focus on moving averages; a third group believes markets move in pre-determined waves. The Fibonacci fans fall into this last set.<br />
</blockquote><br />
<br />
The Economist article points out that <br />
a [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 new study], by Professor Roy Batchelor and Richard Ramyar of the Cass Business School, finds no indication that trends reverse at the 61.8% level, or indeed at any predictable milestone in American stockmarkets.<br />
<br />
Fibonacci numbers at least have the virtue of creating a testable proposition; one that they appear to fail. However, chartists will not be completely discouraged as The Economist highlights [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 another study] which claims that 58 of 92 modern studies of technical analysis produced positive results. The authors of this second paper conclude:<br />
<blockquote><br />
Despite the positive evidence ... it appears that most empirical studies are subject to various problems in their testing procedures, e.g. data snooping, ex-post selection of trading rules or search technologies and difficulties in estimation of risk and transaction costs.<br />
</blockquote><br />
<br />
The Economist article goes on to imply that the theory which dominates at any point in time may simply be a matter of fashion:<br />
<blockquote><br />
If financial markets are efficient, technical analysis should not work at all; the prevailing market price should reflect all information, including past price movements. However, academic fashion has moved in favour of behavioural finance, which suggests that investors may not be completely rational and that their psychological biases could cause prices to deviate from their 'correct' level.<br />
</blockquote><br />
<br />
The article claims that chartism probably works best in the [http://en.wikipedia.org/wiki/Foreign_exchange_market foreign-exchange market] because major participants, especially central banks, are not 'profit-maximising' leading to inefficient pricing. Furthermore, some technical predictions may be self-fulfilling; if everyone believes that the dollar will rebound at 100 yen, they will buy it as it approaches that level. <br />
<br />
But it finishes with a warning<br />
<blockquote><br />
Chartists fall prey to their own behavioural flaw, finding “confirmation” of patterns everywhere, as if they were reading clouds in their coffee futures.<br />
</blockquote><br />
<br />
===Questions===<br />
* Can you think of possible ways to alleviate the biases mentioned: 'data snooping', 'ex-post selection of trading rules' and 'transaction costs'? Which of these issues do you think is easiest to incorporate into an analysis?<br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'Critics of technical analysis include well known [http://en.wikipedia.org/wiki/Fundamental_analysis fundamental analysts.] Warren Buffett has said, <em>I realized technical analysis didn't work when I turned the charts upside down and didn't get a different answer</em> and <em>if past history was all there was to the game, the richest people would be librarians.</em>' How might you test if Buffett's assertions are true? <br />
* (from [http://en.wikipedia.org/wiki/Technical_analysis#Lack_of_evidence Wikipedia]) 'To a technician, however, Buffett paraphrased [technical analysis] when he commented in a recent conference on investing in mining companies, <em>in metals and oils, there's been a terrific [price] move. It's like most trends: at the beginning, it's driven by fundamentals, then speculation takes over ... then the speculation becomes dominant.</em>' Do you agree that Buffett is acknowledging that markets are inefficient because they trend? Would a basic, first-order, auto-regressive model (AR(1)) on price differences be sufficient to test the existance of such a trend?<br />
* Technicians argue that many investors base their future expectations on past earnings, track records, etc. Because future stock prices can be strongly influenced by investor expectations, technicians claim this means that past prices can influence future prices. Does this argument persuade you?<br />
<br />
===Further reading===<br />
* [http://www.cass.city.ac.uk/media/stories/resources/Magic_Numbers_in_the_Dow.pdf#search=%22magic%20numbers%20in%20the%20dow%22 Magic numbers in the Dow,] Roy Batchelor and Richard Ramyar, Cass Business School, City of London, Sep 2006.<br />
* [http://papers.ssrn.com/sol3/papers.cfm?abstract_id=603481 The Profitability of Technical Analysis: a review,] by Cheol-Ho Park and Scott H Irwin, University of Illinois, October 2004.<br />
* The [http://en.wikipedia.org/wiki/Random_walk_hypothesis random walk hypothesis] is at odds with technical analysis and charting. This hypothesis claims that stock price movements are a Brownian Motion with either independent or uncorrelated increments. In such a model, movements in stock prices are not dependent on past stock prices, so trends cannot exist and technical analysis has no basis. Random Walk advocates such as [http://www.math.temple.edu/~paulos John Allen Paulos] believe that technical analysis and fundamental analysis are pseudo-sciences. The latter tried his hand at playing the stock markets without success:<br />
<blockquote><br />
[http://www.math.temple.edu/~paulos/contents.html A Mathematician Plays the Stock Market] is the story of my disastrous love affair with WorldCom, but lest you dread a cloyingly personal account of how I lost my shirt (or at least had my sleeves shortened), I assure you that the book's primary purpose is to lay out, elucidate, and explore the basic conceptual mathematics of the market. I'll examine ... issues associated with the market. Is it efficient? random? Is there anything to technical analysis, fundamental analysis, and other supposedly time-tested methods of picking stocks? How can one quantify risk? What is the role of cognitive illusion and pyschological foible (to which, alas, I am not immune)?<br />
...<br />
In short, what can the tools of mathematics can tell us about the vagaries of the stock market?<br />
</blockquote><br />
<br />
Submitted by John Gavin.</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_20&diff=3111Chance News 202006-10-03T23:49:09Z<p>Mmartin: /* A Reader's Guide to Polls */</p>
<hr />
<div>==Quotations==<br />
<br />
<blockquote>Like dreams, statistics are a form of wish fulfillment. - Jean Baudrillard </blockquote><br />
----<br />
<blockquote>According to an article in the WSJ by Dr. Jerome Groopman of the Harvard Medical School criticizing alternative medicine: on the wall of the office of Dr. Stephen Straus who directs NCCAM, (formerly the Office of Alternative Medicine which is within the National Institutes of Health) there exists the following framed quotation, "The plural of anecdote is not evidence."<br />
This useful and insightful aphorism appears in various versions as can be seen by this website [http://bearcastle.com/blog/?m=20050808 here]. </blockquote><br />
<br />
==Forsooth==<br />
<br />
<blockquote><br />
"People who live longer have a greater chance of developing cancer in old age." Heard on the "Today" news programme on BBC Radio 4 and reported to the [http://groups.google.com/group/MedStats MEDSTATS] discussion group by Ted Harding.<br />
</blockquote><br />
<br />
The next two Forsooths are from the September RRS NEWS.<br />
----<br />
<blockquote><br />
The number of motorists willing to pay to travel on Britain's roads is falling, a survey out today reveals. More than one in four drivers were will to pay to use city centre roads in 2002, but that figure fell to just 36 per cent in 2005, a study for the RAC said.<br><br />
<div align=right>Metro <br><br />
16 March 2006<br />
</div></blockquote><br />
----<br />
<blockquote><br />
At present, Labour has a majority of 64, which means it holds 32 more seats than the other parties combined.<br><br />
<div align=right><br />
''Times on line'' <br><br />
20 March 2006 </div><br />
</blockquote><br />
<br />
==A car talk puzzle==<br />
Week of 08-21-07 <br />
<br />
The bullet holes were all over the place on the R.A.F. planes -- in the wings and the fuselage, and seemingly distributed randomly on the undersides. So, where did the R.A.F. mathematician recommend extra armor, to save future missions?<br />
<br />
==A clumsy attempt at anonymization ==<br />
<br />
A Face is Exposed for AOL Searcher No. 4417749<br><br />
''The New York Times'', August 9, 2006<br><br />
Michael Barbaro and Tom Zeller, Jr<br />
<br />
Statisticians frequently deal with confidentiality issues when deciding what type of data and what amount of detail should be withheld to protect sensitive information about individual patients or institutions. It's not an easy task and there are some subtle traps. And sometimes there are not so subtle traps.<br />
<br />
At the request of some researchers, America Online (AOL) released data on 20 million web searches performed 650 by thousand AOL users over a three month span. They released the data, not just to those researchers, but to the general public. AOL quickly realized that this was a bad idea and removed the database, but it had already been copied to many locations. It is unlikely that they will ever be able to persuade the web owners at all the other locations to take the files offline.<br />
<br />
The data was anonymized by replacing the user name with a random number. This is important, because some of the search terms are for rather sensitive items. Examples of things that people searched on are<br />
<br />
- "can you adopt after a suicide attempt" or<br />
<br />
- "how to tell your family you're a victim of incest."<br />
<br />
But replacing a name by a number did not come even close to anonymizing all of the records. The problem is that people will do web searches about things that reveal hints about themselves. Actual searches listed in the data base included things like geographic locations:<br />
<br />
- "gynecology oncologists in new york city,"<br />
<br />
- "orange county california jails inmate information,"<br />
<br />
- "employment needed- louisville ky," or<br />
<br />
- "salem probate court decisions,"<br />
<br />
or places where the searchers shopped or banked or got health care,<br />
<br />
- "gerards restaurant in dc,"<br />
<br />
- "st. margaret's hospital washington d.c.,"<br />
<br />
- "l&n federal credit union," or <br />
<br />
- "mustang sally gentlemans club,"<br />
<br />
or products that the searchers owned,<br />
<br />
- "cheap rims for a ford focus," or<br />
<br />
- "how to change brake pads on scion xb,"<br />
<br />
or their hobbies,<br />
<br />
- "knitting stitches," or<br />
<br />
- "texas hold'em poker on line seminars."<br />
<br />
It gets even more revealing when people do web searches on their relatives or even themselves.<br />
<br />
These individual searches are, according to one report, like individual pieces in a mosaic. Put enough of them together and you can get a really clear picture of who the searcher is. Can you actually identify people from their web searches? The answer is yes.<br />
<br />
According to the New York Times report, one user, with the id number 4417749 searched for<br />
<br />
- "landscapers in Lilburn, Ga," and<br />
<br />
- "homes sold in shadow lake subdivision gwinnett county georgia,"<br />
<br />
as well as the names of several people, all of whose last names were Arnold. It didn't take long for the New York Times to track down a 62 year old widow named Thelma Arnold.<br />
<br />
<blockquote>Ms. Arnold, who agreed to discuss her searches with a reporter, said she was shocked to hear that AOL had saved and published three months’ worth of them. “My goodness, it’s my whole personal life,” she said. "I had no idea somebody was looking over my shoulder."</blockquote><br />
<br />
This is an important lesson that statisticians have been aware of for some time. An individual piece of information by itself may not compromise someone's privacy, but will do so when it is combined with other pieces of information. Knowing that someone lives in a small town still preserves anonymity, but when that small town name appears in a database of all pediatric heart transplant cases, you have a problem.<br />
<br />
===Questions===<br />
<br />
1. List some of the other things that people might search on that would potentially reveal their identities.<br />
<br />
2. Could this data set be cleaned up to the point where it could be truly thought to be anonymized?<br />
<br />
3. Why would a researcher be interested in what people search for on the Internet? What sort of information would be useful for someone in Marketing?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Mean vs. Median==<br />
<br />
[http://abcnews.go.com/Technology/WhosCounting/story?id=2265555&page=1 Who's Counting: It's Mean to Ignore the Median] <br><br />
ABCNews.com, 6 August 2006 <br><br />
John Allen Paulos <br />
<br />
This latest installment of "Who's Counting" focuses on the distinction between the mean and median. Paulos begins with the familiar example of housing prices, and goes on to discuss the implications for interpreting newly released data on the performance of the US economy for 2004. Republicans point out that the economy grew at a rate of 4.2%, and complain that they are not getting enough credit for the good news. Democrats counter that real median income is falling and poverty is rising. How can both be true? Just as a few expensive houses in a neighborhood can pull the mean substantially above the median, gains by a wealthy few at the top of the income ladder can pull up the mean, even if most people are not benefiting. <br />
<br />
To show that this is happening, Paulos cites work on income distribution by economists Thomas Piketty and Emmanuel Satz. According to their calculations, the the richest one percent, whose incomes exceed $315,000, gained on average nearly 17% over the year in question. However, the good news did not extend very far down the income distribution. Looking at the top five percent of all incomes, the average gain is described as "minimal." This means that the gains were concentrated near the very top. In fact, even among the top one percent, Picketty and Satz found that half of income gains went to the top tenth of the group.<br />
<br />
Paulos points out that the pattern of the income distribution can be described mathematically in terms of so-called "power laws," which apply to a variety of observed phenomenon, including Internet surfing and investing. A general description of power laws from Wikipedia can be found [http://en.wikipedia.org/wiki/Pareto_distribution here]. <br><br />
<br />
Submitted by Bill Peterson<br />
<br />
==A Reader's Guide to Polls==<br />
[http://www.nytimes.com/2006/08/27/opinion/27pubed.html Precisely False vs. Approximately Right: A Reader's Guide To Polls]<br><br />
''The New York Times'', August 27, 2006, The Public Editor<br><br />
Jack Rosenthal<br />
<br />
Jack Rosenthal, a former New York Times senior editor filling in as the guest "Public Reader", is concerned that the media often reports the outcomes of a poll without explaining how the poll should be interpreted and without alerting the readers when there are serious problems with the way the poll is carried out. He provides the following example:<br />
<br />
<blockquote> Last March, the American Medical Association reported an alarming rate of binge drinking and unprotected sex among college women during spring break. The report was based on a survey of "a random sample" of 644 women and supplied a scientific-sounding "margin of error of +/– 4.00 percent." Television, columnists and comedians embraced the racy report. The New York Times did not publish the story, but did include some of the data in a chart. <br><br><br />
<br />
The sample, it turned out, was not random. It included only women who volunteered to answer questions — and only a quarter of them had actually ever taken a spring break trip. They hardly constituted a reliable cross section, and there is no way to calculate a margin of sampling error for such a "sample." </blockquote><br />
<br />
For more information about this AMA survey, Rosenthal refers readers to a polling blog [http://www.mysterypollster.com/ Mystery Pollster] maintained by Mark Blumenthal, a pollster for the Democratic Party. Here we read: <br />
<br />
<blockquote> Cliff Zukin, the current president of the American Association for Public Opinion Research (AAPOR), saw the survey results printed in the Times, and wondered about how the survey had been conducted. He contacted the AMA and was referred to the methodology section of their online release. He saw the following description (which has since been scrubbed):<br />
<br />
<blockquote>The American Medical Association commissioned the survey. Fako & Associates, Inc., of Lemont, Illinois, a national public opinion research firm, conducted the survey online. A nationwide '''random sample''' of 644 women age 17 - 35 who currently attend college, graduated from college or attended, but did not graduate from college within the United States were surveyed. The '''survey has a margin of error''' of +/- 4.00 percent at the 95 percent level of confidence [emphasis added]. </blockquote><br />
<br />
Zukin then contacted Janet Willams at the AMA asking for more details on how the study was carried out. She responded: <br />
<br />
<blockquote> The poll was conducted in the industry standard for internet polls -- this was not academic research -- it was a public opinion poll that is standard for policy development and used by politicians and nonprofits.</blockquote><br />
<br />
Zukin replied: <br />
<br />
<blockquote>I'm very troubled by this methodology. As an opt-in non-probability sample, it lacks scientific validity in that your respondents are not generalizable to the population you purport to make inferences about. As such the report of the findings may be seriously misleading. I do not accept the distinction you make between academic research and a "public opinion" survey. </blockquote> <br />
<br />
In her reply Williams said:<br />
<br />
<blockquote> As far as the methodology, it is the standard in the industry and does generalize for the population. Apparently I need to reiterate that this is not an academic study and will [not ?] be published in any peer reviewed journal; this is a standard media advocacy tool that is regularly used by the American Lung Association, American Heart Association, American Cancer Society and others. </blockquote><br />
<br />
We recommend reading the full Mystery Pollster discussion of the AMAA Spring Break Survey [http://www.mysterypollster.com/main/2006/03/the_ama_spring_.html Part 1] and [http://www.mysterypollster.com/main/2006/03/the_ama_spring__1.html Part 2].<br />
<br />
Rosenthal gives another example:<br />
<br />
<blockquote>Another example surfaced last week in The Wall Street Journal. It examined a “landmark survey,” conducted for liquor retailers, claiming to show that “millions of kids” buy alcohol online. A random sample? The pollster paid the teenage respondents and included only Internet users.</blockquote><br />
<br />
This survey is critiqued in Carl Bialik's "Numbers Guy" [http://online.wsj.com/public/article/SB115574573662137365.html column] in the Wall Street Journal Online, August 18, 2006.<br />
<br />
Rosenthal remarks:<br />
<br />
<blockquote> Such misrepresentations help explain why The Times recently issued a seven-page paper on polling standards for editors and reporters. "Keeping poorly done survey research out of the paper is just as important as getting good survey research into the paper," the document said.</blockquote><br />
<br />
Rosenthal says "readers, too, need to know something about polls--at least enough to sniff out good polls from bad" and so he provides a brief reader's guide. This includes understanding margin of error and being aware of problems in the way the questions are asked such as: use of double negatives, the order of the questions, the effect of strength of feeling about an issue etc.<br />
<br />
The MysteryPollster remarks that the TIMES document on polling standards is apparently not in the public domain while ABC have made their standards public in their report [http://abcnews.go.com/US/PollVault/story?id=145373&page=1ABC News' Polling Methodology and Standards] and suggested that the Times should also make their standards for editors and reporters public. <br />
</blockquote><br />
<br />
===Discussion questions:===<br />
<br />
(1) The first item in the Reader's guide is to beware of too much precision. The following example is given:<br />
<br />
<blockquote>A recent Zogby Interactive poll, for instance, showed that the candidates for the Senate in Missouri were separated by 3.8 percentage points. Yet the stated margin of sampling error meant the difference between the candidates could be seven points. The survey would have to interview unimaginably many thousands for that zero point eight to be useful.</blockquote><br />
<br />
Why should we beware of too much precision?<br />
<br />
(2) The second item deals with sampling error. We read:<br />
<br />
<blockquote>The Times and other media accompany poll reports with a box explaining how the random sample was selected and stating the sampling error. Error is actually a misnomer. What this figure actually describes is a range of approximation. <br><br><br />
<br />
For a typical election sample of 1,000, the error rate is plus or minus three percentage points for each candidate, meaning that a 50-50 race could actually differ by 53 to 47. </blockquote><br />
<br />
Do you agree that the error in "sampling error" is a misnomer? Do you see anything wrong with the second sentence?<br />
<br />
(3) Rosenthal says:<br />
<br />
<blockquote>There’s also a formula for calculating the error in comparing one survey with another. For instance, last May, a Times/CBS News survey found that 31 percent of the public approved of President Bush’s performance; in the survey published last Wednesday, the number was 36 percent. Is that a real change? Yes. After adjustment for comparative error, the approval rating has gained by at least one point.</blockquote><br />
<br />
What was the sample size?<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Risk perceptions==<br />
[http://www.wired.com/news/technology/0,71743-0.html?tw=wn_index_29 One Million Ways to Die], Ryan Singel, Wired.com, 11 Sep 2006.<br />
<br />
This on-line article compares official mortality data with <br />
the number of Americans who have been killed <br />
inside the United States by terrorism since 1995.<br />
It highlights that many threats are far more likely <br />
to kill an American than any terrorist -- at least, statistically speaking.<br />
For example, it claims that your appendix is more likely to kill you than al-Qaida is.<br />
<br />
The rankings are:<br />
<br />
S E V E R E<br />
Driving off the road: 254,419<br />
Falling: 146,542<br />
Accidental poisoning: 140,327<br />
<br />
H I G H<br />
Dying from work: 59,730<br />
Walking down the street: 52,000.<br />
Accidentally drowning: 38,302<br />
<br />
E L E V A T E D<br />
Killed by the flu: 19,415 <br />
Dying from a hernia: 16,742<br />
<br />
G U A R D E D<br />
Accidental firing of a gun: 8,536 <br />
Electrocution: 5,171<br />
<br />
L O W<br />
Being shot by law enforcement: 3,949<br />
Terrorism: 3147<br />
Carbon monoxide in products: 1,554<br />
<br />
===Questions===<br />
* The rankings are based on the number of mortalities in each category throughout the 11-year period spanning 1995 through 2005 (extrapolated from best available data). What issues might arise from extrapolation of data? Is the past data a good guide to future exposures for all of these risks?<br />
* Are the underlying populations from which the data are compiled really comparable? If you think the exposures to risk vary by threat, what adjustments might be made to standardize the data?<br />
* Why do you think the risk from certain threats is perceived to be greater or less than the statistics suggest? <br />
* If these point estimates included some estimates of variation, such as a full probability distribution, what differences might you expect to see between such distributions? Do you think that that extra information might influence your perception of risk, or even how you might define risk in the first place? <br />
<br />
===Data sources===<br />
* [http://www.nhtsa.dot.gov/people/Crash/crashstatistics/National%20Highway%20Safety%20Data%20charts.pdf National Highway and Safety Agency] (.pdf)<br />
* [http://www.cdc.gov/nchs/data/nvsr/nvsr50/nvsr50_15.pdf#search=%22National%20Vital%20Statistics%20Report%2C%20Vol.%2050%2C%20No.%2015%2C%20September%2016%2C%202002%22 National Vital Statistics Reports], Vol. 50, No. 15 (09/16/2002) (.pdf)<br />
* [http://www.cpsc.gov/library/data.html US Consumer Product Safety Commission]<br />
* [http://www.iii.org/media/facts/ the Insurance Information Institute].<br />
<br />
<br />
Submitted by John Gavin.<br />
<br />
==Exit poll inventor dies aged 71==<br />
[http://www.cnn.com/2006/US/09/02/obit.mitofsky/index.html Mitofsky, 'father of exit polling,' dies at 72], CNN 09.03.06.<br><br />
[http://www.bbc.co.uk/radio4/news/lastword_15sept2006.shtml Warren J Mitofsky, Pollster who has died aged 71], Last Word, BBC Radio 4, Friday 15th Sep 2006.<br><br />
<br />
[http://en.wikipedia.org/wiki/WARREN_MITOFSKY Warren Mitofsky] considered by many to be the "father of exit polling", changed the way the media covers elections by pioneering the use of exit polls.<br />
<br />
An exit poll is a poll of voters taken immediately after they have exited the polling stations. Unlike an opinion poll, which asks who the voter plans to vote for or some similar formulation, an exit poll asks who the voter actually voted for (from [http://en.wikipedia.org/wiki/Exit_polling Wikipedia]).<br />
<br />
[http://www.cbsnews.com/stories/2006/09/03/politics/main1963665.shtml CBS News] said<br />
<blockquote><br />
Today, the methods behind the exit polls that give voice to America’s voters, and the mathematical models that help estimate election results, are in large part the result of his ingenuity and creativity. As Dan Rather once told the nation, as a heated election night’s results poured in, "I believe in God, Country, and Warren Mitofsky." <br />
</blockquote><br />
<br />
Mitofsky’s demand for the highest standards in those methods was legendary. Murray Edelman, Mitofsky’s colleague at CBS News from 1967-1992, said <br />
<blockquote><br />
people in the field knew Warren for his creativity, his dedication, and his passion … and they have the scars to prove it.<br />
</blockquote><br />
<br />
Mitofsky always sought to build outstanding teams of researchers. In his address to NYAAPOR in 2002, he emphasized that survey research was "an eclectic field" demanding many kinds of expertise, and that in turn demanded that many diverse experts be involved. <br />
<blockquote><br />
No one person I know possess all the various skills at a high enough level necessary to conduct a survey. It takes a team of people to encompass all the areas.<br />
</blockquote><br />
<br />
He also played a key role in developing the sample survey technique know as [http://www.answers.com/topic/random-digit-dialing random digit dialing (RDD)]. RDD means a computer keeps picking numbers at random until it finds a valid one.<br />
<br />
Such computer assisted telephone interviewing (CATI) techniques are widely used for surveys. Their advantages over face-to-face interviewing are timeliness and cost-reduction to achieve the same sample size and geographical coverage. Two common sampling procedures are random sampling from the telephone directory and RDD sampling. RDD sampling offers better coverage of households than telephone-book sampling and can be generated quickly.<br />
<br />
For example, almost all telephone numbers in the US are a ten-digit number: a three-digit area code, a three-digit central office code and four-digit suffix number. For each central office included in the sample, random four-digit numbers between 0001 and 9999 yield the required random telephone numbers. This includes both listed and unlisted numbers. But unlisted households tend to cluster bimodally, among high and low income areas and they are also more prevalent in metropolitan areas. <br />
Also RDD tends to exclude small geographical areas, as a selected telephone exchange may contain several small geographical areas i.e. small towns. <br />
<br />
===Questions===<br />
* Exit polls have historically been used as a check against and rough indicator of the degree of fraud in an election. What are the potential flaws with such samples?<br />
* In the US, exit polls can be reported before elections polls have closed. Why might this matter? For example, in the 2000 U.S. Presidential election, it was alleged that media organizations released exit poll results for Florida before the polls closed in the Florida panhandle. Could such exit polls influence the outcome that they are trying to predict and, if so, would it be a positive or negative feedback effect?<br />
* Do you think that such sampling techniques could ever be so bad that they should be banned completely, as in New Zealand, or in the UK, where publication of exit polls before the polls close is a criminal offence?<br />
* Why do you think looking up looking numbers via a telephone book might result in a biased sample?<br />
* On the other hand, assuming that ex-directory numbers will be unhappy to be cold-called regarding some survey, how do think this attitude might influence the RDD sample?<br />
* Do you think RDD is an invasion of privacy? How might you justify it against this charge?<br />
<br />
===Futher reading===<br />
* [http://www.math.temple.edu/~paulos/exit.html Final Tallies Minus Exit Polls = A Statistical Mystery!], by John Allen Paulos, OpEd in the Philadelphia Inquirer, Nov. 24, 2004.<br />
* [http://www.google.co.uk/search?num=30&hl=en&safe=off&q=warren+mitofsky+obituary&meta= Obituaries] via Google.<br />
<br />
Submitted by John Gavin.<br />
<br />
==Do man-made factors fuel hurricanes?==<br />
<br />
Man-made factors fuel hurricanes, study finds.<br><br />
Boston Globe, September 12, 2006, A1<br><br />
Beth Daley<br />
<br />
A study in the Proceedings of the National Academy of Sciences reports an 84 percent chance that human activities are responsible for most of the recent heating in the Atlantic and Pacific ocean regions where hurricanes form. Overall, oceans have warmed approximately 1 degree Fahrenheit over the last century, a change which the study says cannot be attributed to natural cycles. That claim is based on extensive computer simulations that try to model climate systems under different scenarios, including volcanoes, fluctuations in solar fluctations and human effects on the atmosphere. No combination of natural factors was able to reproduce the observed warming.<br />
<br />
It is well known that warm water contributes to hurricane intensity, so the study helps bolster the case of scientists who warned that average hurricane intensity has been increasing as a result of global warming. Others caution, however, that the evidence is not yet clear. Among the objections cited in the article are concerns about underestimation of the strength of earlier storms, and questions about whether the observed warming is sufficient to explain the strength of recent storms. <br />
<br />
Since Hurricane Katrina last year, there has been a great deal of public debate about possible human influences on hurricane intensity. The August 2006 issue of the Bulletin of the American Meteorological Society (BAMS) has an excellent review of the matter, entitled [http://webster.eas.gatech.edu/Papers/Webster2006d.pdf#search=%22BAMS%20mixing%20politics%20and%20science%22 &quot;Mixing Politics and Science in Testing the Hypothesis That Greenhouse Warming Is Causing a Global Increase in Hurricane Intensity&quot;]. The authors analyze in considerable detail the structure of the arguments put forward by skeptics of climate change, taking care to distinguish valid criticisms from logical fallacies. After debunking the logical fallacies, they outline the kinds of scientific investigations that could be used to rationally settle the open questions. A sidebar in the article includes a taxonomy of logical fallacies (such as ''ad hominem fallacy'', ''begging the question'', etc.). For example, &quot;''Statistical special pleading'' occurs when the interpretation<br />
of the relevant statistic is 'massaged' by looking for ways to reclassify or requantify data from one<br />
portion of results, but not applying the same scrutiny to other categories&quot;. The article cites further [http://en.wikipedia.org/wiki/Logical_fallacy online discussion from Wikipedia]. This would make wonderful background reading in a CHANCE course (even if global warming was not on the course agenda)! <br />
<br />
Here is one nice example of discussion from the BAMS article. One argument advanced by the skeptics held that the reported doubling of the annual number of major hurricanes (Category 4 and 5) between 1970 and 2004 goes away if Category 3 storms are included along with 4 and 5. There is a simple graphic in the BAMS article that shows why this argument fails. As explained by the authors (p. 1028): <br />
<blockquote><br />
Figure 1a shows the global trends for each <br />
hurricane category, and Fig. 1b shows the global <br />
trends for 3+4+5 and 4+5 hurricanes.<br />
The comparison in Fig. 1b indicates that an inability<br />
to discriminate between category-3, -4, and -5<br />
hurricanes introduces a maximum uncertainty of<br />
±30% to WHCC's finding of a 100% increase in the<br />
proportion of category-4+5 hurricanes. Hence, the<br />
null hypothesis must be rejected unless we cannot<br />
distinguish category-1 from category-4 storms.<br />
</blockquote><br />
<br />
Submitted by Bill Peterson</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2795Chance News 182006-07-11T16:51:49Z<p>Mmartin: /* The interaction that wasn't there */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong, one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006, ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really surprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between Vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the Vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of a Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals' impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence "What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take Vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2794Chance News 182006-07-11T16:45:14Z<p>Mmartin: /* The interaction that wasn't there */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong, one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006, ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really surprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between Vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the Vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of a Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals' impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence "What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take Vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2793Chance News 182006-07-11T16:40:07Z<p>Mmartin: /* Discussion */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong, one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006, ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really surprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between Vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the Vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of a Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals' impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence "What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2792Chance News 182006-07-11T16:38:33Z<p>Mmartin: /* Impact Factors */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong, one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006, ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really surprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between Vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the Vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of a Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals' impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2791Chance News 182006-07-11T16:36:25Z<p>Mmartin: /* Independence of a DSMB is questioned */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong, one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006, ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really surprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between Vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the Vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of a Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2790Chance News 182006-07-11T16:08:49Z<p>Mmartin: /* Discussion questions */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong, one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006, ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really surprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2789Chance News 182006-07-11T16:06:30Z<p>Mmartin: /* Newsweek says they were wrong */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong, one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006, ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really suprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2788Chance News 182006-07-11T15:58:49Z<p>Mmartin: /* Newsweek says they were wrong */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited'." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006 ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really suprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2787Chance News 182006-07-11T15:56:44Z<p>Mmartin: /* Newsweek says they were wrong */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-olds are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited"." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006 ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really suprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2786Chance News 182006-07-11T15:51:00Z<p>Mmartin: /* What does "unable to replicate" mean? */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-old are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited"." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006 ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really suprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2785Chance News 182006-07-11T15:48:20Z<p>Mmartin: /* Newsweek says they were wrong */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has under come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-old are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited"." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquote><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006 ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really suprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_18&diff=2784Chance News 182006-07-11T15:47:04Z<p>Mmartin: /* How to Lie with Statistics Turns Fifty */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote>Single 40-year-old women have a better chance of being killed by a terrorist than getting married.</blockquote><br />
<br />
<div align="right" >[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986</div><br />
<br />
See: [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_18#Newsweek_says_they_were_wrong Newsweek says they were wrong]<br />
<br />
==Forsooths==<br />
<br />
These Forsooths are from the June 2006 ''RSS News''.<br />
<br />
<blockquote> This summer there's about a 50 per cent probability that there will be above normal temperatures for much of Britain and Europe.<br><br />
<div align=right>''The Times''<br><br />
5 March 2004<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> To convert kilometres to miles multiply by .6214; kilometres/hour to miles/hour multiply by .6117<br><br />
<div align=right>''Schott's Almanac'', page 193, Table of Conversions.<br />
</div></blockquote><br />
----<br />
<blockquote> <br />
The BBC remains just ahead of commercial radio in the UK, with a 67% share of all listeners compared with 64%.<br />
<br><br />
<div align="right">BBC news website<br><br />
2 February 2006<br />
</div><br />
<br />
----<br />
<br />
==Statz Rappers==<br />
[http://video.google.com/videoplay?docid=489221653835413043 A statistics class at the University of Oregon had an imaginative graduate teaching assistant.]<br />
<br />
==How to Lie with Statistics Turns Fifty==<br />
"How to Lie with Statistics Turns Fifty"<br><br />
[http://www.imstat.org/sts/issue_20_3.html Special Section: ''Statistical Science'', Vol. 20. No 3, August 2005]<br />
<br />
''The College Mathematics Journal'' (CMJ) has a column called "Media Highlights" which covers mathematics generally and its reviews often involve probability or statistical concepts. In the May 2006 issue of CMJ, Norton Starr reviews this special section of ''Statistical Science'' that recognizes the 50th birthday of Darrell Huff's famous book "How to Lie with Statistics" by asking several authors to contribute articles for this birthday party. These articles are:<br />
<br />
"Darrell Huff and Fifty Years of How to Lie with Statistics", Michael Steele.<br />
<br />
"Lies, Calculations and Constructions: Beyond How to Lie with Statistics", Joel Best.<br />
<br />
"Lying with Maps", Mark Monmonier.<br />
<br />
"How to Confuse with Statistics or: The Use and Misuse of Conditional Probabilities", Walter Kremer and Gerd Gigerenzer.<br />
<br />
"How to Lie with Bad Data", Richard D. De Veaux and David J. Hand.<br />
<br />
"How to Accuse the Other Guy of Lying with Statistics", Charles Murray.<br />
<br />
"Ephedra", Sally C. Morton.<br />
<br />
"In Search of the Magic Lasso: The Truth About the Polygraph", Stephen, E. Fienberg and Paul C. Stern.<br />
<br />
Norton gives a nice description of each of the papers including some of his own insightful comments. We will restrict ourselves to some quotes from the articles that we found particularly interesting. <br />
<br />
Michael Steeles tells us the story of the life of Darrell Huff and begins with:<br />
<br />
<blockquote> In 1954 former ''Better Homes and Gardens'' editor<br />
and active freelance writer Darrell Huff published a<br />
slim (142 page) volume, which over time would become<br />
the most widely read statistics book in the history<br />
of the world. <br><br><br />
There is some irony to the world's most famous statistics<br />
book having been written by a person with no<br />
formal training in statistics, but there is also some logic<br />
to how this came to be. Huff had a thorough training<br />
for excellence in communication, and he had an exceptional<br />
commitment to doing things for himself.</blockquote><br />
<br />
In his article Joel Best reminds us of the failure of the "critical thinking" movement in the late 1980's and the 1990's and asks "who would teach it”. He is not very optimistic about this being done in statistics courses or in social science courses. And we were not very successful in getting people to teach our Chance course. He concludes his article with:<br />
<br />
<blockquote> We all know statistical literacy is an important problem,<br />
but we’re not going to be able to agree on its place in the curriculum. Which means that "How to Lie with Statistics" is going to continue to be needed in the years ahead. </blockquote><br />
<br />
When we read the "The Bell Curve" by Richard Herrnstein and Charles Murray to review for Chance News, it seemed to us that the reviewers in the major newspapers could not have actually read the book. So we wrote a long review of the book for Chance News ([http://www.dartmouth.edu/~chance/chance_news/recent_news/recent.html Chance News 3.15, 3.16, 4.01]).<br />
<br />
In his article Charles Murray explains six ways to knock down a book. He discribes these as:<br />
<br />
<blockquote> Tough but effective strategies for making people think that the target book is an irredeemable mess, the findings are meaningless, the author is incompetent and devious and the book’s thesis is something it isn’t. </blockquote><br />
<br />
Our experience with "The Bell Curve" made us realize that we may have seen an example of his sixth way to knock down a book which he calls "THE BIG LIE" and describes as follows:<br />
<br />
<blockquote>Finally, let us turn from strategies based on halftruths<br />
and misdirection to a more ambitious approach:<br />
to borrow from Goebbels, the Big Lie.<br />
The necessary and sufficient condition for a successful<br />
Big Lie is that the target book has at some point<br />
discussed a politically sensitive issue involving gender,<br />
race, class or the environment, and has treated this issue<br />
as a scientifically legitimate subject of investigation<br />
(note that the discussion need not be a long one, nor is<br />
it required that the target book takes a strong position,<br />
nor need the topic be relevant to the book's main argument).<br />
Once this condition is met, you can restate the<br />
book's position on this topic in a way that most people<br />
will find repugnant (e.g., women are inferior to men,<br />
blacks are inferior to whites, we don't need to worry<br />
about the environment), and then claim that this repugnant<br />
position is what the book is about.<br><br><br />
What makes the Big Lie so powerful is the multiplier<br />
effect you can get from the media. A television news<br />
show or a syndicated columnist is unlikely to repeat<br />
a technical criticism of the book, but a nicely framed<br />
Big Lie can be newsworthy. And remember: It's not<br />
just the public who won't read the target book. Hardly<br />
anybody in the media will read it either. If you can get<br />
your accusation into one important outlet, you can start<br />
a chain reaction. Others will repeat your accusation,<br />
soon it will become the conventional wisdom, and no<br />
one will remember who started it. Done right, the Big<br />
Lie can forever after define the target book in the public<br />
mind.</blockquote><br />
<br />
Finally we agree with Norton's final remark in his review:<br />
<br />
<blockquote> The articles are both a compliment to and a complement of Huff's pathbreaking venture in writing. [http://www.imstat.org/sts/issue_20_3.html This issue of '' Statistical Science''] is destined to be a collector's item.</blockquote><br />
<br />
Submitted by Laurie Snell<br />
<br />
==What does "unable to replicate" mean?==<br />
<br />
[http://www.bloomberg.com/apps/news?pid=10000088&sid=a1ELJy6bUuTk&refer=culture "Freakonomics" Author and HarperCollins Sued for Defamation], Kevin Orland, April 11, 2006, Bloomberg.com.<br />
<br />
John Lott is an economist who has published a book "More Guns, Less Crime" that uses a multiple linear regression model to demonstrate that crime rates go down when states pass "concealed carry" laws. Concealed carry laws allow citizens to apply for the right to legally carry a concealed gun for their own protection. The regression model controlled for a large number of possible confounding variables. The theory is that if criminals do not know which of their victims might be armed, they would be more reluctant to mug strangers. This theory is very controversial and has come under attack from gun control advocates.<br />
<br />
Steven D. Levitt and Stephen J. Dubner are economists who published a book "Freakonomics" that uses a multiple linear regression model in Chapter 4 to demonstrate that states which have a high abortion rate saw a larger drop in crime than states with a low abortion rate. The regression model controlled for a large number of possible confounding variables. The theory is that if abortion laws reduced the number of "unwanted children" fewer children would grow up in an environment of neglect and end up becoming criminals. This theory is very controversial and has under come under attack from right-to-life groups.<br />
<br />
It is not too surprising that the authors of two such provocative regression models would end up in a public clash. Levitt and Dubner criticize Lott's research in their book, and Lott has responded by suing.<br />
<br />
<blockquote>Lott said in a federal lawsuit filed yesterday in Chicago that Levitt, a University of Chicago economist, defamed him when he wrote that other scholars have been unable to replicate Lott's research linking lower crime rates with the right to carry guns. The passage amounts to an allegation that Lott falsified his results, according to the suit.</blockquote><br />
<br />
There are actually much stronger allegations about fraud concerning Lott's research. Timothy Noah, for example, published an article in Slate magazine about Lott with the title "[http://www.slate.com/id/2078084/ Another firearms scholar whose dog ate his data.]"<br />
<br />
But apparently, the allegation of failure to replicate is more serious.<br />
<br />
<blockquote>The allegation "damages Lott's reputation in the eyes of the academic community in which he works, and in the minds of the hundreds of thousands of academics, college students, graduate students, and members of the general public who read 'Freakonomics,'" Lott said in the lawsuit.</blockquote><br />
<br />
The remedies suggested by Lott are rather harsh.<br />
<br />
<blockquote>Lott's suit asks for a halt in sales, a retraction in the next printing of the book and unspecified damages from Levitt and HarperCollins.</blockquote><br />
<br />
Interestingly enough the suit does not mention the co-author, Stephen Dubner.<br />
<br />
===Questions===<br />
<br />
1. What does the phrase "unable to replicate" mean to you? Does replication mean different things in economics versus medicine? Is "unable to replicate" a code phrase used to hint that the data is fraudulent?<br />
<br />
2. Why do you think that Lott sued Levitt and not Noah?<br />
<br />
3. What impact might this lawsuit have on scientific criticism?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Newsweek says they were wrong==<br />
<br />
[http://msnbc.msn.com/id/13007828/site/newsweek/ Marriage by the Numbers]<br> Newsweek, June 6, 2006,<br />
society; Pg. 40<br><br />
Daniel McGinn; With Andrew Murr, Karen Springen, Joan Raymond, Marc Bain, Alice-Azania Jarvis and Sam Register<br />
<br />
<br />
[http://msnbc.msn.com/id/12940202/site/newsweek/ Too Late for Prince Charming]<br>Newsweek, June 2, 1986, Lefestyle, Pg.58<br><br />
Eloise Salholz, Rennee Michael, Mark Starr, Shawn Doherty, Pamela Abramson, Pat, Wingert.<br />
<br />
[http://www.latimes.com/news/opinion/commentary/la-oe-daum3jun03,0,6461972.column?coll=la-home-commentary Lies, damn lies and marriage statistics]<br> ''Los Angeles Times'', June 3, 2006 Editorial Pages Desk; Part B; Pg. 17 <br><br />
Meghan Daum.<br />
<br />
The 1986 Newsweek article begins with:<br />
<blockquote>HIGHLIGHT:<br>A new study reports that college-educated women who are still single at the age of 35 have only a 5 percent chance of ever getting married<br><br />
BODY:<br><br />
Her sister had heard about it from a friend who had heard about it on "Phil Donahue" that morning. Her mother got the bad news via a radio talk show later that afternoon. So by the time Harvard graduate Carol Owens, 23, sat down to a family dinner in Boston, the discussion of the man shortage had reached a feverish pitch. With six unmarried daughters, Carol's said her mother was sounding an alarm. "You've got to get out of the house and meet someone," she insisted. "Now." </blockquote><br />
<br />
After two more such examples the article goes on to say:<br />
<br />
<blockquote>The traumatic news came buried in an arid demographic study titled, innocently enough, "Marriage Patterns in the United States." But the dire statistics confirmed what everybody suspected all along: that many women who seem to have it all -- good looks and good jobs, advanced degrees and high salaries -- will never have mates. According to the report, white, college-educated women born in the mid-'50s who are still single at 30 have only a 20 percent chance of marrying. By the age of 35 the odds drop to 5 percent. Forty-year-olds are more likely to be killed by a terrorist: they have a minuscule 2.6 percent probability of tying the knot.</blockquote><br />
<br />
We see that the study reported on white, college-educated women, it was clearly the sentence "Forty-year-old are more likely to be killed by a terrorist" that made the article have such a big impact on the public. We read further:<br />
<br />
<blockquote>Within days, that study, as it came to be known, set off a profound crisis of confidence among America's growing ranks of single women. For years bright young women single-mindedly pursued their careers, assuming that when it was time for a husband they could pencil one in. They were wrong. "Everybody was talking about it and everybody was hysterical," says Bonnie Maslin, a New York therapist. "One patient told me 'I feel like my mother's finger is wagging at me, telling me I shouldn't have waited"." Those who weren't sad got mad. The study infuriated the contentedly single, who thought they were being told their lives were worthless without a man. "I'm not a little spinster who sits home Friday night and cries," says Boston contractor Lauren Aronson, 29. "I'm not married, but I still have a meaningful life with meaningful relationships."</blockquote><br />
<br />
On the cover of the 2006 article we see::<br />
<center><font= 5>'''20 Years Ago</font><br><font= 3>Newsweek Predicted a Single 40-Year-Old Woman <br> Had a Better Chance of Being Killed by a Terrorist <br> Than Getting Married. Why We Were Wrong'''. </font></center><br />
<br />
From the 2006 Newsweek article we read:<br />
<br />
<blockquote> To mark the anniversary of the "Marriage Crunch" cover, NEWSWEEK located 11 of the 14 single women in the story. Among them, eight are married and three remain single. Several have children or stepchildren. None divorced. Twenty years ago Andrea Quattrocchi was a career-focused Boston hotel executive and reluctant to settle for a spouse who didn't share her fondness for sailing and sushi. Six years later she met her husband at a beachfront bar; they married when she was 36. Today she's a stay-at-home mom with three kids--and yes, the couple regularly enjoys sushi and sailing. "You can have it all today if you wait--that's what I'd tell my daughter," she says. " 'Enjoy your life when you're single, then find someone in your 30s like Mommy did'." </blockquoate><br />
<br />
The writers for Newsweek go on to say:<br />
<br />
<blockquote> The research that led to the highly touted marriage predictions began at Harvard and Yale in the mid-1980s. Three researchers--Neil Bennett, David Bloom and Patricia Craig--began exploring why so many women weren't marrying in their 20s, as most Americans traditionally had. Would these women still marry someday, or not at all? To find an answer, they used "life table" techniques, applying data from past age cohorts to predict future behavior--the same method typically used to predict mortality rates. "It's the staple [tool] of demography," says Johns Hopkins sociologist Andrew Cherlin. "They were looking at 40-year-olds and making predictions for 20-year-olds." The researchers focused on women, not men, largely because government statisticians had collected better age-of-marriage data for females as part of its studies on fertility patterns and birthrates.<br><br><br />
<br />
Enter NEWSWEEK. We were hardly the first to make a big deal out of their findings, which began getting heavy media attention after the Associated Press wrote about the study that February. People magazine put the study on its cover in March with the headline the new look in old maids. And NEWSWEEK's story might be little remembered if it weren't for the "killed by a terrorist" line, first hastily written as a funny aside in an internal reporting memo by San Francisco correspondent Pamela Abramson. "It's true--I am responsible for the single most irresponsible line in the history of journalism, all meant in jest," jokes Abramson, now a freelance writer who, all kidding aside, remains contrite about the furor it started. In New York, writer Eloise Salholz inserted the line into the story. Editors thought it was clear the comparison was hyperbole. "It was never intended to be taken literally," says Salholz. Most readers missed the joke. </blockquote><br />
<br />
While Newsweek admits they were wrong one gets the impression that their real mistake was the use of terrorist in their comparison.<br />
<br />
Finally, some comments by Megham Daum from her June 3, 2006 ''Los Angeles Times'' column.<br />
<br />
<blockquote>Since at least the 1970s, we've surfed the waves of any number of media-generated declarations about what women want, what we don't want, what we're capable of and, inevitably, what it's like to figure out that we're not capable of all that stuff after all, which doesn't matter because it turns out we didn't want it anyway. <br><br><br />
<br />
Like hem lengths, scare tactics wrought by questionably massaged statistics change with the seasons. After the difficulty of marrying came the challenge of getting pregnant later in life. The panic du jour, of course, is the apparent near-impossibility of effectively raising kids while maintaining a career. Somehow this topic registers as sexier than what's happening in, say, Iraq or Darfur. In our more myopic moments, we seem to believe that people in refugee camps aren't nearly as stressed out as your average law school grad with a Baby Bjorn.</blockquote><br />
<br />
Well, we did not add anything to this story but sometimes it seems best to let the players speak for themselves.<br />
<br />
===Discussion questions===<br />
<br />
(1) The article includes several graphics giving the results of studies on women and marriage. Here is one of these. Note that the first two studies were reported at about the same time.<br />
<br />
<center>Three studies tried to gauge the odds of an<br><br />
40-year-old woman's eventually marrying.</center><br />
<br />
<center>Bennett, Bloom & Craig<br> <br />
2.6% <br><br />
1986 Census report<br><br />
17%-23%<br><br />
1996 Census report<br>40.8%</center><br />
<br />
Do you think that "eventually marrying" is correct? See if you can find the first two studies and see if you can explain the difference in the first two outcomes.<br />
<br />
(2) Do you think that the Newsweek editors were really suprised that their readers did not recognize their joke?<br />
<br />
<br />
<br />
Submitted by Laurie Snell<br />
<br />
==Independence of a DSMB is questioned==<br />
<br />
[http://www.npr.org/templates/story/story.php?storyId=5462419 Conflicted Safety Panel Let Vioxx Study Continue], Snigdha Prakash, June 8, 2006, National Public Radio.<br />
<br />
Vioxx is a pain reliever manufactured by Merck which has a [http://www.npr.org/templates/story/story.php?storyId=5470430 complex and controversial history.] There have been recent revelations about serious conflicts of interest in the Data Safety Monitoring Board (DSMB) for a large scale trial, the Vioxx Gastrointestinal Outcomes Research study (VIGOR). This is not the trial that resulted in Vioxx being removed from the market, but rather an earlier trial.<br />
<br />
The DSMB reviewed data in 2000 that indicated a difference in risk of cardiovasclar between vioxx and the comparison drug, naproxen. If the VIGOR trial had been ended early because of an increased risk of heart problems, perhaps the vioxx would have been removed from the market four years earlier, saving countless lives and avoiding a flood of lawsuits that Merck is now facing.<br />
<br />
The DSMB, however, did not stop the study early and offered several explanations. First, the DSMB <br />
<br />
<blockquote>couldn't tell if Vioxx was causing the heart problems or if naproxen, acting like low-dose aspirin, protected people from them, making Vioxx just look risky by comparison.</blockquote><br />
<br />
This contention was disputed by several experts that NPR interviewed who pointed out that the reason for the discrepancy was irrelevant to those patients in the VIGOR trial that suffered harm as a result of their participation in the study. Also, there was no solid evidence that naproxen had a protective effect.<br />
<br />
The DSMB was also concerned about the small sample size. One of the experts disagreed with this contention also. The results were indeed statistically significant, and were consistent across all subgroups.<br />
<br />
<blockquote>Curt Furberg concedes the number of heart problems and deaths was small. But he says it's clear the results weren't due to chance. He says the patterns were the same in every population group in the study.</blockquote><br />
<br />
<blockquote>FURBERG: In old people, young people, those who have hypertension, those who don't, etc. And the findings were very, very consistent. So in my mind, this confirms that the findings are real.</blockquote><br />
<br />
The DSMB also did not stop the study early because the trial was almost completely over.<br />
<br />
Again, Dr. Furberg objects to this logic.<br />
<br />
<blockquote>Curt Furberg says it does take time to stop a large, multinational study, and only a few additional heart attacks or deaths could have been predicted to occur in the remaining time. But he says:</blockquote><br />
<br />
<blockquote>FURBERG: I think we have obligations -- ethical, moral obligations. You don't want to expose patients to a harmful drug in a drug study. They should not be treated like guinea pigs. They are human beings. And we need to respect their rights. </blockquote><br />
<br />
The DSMB also wanted the trial to continue because it was addressing a very important question.<br />
<br />
<blockquote>Vioxx could save lives, if the study showed that Vioxx caused less gastrointestinal bleeding.</blockquote><br />
<br />
Another expert interviewed by NPR disagreed.<br />
<br />
<blockquote>But cardiologist Paul Armstrong counters such bleeding isn't common.</blockquote><br />
<br />
<blockquote>ARMSTRONG: The frequency with which that occurs is minor, and I would say unlikely to be counterbalanced by this excess in death and cardiovascular events<br />
</blockquote><br />
<br />
There were several conflicts of interest among members of the DSMB. The chair of the DSMB owned $73,000 in Merck stock. Shortly after the DSMB finished it's work, the chair received a consulting contract for 12 days of work at $5,000 per day. Although it probably wasn't as lucrative, another member of the DSMB particpated on the speaker bureau at Merck.<br />
<br />
Another concern raised was the presence of Merck statistician during all deliberations of the DSMB. It is not unusual for a company statistician to present data to the DSMB, but in most situations, the statistician then removes himself/herself from any additional discussion.<br />
<br />
<br />
===Questions===<br />
<br />
1. If there is a statistically significant difference in the risk of side effects between two arms of the study, should the DSMB stop the study? Does the reason for the discrepancy have any relevance?<br />
<br />
2. Why would consistency across a wide range of subgroups in a study strengthen the credibility of a finding. How would you interpret such a finding if was restricted to a specific subgroup? What action would be appropriate for that subgroup?<br />
<br />
3. How large a financial stake should a person have before he/she should be barred from serving on a DSMB.<br />
<br />
4. If you were serving on a DSMB, would you be troubled by the presence of a company statistician during all deliberations?<br />
<br />
5. The members of a DSMB are typically selected by the company whose drug is being studied. Is there a problem with this approach? Can you suggest an alternative method for selecting members of a DSMB?<br />
<br />
Submitted by Steve Simon<br />
<br />
==Impact Factors==<br />
[http://online.wsj.com/public/article/SB114946859930671119-eB_FW_Satwxeah21loJ7Dmcp4Rk_20070604.html?mod=rss_free Science Journals artfully try to boost their Rankings]<br><br />
''Wall Street Journal'', June 5, 2006, B1<br><br />
Sharon Begley<br />
<br />
It always comes as a shock to students fresh out of high school chemistry and physics classes--where data is deemed sacred--to be told that in statistics it is legitimate to remove outliers. What is beyond the pale is to add data that didn't happen. This obvious restriction is now being loosened in a strange way. According to this ''Wall Street Journal'' article, researchers submitting papers to a particular scientific journal are being pushed to augment their articles with bibliographic citations of that specific journal. "Scientists and editors say scientific journals increasingly are manipulating rankings--called 'impact factors'--that are based on how often papers they publish are cited by other researchers."<br />
<br />
Why? Because "Impact factors are essentially a grading system of how important the papers a journal publishes are." Besides inflating a journal's reputation, "Journals can [also] limit citiations to papers published by competitors, keeping their rivals'impact factors down." As always, follow the money: "Impact factors matter to publishers' bottom lines because librarians rely on them to make purchasing decisions. Annual subscriptions to some journals can cost upwards of $10,000."<br />
<br />
===Discussion===<br />
<br />
1. In the ''Wall Street Journal'' article, several scientific journal editors<br />
deny that the impact factor plays any role in the selection of papers.<br />
Assume you are the editor, what would you tell would-be authors? What would<br />
you tell your reviewers?<br />
<br />
2. The article further states, "Scientists and publishers worry that the<br />
cult of the impact factor is skewing the direction of scientific research."<br />
Elaborate.<br />
<br />
3. A standard technique in frequentist inferential statistics is known as<br />
"p-value" which deals with data this extreme or more extreme. How does this<br />
square with the sentence " What is beyond the pale is to add data that<br />
didn't happen"?<br />
<br />
==Privacy vs. Security via Bayes Theorem==<br />
<br />
We're giving up privacy and getting little in return<br><br />
''Minneapolis Star Tribune'', May 31, 2006<br><br />
Bruce Schneier<br />
<br />
Bayes theorem (Bayesian inversion) is customarily introduced either via the so-called Harvard Medical School fallacy or the so-called prosecutor's fallacy. The former illustrates that the Prob(Disease|Test +)--what the patient wants to know--can be quite different from Prob(Test +|Disease)--the usual information given the patient by the doctor--when the number of false positives is large compared to the number of true positives. Likewise, the latter fallacy shows that Prob(Guilty|DNA matches) can be quite different from Prob(DNA matches|Guilty).<br />
<br />
However, we now live in an era where privacy and security become the watchwords of the day, affording us an unexpected and possibly unpleasant application of Bayes theorem. Bruce Schneier, a specialist in computer security, considers how data mining by means of NSA government wiretapping of phone calls/emails to uncover terrorist plots, is essentially fruitless because of the incredibly large number of false positives in comparison to the tiny number of true positives [Minneapolis Star Tribune, May 31, 2006]. Or, as he puts it, even an "unrealistically accurate system" will be such that "the police will have to investigate 27 million potential plots in order to find the one real terrorist plot per month. Clearly ridiculous." He concludes that "By allowing the NSA to eavesdrop on us all, we're not trading privacy for security. We're giving up privacy without getting any security in return."<br />
<br />
===Discussion===<br />
<br />
1. Schneier maintains that "Data mining works best when you're searching for a well-defined profile, a reasonable number of attacks per year, and a low cost of false alarms. Credit-card fraud is one of data mining's success stories: All credit-card companies mine their transaction databases for data for spending patterns that indicate a stolen card. Many credit-card thieves share a pattern." What pattern do credit-card thieves tend to have? What pattern, if any, is there for terrorists? Why would you react differently to a phone call from your credit-card company checking on one of your transactions as opposed to a government official questioning the web sites you visit?<br />
<br />
2. He uses the term "base rate fallacy" to describe the imbalance between false positives and true positives. Why is this term indicative of the problem?<br />
<br />
3. In the context of uncovering terrorist plots, what is meant by false negatives and true negatives?<br />
<br />
4. He claims, "It's a needle-in-a-haystack problem, and throwing more hay on the pile doesn't make that problem any easier." What do you think he means by this image?<br />
<br />
<br />
Submitted by Paul Alper<br />
<br />
==The interaction that wasn't there==<br />
<br />
[http://content.nejm.org/cgi/reprint/NEJMp068137v1.pdf Time-to-Event Analyses for Long-Term Treatments -- The APPROVe Trial.]Stephen W. Lagakos. The New England Journal of Medicine. 2006 June 26; [Epub ahead of print]<br />
<br />
Vioxx (rofecoxib), a pain relief medication in a class of drugs known as Cox-2 inhibitors, is the story that just won't go away. On June 26, 2006, the ''New England Journal of Medicine'' (NEJM) released a publication by Stephen Lagakos re-analyzing data from a pivotal trial, the Adenomatous Polyp Prevention on Vioxx (APPROVe) trial. At the same time, the Journal published two letters critical of the original publication of the APPROVe trial (Bresalier RS, Sandler RS, Quan H, et al. Cardiovascular events associated with rofecoxib in a colorectal adenoma chemoprevention trial. NEJM 2005; 352: 1092-102, not available online.), a response from the first two authors of the original study, and a correction to the original publication. All the articles are interesting, but especially the one by Dr. Lagakos, a professor of biostatistics at the Harvard School of Public Health who was hired by NEJM to produce an independent review of the APPROVe study. He comments on a particular side effect in the trial (cardiovascular events), which was of enough concern to force Merck to take vioxx off the market temporarily.<br />
<br />
<blockquote>Assessment of the cardiovascular data raises important issues about the analysis and interpretation of a time-to-event end point in a randomized, placebo controlled trial evaluating a long term treatment. These issues include the appropriate period of follow-up for safety outcomes after the discontinuation of treatment; the purpose and implications of checking the assumption of proportional hazards, which underlies the commonly used logrank test and Cox model; and what the results of a trial examining long-term use imply about the safety of a drug if it were given for shorter periods.</blockquote><br />
<br />
The APPROVe trial originally analyzed events during the course of treatment (up to 36 months) and any events that occurred within 14 days of discontinuation of the drug or placebo. The 14 day window after cessation of treatment is critical. If the window is too narrow, you might miss some events that were related to the treatment. On the other hand, if your window is too wide, you might include events unrelated to the treatment. These events unrelated to the treatment would presumably occur in equal numbers in both groups, diluting any effect that you might otherwise see.<br />
<br />
A short window is especially problematic if patients discontinue the drug for reasons related to the drug itself (the drug might be difficult to tolerate, for example). This causes a differential dropout rate and can produce some serious biases. Dr. Lagakos notes that the bias could end up going in either direction. There is indeed evidence of a differential drop-out rate, and Dr. Lagakos suggests some alternate analyses that should be considered in the face of this problem.<br />
<br />
Dr. Lagakos then discusses the proportional hazards assumption. This assumption is pivotal in the proper interpretation of the hazard ratio in a Cox proportional hazards model. Two examples of deviations from proportional hazards that are especially troublesome, according to Dr. Lagakos, are two survival curves that are initially more or less identical, but which then diverge sharply at a certain time point, and two survival curves that are initially different, but which converged after a particular time point. The original analysis noted the former pattern, with the two Kaplan-Meier survival curves more or less coincident for the first 18 months, and then taking a sharp separation apart after 18 months.<br />
<br />
When you suspect a violation of proportional hazards, one approach is to model the data using time varying covariates. In particular, you can model an interaction between time and treatment or an interaction between log time and treatment.<br />
<br />
This is where things turned seriously wrong.<br />
<br />
<blockquote>The APPROVe investigators planned to use an interaction test with the logarithm of time as the primary basis for testing the proportional-hazards assumption. This test resulted in a P value of 0.07, which did not quite meet the criterion of 0.05 specified for rejecting the assumption. However, the original report of the APPROVe trial1 mistakenly gave the P value as 0.01, which was actually the result of an interaction test involving untransformed time. (This error is corrected in this issue of the Journal.)</blockquote><br />
<br />
Dr. Lagakos notes that even if the test for interaction was not in error, there would still be problems. Presence of an interaction could imply several possible deviations from the proportional hazards assumption and not necessarily a deviation that represents similar risk for the first 18 months and dissimilar risk thereafter. He also points out that a graphical inspection of the Kaplan-Meier curves for violations of proportional hazards is potentially misleading.<br />
<br />
Finally, Dr. Lagakos reminds us that identical survival curves during the first 12-18 months does not, in and of itself, imply that a short term course of rofecoxib is without risk. Many exposures, such as radiation, have a latency period, and a divergence of risk at a later time point could occur even with a brief exposure that shows no change in risk during the short term.<br />
<br />
===Questions===<br />
<br />
1. Why does the drug company (Merck) have a financial incentive to demonstrate that exposure to rofecoxib has no increase in risk during the short term, but only long term?<br />
<br />
2. This is not the only study on rofecoxib that required a clarification or retraction (see the above article, Independence of a DSMB is questioned) nor the only study of Cox-2 inhibitors that has been criticized. Are these retractions evidence that the problems with incorrect data analyses are self correcting, or is it evidence that the peer-review process is broken?<br />
<br />
Submitted by Steve Simon<br />
<br />
===Figures===<br />
<br />
The following two figures were added by Laurie Snell. The first figure is from the authors original paper and the second from the their recent correspondance in the NEJM. In the original article the authors stated that the risk for Thrombotic Events was not apparent until after 18 months. After correcting the errors in this paper and adding additional data, they conclude that the risk is now apparent after 3 years. <br />
<br />
<center>[[Image:vioxx1.jpg]]</center><br />
<br />
Figure 2: Kaplan–Meier Estimates of the Cumulative Incidence of Confirmed Serious Thrombotic Events.<br />
<br />
[[Image:vioxx2.jpg|center|300px|]]</div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2796Chance News 172006-06-09T00:01:57Z<p>Mmartin: /* Facial Attraction */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases originally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappeared.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
==Facial Attraction==<br />
<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water Chance News article], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offspring.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice ... the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomaly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Ericsson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/?p=25#comments Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2653Chance News 172006-06-08T23:58:13Z<p>Mmartin: /* Measuring poverty in London over 100 years */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases originally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappeared.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
==Facial Attraction==<br />
<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water Chance News article], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomaly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Ericsson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/?p=25#comments Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2652Chance News 172006-06-08T23:49:54Z<p>Mmartin: /* Forsooths */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases originally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water Chance News article], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomaly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Ericsson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/?p=25#comments Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2649Chance News 172006-06-07T22:39:46Z<p>Mmartin: /* Facial Attraction */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water Chance News article], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomaly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Ericsson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2642Chance News 172006-06-07T22:31:48Z<p>Mmartin: /* The Birth-Month Soccer Anomaly */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomaly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Ericsson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2641Chance News 172006-06-07T22:27:07Z<p>Mmartin: /* The Birth-Month Soccer Anomoly */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomaly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Erisson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2640Chance News 172006-06-07T22:25:47Z<p>Mmartin: /* The Kindness of Strangers? */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therapeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncertainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomoly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Erisson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2639Chance News 172006-06-07T22:14:53Z<p>Mmartin: /* Discussion */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testosterone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therpeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncerntainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomoly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Erisson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2638Chance News 172006-06-07T22:06:00Z<p>Mmartin: /* Questions */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Economist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testoserone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therpeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncerntainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomoly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Erisson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2636Chance News 172006-06-07T22:02:30Z<p>Mmartin: /* Walking on Water */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors [Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5%. The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles: walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Ecomonist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testoserone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therpeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncerntainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomoly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Erisson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2635Chance News 172006-06-07T21:49:08Z<p>Mmartin: /* Forsooths */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found. The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors[Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5% . The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles, walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Ecomonist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testoserone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therpeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncerntainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomoly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Erisson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartinhttps://www.causeweb.org/wiki/chance/index.php?title=Chance_News_17&diff=2634Chance News 172006-06-07T21:47:36Z<p>Mmartin: /* Quotation */</p>
<hr />
<div>==Quotation==<br />
<br />
<blockquote><br />
There are two times in a man's life when he should not speculate: when he can't afford it, and when he can. </blockquote><br />
<br />
<div align="right" > Mark Twain </div><br />
<br />
==Forsooths==<br />
<br />
Part of the fun of looking at Forsooths is trying to figure out why they are Forsooths. You should certainly try but if you get stumped you can read one person's idea of why they are Forsooths at the end of this Chance News. <br />
<br />
The first three Forsooths are from the May 2006 ''RSS News''.<br />
<br />
<blockquote> Of the US Fortune 500 companies, 84 percent now have women on their boards: in the UK among the directors of companies in the FTSE 100, only 9 percent are women.<br />
<br><br />
<div align="right">''The Observer''<br><br />
19 March 2006<br />
</div></blockquote><br />
<br />
----<br />
<br />
<blockquote> Thursday is the least productive day for finance workers, research has found, The start of the week is the best time with 18 per cent claiming they were most productive on a Monday.<br><br />
<div align="right">''Metro''<br><br />
26 January 2006<br />
</div></blockquote><br />
----<br />
<blockquote> Question:<br><br><br />
Kim has three vases in her living room, each containing the same number of flowers. Kim adds three fresh flowers to one vase which now has two more than the new average. How many flowers were in the vases orginally?<br />
<br><br />
<div align="right">2006 Mensa puzzle calander<br><br />
</div><br />
[note: answer given as "six", which is quite correct of course.]<br />
----<br />
Peter Winkler pointed out the the following question is not a forsooth:<br />
<br />
<blockquote>Kim has *some* vases in her living room, each containing the same number of<br />
flowers. Kim adds three fresh flowers to one vase which now has two more than<br />
the new average. How many *vases* are there? </blockquote><br />
</blockquote><br />
<br />
==Walking on Water==<br />
<br />
For the most part, scientists, mathematician and statisticians labor in obscurity. Almost all of what they do is of no interest to the general public. The exception used to be if sex could somehow get connected and then the scientist/mathematician/statistician would suddenly be on the rolodexes of the various talk-show programs. As an example, not so long ago a statistical study regarding the size of the ratio of the length of the forefinger to the ring finger was everywhere and anywhere. Why? Because the authors[Nature, 30 March, 2000] claimed there was a statistical significance for the difference of the ratio for homosexuals as compared to heterosexuals. Thus, an easy noninvasive, visual way of spotting sexual preference. The flaws in the study were numerous. The participants were chosen from gay pride celebrations in the vicinity of San Francisco, an area not known to be typical of the United States; multiple comparisons were made and with enough data dredging it is not statistically surprising that there would be the odd comparison that had a p-value less than 5% . The clinical (substantive, practical) significance was more or less zero in keeping with the negligible effect size coupled with measurement error. Nevertheless, titillation was high enough for several weeks of joking, hand comparisons and bad puns by the public and the media.<br />
<br />
But sex, while always interesting, has given way to religion in American life. The phenomenal success of Dan Brown's The Da Vinci Code and the rise of the religious right guarantee that any scientific/mathematical/statistical research which can be tied to the Bible will bring instant celebrityhood. Even when the investigation appears in the unlikely Journal of Paleolimnology [2006 35:417-439] and involves "a small freshwater lake (148 km squared and a mean depth of 20 m)." The current name is Lake Kinneret but in Biblical days it was known as the Sea of Galilee upon which Jesus is said to have performed one of his miracles, walking on water. To walk on water is now a phrase that has come into the English language as being synonymous with extra-human, divine talent.<br />
<br />
The paper by Nof, McKeague and Paldor is not an easy read, combining as it does analysis based on sea surface temperature, (warm and salty) springs, plume dynamics, ice dynamics and time series. The paper would never have made the talk-show circuit if it were only the typically dry--no pun intended-- presentation in such a technical journal. What sets it apart is its scientific explanation of how Jesus could manage to walk on water. In essence, after much physics, mathematics, and a bit of statistics, the authors have "proposed that the unusual local freezing process might have provided an origin to the story that Christ walked on water. Since the springs ice is relatively small, a person standing or walking on it may appear to an observer situated some distance away to be 'walking on water'." To avoid being inundated by hate mail (which they received in any event) they carefully state, "Whether this [walking on ice] happened or not is an issue for religion scholars, archeologists, anthropologists and believers to decide on."<br />
<br />
In essence, the result of most of the highly mathematical argument in the paper is that things were occasionally colder back then and ice could have formed every once in a while, about every 160 years. Strangely enough, much of their data for this allegation comes from two core samples of temperature taken 2000 km away. The justification for this strange assertion is "because this distance is not any greater than the typical weather system scale in this part of the world." They do have some data much closer to the Lake but only from 1986 to 2003 yet "only the first 9 years of data were deemed suitable for use in the subsequent model." Because "the residual plots displayed some wild transitory behavior (as often seen for example, in financial time series data)," so they "added "a GARCH(1,1) component" to an AR(3) model resulting in the prediction of ice formation about every 160 years.<br />
<br />
In their summary, the authors carefully state, "We hesitate to draw any conclusions regarding the implications of this study to the actual events that took place...Our springs ice calculations may or may not be related to the origin of the account of Christ walking on water." Nonetheless, Nof and Paldor are not strangers to conjuring up scientific explanations for Biblical phenomena. In 1992 they wrote an article, "Are There Oceanographic Explanations for the Israelites' Crossing of the Red Sea?" [Bulletin American Meteorological Society, 73; 305-314] This time, instead of temperature, it is wind which parted the Red Sea just long enough: "It is suggested that the crossing occurred while the water receded and that the drowning of the Egyptians was of a result of the rapidly returning wave." Nof likened this event to "It's like blowing across the top of a cup of coffee. The coffee blows from one end of the cup to the other." Statistics are completely absent in this paper. However, in 1993 they published a paper, "Statistics of Wind over the Red Sea with Application to the Exodus Question" [Journal of Applied Meteorology, 33, No 8; 1017-1025]. Here they "used the Weibull distribution ...applied to winds in the part of the Indian Ocean adjacent to the Red Sea" to argue that the likelihood of a proper storm would occur "roughly once every 2000 years." <br />
<br />
---DISCUSSION---<br />
<br />
1. Someone commented that "The reaction among Biblical scholars to Nof's theory ranged from bemused detachment to real irritation." Why the detachment and why the irritation?<br />
<br />
2. Were the Israelites lucky to have picked the exactly correct moment? What calculations do you believe they did?<br />
<br />
3. What physical phenomenon could explain the destruction of the walls of Jericho? Noah's flood? The Biblical burning bush?<br />
<br />
4. The conflict between Darwinism and Biblical fundamentalism has been much in the news the past few years. Why hasn't there been any clash between fundamentalism and aspects of chemistry such as Avogadro's number?<br />
<br />
Submitted by Paul Alper<br />
<br />
==Measuring poverty in London over 100 years==<br />
[http://www.economist.com/World/europe/displayStory.cfm?story_id=6888761 There goes the neighbourhood], <br />
From The Economist print edition, May 4th 2006.<br><br />
[http://www.economist.com/World/europe/displaystory.cfm?story_id=6893177&CFID=4152326&CFTOKEN=9692083 Booth redux], <br />
From Economist.com, May 4th 2006.<br />
<br />
This on-line article uses recent census data to graphically update a 100-year old map of poverty in London by district and street.<br />
The original project, led by the shipping magnate Charles Booth, <br />
colour-coded every street in the capital according to its social make-up.<br />
It shows the extent to which poverty depends on location<br />
and how little has changed over the past century.<br />
<br />
The article illustrates one area, north Chelsea, in 1898 and 2001,<br />
colour-coding each street as either wealthy, well-off, middling or poor.<br />
In 1898, Chelsea was socially mixed, neither especially rich nor especially poor.<br />
Today Chelsea is considered a very desirable place to live,<br />
with many wealthy streets and some of the poverty has disappered.<br />
But on closer inspection the Economist claims that <br />
<blockquote><br />
poverty has not been altogether banished from this part of Chelsea, <br />
nor has it moved much. <br />
Most of the poorest areas in 2001 were also poor in 1898, <br />
and in almost exactly the same places. <br />
The reason is that the worst Victorian slums have been knocked down <br />
and replaced with tracts of social housing.<br />
</blockquote><br />
<br />
Neither the original survey nor its updated version<br />
use complicated statistical models.<br />
In 1898, researchers peered through windows and into back gardens,<br />
or asked police officers for opinions, in <br />
order to classify each street into one of seven categories<br />
from wealthy at the top to 'vicious, semi-criminal' at the bottom of the poverty scale.<br />
The 2001 census measures people's socio-economic status as one of eight categories.<br />
So to combine the two datasets a subset of four categories was used by the Economist.<br />
Having calculated the number of people, <br />
within the smallest unit available from the 2001 census, <br />
who fall into the four new categories, <br />
the single largest group is taken to represent the character of the area. <br />
<br />
===Questions===<br />
* The Ecomonist gives an example of its classification methodology: if an output area contains 80 members of the upper managerial and professional class 'the wealthy' and 60, 40, and 20 members, respectively, of the other three new categories, it is taken to be wealthy. Is it reasonable to based the classification of an area on the most common category of resident? e.g. should the number of people in each steet be taken into account?<br />
* How might missing data be handled, old streets that have disappeared or new streets that didnt exist in 1898?<br />
<br />
===Further reading===<br />
* [http://booth.lse.ac.uk/ The Charles Booth Online Archive] is a searchable resource giving access to archive material from the Booth collections of the British Library of Political and Economic Science (the Library of the London School of Economics and Political Science) and the University of London Library.<br />
* [http://booth.lse.ac.uk/cgi-bin/do.pl?sub=view_booth_and_barth&args=531000,180400,6,large,5 Poverty maps of London] - this interactive webpage allows viewers to zoom in on an area of London to see the original 1898 map juxtaposed with a modern view of the same area.<br />
* [http://www.statistics.gov.uk/census/ 2001 UK census]<br />
<br />
Submitted by John Gavin<br />
<br />
<br />
==Facial Attraction==<br />
<br />
Facial Attraction<br />
In a recent [http://chance.dartmouth.edu/chancewiki/index.php/Chance_News_17#Walking_on_Water], it is alleged that "sex, while always interesting, has given way to religion in American life" when it comes to getting research and researchers into the rolodexes of the media. That this is clearly not the case is evidenced by "Reading men's faces: women's mate attractiveness judgments track men's testosterone and interest in infants" which appeared in the ''Proceedings of the Royal Society'', 2006. In summary, it is postulated that females, when eyeing a potential mate, are able to discern from facial cues which males are likely to provide good genetic quality for offsprings and which males would help raise offsprings.<br />
<br />
In order to determine the genetic quality of masculinity, the authors had the males' saliva tested for testosterone. Each male also "completed an interest in infants test" in which "subjects were asked to indicate whether they preferred pictures of adult or infant faces when both were presented simultaneously in pairs." The males then "posed for digital photographs" with hairstyles excluded and "Young women subsequently rated these photos for the degree to which the men depicted like children, as well as for physical attractiveness, masculinity, kindness, attractiveness as a short-term mate and attractiveness as a long-term mate."<br />
<br />
According to the article, "The results of this study suggest that women's perceptions of men's faces track actual characteristics of men that are theoretically important for mate choice.. the present study provides the first direct evidence that women's attractiveness judgments specifically track both men's affinity for children and men's hormone concentrations."<br />
<br />
===Discussion===<br />
1. The study started with "51 University of Chicago students who were recruited from a University website and paid $10 for their participation." The 29 "Women raters were University of California, Santa Barbara (UCSB) undergraduates who participated in exchange for course credit." Starting with this non-random sample, what inferences if any can be made to a larger population? Undergraduates, students in general, Americans, the rest of the planet? Speculate on how seriously the women did their rating.<br />
<br />
2. "Five [male] subjects who reported a gay sexual orientation and seven others who refused to have their photos taken were dropped from the data analysis." Justify and criticize this exclusion. <br />
<br />
3. The women rated the men on a scale of 1 to 7 and "a rating of 4 indicates that he is about average, a rating of 1 means he is far below average and a rating of 7 means he is far above average." Comment on whether "distance" between a 5 and a 4 is the same as the distance between a 2 and a 1. Comment on whether a 6 is twice as good as a 3. What is the similarity between this type of rating and student evaluations of instructors?<br />
<br />
4. The men were instructed "to look straight into the camera and assume a neutral facial expression." Define a neutral facial expression.<br />
<br />
5. If you were given paired photos of adults and infants how much time would be necessary to choose a preference within a given pair? If you were paid more money for participating, would you spend more time choosing? Could someone who greatly prefers infants to adults be accused of pedophilia tendencies?<br />
<br />
6. The mean testoserone for this group was 88.38 pg/ml with a standard deviation of 27.97 and was "normally distributed once an outlier three standard deviations above the mean was dropped from the sample." Have you ever had your testosterone measured? Do you have any idea what your pg/ml score is? <br />
<br />
7. The article has an abundant number of t-values and related p-values, the latter usually of the form p-value < some number. Speculate on why effect size coupled with some sort of interval doesn't seem to be present. <br />
<br />
8. One attribute that was not discussed was spirituality, a popular term in this age of religiosity. How could that be measured, either facially or otherwise?<br />
<br />
9. Why is this variant of an old Yiddish joke relevant? A young woman goes to a shadchen [matchmaker or marriage broker] to seek a husband. The shadchen is an up-to-date techie and uses a spreadsheet to find the right male. She lists all the characteristics she wants in a husband: age, height, weight, athletic ability, eye color, etc. He uses his spreadsheet to find a fellow who fits the constraints, and arranges a meeting between the two of them. Next week the woman comes back and instead of paying him she ask him to find another candidate. The shadchen is surprised and says, "Wasn't he of the right age, right height, weight, athletic ability, eye color, etc." She replies, "Yes, but I didn't like him."<br />
<br />
Submitted by Paul Alper<br />
<br />
==A New Statistical Misrepresentation==<br />
<br />
Every elementary statistics textbook warns the readers about statistical misrepresentations. For example: a bar graph comparison should never have different widths because to do so would exaggerate the difference which should depend only on heights; a graph where the origin is missing inflates differences; histograms should exhibit equal widths; when comparing contributions, per capita contribution is better than total contribution; regression graphs should avoid extrapolation. [http://select.nytimes.com/2006/05/29/opinion/29krugman.html Paul Krugman's op-ed piece] in the ''New York Times'' of May 29, 2006 referred to a flagrant misrepresentation I had never heard of. He entitled his article "Swift Boating The Planet" because he feels it is a fraudulent misrepresentation of global warming.<br />
According to Krugman, Dr. James Hansen, a climatologist at NASA, had numerically predicted rising temperatures as far back as 1988. "The original paper showed a range of possibilities, and the actual rise in temperature has fallen squarely in the middle of the range." However, his critic, Dr. Patrick Michaels, "claimed that the actual pace of global warming was falling far short of Dr. Hansen's predictions." Dr. Michaels concluded this by erasing "all the lower curves, leaving only the curve that the original paper described as being 'on the high side of reality'."<br />
<br />
===Discussion===<br />
<br />
1. Krugman claims that Dr. Michaels "has received substantial financial support from the energy industry." How does this affect your view of Dr. Michaels' assertions?<br />
<br />
2. Of Dr. Michaels' removal of the lower curves, Dr. Hansen is quoted as saying "Is this treading close to scientific fraud?" Krugman's response is "no: it isn't 'treading close,' it's fraud pure and simple." What do you believe Dr. Michaels would say to justify his removal of the lower curves?<br />
<br />
Submitted by Paul Alper<br />
<br />
== The Kindness of Strangers? ==<br />
<br />
This is a review of a recent article:<br />
<br />
[http://www.nytimes.com/2006/03/31/health/31pray.html?ex=1301461200&en=4acf338be4900000&ei=5088&partner=rssnyt&emc=rss Long-awaited study questions the power of prayer]<br><br />
The ''New York Times'', March 31, 2006, Page A1<br><br />
Benedict Carey<br />
<br />
that is based on the following paper.<br />
<br />
[http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=16569567 Study of the Therpeutic Effects of Intercessory Prayer (STEP) in cardiac bypass patients]: A multicenter randomized trial of uncerntainty and certainty of receiving intercessory prayer<br />
American Heart Journal, Volume 151, Issue 4, April 2006, Pages 934-942<br />
Herbert Bensen, MD et.al.<br />
<br />
Suppose you are about to undergo coronary artery bypass surgery. Would you want to have strangers praying for your successful recovery? And if so, would you prefer to know, or not to know, that such prayers were being offered?<br />
<br />
The results of this study, which represents nearly 10 years of research, are described in the ''New York Times'' article as “the most scientifically rigorous investigation” to date of the effects of prayer on illness and medical recovery. In addition, the researchers also studied whether patients who knew they were receiving prayers fared better than those who were told only that they might be prayed for. Leaving aside the perhaps surprising fact that “rigorous investigation” of the connection between prayer and medical recovery is deemed a worthy expenditure of research time and money, the study did produce some unexpected conclusions. While there was no difference between the recovery outcomes of the patients who were prayed for and those who were not, the patients who knew they were receiving prayers actually fared ''worse'' than those didn’t know they were receiving prayers.<br />
<br />
In the study, roughly two-thirds of the 1802 subjects were told that they may or may not receive prayers—of these, 604 were prayed for and 597 were not. The remaining 601 patients received prayers after being told that they would receive them. Prayers began the night before surgery and continued for two weeks, and were provided by members of three Christian congregations in Massachusetts, Minnesota, and Missouri. The prayer givers, known as ''intercessors'', were asked to include the phrase “for a successful surgery with a quick, healthy recovery and no complications” to their usual prayers. The primary outcome of interest was the development of any complication within 30 days of a subject’s bypass graft surgery.<br />
<br />
At least one complication arose in 971 patients, or roughly 54% of the total. Of these, 315 were in the first group (52%), 304 were in the second group (51%), and 352 were in the last group (59%.) A Chi-squared test applied to the values for the first and third groups (both of whom received prayers but only the third knew they were receiving them) indeed implies that the difference between the outcomes is significant (p = .025.) <br />
<br />
While the researchers state in the their paper that “We have no clear explanation for the observed excess of complications in the patients who were certain that intercessors would pray for them,” the ''Times'' article suggests that a kind of “performance anxiety” may have been responsible: “It may have made them uncertain,” a co-author of the study remarks, “wondering am I so sick they had to call in their prayer team?” In addition, the authors note that a single outcome category was responsible for most of the excess complications in the third group, but they fail to mention that a Chi-squared test applied to the values for this category alone yields a p value of .011. Instead they merely remark that “the excess may be a chance finding,” a comment echoed without clarification in the ''Times'' article. One wonders if such hedging may be a reflection of the background of the lead investigator of the study, Dr. Herbert Bensen. According to the ''Times'', in his work Dr. Bensen has “emphasized the soothing power of personal prayer and meditation.” Moreover, most of the $2.4 million cost of the study was provided by the John Templeton Foundation, which supports research on spirituality and promotes a more close relationship between religion and science.<br />
<br />
Perhaps even more curious is the discussion in the paper about prayer and its use in the study. For example, after noting that the subjects may have had friends and family praying for them, or may have prayed for themselves, the authors note that “our study subjects may have been exposed to a large amount of non-study prayer, and this could have made it more difficult to detect the effects of prayer provided by the intercessors.” However, they do not suggest that there is any reason to believe that the amount of non-study prayer varied significantly between the three groups. Once again, one senses a reluctance to accept the results of the study, which is also conveyed in the ''Times'' article by a comment provided by Dean Marek, a chaplain at the Mayo Clinic in Rochester, Minnesota and co-author of the study: “You hear tons of stories about the power of prayer, and I don’t doubt them.” Although Marek is referring to the effects of personal prayer and the prayers of friends and family, not the prayers of strangers, the remark clearly misses a crucial point: one assumes that he doesn’t hear many stories about the prayers of friends and family that did ''not'' lead to an improved outcome, so we have no way of evaluating the efficacy of such prayers. Indeed, wasn’t the purpose of the study to investigate the validity of what is otherwise merely anecdotal reporting? Apparently the researchers don’t think so, given their comment near the end of the report: “Private or family prayer is widely believed to influence recovery from illness, and the results of this study do not challenge this belief.”<br />
<br />
===Discussion=== <br />
1. As noted above, this study cost $2.4 million. In addition, the ''Times'' reports that since 2000, the U.S. government has spent nearly the same amount on prayer research. Do you think this is money well spent? Why or why not?<br />
<br />
2. The reporter for the ''Times'' article notes that the study’s authors “left open the possibility” that their results were due to chance. Do you agree with the authors? Do you think that the reporter should have worked harder to understand and describe the significance level of the report’s findings?<br />
<br />
3. In the last sentence of the report’s discussion section the authors write, “Our study focused only on intercessory prayer as provided in this trial and was never intended to and cannot address a large number of religious questions, such as whether God exists [and] whether God answers intercessory prayers…” Why do you think they included this statement?<br />
<br />
4. How do you respond to the questions posed at the beginning of this article? <br />
<br />
Submitted by Jeanne Albert<br />
<br />
==The Birth-Month Soccer Anomoly==<br />
<br />
[http://www.nytimes.com/2006/05/07/magazine/07wwln_freak.html?ex=1304654400&en=2cf57fe91bdd490f&ei=5090&partner=rssuserland&emc=rss A Star is Made]<br><br />
''New York Times'', May 7, 2006, Sect. 6, p. 24 <br><br />
Stephen J. Dubner and Steven D. Levitt<br><br />
<br><br />
Readers may recognize Dubner and Levitt as the authors of ''Freakonomics.'' The present article opens with the curious observation that top soccer players tend to have birth-months early in the calendar year. Recent data from England, for example, show that half of the top teenage players have birthdays in January, February or March. <br />
<br />
The authors offer the following possible explanations:<br />
<blockquote><br />
(a) certain astrological signs confer superior soccer skills; <br><br />
(b) winter-born babies tend to have higher oxygen capacity, which increases soccer stamina; <br><br />
(c) soccer-mad parents are more likely to conceive children in springtime, at the annual peak of soccer mania; <br><br />
(d) none of the above.<br />
</blockquote><br />
<br />
As one might suspect, the authors' answer is (d). Their explanation flows from the larger theme of the article, which is that native ability matters a lot less than &quot;deliberate practice&quot; in determining what makes people successful. They cite a forthcoming book, the ''Cambridge Handbook of Expertise and Expert Performance'', which is based on research by Florida State University psychologist Anders Ericsson and his colleagues. The research spans performance in such diverse areas as sports, music, computer programming and investing. As quoted in the article, Erisson summarizes the findings by saying, &quot;I think the most general claim here, is that a lot of people believe there are some inherent limits they were born with. But there is surprisingly little hard evidence that anyone could attain any kind of exceptional performance without spending a lot of time perfecting it.&quot; (This, by the way, reminded us of Fred Mosteller's acronym T.O.T., for &quot;Time on Task&quot;).<br />
<br />
As a concrete example, the article offers the following recommendation for medical training. In many specialties, performance tends to degrade over time, but not so for surgeons. The key, according to this account, is continual practice, with immediate feedback on the success of the procedure. By contrast, mammographers do not get immediate feedback on their recommendations; it may take weeks for biopsy results, and years to see whether cancer does or does not appear. The authors suggest that these professionals could enhance their skills through regular practice reading old scans, having the actual followup histories available for immediate review.<br />
<br />
With this in mind, here is the explanation proposed by Dubner and Levitt for the soccer puzzle. Youth leagues organize players by age, with brackets often defined by age at the end of the calendar year. But a child who turns ten, say, in December is nearly a year younger than one who turned ten the previous January. The greater physical development of the older child can easily be confused with native talent for the sport. And those selected (by whatever means) for increased attention gain access to the practice and feedback that are essential for reaching the top levels of performance. <br />
<br />
Dubner and Levitt maintain links to [http://www.freakonomics.com/times0507.html more research on this topic], as well as [http://www.freakonomics.com/times.php previous ''Freakonomics'' pieces] from the ''New York Times''.<br />
<br />
Submitted by Bill Peterson<br />
<br />
==Why the Forsooths are Forsooths.==<br />
<br />
(1) [http://observer.guardian.co.uk/letters/story/0,,1739800,00.html Letter to the editor: The Observer, March 26, 2006.]<br><br />
<br />
<blockquote> In the story 'Where women get real respect' (News, last week), you said: 'Of the US Fortune 500 companies, 84 per cent now have women on their boards; in the UK among directors of companies in the FTSE 100, only 9 per cent are women.' So what?<br><br><br />
<br />
If every FTSE 100 company had 11 board members, and one of those was a woman, then 100 per cent of FTSE 100 companies would have a female board member and still only 9 per cent would be women.<br><br><br />
<br />
If 84 per cent of F500 companies have a woman on the board, and every board has 20 members, then (about) 4 per cent of F500 board members are women.<br><br><br />
Meaningless comparisons do not make an argument.<br><br />
Jeremy Miles<br><br />
University of York</blockquote><br />
----<br />
(2) Zack Says: <br><br />
March 10th, 2006<br><br />
[http://zack.notsoevil.net/ Digital Home of Zack Stewart >> Puzzled]<br />
<br />
<blockquote>n = the original number of flowers in each vase.<br><br><br />
<br />
So after Kim adds 3 flowers to one vase it contains n+3 flowers. <br><br><br />
<br />
The new average is thus (n+n+n+3)/3 = (3n+3)/3 = n+1 flowers.<br><br><br />
<br />
So the special vase has (n+3) - (n+1) = 2 flowers more than the new average. <br><br><br />
<br />
All of the above is true for any n. <br><br><br />
<br />
I have to wonder what made them pick 6 as their answer - I would have gone for something interesting, like 5930912377. That way, when you turn the page over you at least get some fun schock value before you realize they're full of it. </blockquote></div>Mmartin