Chance News 10: Difference between revisions

From ChanceWiki
Jump to navigation Jump to search
No edit summary
 
(54 intermediate revisions by 3 users not shown)
Line 93: Line 93:
Submitted by John Gavin.
Submitted by John Gavin.


==Expert political judgement: how good is it?==
==Expert political judgment: how good is it?==
Everybody's and expert<br>
[http://www.newyorker.com/critics/content/articles/051205crbo_books1 Everybody's an expert]<br>
The New Yorker, Nov. 28, 2005<br>
The New Yorker, Dec.5, 2005<br>
Louis Menand
Louis Menand


Menand writes:
Menand writes:
<blockquote> It is the somewhat gratifying lesson of of Philip Tetlock's new book, "Expert Political Judgement: How good is it? How can we Know?" (Princeton; $35}, that people who make predictions their business--people who appear as experts on television, get quoted in newspaper articales, advise governments and businesses, and participate in punditry roundtables--are no better than the rest of us. When they are wrong, they're rarely held accountable, and they rarely admit it, either.  They inisist that they were just off on timing, or blindsided by an improbable event, or almost right , or wrong for the right reasons. They have the same repertoire of self-justifications that everyone has, and are no more inclined than anyone else to revise their belief's about the way the world works, or ought to work, just because they made a mistake...People who follow current events by reading the papers and newspmazines regularly can guess what is likely to happen about as accurately as the specialist whom the papers quote. .<blockquote>
<blockquote> It is the somewhat gratifying lesson of Philip Tetlock's new book, "Expert Political Judgment: How good is it? How can we know?" (Princeton; $35}, that people who make predictions their business--people who appear as experts on television, get quoted in newspaper articles, advise governments and businesses, and participate in punditry roundtables--are no better than the rest of us. When they are wrong, they're rarely held accountable, and they rarely admit it, either.  They insist that they were just off on timing, or blindsided by an improbable event, or almost right, or wrong for the right reasons. They have the same repertoire of self-justifications that everyone has, and are no more inclined than anyone else to revise their belief's about the way the world works, or ought to work, just because they made a mistake...People who follow current events by reading the papers and newsmagazines regularly can guess what is likely to happen about as accurately as the specialist whom the papers quote. </blockquote>
To be continued  
 
Tetlock is a Berkeley psychologist and his conclusions are based on a study that he started 20 years ago and ended in 2003. He chose two hundred and eight-four people who mad their living giving advice on political and economics issues. He asked them to estimate the probability that events would come to pass both in areas that they were considered experts as well as areas that they were not experts such as would Gorbachev be ousted in a coup, would the United States go to war in the Persian Gulf? ect. By the end of the study in 2003 the experts had made 82,361 predictions. 
 
For most of the questions the subjects were asked to rate the probability of three options: no change, more of something and less of something. In most cases the experts did less well than the monkey who would choose one of the three at random.
 
While this might be disappointing, Tetlock felt that he did learn why some people make better forecasts than other.  He explained this in terms of Isaac Berlin's "The Hedgehog and the Fox" summed up by this quotation:
 
<blockquote>The fox knows many things, but the hedgehog knows one big thing.--Isaiah Berlin</blockquote>
Menand quotes from Tetlock's book:
 
<blockquote> Low scorers look like hedgehogs: thinkers who “know one big thing,” aggressively extend the explanatory reach of that one big thing into new domains, display bristly impatience with those who “do not get it,” and express considerable confidence that they are already pretty proficient forecasters, at least in the long term. High scorers look like foxes: thinkers who know many small things (tricks of their trade), are skeptical of grand schemes, see explanation and prediction not as deductive exercises but rather as exercises in flexible “ad hocery” that require stitching together diverse sources of information, and are rather diffident about their own forecasting prowess. <br><br>
A hedgehog is a person who sees international affairs to be ultimately determined by a single bottom-line force: balance-of-power considerations, or the clash of civilizations, or globalization and the spread of free markets. A hedgehog is the kind of person who holds a great-man theory of history, according to which the Cold War does not end if there is no Ronald Reagan. Or he or she might adhere to the “actor-dispensability thesis,” according to which Soviet Communism was doomed no matter what. Whatever it is, the big idea, and that idea alone, dictates the probable outcome of events. For the hedgehog, therefore, predictions that fail are only “off on timing,” or are “almost right,” derailed by an unforeseeable accident. There are always little swerves in the short run, but the long run irons them out.<br><br>
Foxes, on the other hand, don't see a single determining explanation in history. They tend, Tetlock says, “to see the world as a shifting mixture of self-fulfilling and self-negating prophecies: self-fulfilling ones in which success breeds success, and failure, failure but only up to a point, and then self-negating prophecies kick in as people recognize that things have gone too far.</blockquote>
 
Evidently Tetlock reported that if a prediction needs two independemt things to occur in order for it to be true, experts tend to find this more likely to happen that either of the independent events by themselves.  He suggests that this explains the infamous Linda paradox.
 
===Discussion===
Here is the Linda paradox.
 
Tversky and Kahneman tell the subjects in a psychological experiment that a certain Linda is 31 years old, single, majored in philosophy, was deeply concerned with issues of discrimination and social justice as a student etc. The subjects are then asked to provide a plausibility ordering of various propositions. It turns out that a large percentage finds it more plausible that Linda is a feminist and a bank teller than that she is a bank teller.
 
This has been interpreted to mean that people do not really understand probability. Do you agree with this?
 
Submitted by Laurie Snell


==The Beginner's Handbook of Dowsing==
==The Beginner's Handbook of Dowsing==
Line 119: Line 142:


Submitted by Margaret Cibes
Submitted by Margaret Cibes
==Problems that arise from insufficient mortality==
[http://www.guardian.co.uk/g2/story/0,,1654529,00.html So, how long have we got?] by Tim Dowling, The Guardian (UK), 1st December 2005.<br>
[http://news.ft.com/cms/s/7a9b14ee-6ab4-11da-ba41-0000779e2340.html When old age becomes a risk factor], Jennifer Hughes and Norma Cohen, Finaincial Times.
Not only are we living longer, we're living longer than we'd ever imagined we would, and this, apparently, is nothing to smile about, according to this Guardian article.
[http://en.wikipedia.org/wiki/Actuary Actuaries] are the best people to answer the question "when am I going to die?".
These are mathematicians working for insurance companies, compiling statistics and supplying and interpreting the risk tables upon which the calculations of annuities, premiums, dividends and reserves are based.
For example, the [http://www.actuaries.org.uk/Display_Page.cgi?url=/library/cmi.xml Continuing Mortality Investigation] (CMI) is a voluntary UK body funded by insurance houses, who pool their data on death and dying in order to get an overall idea about the future of mortality.
The article contains an interview with Dave Grimshaw, from an actuarial firm called Barnett Waddingham,
who says:
<blockquote>
The fundamentals of pension planning both for companies and the state, the fundamentals of life insurance, the fundamentals of health provision, all depend on some sort of idea of how long people are going to live.
</blockquote>
He discusses the recent surprising changes in mortality rates, the percentage change in the number of people of a specific age in a given sample who die in a given year.
The CMI recently published figures that showed that pensioner mortality fell by 30% in just eight years - roughly double what they were predicting.
The tables show that of 10,000 males aged 65 in 1994, 181 could be expected to die within the year.
In 2002, that figure was 129.
For women the improvement was even more marked, from 110 to just 74.
There were further drops in mortality at age 75 (25%) and 85 (about 12%).
When they issued a press release about it, CMI chairman Brian Ridsdale said:
<blockquote>
We're delighted to see that mortality is so much lower.
</blockquote>
However, that what's good news for most of us may mean trouble for some insurers.
For companies that primarily sell life insurance, it's great news.
They get more premium payments and pay out later.
But it is creating problems for insurance companies that have pensions liabilities.
The article goes into more detail about mortality issues.
Part of the reason for this change in mortality is something called the [http://en.wikipedia.org/wiki/Cohort_study Cohort Effect,] whereby groups of people of a certain age show sharp falls in mortality rates that exceed both their predecessors and their successors.
It transpires that there is a particular UK group, those born around the year 1931, who have outstripped everyone else in terms of not dying.
The Financial Times article quotes Stephen Richard, a consultant specialising in longevity
<blockquote>
data shows that while life expectancy for those born in 1931 initially showed improvements of 1 to 2 per cent per year, they are now improving at a rate of over 4 per cent a year
</blockquote>
Some research shows that a reduction in smoking could account for about a third of the drop.
In any case, people born later have - so far, anyway - not shown the same rates of improvement.
The unforeseen robustness of the 1931 cohort is illustrative of the problem facing the UK government: when the pension age was set at 65, a man who reached that age (and many never did) was expected to draw his pension for two to three years before shuffling obligingly off this mortal coil. A man who turned 65 in 2000 had a life expectancy of 86.
Not surprisingly, the Financial Times goes on to outline various financial instruments that might be used to mitigate longevity risk.
The Guardian article finishes with the actuary predicting an expected remaining lifetime of 48 years for the 42 year old reporter. This pleases the reporter who points out that an alternative actuarial table from the (UK) Government Actuary's Department only gives him another 36 years, so he is sticking with the CMI tables!
===Further reading===
[http://www.fenews.com/fen46/topics_act_analysis/topics-act-analysis.htm A matter of life and death], Mary Hardy, Financial Engineering News. This article covers the same topic from a more technical point of view and discusses various solutions that have been proposed to handle the uncertainty in estimated future mortality rates.
Submitted by John Gavin.
==Wikipedia and Britannica go head to head==
[http://www.nature.com/nature/journal/v438/n7070/full/438900a.html Internet encyclopaedias go head to head], Jim Giles, Nature, Nature 438, 900-901 (15 December 2005).<br>
[http://news.bbc.co.uk/2/hi/technology/4530930.stm Wikipedia survives research test], BBC News on-line, 15 December 2005.<br>
[http://www.boingboing.net/2005/12/14/britannica_averages_.html Britannica averages 3 bugs per entry; Wikipedia averages 4], Boingboing wiki, December 14, 2005.<br>
The science journal Nature claims that it is the first to conduct a blind, peer-reviwed test on certain entries in Wikipedia against the corresponding entries in Encylopoedia Britannica.
The area of focus is their coverage of science.
The reviewers concluded that Britannica has a marginally lower error-rate than Wikipedia
but that there is not much to choose between the two in terms of overall accuracy.
[http://Wikipedia.org Wikipedia] is a free-online and rapidly-growing encyclopaedia that has almost four million entries and the English language version grew by 1,500 entries per day in October 2005.
Anyone can edit it but this makes it controversial because if anyone can edit entries, how can users really know if Wikipedia is as accurate as more traditional sources such as Encyclopaedia Britannica?
The Nature article highlights some examples, such as an entry
falsely suggesting that a former assistant to US Senator Robert Kennedy may have been involved in his assassination.
<!-- podcasting pioneer Adam Curry being accused of editing the entry on podcasting to remove references to competitors' work.-->
Writing in the online magazine TCS last year, former Britannica editor Robert McHenry said
<blockquote>
Opening up the editing process to all, regardless of expertise, means that reliability can never be ensured.
</blockquote>
In the study, entries were chosen from the websites of Wikipedia and Encyclopaedia Britannica on a range of scientific disciplines and sent to a relevant expert for peer review. Each reviewer examined the entry on a single subject from the two encyclopaedias and they were not told which article came from which encyclopaedia. A total of 42 usable reviews were returned out of 50 sent out and were then examined by Nature's news team.
The exercise revealed numerous errors in both encyclopaedias, but among the 42 entries tested, the difference in accuracy was not particularly great: the average science entry in Wikipedia contained around four inaccuracies; Britannica, about three.
So Nature's investigation suggests that Britannica's advantage may not be great, at least when it comes to science entries, although the inaccuracies in both sources suggests that some caution is always advisable.
Nature claims that only eight serious errors, such as misinterpretations of important concepts, were detected in the pairs of articles reviewed, four from each encyclopaedia. But their reviewers also found many factual errors, omissions or misleading statements: 162 and 123 in Wikipedia and Britannica, respectively.
Tom Panelas, director of corporate communications at Britannica's headquarters in Chicago commented
<blockquote>
We have nothing against Wikipedia.
But it is not the case that errors creep in on an occasional basis or that a couple of articles are poorly written. There are lots of articles in that condition. They need a good editor.
</blockquote>
Also several of Nature's reviewers noted an undue prominence given to controversial scientific theories.
But Michael Twidale, an information scientist at the University of Illinois at Urbana-Champaign, says <blockquote>
Wikipedia's strongest suit is the speed at which it can updated, a factor not considered by Nature's reviewers.
People will find it shocking to see how many errors there are in Britannica.
Print encyclopaedias are often set up as the gold standards of information quality against which the failings of faster or cheaper resources can be compared. These findings remind us that we have an 18-carat standard, not a 24-carat one.
</blockquote>
The Nature article also mentions a survey of more than 1,000 Nature authors and found that although more than 70% had heard of Wikipedia and 17% of those consulted it on a weekly basis, less than 10% help to update it.
It claims that the steady trickle of scientists who have contributed to articles describe the experience as rewarding, if occasionally frustrating (see [http://www.nature.com/nature/journal/v438/n7070/box/438900a_BX1.html Challenges of being a Wikipedian]).
The co-founder of Wikipedia Jimmy Wales comments that, next year, he intends to introduce a stable version of each entry, once a specific quality threshold is reached.
===Questions===
* Is a sample size of 42 large enough to draw conclusions about the 4 million entries in Wikipedia, across 200 languages?
* Is it reasonable to extrapolate from a science focussed test to all other categories?
* The article refers to various degrees of differences such as 'inaccuracies', 'serious errors' and 'factual errors, omissions or misleading statements'. How might results be adjusted to weight these different categories? And should different kinds of users care more about certain categories?
===Further reading===
* The on-line [http://www.nature.com/nature/journal/v438/n7070/full/438900a.html Nature article] gives a list or related links.
* [http://www.nature.com/nature/journal/v438/n7070/box/438900a_BX1.html Challenges of being a Wikipedian]
* [http://tinyurl.com/8ydmh Wikipedia's bid to govern knowledge democracy], Michael Earl, Financial Times, December 18 2005.
Submitted by John Gavin.
==Mozart's musical game==
[http://www.npr.org/templates/story/story.php?storyId=5058307 The Mischievous Mozart]<br>
''NPR Performance Today'', Dec. 16, 2005<br>
Fred Child
January 27th, 2006, is the 250th anniversary of Mozart's birth. The ''NPR Performance Today'' is having weekly commentaries on Mozart's works. The "Mischievous Mozart" is one of these commentaries. Child describes a mischievous Mozart and at the end of the program he discusses the famous Mozart Musical Dice game.
He describes the game as follows:
<blockquote>Mozart proposed a musical dice game. Mozart was fascinated by mathematics and puzzles and he wrote a game in which anyone could compose a minuet by rolling dice. This was something several composers around that time played with.
<br><br> Here's the idea. You write a number of short segments of music that are, in a particular way, interchangeable. Then anyone is able to put them in order by rolling dice.  And you don't have to have a lot of these to contribute almost infinite possibilities. Here's a complete game attributed to Mozart K.516F.<br><br>
This game that we are about to play has 176 short fragments that we put in order by rolling dice. But the number of possible combinations is something more than 100 quadrillion. There are more possible minuets here than there are stars in the sky.<br><br>
So how do we start?  Well we roll the dice to select our first short segment of music. We have several different possibilities that can come up one for every different roll of the dice.  If we rolled a 8 we would begin like this (the first measure is played). If we rolled a 4 we would start like this (the next measure is played)</blockquote>
Then Child constructs a minuet which he says is using Mozart's K.516.F.  He rolls the dice 14 times resulting in numbers 63771853896743 and remarks that the end of each phrase is always the same. You can hear the resulting minuet at the end of Child's discussion.
This is a typical media discussion of an interesting mathematics topic. It is enough to get you interested in the problem but leaves you with a lot of questions: Where did the 176 come from?  If there are 176 short fragments in the music played why were the dice only thrown 14 times?
We find a more understandable description of the mathematical version of Mozart's musical game in an article, [http://www.maa.org/mathland/mathtrek_8_21_01.html ''Mozart's Melody Machine''], that Ivars Peterson wrote for ''Science News''.
Peterson remarks:
<blockquote> Music publishing was a thriving trade during the latter part of the 18th century in Europe. Publishers vied with one another to print the works of the latest "hot" composer. Many of them looked for novel ways to entice new customers into their music shops.<br><br>
Many of these schemes involved using dice or other randomizers to select musical fragments from an array of choices. Composer Johann Philipp Kirnberger (1721-1783), a former pupil of Johann Sebastian Bach (1685-1750), suggested the use of dice for this purpose in his book The Ever-ready Composer of Polonaises and Minuets, published in 1757. About two decades later, Austrian composer Maximilian Stadler (1748-1833) put together a set of musical bars and tables for generating minuets and trios with the help of dice.<br><br>
One well-known example of such a scheme is the "Musikalisches Würfelspiel" (Musical Dice Game), first published in 1792 in Berlin. Attributed by the publisher to Wolfgang Amadeus Mozart (1756-1791), it appeared a year after the composer's death. <br><br>
The idea was to compose a 16-measure "waltz" by rolling dice to decide which measures to select from a large pool of choices. In the "Musikalisches Würfelspiel," the measures are numbered from 1 to 176, and the numbers are arranged in two charts, each consisting of 11 rows and eight columns. To select the first measure, a player would roll two dice, subtract 1 from the total, and look up the corresponding row in the first column of the first chart to determine the appropriate measure number. Subsequent rolls of the dice decide which measure to select from each sixteen columns to complete the melody.</blockquote>
On the NPR website you are invited to make your own minuets by going to the [http://wwwhome.cs.utwente.nl/~zsofi//mozart/index.html website] of Ruttkay and Boskamp. Here you will find that the minuets have 16 measures consistent with Peterson’s description of the game. You will also find an interesting discussion of the use of Mozart's musical game in a mathematics summer camp called "Fun in math" for 12-16 year old "average" Dutch children in 1995 and 1996 in the Netherlands. In addition you will find questions about the Mozart game used in the workshop and a movie of an orchestra made up of those students whom were also musicians playing a random Mozart composition.
Alas, we also learn from Peterson’s article that, even when we understand the math, it may not agree with reality.  He writes:
<blockquote> Most scholars reject the publisher's claim that Mozart himself devised this particular scheme... . One Mozart manuscript actually includes what might be considered a musical game, though not played with dice. On both sides of the sheet, Mozart wrote down long strings of measures, grouped into two-bar melodies, each labeled with a letter of the alphabet and a number (1 or 2). However, other than supplying a "worked-out" example at the end of each page, he gave no instructions on how to proceed.<br><br>
Hideo Noguchi of Kobe, Japan, has tried to work out the game's rules. He speculates that Mozart's starting point was the name of an acquaintance, such as Francisca. The idea was to add "z" to the end of the name, rewrite the letters in alphabetical order, alternately assign the number 1 or 2 to each letter in succession (with certain refinements), then return to the original spelling: f1 r2 a1 n1 c2 i2 s1 c1 a1 z2. A player would then select the appropriate measures in the required order from the labeled groups to come up with a signature tune. <br><br>
Hideo Noguchi describes his findings in a paper posted [http://www.asahi-net.or.jp/~rb5h-ngc/e/k516f.htm here], which also includes facsimiles and transcriptions of the Mozart manuscript pages.</blockquote>
===Questions===
(1) Using the rules for Mozart's Musical Game described by Peterson, how many possible waltzes are there? Are there more than the number of stars?
(2) Are all waltzes equally likely to occur?  If not how could you modify the game to make them equally likely?
(3) Referring to a creating a waltz described above, Martin Gardner remarked, "If you fail to preserve it, it will be a waltz that will probably never be heard again." Why does he say that?
(4) Those who prefer that Mozart rolled dice ask "how could the random minuets (waltzes) made by the dice version sound so much like Mozart if he did not provide the data? What do you think about this?
Suggested by Jim Strickler and submitted by Laurie Snell

Latest revision as of 08:34, 19 December 2005

Quotation

The weather man is never wrong. Suppose he says that there's an 80% chance of rain. If it rains, the 80% chance came up; if it doesn't, the 20% chance came up! - Saul Barron .

From: Statistical Quotations

Forsooth

Literary License

"'Four million ... heard it. Ten percent remember it. One percent of those matter. One percent of those do something about it. That's still' - he does the math - 'four people.'" From: _The Betrayal_, by Sabin Willett, NY: Villard (Random House), 1998.

Submitted by Margaret Cibes

Logarithmetic behavior as metaphor

For many years Ed Barbeau has edited a wonderful column in the College Mathematics Journal called Fallacies, Flaws, and Flimflam. In Barbeau's column in the November 2005 issue of the College Math Journal Norton Starr provides a contribution called "Logarithmic behaviour as metaphor". Norton provides examples from a wide variety of writers who say that something is growing logarithmically when they mean it is growing exponentially.

Norton says that he became interested in this when a convocation speaker at his college (Amherst) said:

As opposed to all other appetites which are stimulated by deprivation and satisfied by food, good education stimulates with plenty so that appetite for knowledge and understanding escalate logarithmically to insatiability.

Norton finds examples among faculty, newspapers, television and of course on the web. He writes:

Here are three examples of metaphorical growth from the New York Times, with the third suggesting an improved understanding on the part of this newspaper:

  • from a review of Harlow Shapley's autobiography: "if the autobiographer opts for a method he believes will grant him immortality without industry, his risks rise logarithmically."
  • from a story about corruption and drugs:"' The drug situation is a horror story, increasing logarithmically.'"
  • from a more recent story: "Street crime, fed by an explosion of drug abuse, has risen exponentially."

RRS Coincidence column

Norton Starr, who provides our RSSNews Forsooth items, told us that the RRSNews now has a Coincidence column. He sent us the coincidence story from the Nov 05 RSSNews and suggested some questions relating to this story.

This month's contribution is from Pam Warner of the University of Edinburgh Medical Statistics Unit who relates a story told to her by a colleague named Wilma during a recent morning coffee-break.

Wilma had been waiting in a queue for a check-out (in Edinburgh) and a little boy, accompanying the lady ahead of her in the queue, struck up a conversation with her. He asked her what her name was, and on being told that it was Wilma exclaimed that was the same name as his granny (the lady he was with).

Wilma then returned the compliment, asking him what his name was, and he said Kieran. She in turn commented that that was funny as she had a young grand-daughter (in London) whose name was Kiera.

By this time Kieran's grand-mother (the other Wilma) was involved in the proceedings, and chipped in 'Next thing you will be telling me is that Kiera's Mum's name is Pamela...', to which our Wilma could only reply astounded, 'It is!'

Questions

(1) What's the likelihood of two instances of a grandmother, her daughter (even though in this case it might be daugher-in-law) and her grandson having the same names, say in Britain?

(2) How does the above probability vary with size of population>

(3) How likely is it that such a pair of trios would encounter each other in person?

Pi in the News

“Expressed in digits, pi begins 3.14159…, and it runs on to an infinity of digits that never repeat. Though pi has been known for more than three thousand years, mathematicians have been unable to learn much about it. The digits show no predictable order or pattern. The Chudnovskys were hoping, very faintly, that their supercomputer might see one.” From: “Capturing the Unicorn,” by Richard Preston, in _The New Yorker_, April 11, 2005

Submitted by Margaret Cibes

Question

If you were trying to explain to your Uncle George what it would mean for the digits of pie to be a random sequence, what might you tell him?

Suggested by Laurie Snell.

How stocks share in soccer sorrows

How stocks share in soccer sorrows Phillip Coggan, December 01, 2005, Financial Times
Sports Sentiment and Stock Returns by Alex Edmans, Diego Garcia and Oyvind Norli, Social Science Research Network. www.ssrn.com

Motivated by psychological evidence of a strong link between sporting outcomes and mood, a new study claims a statistically significant link between a nation's soccer team's results and the subsequent day's performance of that country's stock market, attributed to sudden changes in investor mood.

The study analysed about 1,200 football matches played across 39 nations, focusing on the FIFA World Cup or on Continental competitions such as the European Championship. Elimination from those competition is, on average, associated with a stock market performance that is 38 basis points (slightly more than a third of a percentage point) worse than normal.

The Financial Times summarises the results by saying that the magnitude of the loss effect and its concentration in Western European countries with developed stock markets, suggests that investors would have obtained large excess returns by trading on these mood events. The effect seems to be strongest in small stocks, where local investor sentiment is most likely to be dominant. It seems to be unrelated to the potential economic effects - such as loss of merchandising revenue - that defeat might have.

The authors even suggest a profitable trading strategy

One such strategy would be to short futures on both countries’ indices before an important match to exploit the asymmetry of the effect.

They also claim that the loss effect is stronger for more important games, robust to changes in estimation methodology and to the removal of outliers in the data. It seems that there is also a statistically significant stock market loss effect using cricket, rugby, ice hockey, and basketball games in countries where these sports are popular.

However, a similar positive effect was not found when teams won. That may be because sports fans have unrealistic expectations of their team's chances of success. For example, 86 per cent of fans thought England would beat Brazil in the 2002 World Cup quarter-final, even though Brazil were the world's top-ranked team and bookmakers assigned only a 42 per cent probability to an England victory. A loss is more crushing to sentiment than a win is helpful. Furthermore, success in one round still leaves the possibility of elimination in the next.

The Financial Times article goes on to say

The study seems to chime with other surveys which found that increases in heart attacks, murders, suicides and riots are associated with sporting defeat. The mood of the population is adversely affected by a loss, particularly for sports in which a large proportion of the population takes an interest.

Other academic work that suggests investors are not entirely rational and so can be affected by factors like the weather, holidays and even the annual switch to daylight-saving time in the US.

Submitted by John Gavin.

Expert political judgment: how good is it?

Everybody's an expert
The New Yorker, Dec.5, 2005
Louis Menand

Menand writes:

It is the somewhat gratifying lesson of Philip Tetlock's new book, "Expert Political Judgment: How good is it? How can we know?" (Princeton; $35}, that people who make predictions their business--people who appear as experts on television, get quoted in newspaper articles, advise governments and businesses, and participate in punditry roundtables--are no better than the rest of us. When they are wrong, they're rarely held accountable, and they rarely admit it, either. They insist that they were just off on timing, or blindsided by an improbable event, or almost right, or wrong for the right reasons. They have the same repertoire of self-justifications that everyone has, and are no more inclined than anyone else to revise their belief's about the way the world works, or ought to work, just because they made a mistake...People who follow current events by reading the papers and newsmagazines regularly can guess what is likely to happen about as accurately as the specialist whom the papers quote.

Tetlock is a Berkeley psychologist and his conclusions are based on a study that he started 20 years ago and ended in 2003. He chose two hundred and eight-four people who mad their living giving advice on political and economics issues. He asked them to estimate the probability that events would come to pass both in areas that they were considered experts as well as areas that they were not experts such as would Gorbachev be ousted in a coup, would the United States go to war in the Persian Gulf? ect. By the end of the study in 2003 the experts had made 82,361 predictions.

For most of the questions the subjects were asked to rate the probability of three options: no change, more of something and less of something. In most cases the experts did less well than the monkey who would choose one of the three at random.

While this might be disappointing, Tetlock felt that he did learn why some people make better forecasts than other. He explained this in terms of Isaac Berlin's "The Hedgehog and the Fox" summed up by this quotation:

The fox knows many things, but the hedgehog knows one big thing.--Isaiah Berlin

Menand quotes from Tetlock's book:

Low scorers look like hedgehogs: thinkers who “know one big thing,” aggressively extend the explanatory reach of that one big thing into new domains, display bristly impatience with those who “do not get it,” and express considerable confidence that they are already pretty proficient forecasters, at least in the long term. High scorers look like foxes: thinkers who know many small things (tricks of their trade), are skeptical of grand schemes, see explanation and prediction not as deductive exercises but rather as exercises in flexible “ad hocery” that require stitching together diverse sources of information, and are rather diffident about their own forecasting prowess.

A hedgehog is a person who sees international affairs to be ultimately determined by a single bottom-line force: balance-of-power considerations, or the clash of civilizations, or globalization and the spread of free markets. A hedgehog is the kind of person who holds a great-man theory of history, according to which the Cold War does not end if there is no Ronald Reagan. Or he or she might adhere to the “actor-dispensability thesis,” according to which Soviet Communism was doomed no matter what. Whatever it is, the big idea, and that idea alone, dictates the probable outcome of events. For the hedgehog, therefore, predictions that fail are only “off on timing,” or are “almost right,” derailed by an unforeseeable accident. There are always little swerves in the short run, but the long run irons them out.

Foxes, on the other hand, don't see a single determining explanation in history. They tend, Tetlock says, “to see the world as a shifting mixture of self-fulfilling and self-negating prophecies: self-fulfilling ones in which success breeds success, and failure, failure but only up to a point, and then self-negating prophecies kick in as people recognize that things have gone too far.

Evidently Tetlock reported that if a prediction needs two independemt things to occur in order for it to be true, experts tend to find this more likely to happen that either of the independent events by themselves. He suggests that this explains the infamous Linda paradox.

Discussion

Here is the Linda paradox.

Tversky and Kahneman tell the subjects in a psychological experiment that a certain Linda is 31 years old, single, majored in philosophy, was deeply concerned with issues of discrimination and social justice as a student etc. The subjects are then asked to provide a plausibility ordering of various propositions. It turns out that a large percentage finds it more plausible that Linda is a feminist and a bank teller than that she is a bank teller.

This has been interpreted to mean that people do not really understand probability. Do you agree with this?

Submitted by Laurie Snell

The Beginner's Handbook of Dowsing

This book was written by Joseph Baum (NY: Crown Publishers, Inc., c. 1974). “Dowsing” refers to the use of a “divining rod” in search of water, ore, or oil.

Baum’s wife told the contributor that author Joe Baum, a commercial art director, was so successful in dowsing for water on their Massachusetts farm that local drillers often sought his help, with 100 percent success in his locating large underground water sources.

In this book, Baum describes a test once administered to a group of oil dowsers. The test’s creator, a geologist, felt that if a dowser could find oil thousands of feet down he most certainly could spot it three feet away. The experiment involved 10 cigar boxes filled with sand, with 1 of the boxes containing a small bottle of oil buried in its sand. The boxes were shuffled and spread out on the floor. After each contestant dowsed, the boxes were reshuffled, and the dowsing repeated, until a contestant had dowsed 10 times. A dowser who correctly identified the box with oil all 10 times was guaranteed financial backing to sink a well. Some 50 dowsers tried it and the highest score was 3 successful dowsings out of 10 tries.

Questions

(1) Suppose that each contestant had experienced a long history of successful dowsing, resulting in a 90% chance of success on each trial. In this case, what would have been the probability of guessing correctly at most 3 times out of 10? Would you have expected these folks to have achieved a higher maximum score than 3 out of 10?

(2) Suppose that each contestant had had no experience with dowsing and had randomly guessed. What would have been the probability of guessing correctly at most 3 times out of 10? Would you have expected them to have done better than experienced dowsers?

(3) Was there anything about the test itself, or the assumption behind the test, which might have adversely affected the contestants’ ability to dowse successfully?

Submitted by Margaret Cibes

Problems that arise from insufficient mortality

So, how long have we got? by Tim Dowling, The Guardian (UK), 1st December 2005.
When old age becomes a risk factor, Jennifer Hughes and Norma Cohen, Finaincial Times.

Not only are we living longer, we're living longer than we'd ever imagined we would, and this, apparently, is nothing to smile about, according to this Guardian article.

Actuaries are the best people to answer the question "when am I going to die?". These are mathematicians working for insurance companies, compiling statistics and supplying and interpreting the risk tables upon which the calculations of annuities, premiums, dividends and reserves are based. For example, the Continuing Mortality Investigation (CMI) is a voluntary UK body funded by insurance houses, who pool their data on death and dying in order to get an overall idea about the future of mortality.

The article contains an interview with Dave Grimshaw, from an actuarial firm called Barnett Waddingham, who says:

The fundamentals of pension planning both for companies and the state, the fundamentals of life insurance, the fundamentals of health provision, all depend on some sort of idea of how long people are going to live.

He discusses the recent surprising changes in mortality rates, the percentage change in the number of people of a specific age in a given sample who die in a given year. The CMI recently published figures that showed that pensioner mortality fell by 30% in just eight years - roughly double what they were predicting. The tables show that of 10,000 males aged 65 in 1994, 181 could be expected to die within the year. In 2002, that figure was 129. For women the improvement was even more marked, from 110 to just 74. There were further drops in mortality at age 75 (25%) and 85 (about 12%). When they issued a press release about it, CMI chairman Brian Ridsdale said:

We're delighted to see that mortality is so much lower.

However, that what's good news for most of us may mean trouble for some insurers. For companies that primarily sell life insurance, it's great news. They get more premium payments and pay out later. But it is creating problems for insurance companies that have pensions liabilities.

The article goes into more detail about mortality issues. Part of the reason for this change in mortality is something called the Cohort Effect, whereby groups of people of a certain age show sharp falls in mortality rates that exceed both their predecessors and their successors. It transpires that there is a particular UK group, those born around the year 1931, who have outstripped everyone else in terms of not dying. The Financial Times article quotes Stephen Richard, a consultant specialising in longevity

data shows that while life expectancy for those born in 1931 initially showed improvements of 1 to 2 per cent per year, they are now improving at a rate of over 4 per cent a year

Some research shows that a reduction in smoking could account for about a third of the drop. In any case, people born later have - so far, anyway - not shown the same rates of improvement. The unforeseen robustness of the 1931 cohort is illustrative of the problem facing the UK government: when the pension age was set at 65, a man who reached that age (and many never did) was expected to draw his pension for two to three years before shuffling obligingly off this mortal coil. A man who turned 65 in 2000 had a life expectancy of 86.

Not surprisingly, the Financial Times goes on to outline various financial instruments that might be used to mitigate longevity risk.

The Guardian article finishes with the actuary predicting an expected remaining lifetime of 48 years for the 42 year old reporter. This pleases the reporter who points out that an alternative actuarial table from the (UK) Government Actuary's Department only gives him another 36 years, so he is sticking with the CMI tables!

Further reading

A matter of life and death, Mary Hardy, Financial Engineering News. This article covers the same topic from a more technical point of view and discusses various solutions that have been proposed to handle the uncertainty in estimated future mortality rates.

Submitted by John Gavin.

Wikipedia and Britannica go head to head

Internet encyclopaedias go head to head, Jim Giles, Nature, Nature 438, 900-901 (15 December 2005).
Wikipedia survives research test, BBC News on-line, 15 December 2005.
Britannica averages 3 bugs per entry; Wikipedia averages 4, Boingboing wiki, December 14, 2005.

The science journal Nature claims that it is the first to conduct a blind, peer-reviwed test on certain entries in Wikipedia against the corresponding entries in Encylopoedia Britannica. The area of focus is their coverage of science. The reviewers concluded that Britannica has a marginally lower error-rate than Wikipedia but that there is not much to choose between the two in terms of overall accuracy.

Wikipedia is a free-online and rapidly-growing encyclopaedia that has almost four million entries and the English language version grew by 1,500 entries per day in October 2005. Anyone can edit it but this makes it controversial because if anyone can edit entries, how can users really know if Wikipedia is as accurate as more traditional sources such as Encyclopaedia Britannica? The Nature article highlights some examples, such as an entry falsely suggesting that a former assistant to US Senator Robert Kennedy may have been involved in his assassination. Writing in the online magazine TCS last year, former Britannica editor Robert McHenry said

Opening up the editing process to all, regardless of expertise, means that reliability can never be ensured.

In the study, entries were chosen from the websites of Wikipedia and Encyclopaedia Britannica on a range of scientific disciplines and sent to a relevant expert for peer review. Each reviewer examined the entry on a single subject from the two encyclopaedias and they were not told which article came from which encyclopaedia. A total of 42 usable reviews were returned out of 50 sent out and were then examined by Nature's news team.

The exercise revealed numerous errors in both encyclopaedias, but among the 42 entries tested, the difference in accuracy was not particularly great: the average science entry in Wikipedia contained around four inaccuracies; Britannica, about three. So Nature's investigation suggests that Britannica's advantage may not be great, at least when it comes to science entries, although the inaccuracies in both sources suggests that some caution is always advisable. Nature claims that only eight serious errors, such as misinterpretations of important concepts, were detected in the pairs of articles reviewed, four from each encyclopaedia. But their reviewers also found many factual errors, omissions or misleading statements: 162 and 123 in Wikipedia and Britannica, respectively.

Tom Panelas, director of corporate communications at Britannica's headquarters in Chicago commented

We have nothing against Wikipedia. But it is not the case that errors creep in on an occasional basis or that a couple of articles are poorly written. There are lots of articles in that condition. They need a good editor.

Also several of Nature's reviewers noted an undue prominence given to controversial scientific theories.

But Michael Twidale, an information scientist at the University of Illinois at Urbana-Champaign, says

Wikipedia's strongest suit is the speed at which it can updated, a factor not considered by Nature's reviewers. People will find it shocking to see how many errors there are in Britannica. Print encyclopaedias are often set up as the gold standards of information quality against which the failings of faster or cheaper resources can be compared. These findings remind us that we have an 18-carat standard, not a 24-carat one.

The Nature article also mentions a survey of more than 1,000 Nature authors and found that although more than 70% had heard of Wikipedia and 17% of those consulted it on a weekly basis, less than 10% help to update it. It claims that the steady trickle of scientists who have contributed to articles describe the experience as rewarding, if occasionally frustrating (see Challenges of being a Wikipedian).

The co-founder of Wikipedia Jimmy Wales comments that, next year, he intends to introduce a stable version of each entry, once a specific quality threshold is reached.

Questions

  • Is a sample size of 42 large enough to draw conclusions about the 4 million entries in Wikipedia, across 200 languages?
  • Is it reasonable to extrapolate from a science focussed test to all other categories?
  • The article refers to various degrees of differences such as 'inaccuracies', 'serious errors' and 'factual errors, omissions or misleading statements'. How might results be adjusted to weight these different categories? And should different kinds of users care more about certain categories?

Further reading

Submitted by John Gavin.

Mozart's musical game

The Mischievous Mozart
NPR Performance Today, Dec. 16, 2005
Fred Child

January 27th, 2006, is the 250th anniversary of Mozart's birth. The NPR Performance Today is having weekly commentaries on Mozart's works. The "Mischievous Mozart" is one of these commentaries. Child describes a mischievous Mozart and at the end of the program he discusses the famous Mozart Musical Dice game.

He describes the game as follows:

Mozart proposed a musical dice game. Mozart was fascinated by mathematics and puzzles and he wrote a game in which anyone could compose a minuet by rolling dice. This was something several composers around that time played with.



Here's the idea. You write a number of short segments of music that are, in a particular way, interchangeable. Then anyone is able to put them in order by rolling dice. And you don't have to have a lot of these to contribute almost infinite possibilities. Here's a complete game attributed to Mozart K.516F.

This game that we are about to play has 176 short fragments that we put in order by rolling dice. But the number of possible combinations is something more than 100 quadrillion. There are more possible minuets here than there are stars in the sky.

So how do we start? Well we roll the dice to select our first short segment of music. We have several different possibilities that can come up one for every different roll of the dice. If we rolled a 8 we would begin like this (the first measure is played). If we rolled a 4 we would start like this (the next measure is played)

Then Child constructs a minuet which he says is using Mozart's K.516.F. He rolls the dice 14 times resulting in numbers 63771853896743 and remarks that the end of each phrase is always the same. You can hear the resulting minuet at the end of Child's discussion.

This is a typical media discussion of an interesting mathematics topic. It is enough to get you interested in the problem but leaves you with a lot of questions: Where did the 176 come from? If there are 176 short fragments in the music played why were the dice only thrown 14 times?

We find a more understandable description of the mathematical version of Mozart's musical game in an article, Mozart's Melody Machine, that Ivars Peterson wrote for Science News.

Peterson remarks:

Music publishing was a thriving trade during the latter part of the 18th century in Europe. Publishers vied with one another to print the works of the latest "hot" composer. Many of them looked for novel ways to entice new customers into their music shops.

Many of these schemes involved using dice or other randomizers to select musical fragments from an array of choices. Composer Johann Philipp Kirnberger (1721-1783), a former pupil of Johann Sebastian Bach (1685-1750), suggested the use of dice for this purpose in his book The Ever-ready Composer of Polonaises and Minuets, published in 1757. About two decades later, Austrian composer Maximilian Stadler (1748-1833) put together a set of musical bars and tables for generating minuets and trios with the help of dice.

One well-known example of such a scheme is the "Musikalisches Würfelspiel" (Musical Dice Game), first published in 1792 in Berlin. Attributed by the publisher to Wolfgang Amadeus Mozart (1756-1791), it appeared a year after the composer's death.

The idea was to compose a 16-measure "waltz" by rolling dice to decide which measures to select from a large pool of choices. In the "Musikalisches Würfelspiel," the measures are numbered from 1 to 176, and the numbers are arranged in two charts, each consisting of 11 rows and eight columns. To select the first measure, a player would roll two dice, subtract 1 from the total, and look up the corresponding row in the first column of the first chart to determine the appropriate measure number. Subsequent rolls of the dice decide which measure to select from each sixteen columns to complete the melody.

On the NPR website you are invited to make your own minuets by going to the website of Ruttkay and Boskamp. Here you will find that the minuets have 16 measures consistent with Peterson’s description of the game. You will also find an interesting discussion of the use of Mozart's musical game in a mathematics summer camp called "Fun in math" for 12-16 year old "average" Dutch children in 1995 and 1996 in the Netherlands. In addition you will find questions about the Mozart game used in the workshop and a movie of an orchestra made up of those students whom were also musicians playing a random Mozart composition.

Alas, we also learn from Peterson’s article that, even when we understand the math, it may not agree with reality. He writes:

Most scholars reject the publisher's claim that Mozart himself devised this particular scheme... . One Mozart manuscript actually includes what might be considered a musical game, though not played with dice. On both sides of the sheet, Mozart wrote down long strings of measures, grouped into two-bar melodies, each labeled with a letter of the alphabet and a number (1 or 2). However, other than supplying a "worked-out" example at the end of each page, he gave no instructions on how to proceed.

Hideo Noguchi of Kobe, Japan, has tried to work out the game's rules. He speculates that Mozart's starting point was the name of an acquaintance, such as Francisca. The idea was to add "z" to the end of the name, rewrite the letters in alphabetical order, alternately assign the number 1 or 2 to each letter in succession (with certain refinements), then return to the original spelling: f1 r2 a1 n1 c2 i2 s1 c1 a1 z2. A player would then select the appropriate measures in the required order from the labeled groups to come up with a signature tune.

Hideo Noguchi describes his findings in a paper posted here, which also includes facsimiles and transcriptions of the Mozart manuscript pages.

Questions

(1) Using the rules for Mozart's Musical Game described by Peterson, how many possible waltzes are there? Are there more than the number of stars?

(2) Are all waltzes equally likely to occur? If not how could you modify the game to make them equally likely?

(3) Referring to a creating a waltz described above, Martin Gardner remarked, "If you fail to preserve it, it will be a waltz that will probably never be heard again." Why does he say that?

(4) Those who prefer that Mozart rolled dice ask "how could the random minuets (waltzes) made by the dice version sound so much like Mozart if he did not provide the data? What do you think about this?

Suggested by Jim Strickler and submitted by Laurie Snell