Chance News 30

From ChanceWiki
Jump to navigation Jump to search

Quotations

Forsooth

The following Forsooth was suggested by John Vokey.

Vitamine D can lower risk of death by 7 percent

Martin Mitttelstaedt
Globe and Mail
September 11, 2007

This is an interesting example. In the article we read "people who were given a vitamin D supplement had a 7-per-cent lower risk of premature death than those who were not." and "It appears to be a life extender". So perhaps many of our forsooths come from the fact that copy editors write the headlines. (Lauie Snell)

Supplement may help treat gambling addiction

Miniapolis Star Tribune, September 12, 2007
Maura Lerner

There seems to be a never-ending supply of questionable statistical studies. Consider the recent Minneapolis Star Tribune account of September 12, 2007. A University of Minnesota researcher publishing in the September 15, 2007 issue of Biological Psychiatry treated "27 pathological gamblers for eight weeks" with an amino acid supplement, N-acetyl cysteine. "By the end, 60 percent said they had fewer urges to gamble." Of the 16 who reported a benefit, "13 remained in a follow-up study...five out the six on the supplement reported continued improvement, compared to two out of seven on a placebo." According to the researcher, "There does seem to be some effect, but you would need bigger numbers."

Here are the results of the follow-up study as seen by Minitab:

MTB > PTwo 6 5 7 2.

Test and CI for Two Proportions

Sample
X
N
Sample p
1
5
6
0.833333
2
2
7
0.285714

Difference = p (1) - p (2)

Estimate for difference: 0.547619

95% CI for difference: (0.0993797, 0.995858)

Test for difference = 0 (vs not = 0): Z = 2.39 P-Value = 0.017

Fisher's exact test: P-Value = 0.103

NOTE: The normal approximation may be inaccurate for small samples.

Discussion

1.. Assume you are a frequentist, what about statistical significance? Note the discrepancy between the exact P-Value and the P-Value using the normal approximation.

2.. Assume you are a Bayesian and thus immune to P-Value whether exact or due to a normal approximation, pick your priors and find the probability that there is a difference between the effect of the supplement and the effect of the placebo.

3.. Aside from the choice of inference procedure, frequentist or Bayesian, what other flaws do you see in this study with regard to sample size and measurement of success?

4.. Speculate as to why this study was reported in a Twin Cities newspaper and probably not elsewhere.

5.. Speculate on what might happen if the 11 who did not respond to the supplement originally were put on the follow-up study.

Submitted by Paul Alper

Excel 2007 arithmetic error

Calculation Issue Update David Gainer, September 25, 2007.

The Excel blog at Microsoft usually talks about product enhancements and future plans for development, but on September 25 had to admit an embarrassing problem with basic arithmetic in Excel 2007. A series of calculations such as 77.1*850, 20.4*3,212.5, 10.2*6,425, and 5.1*12,850 that should normally produce a value of 65,535 instead produce a result of 100,000. The result is actually stored in an acceptable binary form, but the process of transforming the binary representation to a decimal form that is then displayed is flawed.

Although there are an infinite number of rational numbers, a computer can only represent a finite number of these values in its storage. For the rest, the computer has to choose a value in binary that is reasonably close. Some numbers that have very simple representations in decimal, such as 0.1 do not have an exact representation in binary. Certain fractions, such as 1/3 and 1/7 have infinite expansions in decimal notation and have to be truncated. A larger list of fractions such as 1/10 have infinite expansion in binary representation, so that creates a slight inaccuracy. This slight inaccuracy produces a product for terms like 77.1*850 that are not quite 65,535 but slightly larger or smaller.

According to the Microsoft blog, there are six binary numbers that lie between the decimal values of 65534.99999999995 and 65,535 that are not displayed properly in Excel 2007. Another six binary numbers that lie between 65,535.99999999995 and 65,536 also have problems. You can't enter these numbers directly in Excel, because Excel will round any directly entered values to 15 significant digits.

It's probably not a coincidence that these numbers are close to 2^16. These values would have long strings of consecutive 1's in the binary representation. An entry on the Wolfram Blog, Arithmetic is Hard--To Get Right, Mark Sofroniou, speculates that there is a problem with propagation of carries.

This bug only affects values close to 65,535 and 65,536 when they are displayed as a final result. Intermediate calculations that produce one of these unfortunate 12 numbers are unaffected because it is the process of converting the binary representation to a decimal form for display that is flawed.

The bug also does not appear to affect earlier versions of Excel. This error is reminiscent of the Pentium FDIV bug.

Questions

1. According to the Microsoft blog, there are 9.2*10^18 possible binary values, and only 12 of them are affected by this bug. Would it be safe to say that the probability that any individual would encounter this bug is 12 / 9.2*10^18 = 1.3*10^-18?

2. It is impossible to test every possible arithmetic calculation in a computer system. How would you design a testing system that evaluated a representative sample of possible arithmetic calculations?

Submitted by Steve Simon

article 5