





 3月9日の日記で、アメリカ統計学会の声明の2.を取り上げた。「P-values do not measure the ... probability that the data were produced by random chance alone. (p値は「ランダムな偶然だけからそのデータが得られる確率」ではない)」というのは、いっけんあれっ?と思わせる記述であるが、p値というのはあくまで「帰無仮説が真のもとでの “観測値の出現率" を計算しているにすぎない.」というのが正しい解釈である。

 このことに関連して、某家族からRetraction Watchに、こちらの記事が掲載されているとの情報が送られてきた。3月9日に私が取り上げたことに関連して、興味深いやりとりがあった。
2.P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.

Retraction Watch: Some of the principles seem straightforward, but I was curious about #2 ? I often hear people describe the purpose of a p value as a way to estimate the probability the data were produced by random chance alone. Why is that a false belief?

Ron Wasserstein: Let’s think about what that statement would mean for a simplistic example. Suppose a new treatment for a serious disease is alleged to work better than the current treatment. We test the claim by matching 5 pairs of similarly ill patients and randomly assigning one to the current and one to the new treatment in each pair. The null hypothesis is that the new treatment and the old each have a 50-50 chance of producing the better outcome for any pair. If that’s true, the probability the new treatment will win for all five pairs is (?)5 = 1/32, or about 0.03. If the data show that the new treatment does produce a better outcome for all 5 pairs, the p-value is 0.03. It represents the probability of that result, under the assumption that the new and old treatments are equally likely to win. It is not the probability the new treatment and the old treatment are equally likely to win.

This is perhaps subtle, but it is not quibbling.  It is a most basic logical fallacy to conclude something is true that you had to assume to be true in order to reach that conclusion.  If you fall for that fallacy, then you will conclude there is only a 3% chance that the treatments are equally likely to produce the better outcome, and assign a 97% chance that the new treatment is better. You will have committed, as Vizzini says in “The Princess Bride,” a classic (and serious) blunder.


 両側検定の場合は“AとBの差は有意であった”とは言えるが,“AよりBのほうが有意に大であった”とは言えない.A>Bと結論することは実際的には問題がないが(近藤・安藤, 1967, p.16),これは検定の結果ではなくて,信頼限界に基づく推定の結果であることを理解しておかなければならない.

