[Note for bibliographic reference: Melberg, Hans O. (1997), What should we
believe? A reflection on the flawed use of traditional hypothesis-testing http://www.oocities.org/hmelberg/papers/970201.htm]
What should we believe?
A reflection on the flawed use of traditional hypothesis-testing
by Hans O. Melberg
Introduction: An example
I recently read an article in American Economic Review in which the authors had
asked a sample of Russians and Americans a number of questions concerning their attitude
toward market transactions. ("Popular attitudes toward free markets" by R. J.
Schiller, M. Boycko and V. Korobov, AER, 81 (1991), pp. 385-400). A typical example
is the following:
"A small merchant company buys vegetables from some rural people, brings the
vegetables to the city and sells them, making from this a large profit. The company
honestly and openly tells the rural people what it is doing, and these people freely sell
the company the vegetables at the agreed price. is this behaviour of the company, making
large profit using the rural people, acceptable from a moral point of view?" (p.
390).
In Russia 49% answered "yes", compared to 59% of the Americans i.e. there was
a difference of 10%. However, with a t-statistic of 0.52 and 218 observations this
difference is not statistically significant. The theory of traditional statistical
hypothesis testing tells us that we cannot reject the null hypothesis that the Russians
and the Americans have the same attitudes. In fact, based on a number of similar questions
with roughly similar responses - different but often statistically insignificant - the
authors of the article conclude that "the reported evidence suggests that there is
actually little ground to believe that the Soviets are characteristically more hostile
towards free-market prices." (p. 390)
My immediate reflection was that this cannot be the correct way to form beliefs. I have
a choice between two mutually exclusive alternatives. First, Russians and Americans are
similar. Second: Russians and Americans are not similar. Assume you have a survey in which
there is a difference of opinions between Russians and Americans, but this difference is
not always statistically significant. The question is then what you should believe.
It seems to me that you should believe exactly what the survey says i.e. you should
believe that the Russians are different from the Americans. Moreover you should believe
this with a strength which corresponds to the significance of the results i.e. in this
case it moderately strong belief. (I base this on the following t-statistics reported in
the article: -0.89, -0.71, 0.06, -3.71, 0.29, 0.52. -1.28, -2.07 and more of the same.) It
seems wrong to accept the first belief (similarity) simply because you cannot prove with
95% confidence that the second belief (difference) is correct.
More generally
The above mistake is not isolated. Again and again I encounter examples of people who
dismiss a result because it is not statistically significant - or worse, to accept a
belief because the null hypothesis cannot be rejected. For example, in his book Rational
Expectations (Cambridge University Press, 1996, 2nd edition) Steven Sheffrin writes
about a number of articles which test the validity of rational expectations (see p.17, 21,
52-54, 142-3, 156). Often these articles conclude that rational expectations could not be
rejected and hence we should believe it. This is to commit two mistakes. First, to make
what you want to establish the null hypothesis while it should be the alternative
hypothesis. Second, to make a dichotomous distinction between significant and
insignificant, ignoring the fact that the strength of a hypothesis is a question of
degrees.
In the above example one might debate whether it is a mistake to make rationality the
null hypothesis. One may argue that parsimony and general considerations about the
purposive nature of humans imply that rationality should be the a privileged assumption.
There are also issues which I have not discussed, such as the formulation of the
hypothesises. In the case of Russians vs. Americans the distinction is relatively clear:
Either they are different or they are not. In this case the one of the positions is more
likely the less likely the other is. However, in the case of rational expectations it is
less obvious that if adaptive expectations is untrue, then rational expectations is true.
There are more than two ways of forming expectations and it might be that regressive
expectations - not rational- are true when adaptive expectations are proven inadequate to
explain the data. In any case, the answer is one of degrees - not absolute truths.
In conclusion
The mistake pointed out here is not really one inherent in statistical theory since we are
always advised - even in introductory textbooks - that rejection of the null hypothesis
does not imply acceptance of the alternative hypothesis. Moreover, we also learn that
statistical significance is a purely technical term which should not be confused with
economic significance. The problem is the prevailing unreflective use of the theory. It is
as if 95% was a mystical and sacred number below which nothing counts and above which
everything is true. The position is clearly incorrect.
Note: If you want to read more about the misunderstood concept of statistical
significance, there is a good article by Deirdre McCloskey and Stephen T. Ziliak
("The Standard Error of Regressions") in Journal of Economic Literature
34 (1996), pp. 97-114).
[Note for bibliographic reference: Melberg, Hans O. (1997), What should we believe? A
reflection on the flawed use of traditional hypothesis-testing http://www.oocities.org/hmelberg/papers/970201.htm]