Talk:Likelihood principle
|
The likelihood link ( http://www.cimat.mx/reportes/enlinea/D-99-10.html ) given at the end is 404. -- 20050120 03:15.
Hello. Recently the following statement was added -- "By contrast, a likelihood-ratio test is based on the principle." This is not clear to me -- while forming a likelihood ratio is entirely consistent with the likelihood principle, appealing to the usual logic of null-hypothesis tests (rejecting the null hypothesis if the LR is too small) is not. A LR test appears to be similar to other null-hypothesis tests in that events that didn't happen have an effect on the inference, thus it appears to be inconsistent with the likelihood principle. -- I'm inclined to remove this new assertion; perhaps someone would like to argue in its favor? Happy editing, Wile E. Heresiarch 15:05, 18 Jun 2004 (UTC)
- Being a Bayesian at heart, I don't personally like standard likelihood-ratio tests any more than I like maximum likelihood as a method, but the argument might go like this: the evidence may point more to the null hypothesis or more to the alternative hypothesis. The degree to which the evidence points at one hypothesis rather than another is (thanks to the likelihood principle) expressed in the likelihood ratio. Therefore it makes sense to accept the null hypothesis if the likelihood ratio is "high enough" and to reject it if not. The value of "high enough" is a matter of choice; one approach might be to use 1 as the critical value, but for a Bayesian looking at point hypotheses the figure would best be a combination of the (inverse) ratio of the priors and the relative costs of Type I errors and Type II errors. --Henrygb 22:15, 23 Jun 2004 (UTC)
- Henry, I went ahead and removed "By contrast, a likelihood-ratio test can be based on the principle." from the article. A standard LR test, as described for example in the likelihood-ratio test article, does involve unrealized events and so it is not consistent with the likelihood principle. It might be possible to construct an unconventional LR test as described above, but that's not what is generally understood by the term, so I think it's beside the point. Regards & happy editing, Wile E. Heresiarch 14:20, 4 Aug 2004 (UTC)
- I think you need to consider what you are saying: "[the likelihood ratio] is the degree to which the observation x supports parameter value or hypothesis a against b. If this ratio is 1, the evidence is indifferent, and if greater or less than 1, the evidence supports a against b or vice versa." But you think that this does not provide any justification for a likelihood ratio test which in effect says: "If the likelihood ratio is less than some value κ, then we can decide to prefer b to a." I find that very odd; I suspect that in fact you object to how frequentists calculate κ but that is not the point about likelihood ratio tests in general. --Henrygb 17:33, 13 Aug 2004 (UTC)
- I'm willing to consider a compromise of the form "The conventional likelihood-ratio test is not consistent with the likelihood principle, although there is an unconventional LR test which is". That would make it necessary to explain just what an unconventional LR test is, which might be worthwhile. Comments? Wile E. Heresiarch 02:17, 15 Aug 2004 (UTC)
I've largely rewritten the article. It still needs work, in particular, it needs some non-Bayesian argument under "Arguments in favor of the likelihood principle". I've tried to clarify the article by separating the general principle from particular applications. It could also use some links to topics of inference in general, maybe Hume, Popper, epistemology etc if we want to get wonky about it. Wile E. Heresiarch 17:51, 2 Jan 2004 (UTC)
The remainder of the talk page here could probably be archived under a suitable title.
In my opinion, it is not true that, if the designs produce proportional likelihood functions, one should make an identical inference about a parameter from the data irrespective of the design which generated the data (likelihood principle: LP).
The situation is usually illustrated by means of the following well-known example. Consider a sequence of independent Bernoulli trials in which there is a constant probability of success p for each trial. The observation of x successes on n trials could arise in two ways: either by taking n trials yielding x successes, or by sampling until x successes occur, which happens to require n trials. According to the LP, the distinction is irrelevant. In fact, the likelihood is proportional to the same expression in each case, and the inferences about p would be the same.
Nevertheless, this point is questionable.
In particular, following the logical approach, the probability of hypothesis h is conditional upon or relative to given evidence (cf. Carnap, 1962, p.31). Quoting Carnap's own words, "the omission of any reference to evidence is often harmless". That means that probability is conditional to that which is known. Now, apart from other information, the design d is actually known. Therefore, evidence (e) comprises not only that which is known to the statistician prior the survey is performed (e*), but also the piece of information about d. Let suppose now that i (that stands for information) is our experimental observation and h one of the competing hypotheses, we could use the premise above to correctly formulate the probability of i as follows:
(1) p(i|h, e*, d)
Notice that this probability is not defined without a reference to d. Thus, the probability of x successes on n Bernoulli trials is different whether n or x is fixed before the experiment is performed. Namely, the design always enters into the inference because of its occurrence in the probability of i.
- So far so good. Note that p(i|h, e*, d) immediately simplifies to p(i|h, e*). Why? Because asserting that p(i|h, e*, d) != p(i|h, e*) is equivalent to asserting that p(d|i, h, e*) != p(d|h, e*) -- that is, knowing the experimental outcome must tell you something about the design. That's not so: I tell you that I tossed a coin 10 times and got 7 heads. Do you have any reason to believe one way or the other that I resolved to toss 10 times exactly, or to toss until getting 7 heads? No, you don't. Therefore p(i|h, e*, d) = p(i|h, e*), and the computation of p(h|i, e*) goes through as usual. Wile E. Heresiarch 17:51, 2 Jan 2004 (UTC)
The simplified manner with which Bayes formula has been and still is presented in Statistics (i.e. without specifying the evidence e) caused rather serious interpretation errors. As a matter a fact, the correct expression of Bayes' formula is of the form:
(2) p(h|i, e*, d) proportional to p(h| e*, d) p(i|h, e*, d)
in which it is apparent that the prior depends on d. Namely, in general, the prior is influenced by the knowledge available on design.
Consequently, contrary to a widely held opinion, the likelihood principle is not a direct consequence of Bayes theorem. In particular, the piece of information about the design is one part of the evidence, and, therefore, it is relevant for the prior.
REFERENCES:
CARNAP R. (1962). Logical Foundations of Probability. The University of Chicago Press.
DE CRISTOFARO R. (1992). The Inductive Reasoning in Statistical Inference, Communications in Statistics, Theory and Methods, v. 31, issue 7, pp. 1079-1089.