Bagaimana cara memperoleh fungsi kemungkinan untuk distribusi binomial untuk estimasi parameter?

22

Menurut Miller dan Freund's Probability and Statistics for Engineers, 8ed (hal.217-218), fungsi kemungkinan dimaksimalkan untuk distribusi binomial (uji coba Bernoulli) diberikan sebagai

L(p)=i=1npxi(1p)1xsaya

Bagaimana cara mencapai persamaan ini? Tampaknya cukup jelas bagi saya mengenai distribusi lainnya, Poisson dan Gaussian;

L(θ)=i=1nPDF or PMF of dist.

Tapi yang untuk binomial hanya sedikit berbeda. Terus terang, bagaimana caranya

nCx px(1p)nx

menjadi

pxi(1p)1xi

dalam fungsi kemungkinan di atas?

Ébe Isaac
sumber

Jawaban:

25

Dalam estimasi kemungkinan maksimum, Anda mencoba memaksimalkan ; Namun, memaksimalkan ini sama dengan memaksimalkan p x ( 1 - p ) n - x untuk fixnCx px(1p)nxpx(1p)nxx.

Actually, the likelihood for the gaussian and poisson also do not involve their leading constants, so this case is just like those as w


Addressing OPs Comment

Here is a bit more detail:

First, x is the total number of successes whereas xi is a single trial (0 or 1). Therefore:

i=1npxi(1p)1xi=p1nxi(1p)1n1xi=px(1p)nx

That shows how you get the factors in the likelihood (by running the above steps backwards).

Why does the constant go away? Informally, and what most people do (including me), is just notice that the leading constant does not affect the value of p that maximizes the likelihood, so we just ignore it (effectively set it to 1).

We can derive this by taking the log of the likelihood function and finding where its derivative is zero:

ln(nCx px(1p)nx)=ln(nCx)+xln(p)+(nx)ln(1p)

Take derivative wrt p and set to 0:

ddpln(nCx)+xln(p)+(nx)ln(1p)=xpnx1p=0

nx=1pp=xn

Notice that the leading constant dropped out of the calculation of the MLE.

More philosophically, a likelihood is only meaningful for inference up to a multiplying constant, such that if we have two likelihood functions L1,L2 and L1=kL2, then they are inferentially equivalent. This is called the Law of Likelihood. Therefore, if we are comparing different values of p using the same likelihood function, the leading term becomes irrelevant.

At a practical level, inference using the likelihood function is actually based on the likelihood ratio, not the absolute value of the likelihood. This is due to the asymptotic theory of likelihood ratios (which are asymptotically chi-square -- subject to certain regularity conditions that are often appropriate). Likelihood ratio tests are favored due to the Neyman-Pearson Lemma. Therefore, when we attempt to test two simple hypotheses, we will take the ratio and the common leading factor will cancel.

NOTE: This will not happen if you were comparing two different models, say a binomial and a poisson. In that case, the constants are important.

Of the above reasons, the first (irrelevance to finding the maximizer of L) most directly answers your question.


sumber
2
We can see that's the idea. But could you explain a little more on how nCx is removed and n is replaced with 1?
Ébe Isaac
@ÉbeIsaac added some more details
2

xi in the product refers to each individual trial. For each individual trial xi can be 0 or 1 and n is equal to 1 always. Therefore, trivially, the binomial coefficient will be equal to 1. Hence, in the product formula for likelihood, product of the binomial coefficients will be 1 and hence there is no nCx in the formula. Realised this while working it out step by step :) (Sorry about the formatting, not used to answering with mathematical expressions in answers...yet :) )

Abhishek Tiwari
sumber