Kecenderungan logaritma variabel acak gamma

16

Pertimbangkan variabel acak gamma . Ada rumus rapi untuk mean, varians, dan skewness:XΓ(α,θ)

E[X]=αθVar[X]=αθ2=1/αE[X]2Skewness[X]=2/α

Pertimbangkan sekarang variabel acak log-transformed . Wikipedia memberikan rumus untuk mean dan varians:Y=log(X)

E[Y]=ψ(α)+log(θ)Var[Y]=ψ1(α)

melalui fungsi digamma dan trigamma yang didefinisikan sebagai turunan pertama dan kedua dari logaritma fungsi gamma.

Apa rumus untuk kemiringan?

Akankah fungsi tetragamma muncul?

(Apa yang membuat saya bertanya-tanya tentang ini adalah pilihan antara distribusi lognormal dan gamma, lihat distribusi Gamma vs lognormal . Di antara hal-hal lain, mereka berbeda dalam sifat kemiringannya. Secara khusus, kemiringan log dari lognormal sepele sama dengan nol. Sedangkan kemiringan log gamma negatif. Tetapi seberapa negatif?)

amuba kata Reinstate Monica
sumber
1
Apakah ini membantu? Atau ini ?
S. Kolassa - Reinstate Monica
Saya tidak yakin apa itu distribusi log-gamma. Jika itu terkait dengan gamma karena lognormal terkait dengan normal, maka saya bertanya tentang sesuatu yang lain (karena "lognormal", yang membingungkan, adalah distribusi exp (normal) bukan log (normal)).
Amuba kata Reinstate Monica
1
@ Glen_b: Sejujurnya, saya akan mengatakan bahwa memanggil eksponensial dari normal "lognormal" jauh lebih tidak konsisten dan membingungkan. Meski, sayangnya, lebih mapan.
S. Kolassa - Reinstate Monica
2
@Stephan lihat juga log-logistic, log-Cauchy, log-Laplace dll. Ini adalah konvensi yang lebih jelas daripada kebalikannya
Glen_b -Reinstate Monica
1
Ya; Saya telah berhati-hati untuk tidak mengatakan "log-gamma" di mana saja sehubungan dengan distribusi ini karena alasan ini. (Saya telah menggunakannya di masa lalu secara konsisten dengan log-normal)
Glen_b -Reinstate Monica

Jawaban:

12

Generating saat fungsi M(t) dari Y=lnX sangat membantu dalam kasus ini, karena memiliki bentuk aljabar sederhana. Dengan definisi mgf, kita memiliki

M(t)=E[etlnX]=E[Xt]=1Γ(α)θα0xα+t1ex/θdx=θtΓ(α)0yα+t1eydy=θtΓ(α+t)Γ(α).

Mari kita memverifikasi harapan dan varian yang Anda berikan. Mengambil turunan, kita memiliki dan M ( t ) = Γ ( α + t )

M(t)=Γ(α+t)Γ(α)θt+Γ(α+t)Γ(α)θtln(θ)
Karenanya, E [ Y ] = ψ ( 0 ) ( α ) + ln ( θ ) ,
M(t)=Γ(α+t)Γ(α)θt+2Γ(α+t)Γ(α)θtln(θ)+Γ(α+t)Γ(α)θtln2(θ).
Ini mengikuti kemudianVar(Y)=E
E[Y]=ψ(0)(α)+ln(θ),E[Y2]=Γ(α)Γ(α)+2ψ(0)(α)ln(θ)+ln2(θ).
Var(Y)=E[Y2]E[Y]2=Γ(α)Γ(α)(Γ(α)Γ(α))2=ψ(1)(α).

Untuk menemukan kemiringan, perhatikan fungsi penghasil kumulans (terima kasih @probabilityislogic untuk tipnya) adalah Kumulant pertama dengan demikian hanya K ( 0 ) = ψ ( 0 ) ( α ) + ln ( θ ) . Ingat itu

K(t)=lnM(t)=tlnθ+lnΓ(α+t)lnΓ(α).
K(0)=ψ(0)(α)+ln(θ), n 2 . Kecondongan karena itu E [ ( Y - E [ Y ] ) 3 , sehingga kumulan berikutnya adalah K ( n ) ( 0 ) = ψ ( n - 1 ) ( α )ψ(n)(x)=dn+1lnΓ(x)/dxn+1K(n)(0)=ψ(n1)(α)n2
E[(YE[Y])3]Var(Y)3/2=ψ(2)(α)[ψ(1)(α)]3/2.

Sebagai catatan tambahan, distribusi khusus ini tampaknya telah dipelajari secara seksama oleh AC Olshen dalam bukunya Transformations of the Pearson Type III Distribution , Distribusi Univariat Berkelanjutan Johnson dkk juga memiliki bagian kecil tentangnya. Lihat itu.

Francis
sumber
3
K(t)=log[M(t)]=tlog[θ]+log[Γ(α+t)]log[Γ(α)]M(t)skew=K(3)(0)=ψ(2)(α)ψ(n)(z)
1
@probabilityislogic: panggilan yang sangat bagus, mengubah jawaban saya
Francis
@probabilityislogic Ini adalah tambahan yang bagus, terima kasih banyak. Saya hanya ingin mencatat, jangan sampai beberapa pembaca bingung, bahwa kemiringan tidak secara langsung diberikan oleh kumulan ketiga: ini adalah momen standar ketiga, bukan momen sentral ketiga. Francis sudah benar dalam jawabannya, tetapi formula terakhir dalam komentar Anda tidak tepat.
Amuba mengatakan Reinstate Monica
13

I. Perhitungan langsung

0xν1eμx(lnx)pdx
p=2,3,4p=1Γ,ψζpsebagai turunan dari fungsi gamma sehingga mungkin layak untuk menjadi lebih tinggi. Jadi kemiringan tentu bisa dilakukan tetapi tidak terutama "rapi".

Rincian derivasi formula di 4,358 ada di [2]. Saya akan mengutip formula yang diberikan di sana karena mereka sedikit lebih jelas dinyatakan dan menempatkan 4.352.1 dalam bentuk yang sama.

δ=ψ(a)lnμ

0xa1eμxlnxdx=Γ(a)μa{δ}0xa1eμxln2xdx=Γ(a)μa{δ2+ζ(2,a)}0xa1eμxln3xdx=Γ(a)μa{δ3+3ζ(2,a)δ2ζ(3,a)}0xa1eμxln4xdx=Γ(a)μa{δ4+6ζ(2,a)δ28ζ(3,a)δ+3ζ2(2,a)+6ζ(4,a))}

where ζ(z,q)=n=01(n+q)z is the Hurwitz zeta function (the Riemann zeta function is the special case q=1).

Now on to the moments of the log of a gamma random variable.

Noting firstly that on the log scale the scale or rate parameter of the gamma density is merely a shift-parameter, so it has no impact on the central moments; we may take whichever one we're using to be 1.

If XGamma(α,1) then

E(logpX)=1Γ(α)0logpxxα1exdx.

We can set μ=1 in the above integral formulas, which gives us raw moments; we have E(Y), E(Y2), E(Y3), E(Y4).

Since we have eliminated μ from the above, without fear of confusion we're now free to re-use μk to represent the k-th central moment in the usual fashion. We may then obtain the central moments from the raw moments via the usual formulas.

Then we can obtain the skewness and kurtosis as μ3μ23/2 and μ4μ22.


A note on terminology

It looks like Wolfram's reference pages write the moments of this distribution (they call it ExpGamma distribution) in terms of the polygamma function.

By contrast, Chan (see below) calls this the log-gamma distribution.


II. Chan's formulas via MGF

Chan (1993) [3] gives the mgf as the very neat Γ(α+t)/Γ(α).

(A very nice derivation for this is given in Francis' answer, using the simple fact that the mgf of log(X) is just E(Xt).)

Consequently the moments have fairly simple forms. Chan gives:

E(Y)=ψ(α)

and the central moments as

E(YμY)2=ψ(α)E(YμY)3=ψ(α)E(YμY)4=ψ(α)

and so the skewness is ψ(α)/(ψ(α)3/2) and kurtosis is ψ(α)/(ψ(α)2). Presumably the earlier formulas I have above should simplify to these.

Conveniently, R offers digamma (ψ) and trigamma (ψ) functions as well as the more general polygamma function where you select the order of the derivative. (A number of other programs offer similarly convenient functions.)

Consequently we can compute the skewness and kurtosis quite directly in R:

skew.eg <- function(a) psigamma(a,2)/psigamma(a,1)^(3/2)
kurt.eg <- function(a) psigamma(a,3)/psigamma(a,1)^2

Trying a few values of a (α in the above), we reproduce the first few rows of the table at the end of Sec 2.2 in Chan [3], except that the kurtosis values in that table are supposed to be excess kurtosis, but I just calculated kurtosis by the formulas given above by Chan; these should differ by 3.

(E.g. for the log of an exponential, the table says the excess kurtosis is 2.4, but the formula for β2 is ψ(1)/ψ(1)2 ... and that is 2.4.)

Simulation confirms that as we increase sample size, the kurtosis of a log of an exponential is converging to around 5.4 not 2.4. It appears that the thesis possibly has an error.

Consequently, Chan's formulas for central moments appear to actually be the formulas for the cumulants (see the derivation in Francis' answer). This would then mean that the skewness formula was correct as is; because the second and third cumulants are equal to the second and third central moments.

Nevertheless these are particularly convenient formulas as long as we keep in mind that kurt.eg is giving excess kurtosis.

References

[1] Gradshteyn, I.S. & Ryzhik I.M. (2007), Table of Integrals, Series, and Products, 7th ed.
Academic Press, Inc.

[2] Victor H. Moll (2007)
The integrals in Gradshteyn and Ryzhik, Part 4: The gamma function
SCIENTIA Series A: Mathematical Sciences, Vol. 15, 37–46
Universidad Técnica Federico Santa María, Valparaíso, Chile
http://129.81.170.14/~vhm/FORM-PROOFS_html/final4.pdf

[3] Chan, P.S. (1993),
A statistical study of log-gamma distribution,
McMaster University (Ph.D. thesis)
https://macsphere.mcmaster.ca/bitstream/11375/6816/1/fulltext.pdf

Glen_b -Reinstate Monica
sumber
1
Cool. Thanks a lot! According to the encyclopedia entry that Stephan linked to above, the final answer for skewness is ψ(α)/ψ(α)3/2 (which almost qualifies as "neat"!). So it seems that all the scary zetas will have to cancel out.
amoeba says Reinstate Monica
1
Sorry only just now saw your comment (I've been editing for about an hour or so); that's correct, though if the Encyclopedia gives kurtosis the way Chan gives it in his thesis, it seems that it's wrong (as given above), but readily corrected. The neat formulas appear to be for cumulants rather than standardized central moments.
Glen_b -Reinstate Monica
Yes, the Encyclopedia does give the same formula for kurtosis.
amoeba says Reinstate Monica
Hmm, I mean to refer to the things normally denoted γ1 and γ2. I will fix.
Glen_b -Reinstate Monica
2
I should probably add the note that the Hurwitz zeta function can be expressed in terms of the polygamma function, and vice versa:
ψ(n)(z)=(1)n+1Γ(n+1)ζ(n+1,z)
So, the answer to the @amoeba's question of "will the tetragamma function appear?" is YES.
J. M. is not a statistician