Mengapa supremum jembatan Brownian memiliki distribusi Kolmogorov-Smirnov?

Jawaban:

13

nsupx|FnF|=supx|1ni=1nZi(x)|

di mana Zi(x)=1XixE[1Xix]

oleh CLT Anda memiliki Gn=1ni=1nZi(x)N(0,F(x)(1F(x)))

ini adalah intuisi ...

brownian bridge memiliki varian t ( 1 - t ) http://en.wikipedia.org/wiki/Brownian_bridge ganti t dengan F ( x ) . Ini untuk satu x ...B(t)t(1t) tF(x)x

x1,,xk(Gn(x1),,Gn(xk))(B1,,Bk) where (B1,,Bk) is N(0,Σ) with Σ=(σij)σij=min(F(xi),F(xj))F(xi)F(xj).

The difficult part is to show that the distribution of the suppremum of the limit is the supremum of the distribution of the limit... Understanding why this happens requires some empirical process theory, reading books such as van der Waart and Welner (not easy). The name of the Theorem is Donsker Theorem http://en.wikipedia.org/wiki/Donsker%27s_theorem ...

robin girard
sumber
Shouldn't we apply the CLT to all finite-dimensional marginal distributions?
Rasmus
you asked for an intuitive answer :) also I choose not to bother you with the tricky mathematical part which is to show that the convergence for all t implies the convergence (in law) of the supremum... do you want me to complete the answer ?
robin girard
Dear robin girard, I think your answer is fine as it stands. Thank you!
Rasmus
1
the difficult part actually is to show weak convergence. The convergence of supremums then follows directly from continuous mapping theorem. This result can be found in Billingsley's "Convergence of Probability Measures". Van der Vaart and Wellner give more general result and their book is really, really tough :)
mpiktas
@robingirard I personally would love to see a "complete answer" with all the "tricky mathematical part[s]"
StatsPlease
6

For Kolmogorov-Smirnov, consider the null hypothesis. It says that a sample is drawn from a particular distribution. So if you construct the empirical distribution function for n samples f(x)=1niχ(,Xi](x), in the limit of infinite data, it will converge to the underlying distribution.

For finite information, it will be off. If one of the measurements is q, then at x=q the empirical distribution function takes a step up. We can look at it as a random walk which is constrained to begin and end on the true distribution function. Once you know that, you go ransack the literature for the huge amount of information known about random walks to find out what the largest expected deviation of such a walk is.

You can do the same trick with any p-norm of the difference between the empirical and underlying distribution functions. For p=2, it's called the Cramer-von Mises test. I don't know the set of all such tests for arbitrary real, positive p form a complete class of any kind, but it might be an interesting thing to look at.

user873
sumber