Intuisi untuk Ekspektasi Bersyarat dari aljabar-

Misalkan $(\Omega,\mathscr{F},\mu)$ menjadi ruang probabilitas, diberi variabel acak $\xi:\Omega \to \mathbb{R}$ dan $\sigma$ aljabar $\mathscr{G}\subseteq \mathscr{F}$ kita dapat membuat variabel acak baru $E[\xi|\mathscr{G}]$ , yang merupakan harapan bersyarat.

Apa sebenarnya intuisi untuk berpikir tentang $E[\xi|\mathscr{G}]$ ? Saya mengerti intuisi sebagai berikut:

(i) $E[\xi|A]$ mana $A$ adalah suatu peristiwa (dengan probabilitas positif).

(ii) $E[\xi|\eta]$ mana $\eta$ adalah variabel acak diskrit.

Tetapi saya tidak dapat memvisualisasikan $E[\xi|\mathscr{G}]$ . Saya memahami matematika itu, dan saya mengerti bahwa itu didefinisikan sedemikian rupa untuk menggeneralisasi kasus-kasus sederhana yang dapat kita visualisasikan. Tetapi bagaimanapun juga saya tidak menemukan cara berpikir ini berguna. Itu tetap menjadi objek misterius bagi saya.

Misalnya, misalkan menjadi peristiwa dengan . Bentuk -algebra , yang dihasilkan oleh . Kemudian akan sama dengan $A$ $\mu(A)>0$ $\sigma$ $\mathscr{G} = \{ \emptyset, A, A^c, \Omega\}$ $A$ $E[\xi|\mathscr{G}](\omega)$ jika, dan sama dengan $\frac{1}{\mu(A)} \int_A \xi$ $\omega \in A$ jika. Dengan kata lain,jika, danjika. $\frac{1}{\mu(A^c)} \int_{A^c} \xi$ $\omega \not \in A$ $E[\xi|\mathscr{G}](\omega) = E[\xi|A]$ $\omega\in A$ $E[\xi|\mathscr{G}](\omega) = E[\xi|A^c]$ $\omega \in A^c$

Bagian yang membingungkan adalah , jadi mengapa kita tidak menulis saja ? Mengapa kami mengganti oleh tergantung pada apakah atau tidak , tetapi tidak diizinkan untuk mengganti $\omega \in \Omega$ $E[\xi|\mathscr{G}](\omega) = E[\xi|\Omega] = E[\xi]$ $E[\xi|\mathscr{G}]$ $E[\xi| A\text{ or } A^c]$ $\omega\in A$ oleh ? $E[\xi|\mathscr{G}]$ $E[\xi]$

Catatan. Dalam menanggapi pertanyaan ini, jangan menjelaskan ini dengan menggunakan definisi ekspektasi kondisional yang ketat. Aku mengerti itu. Yang ingin saya pahami adalah apa yang diharapkan oleh penghitungan bersyarat dan mengapa kita menolak satu di tempat yang lain.

probability conditional-probability conditional-expectation conditioning sigma-algebra Nicolas Bourbaki
sumber

Jawaban:

Salah satu cara untuk berpikir tentang representasi kondisional adalah sebagai proyeksi ke aljabar - . $\sigma$ $\mathscr{G}$

( dari Wikimedia commons )

Ini benar-benar benar ketika berbicara tentang variabel acak kuadrat-integrable; dalam hal ini sebenarnya adalah proyeksi ortogonal dari variabel random ke ruang bagian dari yang terdiri dari variabel-variabel acak terukur sehubungan dengan . Dan sebenarnya ini bahkan ternyata benar dalam beberapa hal untuk variabel acak melalui pendekatan oleh variabel acak . $\mathbb{E}[\xi|\mathscr{G}]$ $\xi$ $L^2(\Omega)$ $\mathscr{G}$ $L^1$ $L^2$

(Lihat komentar untuk referensi.)

Jika seseorang menganggap aljabar sebagai mewakili berapa banyak informasi yang kami miliki (interpretasi yang sesuai dengan teori proses stokastik), maka aljabar yang lebih besar berarti lebih banyak kemungkinan kejadian dan dengan demikian lebih banyak informasi tentang kemungkinan hasil, sementara lebih kecil aljabar berarti lebih sedikit kemungkinan kejadian dan dengan demikian lebih sedikit informasi tentang kemungkinan hasil. $\sigma-$ $\sigma-$ $\sigma-$

Oleh karena itu, memproyeksikan -measurable variabel acak ke kecil aljabar berarti mengambil perkiraan terbaik kami untuk nilai diberikan informasi yang lebih terbatas yang tersedia dari . $\mathscr{F}$ $\xi$ $\sigma-$ $\mathscr{G}$ $\xi$ $\mathscr{G}$

Dengan kata lain, hanya diberikan informasi dari , dan bukan seluruh informasi dari , dalam arti yang ketat, tebakan terbaik kami untuk apa variabel acak adalah. $\mathscr{G}$ $\mathscr{F}$ $\mathbb{E}[\xi|\mathscr{G}]$ $\xi$

Sehubungan dengan contoh Anda, saya pikir Anda mungkin membingungkan variabel acak dan nilainya. Variabel acak adalah fungsi yang domainnya adalah ruang acara; itu bukan angka. Dengan kata lain, , sedangkan untuk , . $X$ $X: \Omega \to \mathbb{R}$ $X \in \{f\ |\ f: \Omega \to \mathbb{R} \}$ $\omega \in \Omega$ $X(\omega)\in\mathbb{R}$

Notasi untuk ekspektasi bersyarat, menurut pendapat saya, benar-benar buruk, karena itu adalah variabel acak itu sendiri, yaitu juga fungsi . Sebaliknya, harapan (reguler) dari variabel acak adalah angka . Ekspektasi bersyarat dari variabel acak adalah jumlah yang sama sekali berbeda dari ekspektasi variabel acak yang sama, yaitu, bahkan tidak "ketik-cek" dengan . $\mathbb{E}[\xi|\mathscr{G}]$ $\mathbb{E}[\xi]$

Dengan kata lain, menggunakan simbol untuk menunjukkan ekspektasi reguler dan kondisional adalah penyalahgunaan notasi yang sangat besar, yang menyebabkan banyak kebingungan yang tidak perlu. $\mathbb{E}$

Semua itu dikatakan, perhatikan bahwa adalah angka (nilai variabel acak dievaluasi pada nilai ), tetapi adalah variabel acak, tetapi ternyata variabel acak konstan (yaitu trivial degenerate), karena aljabar yang dihasilkan oleh , $\mathbb{E}[\xi|\mathscr{G}](\omega)$ $\mathbb{E}[\xi|\mathscr{G}]$ $\omega$ $\mathbb{E}[\xi|\Omega]$ $\sigma$ $\Omega$ $\{ \emptyset, \Omega\}$ is trivial/degenerate, and then technically speaking the constant value of this constant random variable, is $\mathbb{E}[\xi]$ , where here $\mathbb{E}$ denotes regular expectation and thus a number, not conditional expectation and thus not a random variable.

Also you seem to be confused about what the notation $\mathbb{E}[\xi|A]$ means; technically speaking it is only possible to condition on $\sigma-$ algebras, not on individual events, since probability measures are only defined on complete $\sigma-$ algebras, not on individual events. Thus, $\mathbb{E}[\xi|A]$ is just (lazy) shorthand for $\mathbb{E}[\xi|\sigma(A)]$ , where $\sigma(A)$ stands for the $\sigma-$ algebra generated by the event $A$ , which is $\{ \emptyset, A, A^c, \Omega\}$ . Note that $\sigma(A) = \mathscr{G} = \sigma(A^c)$ ; in other words, $\mathbb{E}[\xi|A]$ , $\mathbb{E}[\xi|\mathscr{G}]$ , and $\mathbb{E}[\xi|A^c]$ are all different ways to denote the exact same object.

Finally I just want to add that the intuitive explanation I gave above explains why the constant value of the random variable $\mathbb{E}[\xi|\Omega]=\mathbb{E}[\xi|\sigma(\Omega)]= \mathbb{E}[\xi| \{ \emptyset, \Omega\}]$ is just the number $\mathbb{E}[\xi]$ -- the $\sigma-$ algebra $\{ \emptyset, \Omega\}$ represents the least possible amount of information we could have, in fact essentially no information, so under this extreme circumstance the best possible guess we could have for which random variable $\xi$ is is the constant random variable whose constant value is $\mathbb{E}[\xi]$ .

Note that all constant random variables are $L^2$ random variables, and they are all measurable with respect to the trivial $\sigma$ -algebra $\{\emptyset, \Omega\}$ , so indeed we do have that the constant random $\mathbb{E}[\xi]$ is the orthogonal projection of $\xi$ onto the subspace of $L^2(\Omega)$ consisting of random variables measurable with respect to $\{\emptyset, \Omega\}$ , as was claimed.

Chill2Macht
sumber

@William I disagree with you about the use of

E[ξ|A] $E[\xi|A]$ as a ran var. Many books define

E[ξ|A] $E[\xi|A]$ to be a number, not a ran var. It is the best possible estimate of

ξ|A $\xi|_A$ . This is a useful notion and highly intuitive. Disregarding it completely, just because you have a generalized notion of cond exp as a ran var is wrong from a pedagogical point-of-view. I am not confused about what a r.v. is, nor do I see how anything I wrote would lead you to thinking like that.

Nicolas Bourbaki

@William Thinking of cond expe as an estimate to the ran var with

G $\mathscr{G}$ representing information, is something I have seen said before but I never gave it that much thought and tried to find a different way of visualizing cond expec. Using your suggestion, I am going to write up a simple example, and post it as an answer, for myself, and for other people. Perhaps, some people can then elaborate on my example and give a more exotic one.

Nicolas Bourbaki

@NicolasBourbaki I recommend that you look at p.221 of the 4th edition of Durrett's Probability - Theory and Examples. I can refer you to other sources discussing this as well. In any case, it is not really a matter of opinion -- in the most general case, a conditional expectation is a random variable, and conditioning is only done with respect to

σ− $\sigma-$ algebras; conditioning with respect to an event is conditioning with respect to the

σ− $\sigma-$ algebra generated by the event, and conditioning with respect to a random variable is conditioning w.r.t. the

σ $\sigma$ -algebra generated by the RV

Chill2Macht

@William And I can refer you to sources which do define the cond. exep. of an event to be a real number. I do not know why you are so stuck on this point. One can define it any way, as long as the notions are not mixed up. For pedagogical reasons, teaching a class on prob. theory, and instantly jumping into the most general def., is not illuminating. In either case, it really does not matter in this discussion, and your complaint is about notation/semantics.

Nicolas Bourbaki

@NicolasBourbaki Chapter 5 of Whittle's Probability via Expectation gives a very good account (in my opinion) of both characterizations of conditional expectation, and explains well how each definition relates to and is motivated by the other definition. You are right that the distinction is one more of semantics. My enthusiasm for the more general definition stems (I think) from reading this chapter (5 of Whittle's Probability via Expectation), which made (I believe) good arguments about how the more general definition is in some ways easier to understand.

Chill2Macht

I am going to try to elaborate what William suggested.

Let $\Omega$ be the sample space of tossing a coin twice. Define the ran. var. $\xi$ to be the num. of heads that occur in the experiment. Clearly, $E[\xi] = 1$ . One way of thinking of what $1$ , as an expec. value, represents is as the best possible estimate for $\xi$ . If we had to take a guess for what value $\xi$ would take, we would guess $1$ . This is because $E[(\xi - 1)^2] \leq E[(\xi - a)^2]$ for any real number $a$ .

Denote by $A = \{ HT, HH \}$ to be the event that the first outcome is a head. Let $\mathscr{G} = \{ \emptyset, A, A^c, \Omega\}$ be the $\sigma$ -alg. gen. by $A$ . We think of $\mathscr{G}$ as representing what we know after the first toss. After the first toss, either heads occured, or heads did not occur. Hence, we are either in the event $A$ or $A^c$ after the first toss.

If we are in the event $A$ , then the best possible estimate for $\xi$ would be $E[\xi|A] = 1.5$ , and if we are in the event $A^c$ , then the best possible estimate for $\xi$ would be $E[\xi|A^c] = 0.5$ .

Now define the ran. var. $\eta(\omega)$ to be either $1.5$ or $0.5$ depending on whether or not $\omega\in A$ . This ran. var. $\eta$ , is a better approximation than $1 = E[\xi]$ since $E[(\xi - \eta)^2] \leq E[(\xi -1)^2]$ .

What $\eta$ is doing is providing the answer to the question: what is the best estimate of $\xi$ after the first toss? Since we do not know the information after the first toss, $\eta$ will depend on $A$ . Once the event $\mathscr{G}$ is revealed to us, after the first toss, the value of $\eta$ is determined and provides the best possible estimate for $\xi$ .

The problem with using $\xi$ as its own estimate, i.e. $0=E[(\xi - \xi)^2] \leq E[(\xi - \eta)^2]$ is as follows. $\xi$ is not well-defined after the first toss. Say the outcome of the experiment is $\omega$ with first outcome being heads, we are in the event $A$ , but what is $\xi(\omega)=?$ We do not know from just the first toss, that value is ambiguous to us, and so $\xi$ is not well-defined. More formally, we say that $\xi$ is not $\mathscr{G}$ -measurable i.e. its value is not well-defined after the first toss. Thus, $\eta$ is the best possible estimate of $\xi$ after the first toss.

Perhaps, somebody here can come up with a more sophisticated example using the sample space $[0,1]$ , with $\xi (\omega) = \omega$ , and $\mathscr{G}$ some non-trivial $\sigma$ -algebra.

Nicolas Bourbaki
sumber

Although you request not to use the formal definition, I think that the formal definition is probably the best way of explaining it.

Wikipedia - conditional expectation:

Then a conditional expectation of X given $\displaystyle \scriptstyle {\mathcal {H}}$ , denoted as $\displaystyle \scriptstyle \operatorname {E} (X\mid {\mathcal {H}})$ , is any $\displaystyle \scriptstyle {\mathcal {H}}$ -measurable function ( $\displaystyle \scriptstyle \Omega \to \mathbb {R} ^{n}$ ) which satisfies:

$\displaystyle \int _{H}\operatorname {E} (X\mid {\mathcal {H}})\;dP=\int _{H}X\;dP\qquad {\text{for each}}\quad H\in {\mathcal {H}}$

Firstly, it is a $\displaystyle \scriptstyle {\mathcal {H}}$ -measurable function. Secondly it has to match the expectation over every measurable (sub)set in $\displaystyle \scriptstyle {\mathcal {H}}$ . So for an event,A, the sigma algebra is $\{A,A^C,\emptyset, \Omega\}$ , so clearly it is set as you specified in your question for $\omega \in A/A^c$ . Similarly for any discrete random variable ( and combinations of them), we list out all primitive events and assign the expectation given that primitive event.

Now consider tossing a coin an infinite number of times, where at each toss i, you get $1/2^i$ , if your coin is tails then your total winnings are $X=\sum _{i=1}^\infty \frac{1}{2^i}c_i$ where $c_i$ = 1 for tails and 0 for heads. Then X is a real random variable on $[0,1]$ . After n coin tosses, you know the value of X to precision $1/2^n$ , eg after 2 coin tosses it is in [0,1/4], [1/4,1/2], [1/2,3/4] or [3/4,1] - after every coin toss, your associated sigma algebra is getting finer and finer, and similarly the conditional expectation of X is getting more and more precise.

Hopefully this example of a real valued random variable with a sequence of sigma algebras getting finer and finer (Filtration) gets you away from the purely event based intuition you are used to, and clarifies its purpose.

seanv507
sumber

I apologize, but I downvoted this question. It does not answer what I originally asked. Nor does it provide any new information that I did not know before.

Nicolas Bourbaki

What I am trying to suggest to you is you do not understand the formal definition as well as you think you do (as the other answer also suggested), so unless you work through what is unintuitive with the formal definition you will not progress.

seanv507

I understand the formal definition just fine. The questions that I asked, I know how to answer them when working from the formal definitions. The 'other answer', was trying to explain my question without using the definition of con. exp.

Nicolas Bourbaki