Saya mencoba mendapatkan intuisi untuk masing-masing fungsi utama dalam ilmu aktuaria (khusus untuk Cox Proportional Hazards Model). Inilah yang saya miliki sejauh ini:
- : mulai dari waktu mulai, distribusi probabilitas kapan Anda akan mati.
- : hanya distribusi kumulatif. Pada waktu , berapa% populasi yang akan mati?
- :. Pada waktu, berapa% populasi yang akan hidup?
- : fungsi bahaya. Pada waktu tertentu , dari orang-orang yang masih hidup, ini dapat digunakan untuk memperkirakan berapa banyak orang akan mati dalam interval waktu berikutnya, atau jika interval-> 0, probabilitas kematian 'seketika'.
- : cumulative hazard. No idea.
What's the idea behind combining hazard values, especially when they are continuous? If we use a discrete example with death rates across four seasons, and the hazard function is as follows:
- Starting at Spring, everyone is alive, and 20% will die
- Now in Summer, of those remaining, 50% will die
- Now in Fall, of those remaining, 75% will die
- Final season is Winter. Of those remaining, 100% will die
Then the cumulative hazard is 20%, 70%, 145%, 245%?? What does that mean, and why is this useful?
Jawaban:
Combining proportions dying as you do is not giving you cumulative hazard. Hazard rate in continuous time is a conditional probability that during a very short interval an event will happen:
Cumulative hazard is integrating (instantaneous) hazard rate over ages/time. It's like summing up probabilities, but sinceΔt is very small, these probabilities are also small numbers (e.g. hazard rate of dying may be around 0.004 at ages around 30). Hazard rate is conditional on not having experienced the event before t , so for a population it may sum over 1.
You may look up some human mortality life table, although this is a discrete time formulation, and try to accumulatemx .
If you use R, here's a little example of approximating these functions from number of deaths at each 1-year age interval:
Hope this helps.
sumber
The Book "An Introduction to Survival Analysis Using Stata" (2nd Edition) by Mario Cleves has a good chapter on that topic.
You can find the chapter on google books, p. 13-15. But I would advise on reading the whole chapter 2.
Here is the short form:
sumber
I'd HAZARD a guess that it's noteworthy owing to its use in diagnostic plots:
(1) In the Cox proportional hazards modelh(x)=eβTzh0(x) , where β and z are the coefficient and covariate vectors respectively, & h0(x) is the baseline hazard function; & so logH(x)=βTz+H0(x) . If you plot the estimate logH^(x)
against x , different covariate patterns follow parallel curves, provided the proportional hazards assumption is correct.
(2) In the Weibull modelh(x)=αθ(xθ)α−1 , where θ & α are the scale & shape parameters respectively; & so logH(x)=αlogx−αlogθ .
If you plot the estimate logH^(x)
against logx , you get a straight line with slope α^ & intercept −α^logθ^ , provided the Weibull assumption is correct. And of course a slope near to 1 suggests an exponential model might fit.
An intuitive interpretation ofH(x) is the expected number of deaths of an individual up to time x if the individual were to be resurrected after each death (without resetting time to zero).
sumber
In paraphrasing what @Scortchi is saying, I would emphasize that the cumulative hazard function does not have a nice interpretation, and as such I would not try to use it as a way to interpret results; telling a non-statistical researcher that the cumulative hazards are different will most likely result in an "mm-hm" answer and then they'll never ask about the subject again, and not in a good way.
However, the cumulative hazard function turns out to be very useful mathematically, such as a general way to link the hazard function and the survival function. So it's important to know what the cumulative hazard is and how it can be used in various statistical methods. But in general, I don't think it's particularly useful to think about real data in terms cumulative hazards.
sumber