Bagaimana cara melakukan validasi silang dengan model bahaya proporsional Cox?

15

Suppose I have constructed a prediction model for the occurrence of a particular disease in one dataset (the model building dataset) and now want to check how well the model works in a new dataset (the validation dataset). For a model built with logistic regression, I would calculate the predicted probability for each person in the validation dataset based on the model coefficients obtained from the model building dataset and then, after dichotomizing those probabilities at some cutoff value, I can construct a 2x2 table that allows me to calculate the true positive rate (sensitivity) and the true negative rate (specificity). Moreover, I can construct the entire ROC curve by varying the cutoff and then obtain the AUC for the ROC graph.

Now suppose that I actually have survival data. So, I used a Cox proportional hazards model in the model building dataset and now want to check how well the model works in the validation dataset. Since the baseline risk is not a parametric function in Cox models, I do not see how I can get the predicted survival probability for each person in the validation dataset based on the model coefficients obtained in the model building dataset. So, how can I go about checking how well the model works in the validation dataset? Are there established methods for doing this? And if yes, are they implemented in any software? Thanks in advance for any suggestions!

Wolfgang
sumber

Jawaban:

9

An ROC curve is not useful in this setting, although the generalized ROC area (c-index, which does not require any dichotomization at all) is. The R rms package will compute the c-index and cross-validated or bootstrap overfitting-corrected versions of it. You can do this without holding back any data if you fully pre-specify the model or repeat a backwards stepdown algorithm at each resample. If you truly want to do external validation, i.e., if your validation sample is enormous, you can use the following rms functions: rcorr.cens, val.surv.

Frank Harrell
sumber
Thank you for the answer. Could you explain why an ROC curve is not useful in this setting? I have seen some prominent applications where such an approach was used (e.g., Hippisley-Cox et al. (2007). Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. British Medical Journal, 335(7611): 136), so now I am wondering about their methods.
Wolfgang
1
Here's an analogy. Suppose one is interested in assessing how aging relates to running ability. The ROC approach would ask the question given someone's running ability what is the probability they are above a certain (arbitrary) age? In a cohort study it only adds confusion to reverse the roles of the independent and dependent variables, and ROC curves also tempt one to make cutoffs on a predictor, which is known to be bad statistical practice - see biostat.mc.vanderbilt.edu/CatContinuous . Besides creating havock, cutpoints must actually be functions of all other predictors.
Frank Harrell
Again, thanks for replying. I am not entirely convinced though. I totally agree that arbitrary categorization of a continuous variable is bad practice, but the ROC approach categorizes on all possible cutoffs and summarizes that information via the AUC. So there is no arbitrariness in that. It also seems like a standard and accepted practice for logistic regression models. So are you against the use of ROC curves in general or just in the context of survival models?
Wolfgang
2
ROC curves in general, unless you use them for what they are really intended for: mass one-time group decision making. They don't help with individual decision making where for a given subject you condition on X=x instead of X>c (we know exact predictor values for each subject, not only they they exceed a cutoff). ROC curves also tempt even good analysts to select a cutpoint. What does the ROC curve tell you that you can't get from standard regression stats?
Frank Harrell
My experience tells me that a lot of researchers/practitioners actually want dichotomous decision rules (leaving aside whether that is useful or not). At any rate, I'll follow up on some of those R functions and see where this gets me. Thanks for the discussion.
Wolfgang
0

I know that this question is pretty old but what I have done when I encountered the same problem was to use the predict function to get a "score" for each subject in the validation set. This was followed by splitting the subjects according to whether the score was higher or lower than than median and plotting the Kaplan-Meier curve. This should show a separation of the subjects if your model is predictive. I also tested the correlation of score (actually of its ln [for normal distribution]) with survival using the coxph function from the survival package in R.

PMA
sumber