Terlepas dari beberapa keadaan unik di mana kita benar-benar harus memahami hubungan rata-rata bersyarat, apa situasi di mana seorang peneliti harus memilih OLS daripada Regresi Kuantil?
Saya tidak ingin jawabannya menjadi "jika tidak ada gunanya memahami hubungan ekor", karena kita bisa menggunakan regresi median sebagai pengganti OLS.
least-squares
econometrics
regression-strategies
quantile-regression
semiparametric
Frank Harrell
sumber
sumber
Jawaban:
Jika Anda tertarik pada mean, gunakan OLS, jika dalam median, gunakan quantile.
Satu perbedaan besar adalah bahwa rerata lebih dipengaruhi oleh pencilan dan data ekstrem lainnya. Terkadang, itulah yang Anda inginkan. Salah satu contoh adalah jika variabel dependen Anda adalah modal sosial di lingkungan. Kehadiran satu orang dengan banyak modal sosial mungkin sangat penting bagi seluruh lingkungan.
sumber
Tampaknya ada kebingungan dalam premis pertanyaan. Dalam paragraf kedua dikatakan, "kita bisa menggunakan regresi median sebagai pengganti OLS". Perhatikan bahwa regresi median bersyarat pada X adalah (bentuk) regresi kuantil.
Jika kesalahan dalam proses pembuatan data yang mendasarinya terdistribusi secara normal (yang dapat dinilai dengan memeriksa apakah residualnya normal), maka mean kondisional sama dengan median kondisional. Selain itu, setiap kuantil yang mungkin menarik bagi Anda (misalnya, persentil ke-95, atau persentil ke-37), dapat ditentukan untuk titik tertentu dalam dimensi X dengan metode OLS standar. Daya tarik utama dari regresi kuantil adalah bahwa ia lebih kuat daripada OLS. Kelemahannya adalah jika semua asumsi terpenuhi, itu akan menjadi kurang efisien (yaitu, Anda akan membutuhkan ukuran sampel yang lebih besar untuk mencapai kekuatan yang sama / perkiraan Anda akan kurang tepat).
sumber
References:
sumber
Peter Flom had a great and concise answer, I just want to expand it. The most important part of the question is how to define "worse".
In order to define worse, we need to have some metrics, and the function to calculate how good or bad the fittings are called loss functions.
We can have different definitions of the loss function, and there is no right or wrong on each definition, but different definition satisfy different needs. Two well known loss functions are squared loss and absolute value loss.
If we use squared loss as a measure of success, quantile regression will be worse than OLS. On the other hand, if we use absolute value loss, quantile regression will be better.
Which is what Peter Folm's answer:
sumber
To say what some of the excellent responses above said, but in a slightly different way, quantile regression makes fewer assumptions. On the right hand side of the model the assumptions are the same as with OLS, but on the left hand side the only assumption is continuity of the distribution ofY (few ties). One could say that OLS provides an estimate of the median if the distribution of residuals is symmetric (hence median=mean), and under symmetry and not-too-heavy tails (especially under normality), OLS is superior to quantile regression for estimating the median, because of much better precision. If there is only an intercept in the model, the quantile regression estimate is exactly the sample median, which has efficiency of 2π when compared to the mean, under normality. Given a good estimate of the root mean squared error (residual SD) you can use OLS parametrically to estimate any quantile. But quantile estimates from OLS are assumption-laden, which is why we often use quantile regression.
If you want to estimate the mean, you can't get that from quantile regression.
If you want to estimate the mean and quantiles with minimal assumptions (but more assumptions than quantile regression) but have more efficiency, use semiparametric ordinal regression. This also gives you exceedance probabilities. A detailed case study is in my RMS course notes where it is shown on one dataset that the average mean absolute estimation error over several parameters (quantiles and mean) is achieved by ordinal regression. But for just estimating the mean, OLS is best and for just estimating quantiles, quantile regression was best.
Another big advantage of ordinal regression is that it is, except for estimating the mean, completelyY -transformation invariant.
sumber