Setelah model ARMA cocok dengan deret waktu, adalah umum untuk memeriksa residu melalui tes portmanteau Ljung-Box (di antara tes lain). Tes Ljung-Box mengembalikan nilai ap. Ini memiliki parameter, h , yang merupakan jumlah kelambatan yang akan diuji. Beberapa teks merekomendasikan penggunaan h = 20; yang lain merekomendasikan penggunaan h = ln (n); paling tidak mengatakan apa h untuk digunakan.
Daripada menggunakan nilai tunggal untuk h , anggaplah saya melakukan tes Ljung-Box untuk semua h <50, dan kemudian memilih h yang memberikan nilai p minimum. Apakah pendekatan itu masuk akal? Apa kelebihan dan kekurangannya? (Satu kelemahan yang jelas adalah peningkatan waktu perhitungan, tetapi itu bukan masalah di sini.) Apakah ada literatur tentang ini?
Untuk menguraikan sedikit .... Jika tes memberikan p> 0,05 untuk semua jam , maka jelas deret waktu (residu) lulus tes. Pertanyaan saya menyangkut bagaimana menginterpretasikan tes jika p <0,05 untuk beberapa nilai h dan bukan untuk nilai lainnya.
sumber
Jawaban:
Jawabannya pasti tergantung pada: Untuk apa sebenarnya mencoba menggunakan tes ?Q
Alasan umum adalah: untuk lebih atau kurang percaya diri tentang signifikansi statistik gabungan dari hipotesis nol tanpa autokorelasi hingga lag (alternatif asumsi bahwa Anda memiliki sesuatu yang dekat denganwhite noise lemah) dan untuk membangunpelitModel, memiliki sedikit jumlah parameter mungkin.h
Biasanya data deret waktu memiliki pola musiman alami, sehingga aturan praktis praktis adalah menetapkan hingga dua kali nilai ini. Satu lagi adalah horizon peramalan, jika Anda menggunakan model untuk peramalan kebutuhan. Akhirnya jika Anda menemukan beberapa keberangkatan yang signifikan pada keterlambatan terakhir cobalah untuk berpikir tentang koreksi (mungkinkah ini karena beberapa efek musiman, atau data tidak dikoreksi untuk pencilan).h
Ini adalah tes signifikansi gabungan , jadi jika pilihan adalah data-driven, lalu mengapa saya harus peduli tentang beberapa keberangkatan kecil (sesekali?) Pada setiap jeda kurang dari jam , seandainya itu jauh lebih sedikit dari n tentu saja (kekuatan dari tes yang Anda sebutkan). Mencari untuk menemukan model yang sederhana namun relevan Saya menyarankan kriteria informasi seperti yang dijelaskan di bawah ini.h h n
Jadi itu akan tergantung pada seberapa jauh dari sekarang ini terjadi. Kerugian keberangkatan jauh: lebih banyak parameter untuk diperkirakan, lebih sedikit derajat kebebasan, kekuatan prediksi model yang lebih buruk.
Cobalah untuk memperkirakan model termasuk bagian MA dan \ atau AR pada jeda di mana keberangkatan terjadi DAN tambahan melihat salah satu kriteria informasi (baik AIC atau BIC tergantung pada ukuran sampel) ini akan membawa Anda lebih banyak wawasan tentang model apa yang lebih pelit. Latihan prediksi yang tidak sampel juga diterima di sini.
sumber
Asumsikan bahwa kami menentukan model AR (1) sederhana, dengan semua properti biasa,
Nyatakan kovarians teoretis dari istilah kesalahan sebagai
Jika kita dapat mengamati istilah kesalahan, maka autokorelasi sampel dari istilah kesalahan didefinisikan sebagai
dimana
Namun dalam praktiknya, kami tidak mengamati istilah kesalahan. Jadi autokorelasi sampel yang terkait dengan istilah kesalahan akan diperkirakan menggunakan residu dari estimasi, seperti
Statistik Q Box-Pierce (Ljung-Box Q hanya versi skala netral asimptotik dari itu) adalah
Masalah kami adalah tepat apakah dapat dikatakan memiliki distribusi chi-square asimptotik (di bawah nol tanpa autokorelasi dalam istilah kesalahan) dalam model ini. Agar ini terjadi, masing-masing dan semua orang √QBP
n−−√ρ^j must be asymptotically standard Normal. A way to check this is to examine whether n−−√ρ^ has the same asymptotic distribution as n−−√ρ~ (which is constructed using the true errors, and so has the desired asymptotic behavior under the null).
We have that
whereβ^ is a consistent estimator. So
The sample is assumed to be stationary and ergodic, and moments are assumed to exist up until the desired order. Since the estimatorβ^ is consistent, this is enough for the two sums to go to zero. So we conclude
This implies that
But this does not automatically guarantee thatn−−√ρ^j converges to n−−√ρ~j (in distribution) (think that the continuous mapping theorem does not apply here because the transformation applied to the random variables depends on n ). In order for this to happen, we need
(the denominatorγ0 -tilde or hat- will converge to the variance of the error term in both cases, so it is neutral to our issue).
We have
So the question is : do these two sums, multiplied now byn−−√ , go to zero in probability so that we will be left with n−−√γ^j=n−−√γ~j asymptotically?
For the second sum we have
Since[n−−√(β^−β)] converges to a random variable, and β^ is consistent, this will go to zero.
For the first sum, here too we have that[n−−√(β^−β)] converges to a random variable, and so we have that
The first expected value,E[utyt−j−1] is zero by the assumptions of the standard AR(1) model. But the second expected value is not, since the dependent variable depends on past errors.
Son−−√ρ^j won't have the same asymptotic distribution as n−−√ρ~j . But the asymptotic distribution of the latter is standard Normal, which is the one leading to a chi-squared distribution when squaring the r.v.
Therefore we conclude, that in a pure time series model, the Box-Pierce Q and the Ljung-Box Q statistic cannot be said to have an asymptotic chi-square distribution, so the test loses its asymptotic justification.
This happens because the right-hand side variable (here the lag of the dependent variable) by design is not strictly exogenous to the error term, and we have found that such strict exogeneity is required for the BP/LB Q-statistic to have the postulated asymptotic distribution.
Here the right-hand-side variable is only "predetermined", and the Breusch-Pagan test is then valid. (for the full set of conditions required for an asymptotically valid test, see Hayashi 2000, p. 146-149).
sumber
Before you zero-in on the "right" h (which appears to be more of an opinion than a hard rule), make sure the "lag" is correctly defined.
http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm
Quoting the section below Issue 4 in the above link:
"....The p-values shown for the Ljung-Box statistic plot are incorrect because the degrees of freedom used to calculate the p-values are lag instead of lag - (p+q). That is, the procedure being used does NOT take into account the fact that the residuals are from a fitted model. And YES, at least one R core developer knows this...."
Edit (01/23/2011): Here's an article by Burns that might help:
http://lib.stat.cmu.edu/S/Spoetry/Working/ljungbox.pdf
sumber
The thread "Testing for autocorrelation: Ljung-Box versus Breusch-Godfrey" shows that the Ljung-Box test is essentially inapplicable in the case of an autoregressive model. It also shows that Breusch-Godfrey test should be used instead. That limits the relevance of your question and the answers (although the answers may include some generally good points).
sumber
Escanciano and Lobato constructed a portmanteau test with automatic, data-driven lag selection based on the Pierce-Box test and its refinements (which include the Ljung-Box test).
The gist of their approach is to combine the AIC and BIC criteria --- common in the identification and estimation of ARMA models --- to select the optimal number of lags to be used. In the introduction of they suggest that, intuitively, ``test conducted using the BIC criterion are able to properly control for type I error and are more powerful when serial correlation is present in the first order''. Instead, tests based on AIC are more powerful against high order serial correlation. Their procedure thus choses a BIC-type lag selection in the case that autocorrelations seem to be small and present only at low order, and an AIC-type lag section otherwise.
The test is implemented in the
R
packagevrtest
(see functionAuto.Q
).sumber
The two most common settings aremin(20,T−1) and lnT where T is the length of the series, as you correctly noted.
The first one is supposed to be from the authorative book by Box, Jenkins, and Reinsel. Time Series Analysis: Forecasting and Control. 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 1994.. However, here's all they say about the lags on p.314:
It's not a strong argument or suggestion by any means, yet people keep repeating it from one place to another.
The second setting for a lag is from Tsay, R. S. Analysis of Financial Time Series. 2nd Ed. Hoboken, NJ: John Wiley & Sons, Inc., 2005, here's what he wrote on p.33:
This is a somewhat stronger argument, but there's no description of what kind of study was done. So, I wouldn't take it at a face value. He also warns about seasonality:
Summarizing, if you just need to plug some lag into the test and move on, then you can use either of these setting, and that's fine, because that's what most practitioners do. We're either lazy or, more likely, don't have time for this stuff. Otherwise, you'd have to conduct your own research on the power and properties of the statistics for series that you deal with.
UPDATE.
Here's my answer to Richard Hardy's comment and his answer, which refers to another thread on CV started by him. You can see that the exposition in the accepted (by Richerd Hardy himself) answer in that thread is clearly based on ARMAX model, i.e. the model with exogenous regressorsxt :
However, OP did not indicate that he's doing ARMAX, to contrary, he explicitly mentions ARMA:
One of the first papers that pointed to a potential issue with LB test was Dezhbaksh, Hashem (1990). “The Inappropriate Use of Serial Correlation Tests in Dynamic Linear Models,” Review of Economics and Statistics, 72, 126–132. Here's the excerpt from the paper:
As you can see, he doesn't object to using LB test for pure time series models such as ARMA. See also the discussion in the manual to a standard econometrics tool EViews:
Yes, you have to be careful with ARMAX models and LB test, but you can't make a blanket statement that LB test is always wrong for all autoregressive series.
UPDATE 2
Alecos Papadopoulos's answer shows why Ljung-Box test requires strict exogeneity assumption. He doesn't show it in his post, but Breusch-Gpdfrey test (another alternative test) requires only weak exogeneity, which is better, of course. This what Greene, Econometrics, 7th ed. says on the differences between tests, p.923:
sumber
... h should be as small as possible to preserve whatever power the LB test may have under the circumstances. As h increases the power drops. The LB test is a dreadfully weak test; you must have a lot of samples; n must be ~> 100 to be meaningful. Unfortunately I have never seen a better test. But perhaps one exists. Anyone know of one ?
Paul3nt
sumber
There's no correct answer to this that works in all situation for the reasons other have said it will depend on your data.
That said, after trying to figure out to reproduce a result in Stata in R I can tell you that, by default Stata implementation uses:min(n2−2,40) . Either half the number of data points minus 2, or 40, whichever is smaller.
All defaults are wrong, of course, and this will definitely be wrong in some situations. In many situations, this might not be a bad place to start.
sumber
Let me suggest you our R package hwwntest. It has implemented Wavelet-based white noise tests that do not require any tuning parameters and have good statistical size and power.
Additionally, I have recently found "Thoughts on the Ljung-Box test" which is excellent discussion on the topic from Rob Hyndman.
Update: Considering the alternative discussion in this thread regarding ARMAX, another incentive to look at hwwntest is the availability of a theoretical power function for one of the tests against an alternative hypothesis of ARMA(p,q) model.
sumber