Dengan tiga vektor , , dan , mungkinkah korelasi antara dan , dan , dan dan semuanya negatif? Yaitu apakah ini mungkin?
correlation
correlation-matrix
Antti A
sumber
sumber
Jawaban:
Dimungkinkan jika ukuran vektornya 3 atau lebih besar. Sebagai contoh
Korelasi adalah
Kita dapat membuktikan bahwa untuk vektor ukuran 2 ini tidak dimungkinkan:
The formula makes sense: ifa1 is larger than a2 , b1 has to be larger than b1 to make the correlation negative.
Similarly for correlations between (a,c) and (b,c) we get
Clearly, all of these three formulas can not hold in the same time.
sumber
Yes, they can.
Suppose you have a multivariate normal distributionX∈R3,X∼N(0,Σ) .
The only restriction on Σ is that it has to be positive semi-definite.
So take the following exampleΣ=⎛⎝⎜1−0.2−0.2−0.21−0.2−0.2−0.21⎞⎠⎟
Its eigenvalues are all positive (1.2, 1.2, 0.6), and you can create vectors with negative correlation.
sumber
let's start with a correlation matrix for 3 variables
non-negative definiteness creates constraints for pairwise correlationsp,q,r which can be written as
For example, ifp=q=−1 , the values of r is restricted by 2r≥r2+1 , which forces r=1 . On the other hand if p=q=−12 , r can be within 2±3√4 range.
Answering the interesting follow up question by @amoeba: "what is the lowest possible correlation that all three pairs can simultaneously have?"
Letp=q=r=x<0 , Find the smallest root of 2x3−3x2+1 , which will give you −12 . Perhaps not surprising for some.
A stronger argument can be made if one of the correlations, sayr=−1 . From the same equation −2pq≥p2+q2 , we can deduce that p=−q . Therefore if two correlations are −1 , third one should be 1 .
sumber
A simple R function to explore this:
As a function of
n
,f(n)
starts at 0, becomes nonzero atn = 3
(with typical values around 0.06), then increases to around 0.11 byn = 15
, after which it seems to stabilize:So, not only is it possible to have all three correlations negative, it doesn't seem to be terribly uncommon (at least for uniform distributions).
sumber