-
-
Notifications
You must be signed in to change notification settings - Fork 10.8k
Tweak corrcoef #7414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak corrcoef #7414
Conversation
The input arrays are documented to have ndim <=2, so check for that and raise a ValueError on failure.
c[i,:] /= (d * d[i]) | ||
stddev = sqrt(d.real) | ||
c /= stddev[:, None] | ||
c /= stddev[None, :] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we checked that this is faster? (I would certainly hope that it is, but you never know :-).) Also, having read up a bit more I withdraw my worry about numerical stability, in case you want to do the multiply-by-inverse thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's about the same. Using multiplication makes it 2% faster for the test problem in the original fix. I think most of the time is always going to be in cov
. However, I prefer this version for its clarity, OTOH, you get two zero division warnings when one of the diagonals is zero (which can happen).
Seems like a nice little improvement to me. |
@@ -2517,6 +2523,12 @@ def corrcoef(x, y=None, rowvar=1, bias=np._NoValue, ddof=np._NoValue): | |||
|
|||
Notes | |||
----- | |||
Due to floating point rounding the resulting array may not be Hermitean, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't that be "Hermit**_i**_an"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed. Thanks.
The non-nan elements of the result of corrcoef should satisfy the inequality abs(x) <= 1 and the non-nan elements of the diagonal should be exactly one. We can't guarantee those results due to roundoff, but clipping the real and imaginary parts to the interval [-1, 1] improves things to a small degree. Closes numpy#7392.
This doesn't actually test much, as we don't have any inputs where that was not already the case. But at least it is there and perhaps a fuzz test can be added at a later date.
I'm going to merge this if there are no complaints. |
👍 |
Clip real and imag parts of result of corrcoef to the interval [-1, 1]. This doesn't really fix the issue with complex Hermitean results, nor is the diagonal guaranteed to be one, but it does take care of issue #7392. I put it here pending a decision of the proper course to take.