-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
ENH: isclose/allclose: support array_like atol
/rtol
#24878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
atol
/rtol
atol
/rtol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It all looks pretty good, but looking more closely I felt a simplification was possible - see inline.
numpy/core/numeric.py
Outdated
@@ -2357,8 +2359,10 @@ def within_tol(x, y, atol, rtol): | |||
# lib.stride_tricks, though, so we can't import it here. | |||
x = x * ones_like(cond) | |||
y = y * ones_like(cond) | |||
atol_f = np.broadcast_to(atol, x.shape)[finite] if atol.shape else atol |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is weird, above it is noted we are not allowed broadcast_arrays
, but here we use broadcast_to
. So, clearly that comment is out of date, which means we can just do (following the definition of cond
- rest can go):
x, y, atol, rtol = np.broadcast_arrays(x, y, atol, rtol, subok=True)
cond[finite] = within_tol(x[finite], y[finite], atol[finite], rtol[finite])
(note addition of subok=True
- useful for subclasses)
Could you check whether the tests still pass with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, this looks better to me.
It turns out that in NumPy main, isclose
is very flexible about the shape of atol
and rtol
when there are no NaNs.
import numpy as np
np.isclose([1, 2], [[2], [1]], atol=[[[0.01]], [[2]]])
# array([[[False, True],
# [ True, False]],
#
# [[ True, True],
# [ True, True]]])
Before the last commit, this would have failed if there were a NaN in one of the arrays because atol
cannot be broadcast to the shape of x
(which is really the braodcasted shape of x
and y
at this point in the code).
Now this will work even when there are NaNs.
np.isclose([np.nan, 2], [[2], [np.nan]], atol=[[[0.01]], [[3]]], equal_nan=True)
# array([[[False, True],
# [ True, False]],
#
# [[False, True],
# [ True, False]]])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks all good to me! The only thing that I think should still be done, though, is to squash the commits - maybe to two to continue to give credit to the earlier attempt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good except I think the code can be simplified even more! See inline comment.
numpy/_core/numeric.py
Outdated
@@ -2361,7 +2360,8 @@ def within_tol(x, y, atol, rtol): | |||
x = x * ones_like(cond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two lines and the comment above can go, since we actually broadcast x
and y
already above.
numpy/_core/numeric.py
Outdated
@@ -2341,8 +2341,7 @@ def within_tol(x, y, atol, rtol): | |||
with errstate(invalid='ignore'), _no_nep50_warning(): | |||
return less_equal(abs(x-y), atol + rtol * abs(y)) | |||
|
|||
x = asanyarray(a) | |||
y = asanyarray(b) | |||
x, y, atol, rtol = np.broadcast_arrays(a, b, atol, rtol, subok=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmmpf, this is one of the cases similar to gh-24957. So, atol
and rtol
stop being Python scalars (with NEP 50) or 0-d objects (without NEP 50) and that changes promotion.
We need a solution for that, right now, we could skip it add logic for them being type(atol) in (int, float, complex)
. The question remains a bit whether we want a way to say mark_was_weak_scalar(array)
, to turn on weak promotion for an array, but... I don't really want such a mark to survive for long and it is unclear what operations should retain it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think that if NumPy is hitting this, then other libraries and users will hit it as well. Maybe we should rethink the transition and how we provide (or not) backward incompatibility shims.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the casting could be done earlier - something like
dtype=np.result_type(a, b, atol, rtol)
atol = np.asanyarray(atol, dtype)
rtol = np.asanyarray(rtol, dtype)
and then broadcast?
If broadcast is a common culprit, we might add a new keyword argument that does this implicitly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If broadcast is a common culprit, we might add a new keyword argument that does this implicitly?
Yes, please! This is something Sebastian and I have discussed before.
Oops, but I've wanted the promotion to follow NEP50, so it would be nice if the user could choose which rules to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be the suggestion: to allow the dtype of arrays created from python scalars to follow NEP 50 promotion rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are different approaches:
- just make the current stuff a bit more convenient.
- Try to tag on the info to an array.
- Just deal with it here, the
isinstance
check and special code paths are not that terrible (and also useful for micro optimization probably).
To me no solution is obvious, unfortunately. Tagging the info on to the array or dtype, means that you have a very tricky situation of how to drag the info around and mainly about when to preserve and when to drop it.
In this case for example, you are calling arr[boolean_idx]
creating a copy, should that drop to prevent accidentally returning such an array? Assignments at least could be prevent by making it readonly, but... Also the special path to make arr > -1
work for such an array is another tricky thing.
I can somewhat think of ways to do this, but I am not sure they will be very pleasant.
Moving resolution to earlier (maybe with a helper) can help. But in some cases might also not be quite what we want because it makes sense to ask isclose(2 meters, 1.9 meters, atol=0.1, rtol=0.1 meters)
. But it is less clear that the: dtype = result_type(meters, float)
is reasonable (for the atol
)?
Taking discussion out of code: You are right, I think in the end, perhaps one just has to write more carefully... Looking at this code, really the two different paths for finite and non-finite are not needed if one is a bit more careful, which means one can avoid the broadcasting altogether -- see mdhaber#1 (@mdhaber - I made this PR to your branch so that we can keep credit for both you and the earlier commit). From that, perhaps it is useful to have a general helper routine that does
Though it may be that we simply have to see how things pan out. p.s. Just for completeness, the NEP 50 changes in astropy were not that large - astropy/astropy#15525 - with some code becoming more and other code becoming less obvious. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reapproving, though given that one of the commits is now mine, I won't merge but let @seberg have another look as well.
We did notice something related to this downstream in SciPy per: scipy/scipy#19526 I didn't open a NumPy issue, because it may actually be an improvement to sensitivity, can't tell just yet.. |
The failure is
I think this should still pass; the new failure is not an intended effect of the PR. It is probably failing because the magnitudes are not finite, so exact equality comparison is performed, but presumably the real parts are not exactly equal. What is surprising is that the new logic is similar to the old logic in this respect, so it's not jumping out at me why it didn't fail before. I'll take a closer look soon. Update: x = x * ones_like(cond)
y = y * ones_like(cond) (where import numpy as np
x = np.asarray([-8.113438e+295-np.inf*1j])
cond = np.asarray([True])
x * np.ones_like(cond) # array([nan+nanj]) Then the NaN comparison logic kicked in, and the test passed. This was not exactly correct behavior either because the original inputs would be considered close even if the real parts were not close. So I think this case has exposed a bug in both the old and new code. The obvious workaround to me is to perform real and imaginary part comparison separately, but maybe there is a more elegant solution. |
@mhvk would you like to adapt your solution to work with complex numbers in which either the real or complex part is infinite? |
I'm actually not sure what the expectation should be: the current test assumes either x, y are finite, or, if not, that they should be the same infinite (same sign). Following that logic, for complex, I guess we can that say that the infinite should point in the same direction (i.e., any of 8 possibilities). Still, for the case in question, of On the other hand, one can see Anyway, I think it makes most sense to check for "same infinity" (ignoring real for p.s. It is all a bit weird: if I do,
This actually is a python problem:
So, it looks like one has to construct the test values explicitly assigning real and imaginary... |
This is what the directional infinity check would look like:
|
Supersedes gh-14343
Closes gh-14320
gh-14320 reported that
np.allclose
accepted vectoratol
but failed when a NaN was present.gh-14343 began to add full support for array
atol
/rtol
inallclose
/isclose
, but became inactive.@seberg asked if anyone would like to finish it, hence this PR.
I preserved the original commits and made the requested changes. I also fixed the case of non-array array_like
rtol
/atol
and strengthened the tests.