API,MAINT: Reorganize array-wrap calling and introduce `return_scalar` #25409

seberg · 2023-12-17T17:36:05Z

This reorganize how array-wrap is called. It might very mildly change the semantics for reductions I think (and for negative priorities).

Overall, it now passes a new return_scalar=False/True when calling __array_wrap__ and deprecates any array-wrap which does not accept arr, context, return_scalar.

I have not integrated it yet, but half the reason for the reorganization is to integrate it/reuse it in the array_coverter helper PR (gh-24214), which stumbled over trying to make the scalar handling sane.

Forcing downstream to add return_scalar=False to the signature is a bit annoying, but e.g. our memory maps currently try to guess at it, which seems bad. I am hoping, this can be part to making the scalar vs. array return more sane.

But, maybe mainly, I hope it consolidates things (together with gh-24124 mainly, as if ufuncs were the only complex place we used this, it wouldn't matter much).

This also deprecates any `__array_wrap__` which does not accept `context` and `return_scalar`.

seberg · 2023-12-17T17:37:35Z

numpy/_core/src/multiarray/calculation.c

@@ -700,7 +700,7 @@ PyArray_Round(PyArrayObject *a, int decimals, PyArrayObject *out)
 finish:
    Py_DECREF(f);
    Py_DECREF(out);
-    if (ret_int) {
+    if (ret_int && ret != NULL) {


Sorry, unrelated and a bit annoying to trigger. But the deprecations triggered the bug...

mhvk · 2023-12-18T19:31:52Z

Hmm, seeing this I'm really not sure it is worth breaking every package that implemented __array_wrap__ for something so relatively small... In the end, most classes could check whether ndim==0 and then do something based on that.

Another idea, that does not break the signature (but might well break implementations), is to pass on to __array_wrap__ what numpy would return, i.e., an array or a scalar.

But there also remains something to be said for just more generally using array scalars, and let that be the API break. It might just make code simpler rather than more complex...

p.s. Should stress that I haven't thought enough about this! I should really look more at your try-0d-preservation branch...

seberg · 2023-12-18T19:52:57Z

It would be nice not to require this, but...

In the end, most classes could check whether ndim==0 and then do something based on that.

No because 0-D doesn't imply that NumPy should be returning a scalar, after all that is what annoys us all the time! I would like to be able to decide that np.add(1, 1) should maybe a scalar, but np.add(np.array(1), 1) should be an array. Or arr1d.sum() vs. arr1d.sum(-1) maybe/probably.

It doesn't matter much which cases are which, so long there is no absolute uniform 0-d is always array/scalar...

is to pass on to __array_wrap__ what numpy would return, i.e., an array or a scalar.

Might work out mostly until dtype=object hits and all is lost, since you need at least obj.view(my_subclass) to work, and it cannot. And it isn't even really reasonable to check for isinstance(obj, ndarray)...

Which was, why I thought I would just call result[()] instead, but that also doesn't work because we don't know that the result defines such indexing semantics.

So, I would love a different idea, but the only thing I have left is piggy-backing on context or try/except indefinitely. But I really dislike try/except indefinitely, so that would leave adding one more try/except branch as the current one breaks subclasses that rely on context (although I hope beyond masked arrays few exist, and everyone who might want to use it moves to __array_ufunc__).

This probably doesn't fix the 32bit issue unfortunately, only the windows ones...

seberg · 2023-12-19T14:18:26Z

FWIW, I have updated the try-0d-preservation branch with these (and the array-converter) changes. There are some tests still failing, but the last commit shows that it is not impossible anymore to return scalars for scalar inputs and arrays otherwise.
(With the admitted caveat in the strange case where e.g. dtype="i,i" means that a tuple is considered a scalar rather, when it normally is not.)

mhvk · 2023-12-19T22:22:12Z

Which was, why I thought I would just call result[()] instead, but that also doesn't work because we don't know that the result defines such indexing semantics.

Maybe we should still go with this? I think it may cause less breakage than a new argument...

Though perhaps I'm overly worried; not sure how many projects in fact have an __array_wrap__ to start with...

seberg · 2023-12-20T08:36:39Z

Let's say this, we currently have two possible paths:

Functions which always return an array
Functions which return a scalar if the result is 0-D.

And __array_wrap__ cannot possibly match that, because even if it relies on context being set for the second, it will miss reductions.

So to some degree, I really want to add the ability to pick the two behaviors consciously for NumPy. Now, you could just not tell __array_wrap__ about it as it rarely needs to know in practice (most implementations are already happy to just ignore it).

I think it may cause less breakage than a new argument...

Well, in the sense of requiring less code change? Yes, for sure. In the sense of braking things disregarding the deprecation warning, not sure? Since I think some things like pandas Series will run into result[()] not working.

Though perhaps I'm overly worried; not sure how many projects in fact have an __array_wrap__ to start with...

Right, and if I add another step which tries to pass arr, context when arr, context, return_scalar fails, then this is a DeprecationWarning and will not fail straight away (albeit add a bunch of overhead).

I don't have a strong opinion, it seems tedious, but OTOH, it seems bound to be annoying for someone to not be able to match NumPy arrays (e.g. mmap is such a case).

seberg · 2024-01-07T13:43:57Z

@mhvk sorry, need to come back to this, I think it is pretty important (if just to push the followup of array_coverter(), which is important to fixup NEP 50 niche case bugs in the Pytho API).

We had discussed this in a meeting a while ago, and I don't think anyone had serious concerns with adding it. But I do value your opinion more than most when it comes to __array_function__. What I will say is that this is just a deprecation, so yes, everyone must update, but end-users wouldn't see it immediately (although it's slow). It could be a PendingDeprecationWarning, but that just hides it even more with no gain I think. I can will add a try for __array_function__(arr, context), so that it is truly just a deprecation (even if it could go through 2 trys before succeeding).

To me, the new kwarg still just seems easier than any other polymorphism to try to guess this without it, and it is only some libraries that need to adapt.

mhvk

@seberg - sorry to have forgotten about this! Looking with a fresh eye, I think your solution is for the best -- everything I can think of has the risk of subtle breakage.

So, now just a proper review of the code -- which looks great! One genuine comment about whether forcing wrapping for reductions is correct (looks like the old code did not). Though if forced wrapping for reductions is OK, then all the calls to npy_apply_wrap in fact used forced, so the argument could be removed. Indeed, that suggests that if we do not want to force wrapping in reductions, we could just move the if statement to the reduction instead (or are you using it in the follow-up PR?).

mhvk · 2024-01-07T19:25:45Z

numpy/_core/memmap.py

@@ -337,7 +337,7 @@ def __array_wrap__(self, arr, context=None):
            return arr
        # Return scalar instead of 0d memmap, e.g. for np.sum with
        # axis=None
-        if arr.shape == ():
+        if arr.shape == () and return_scalar:


This can just be if return_scalar:, right? or does the calling code not ensure this is set only for 0-d arrays?

You are right, thanks. When I started I hadn't ensured that yet, but it does now (and I think it is much cleaner/better).

mhvk · 2024-01-07T19:31:35Z

numpy/_core/src/multiarray/arraywrap.c

+ * @param inputs Original input objects
+ * @param out_wrap Set to the python callable or None (on success).
+ * @param out_wrap_type If not NULL, set to the type belonging to the wrapper
+ *        or set to NULL.


From the code, this seems no longer true: *out_wrap_type is always set. This means the rest of the comment can go, which is probably for the better!

mhvk · 2024-01-07T19:34:32Z

numpy/_core/src/multiarray/arraywrap.c

+ * Wrapping is always done for ufuncs because the values stored will have been
+ * mutated (guarantees calling array-wrap if any mutation may have occurred;
+ * this is not true for Python calls, though).
+ * UFuncs always call array-wrap with base-class arrays in any case.


I think removing the comment up to this point is OK.

Or was this comment meant for npy_apply_wrap?

Removed this is a leftover and better and sufficiently covered by force_wrap now...

mhvk · 2024-01-07T19:38:14Z

numpy/_core/src/multiarray/arraywrap.c

+        arr = (PyArrayObject *)obj;
+    }
+    else {
+        /* TODO: This branch should ideally be NEVER taken! */


Tests fail if you remove this? Might be worth an issue enumerating those (if not too many...).

I didn't really try, but the point is that as long as we convert 0-D arrays to scalars randomly, it would seem rather typical to call wrap(a + b) and the result of that can be a scalar.

I will expand the comment of why rather than investigating for now (just seems more useful to me).

numpy/_core/src/multiarray/arraywrap.c

mhvk · 2024-01-07T19:58:01Z

numpy/_core/src/umath/ufunc_object.c

-            return NULL;
+    for (int out_i = 0; out_i < ufunc->nout; out_i++) {
+        context.out_i = out_i;
+        PyObject *original_out = NULL;


I'd make this a one-liner, but up to you!

PyObject *original_out = (full_args.out) ? PyTuple_GET_ITEM(full_args.out, out_i) : NULL;

numpy/_core/src/umath/ufunc_object.c

mhvk · 2024-01-07T20:03:43Z

numpy/_core/tests/test_deprecations.py

+            def __array_wrap__(self, arr, context=None):
+                return 'pass'
+
+        class Test2:


Can inherit from Test1 to allow removing the __array__ method.

numpy/_core/src/multiarray/arraywrap.c

…ents

seberg

Thanks for the review, sorry, I have not gotten back earlier. I think I addressed the issues (the largest one maybe being to try just passing context).

Forcing wrapping in the reduction path seemed right to me, but that said I don't care about just keeping things as they were. It doesn't really belong here, except that I wanted to slowly try to standardize behavior more.

seberg · 2024-01-12T12:44:09Z

numpy/_core/src/multiarray/arraywrap.c

+        arr = (PyArrayObject *)obj;
+    }
+    else {
+        /* TODO: This branch should ideally be NEVER taken! */


I didn't really try, but the point is that as long as we convert 0-D arrays to scalars randomly, it would seem rather typical to call wrap(a + b) and the result of that can be a scalar.

I will expand the comment of why rather than investigating for now (just seems more useful to me).

seberg · 2024-01-17T13:22:23Z

numpy/_core/memmap.py

@@ -337,7 +337,7 @@ def __array_wrap__(self, arr, context=None):
            return arr
        # Return scalar instead of 0d memmap, e.g. for np.sum with
        # axis=None
-        if arr.shape == ():
+        if arr.shape == () and return_scalar:


You are right, thanks. When I started I hadn't ensured that yet, but it does now (and I think it is much cleaner/better).

seberg · 2024-01-17T13:49:14Z

numpy/_core/src/multiarray/arraywrap.c

+ * Wrapping is always done for ufuncs because the values stored will have been
+ * mutated (guarantees calling array-wrap if any mutation may have occurred;
+ * this is not true for Python calls, though).
+ * UFuncs always call array-wrap with base-class arrays in any case.


Removed this is a leftover and better and sufficiently covered by force_wrap now...

seberg · 2024-01-17T13:58:19Z

numpy/_core/src/umath/ufunc_object.c

    }

+    PyObject *wrapped_result = npy_apply_wrap(
+            (PyObject *)ret, out_obj, wrap, wrap_type, NULL,
+            PyArray_NDIM(ret) == 0, NPY_TRUE);


I would say it is correct, but you are right, it is a subtle change? I find it odd to not do it for reductions, but do it for ufuncs, that said, ufuncs are special also in passing context, so I would be happy to undo it.

(I think besides this, looking up might be unnecessary sometimes, but nothing is changed.)

seberg · 2024-01-17T14:24:12Z

Nevermind, I siwtched to not do force-wrap which I think leaves the reduction result unchanged. While I would like to align them, I need this in more for the other PR then delay thinking about it.

Doing this due to numpygh-25635 since it is super confusing with the bad retrying...

seberg · 2024-01-19T18:22:55Z

gh-25635 was a good reason to chain exceptions. Because this was always terrible (really it got no worse, except now you have a chance of getting to the real error). It shows nicely how that try/except can go wrong if it isn't very specific enough.

In [4]: np.sqrt(np.ma.masked)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
File ~/forks/numpy/build-install/usr/lib/python3.10/site-packages/numpy/ma/core.py:6686, in MaskedConstant.__array_wrap__(self, obj, context, return_scalar)
   6685 def __array_wrap__(self, obj, context=None, return_scalar=False):
-> 6686     return self.view(MaskedArray).__array_wrap__(obj, context)

File ~/forks/numpy/build-install/usr/lib/python3.10/site-packages/numpy/ma/core.py:3131, in MaskedArray.__array_wrap__(self, obj, context, return_scalar)
   3129     fill_value = self.fill_value
-> 3131 np.copyto(result, fill_value, where=d)
   3133 # Update the mask

TypeError: Cannot cast scalar from dtype('float64') to dtype('bool') according to the rule 'safe'

The above exception was the direct cause of the following exception:

DeprecationWarning                        Traceback (most recent call last)
Cell In[4], line 1
----> 1 np.sqrt(np.ma.masked)

DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments (positionally) in the future. (Deprecated NumPy 2.0)

mhvk

Mostly small comments about the logic in npy_find_array_wrap (and one ref-counting issue).

mhvk · 2024-01-19T20:03:17Z

numpy/_core/src/multiarray/arraywrap.c

+    for (int i = 0; i < nin; i++) {
+        PyObject *obj = inputs[i];
+        if (PyArray_CheckExact(obj)) {
+            if (wrap == NULL || priority < 0) {


Priority cannot be < 0 here, right? Should the initialization above be at -inf? A benefit of that is that I think the test for wrap == NULL would not be needed any more.

I'd also write priority < NPY_PRIORITY and use that below for priority = NPY_PRIORITY - somewhat easier to read than having to know the priority of arrays is 0.

Priority can be < 0, this makes sense for example for memmap and is used there.

I had initially thought about it, and decided it seemed dubious. Maybe all it changes is that -inf really means that __array_wrap__ should never be called, which would make sense (and probably nobody uses that anyway).

There is the odd code, that the old code just skipped ndarrays, which is worse, because negative wraps take precendence over ndarray! (yes this is a change here)

(Will replace the NPY_PRIORITY to not hardcode the 0)

But if you initialize priority = 0, doesn't that mean that np.memmap.__array_wrap__ never gets selected? Is that OK?

The code here always uses the first one (if there is more than one). That is what the NULL check does.

Ah, that is nicer though of course! I should write i == 0 that is much clearer...

EDIT: The only reason I even initialized priority at all is because gcc incorrectly thinks it can be used uninitialized.

Wait no, arg... that doesn't work, how about just adding a comment for what the NULL check does early on?

EDIT: Done this, I do think we can tweak this, but as of now... this is the smallest variation from what we had before (which was very strange and didn't agree with the python side)

mhvk · 2024-01-19T20:06:03Z

numpy/_core/src/multiarray/arraywrap.c

+            }
+        }
+        else if (PyArray_IsAnyScalar(obj)) {
+            if (wrap == NULL || priority < NPY_SCALAR_PRIORITY) {


numpy/_core/src/multiarray/arraywrap.c

mhvk · 2024-01-20T03:06:39Z

numpy/_core/src/multiarray/arraywrap.c

+    for (int i = 0; i < nin; i++) {
+        PyObject *obj = inputs[i];
+        if (PyArray_CheckExact(obj)) {
+            if (wrap == NULL || priority < NPY_PRIORITY) {


I'm still confused: if you initialize priority = -inf (or some other very large negative number), will this line not just always be true if wrap is undefined? So, why is the wrap == NULL needed?

not if the object defines priority = -inf. Maybe that would be preferable, but 🤷.

Ah, I only need it on the last branch, yes. Would that seem clearer (then the last branch is still special, this way priority is only initialized to silence warnings)?

I think coding against something that defines priority = -inf yet defined __array_wrap__ seems unnecessary! But if you really prefer what you have, that is fine too -- my own sense is that the logic becomes clearer without the extra conditions that are needed to grab the first argument unconditionally.

I don't disagree, I think I just wanted to match one of the implementations inside NumPy.
Maybe it is laziness speaking: This way no user can run into this (not even hypothesis) and its there already there...?

mhvk

OK, I think we've reached (more than) good enough!

You probably want to squash & merge, or reduce the number of commits a bit.

seberg · 2024-01-20T19:40:17Z

Thanks Marten, let me squash merge this then, since I want to rebase the other PR on this work...

Afterall, while I think it fixes a bunch of issues (at least mid-term) and opens avenues, it was really mostly a split-out from the other PR.

mhvk · 2024-01-20T21:39:02Z

Nice, on to the helper! (ping me when rebased - that one we went over a lot already so should be easy to do a final round of review).

pllim · 2024-01-22T20:48:21Z

I thought numpy 2 would be out by now. Wasn't expecting breaking changes even now...

seberg · 2024-01-22T21:17:51Z

I thought numpy 2 would be out by now. Wasn't expecting breaking changes even now...

I wish :(, although I hope it should be mostly good. You will certainly also notice gh-25168 (more than this).

seberg added 5 commits December 17, 2023 18:22

API: Reorganize __array_wrap__ and add return_scalar=False

0ba20d7

This also deprecates any `__array_wrap__` which does not accept `context` and `return_scalar`.

BUG: Fix niche bug in rounding.

e6fdbdc

MAINT: Adjust __array_wrap__ in code and tests (also deprecation test)

b9fb29e

MAINT: Use/move the simplest C-wrapping also

3c51b0f

DOC: Update doc and add release note

ad0e168

github-actions bot added the 30 - API label Dec 17, 2023

seberg commented Dec 17, 2023

View reviewed changes

seberg added 3 commits December 17, 2023 18:39

STY: Make linter happy in old tests

09ba729

MAINT: Silence GCC warning (value cannot really be used uninitialized)

2ef4302

MAINT: Small style fixes

411e2db

BUG: Fix reference leak in ufunc array-wrapping

4846b1d

This probably doesn't fix the 32bit issue unfortunately, only the windows ones...

seberg force-pushed the clean-wrapping branch from abe3530 to 4846b1d Compare December 18, 2023 19:54

BUG: Fix leak for result arrays in all ufunc calls

a2cee02

mhvk reviewed Jan 7, 2024

View reviewed changes

seberg added 5 commits January 17, 2024 14:20

Ensure we try passing context and address related smaller review comm…

3caf819

…ents

Merge branch 'main' into clean-wrapping

5d2c04c

Ensure we don't try context=None and expand code comment

181a376

Rely on return_scalar always being right (and style nit)

8621eee

Remove outdated comments as per review

3122538

seberg commented Jan 17, 2024

View reviewed changes

Let's just undo force-wrap for now for reductions (its a change...)

497c535

ENH: Chain the original error when the deprecationwarning is raised

f50ff16

Doing this due to numpygh-25635 since it is super confusing with the bad retrying...

mhvk reviewed Jan 19, 2024

View reviewed changes

BUG,MAINT: Address Martens review comments

606f5c3

seberg force-pushed the clean-wrapping branch from dbb85c0 to 606f5c3 Compare January 19, 2024 21:36

charris changed the title ~~API,MAINT: Reorganize array-wrap calling and introduce return_scalar~~ API,MAINT: Reorganize array-wrap calling and introduce return_scalar Jan 20, 2024

mhvk reviewed Jan 20, 2024

View reviewed changes

seberg mentioned this pull request Jan 20, 2024

Scalars are returned when passing 0D arrays to out #16734

Closed

mhvk approved these changes Jan 20, 2024

View reviewed changes

seberg merged commit 6bd3abf into numpy:main Jan 20, 2024

seberg mentioned this pull request Jan 20, 2024

BUG: Calling squeeze on a slice of a memmap raises cryptic error #25278

Closed

seberg deleted the clean-wrapping branch January 20, 2024 21:42

This was referenced Jan 22, 2024

TST: (numpy dev) DeprecationWarning: __array_wrap__ must accept context and return_scalar arguments astropy/astropy#15924

Closed

TST: fix deprecation warnings from numpy dev's __array_wrap__ signature astropy/astropy#15925

Merged

pllim mentioned this pull request Jan 22, 2024

interpret_dqval not compatible with numpy 2.0 (DeprecationWarning) spacetelescope/reftools#63

Closed

larsoner mentioned this pull request Jan 23, 2024

BUG: np.memmap broken on Windows (maybe) #25665

Closed

neutrinoceros mentioned this pull request Jan 24, 2024

TST: (numpy dev) new failing tests astropy/astropy#15926

Closed

mroeschke mentioned this pull request Jan 24, 2024

COMPAT: Make argsort compatable with numpy 2.0 pandas-dev/pandas#57057

Merged

5 tasks

tacaswell mentioned this pull request Apr 15, 2024

DOC: correct PR referenced in __array_wraps__ change note #26283

Merged

jtniehof mentioned this pull request Jul 8, 2024

Review numpy 2.0 migration guide spacepy/spacepy#764

Closed

beroda mentioned this pull request Jan 7, 2025

Treat warnings as errors in tests pyvista/pyvista#7009

Merged

Uh oh!

API,MAINT: Reorganize array-wrap calling and introduce return_scalar #25409

API,MAINT: Reorganize array-wrap calling and introduce return_scalar #25409

Uh oh!

Conversation

seberg commented Dec 17, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk commented Dec 18, 2023

Uh oh!

seberg commented Dec 18, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

seberg commented Dec 19, 2023

Uh oh!

mhvk commented Dec 19, 2023

Uh oh!

seberg commented Dec 20, 2023

Uh oh!

seberg commented Jan 7, 2024

Uh oh!

mhvk left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

seberg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg commented Jan 17, 2024

Uh oh!

seberg commented Jan 19, 2024

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

seberg Jan 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API,MAINT: Reorganize array-wrap calling and introduce `return_scalar` #25409

API,MAINT: Reorganize array-wrap calling and introduce `return_scalar` #25409

seberg commented Dec 18, 2023 •

edited

Loading

mhvk left a comment •

edited

Loading

seberg Jan 19, 2024 •

edited

Loading

seberg Jan 19, 2024 •

edited

Loading