Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make AxesArray handle slicing correctly #451

Merged
merged 64 commits into from
Jan 31, 2024
Merged

Conversation

Jacob-Stevens-Haas
Copy link
Member

@Jacob-Stevens-Haas Jacob-Stevens-Haas commented Jan 7, 2024

Fixes #220

It's long made sense to expose the AxesArray interface and document it. However, when it was created, it wasn't stable enough even for every use case inside pysindy. Indexing can remove, add, and even move around the axes in an array, which meant that properties such as ax_time and n_spatial were unreliable when an array was indexed. FiniteDifference._accumulate() is an example of great code that worked which AxesArrays, but in which AxesArray gave no extra understanding in debugging because the axis labels were incorrect.

This PR makes basic indexing, integer advanced indexing, and boolean array indexing work. It does so by adding a a private attribute that contains both forwards and backwards maps of the axes, and modifying __getitem__ to detect which axes are added, removed, combined, relocated, etc. The PR also expands the ufunc tests dramatically. Item 1 below is probably all that's needed to accept the PR.

  • Determine which warnings should be shown depending on axis assignments (check which ones are raised by tests).
  • Document this well, including the shortcomings (field access for structured arrays)
  • Add issues to make transpose, roll, and reshape work for axesarrays.
  • Now that I understand views vs new-from-template, I can probably simplify __array_finalize__. Not sure I need all those conditionals.
  • I still cut down the original list of numpy ufunc tests from what numpy itself uses. Now that I understand views vs new-from-template, I can probably go back and make better decisions on what/how we test.

AxesArray needs to warn when being created with a set of axes that is
incompatible with the array data.  It also needs to handle slices that copy
or remove an axis
Improved tests
Remove axes for singleton slices using new _reverse_map attribute
Restrict changes to shape
In cases where dimensions change, chances are __array_ufunc__ took care of
creating an AxesArray.  Return it.

If there's a case where __array_function__ created an array with different
dimensions than self, it will still error.
twisted axes is when axes are not adjacent but have the same label, e.g.

arr = AxesArray(np.empty((1,1,1)), {"ax_spatial": [0,2], "ax_time": 1})
Also split apart basic and fancy indexing tests
Also rename _standardize_basic_indexer as _standardize_indexer
I added it before I knew what name mangling was for
TBD: should this fully return a new _AxisMapping and maybe a
new indexer?
Involves processing the keys several times, with increasing standardization
Modify the standardize_indexer() function
Parameterize StandardIndexer
@Jacob-Stevens-Haas Jacob-Stevens-Haas mentioned this pull request Jan 16, 2024
3 tasks
Newer sphinx gives more accurate line numbers for errors
Copy link

codecov bot commented Jan 17, 2024

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (8341efb) 93.92% compared to head (bc3fbd2) 94.40%.
Report is 2 commits behind head on master.

Files Patch % Lines
pysindy/utils/axes.py 98.28% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #451      +/-   ##
==========================================
+ Coverage   93.92%   94.40%   +0.48%     
==========================================
  Files          37       37              
  Lines        3602     3989     +387     
==========================================
+ Hits         3383     3766     +383     
- Misses        219      223       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Jacob-Stevens-Haas
Copy link
Member Author

Jacob-Stevens-Haas commented Jan 17, 2024

Issues to add to pysindy repo, discovered during this PR but not for this PR:

  • Add einsum tests
  • Clear AxesWarnings in tests (about 60 in each test_pysindy and test_sindyc, but far fewer than other warnings) and then replace warning with error
  • Add test for calling pysindy with axesarray with different axes - are we smart about comprehend axes?
  • Remove default in_ndim and axes in _AxisMapping - args should be required.
  • Add support for advanced indexing with labeled AxesArray (e.g. stencil = t[n_stencil]) or provide a better way to do this in developer docs.
  • Add support for @ - + * .T etc (currently just given labels from first)
  • Replace time variables (e.g. "t") in Differentiation with axis generic variable names
  • Clean out unnecesary pieces of documentation API documentation (e.g. pysindy.version module) ... it's a bit out of hand. One page per module or package, and package only with list of links to modules
  • Scrub ufunc_override unit
  • Decide what should happen with numpy.ndarray.view(AxesArray). Should it fail? Should it accept a second parameter if that's possible?)
  • Decide whether we should attempt to make an AxesArray from mixed-type numpy functions (e.g. np.foo((AxesArray, np.ndarray)) or just return ndarray. See array_function`. Changing this now breaks pysindy. This may affect the next issue:
  • Some nonimplemented functions that create a view, which calls __array_finalize__. Currently that method has ugly conditionals, and it might be cleaner to remove them (but doing so now breaks pysindy).

Added a bunch of tests, mostly to check ValueErrors emitted correctly
Remove default argument to _AxisMapping, AxesArray
Remove __array_wrap__
@Jacob-Stevens-Haas
Copy link
Member Author

Jacob-Stevens-Haas commented Jan 17, 2024

Alright @znicolaou, @akaptano, this is ready for your review, if desired! After this I'd suggest we release 2.0.0. One of the main points of this PR was to removes some defaults for AxesArray, so that I'm comfortable fixing issues without breaking backwards compatability.

This PR dramatically changes the AxesArray class to support slicing. This is the most common pysindy operation on arrays. Now that slicing works, AxesArray is stricter about creating arrays/array views. The strictness caused some functions in pysindy to fail; the PR patches FiniteDifference and PDELibrary as well as introduces implementations of common numpy operations on AxesArray (e.g. numpy.linalg.solve).

I'm going to leave this open for at least two weeks. I'd especially appreciate if you could take a look at the changes to the package outside of axes.py, which are many fewer and easier to review. Admittedly, some of these changes only make sense in debugging, when you can see that slices of the spatiotemporal grid carry the correct axis definition.

Conflicts dealing with swapping out scipy factorial for numpy
factorial

Also require newaxis slice to be named.
@Jacob-Stevens-Haas Jacob-Stevens-Haas merged commit 9c73768 into master Jan 31, 2024
6 checks passed
@Jacob-Stevens-Haas Jacob-Stevens-Haas deleted the axesarray-indexing branch January 31, 2024 16:50
jpcurbelo pushed a commit to jpcurbelo/pysindy_fork that referenced this pull request Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Allow AxesArray to handle slicing/indexing correctly
1 participant