-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advice on unit-aware arithmetic #2176
Comments
cc @kmpaul who has also been working on this sort of thing recently. |
For reference, here are some of the sort of methods I've been adding that aren't currently in
|
Thanks, @jhamman, for the reference. I have, indeed, been thinking about this a lot. I've hit the same problem that you have @mcgibbon and came to the same conclusion that subclassing was the only way to go. I'm not sure I'm happy with my solution, but I wrote a (partial) implementation of a DataArray subclass to deal with units. @jhamman suggested that I pull this out into a separate repo, and I hope to do that one day... However, if you are curious, now, you can view it here: https://github.com/NCAR/PyConform/tree/rewrite/source/pyconform/physarrays Note that this implementation deals with time as just another unit, rather than requiring any special dealing with time. This works if you units package can deal with calendared times, too. I am currently using Also, note that this implementation also deals with the Let me know what you think about this. If it's a priority, I can pull this out into its own repository. |
If you implement arithmetic special methods like
It's awkward but you can still use the infix notation. Subclassing or writing a DataArray wrapper like what @kmpaul did in physarrays seems like the cleanest way to do this currently. The main challenge is to avoid relying on non-public parts of the DataArray API, but I think overwriting arithmetic would be pretty safe. In the long term, I think the best solution here is probably an extensible interface for |
@shoyer that notation might work, thanks for pointing it out! Maybe we can think of a more natural name for the accessor ("with_units"? "keep_units"? "uarray"? "u"?). I find the "storage" of units as a string in attrs to be much cleaner than any other implementation I've seen so far (like implementations that have a unit container over an underlying array, or an array of unit-aware objects). It has the added benefit that this is how units are conventionally stored in netCDF files. I don't think it's necessary to use a class other than ndarray for data storage. @kmpaul the main reason I stayed away from It may make sense for us to use some kind of stand-alone unit-aware DataArray implementation. I'd just need to be convinced that yours is well-designed, thoroughly tested, and easy to install with pip. The main things concerning me about |
Well, I'm certainly not trying to argue that the I understand your concerns about I personally think it's fine to discuss this here, unless other people would like to see this go offline. To address some of your issues:
Yes. I agree. In fact, my first implementation was a subclass of
I personally didn't see much of a benefit to a check like
As I mentioned in my post, this is just my personal preference. I think that calendared time units are...well...just units, and that the same mechanism for dealing with all other units should deal with calendared time units. It's just an aesthetic that I prefer. (That said, I'm extremely happy that someone finally dealt with non-standard calendars with
The CF Conventions define the In the end, I'm happy doing whatever the community wants with this code. I can pull it out into it's own repo (along with the unit tests, which are also in the PyConform repo). And then, after that, I have no problem with people taking it in a different direction (e.g., using |
My problem with custom classes (subclasses or composition) is that you will never get those from a I’m also looking at moving from pint to unyt, which is Yt’s unit support, brought into a standalone package. Beyond some performance benefits (I’ve heard), it has the benefit of being a ndarray subclass, which means |
@dopplershift That's a good point. It's pretty trivial to create a In
|
Thank you for the input everyone, this discussion has been very useful! Closing this issue. |
This isn't really a bug report. In
sympl
we're using DataArrays that allow unit-aware operations using the 'units' attribute as the only persistent unit storage. We usepint
as a backend to operate on unit strings, but this is never exposed to the user and could be swapped for another backend without much consequence.Basically, we currently have this implemented as a subclass
sympl.DataArray
. @dopplershift recently introduced me to the accessor interface, and I've been thinking about whether to switch over to that way of extending DataArray.The problem I have is that the new code that results from using an accessor is quite cumbersome. The issue lies in that we mainly use new implementations for arithmetic operations. So, for example, the following code:
instead becomes
This could be a little less cumbersome if we avoid a sympl namespace and instead add separate accessors for each method. At the least it reads naturally. However, there's a reason you don't generally recommend doing this.
I'm looking for advice on what is best for
sympl
to do here. Right now I'm leaning towards that we should use a subclass rather than an accessor - does this seem like an appropriate case to do so?The text was updated successfully, but these errors were encountered: