Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Dataset internals to store data variables and coordinate variables as separate dicts #9203

Open
TomNicholas opened this issue Jul 1, 2024 · 4 comments
Labels
topic-DataTree Related to the implementation of a DataTree class topic-internals

Comments

@TomNicholas
Copy link
Member

TomNicholas commented Jul 1, 2024

There's a lot of other discussion in #9063, but I wanted to pull out this suggestion for independent discussion:

  • Our list of internal attributes on a DataTree node is still not just those on a Dataset plus the inherited coordinates

This is indeed a bit of a con, but in my mind the right fix is probably to adjust the Dataset data model to using dictionaries of data_variables and coord_variables, rather than the current solution of a dict of variables and a set of coord_names. Using a separate dictionary for coord_variables would also be more aligned with how DataArray is implemented. The internal Dataset data model is a hold-over from the very early days of Xarray, before we had a notion of coordinate variables that are not indexes.

Originally posted by @shoyer in #9063 (comment)

@shoyer shoyer changed the title Refactor internals to store data variables and coordinate variables as separate dicts Refactor Dataset internals to store data variables and coordinate variables as separate dicts Jul 1, 2024
@TomNicholas
Copy link
Member Author

In fact could the coord_variables dict just be an actual xr.Coordinates object?

(idea from #9204 (comment))

@shoyer
Copy link
Member

shoyer commented Jul 2, 2024

In fact could the coord_variables dict just be an actual xr.Coordinates object?

I think Coordinates should remain a thin wrapper object. It needs access to both coordinate Variable objects and associated indexes.

@headtr1ck
Copy link
Collaborator

I was wondering about that already quite some time, haha.

Does this mean that handling of coordinates can be handled in the parent class DataWithCoords?

@TomNicholas
Copy link
Member Author

Does this mean that handling of coordinates can be handled in the parent class DataWithCoords?

You mean have all of Dataset, DataArray, and DataTree inherit from DataWithCoords, and add things like the .coords property to DataWithCoords? Relates to #9204 (comment).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-DataTree Related to the implementation of a DataTree class topic-internals
Projects
None yet
Development

No branches or pull requests

3 participants