-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add dataclasses for data model #4
Conversation
I left the old PR up, but we should look at this one instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've included some inline suggestions and a couple of questions in this review, but I'm primarily having trouble following what's happening in serialization.py
, particularly as_tagged_union
. Would you be able to give me a high-level summary of what it's for and how it works?
Func = TypeVar("Func", bound=Callable) | ||
|
||
|
||
def alternative_constructor(func: Func) -> Func: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def alternative_constructor(func: Func) -> Func: | |
def register_alternative_constructor(func: Func) -> Func: |
This naming is clearer to me, but not sure if that's the case for everyone.
def _capitalized(name: str) -> str: | ||
return name[0].upper() + name[1:] | ||
|
||
return "".join((cls.__name__, *(_capitalized(arg.__name__) for arg in args))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _capitalized(name: str) -> str: | |
return name[0].upper() + name[1:] | |
return "".join((cls.__name__, *(_capitalized(arg.__name__) for arg in args))) | |
return "".join((cls.__name__, *(arg.__name__.capitalize() for arg in args))) |
return "".join((cls.__name__, *(_capitalized(arg.__name__) for arg in args))) | ||
|
||
|
||
generic_name = type_name(_get_generic_name_factory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed the docs and source for apischema
, but I can't figure out what the purpose of type_name
is. It associates types with a TypeNameFactory
? And then how is that supposed to be used?
# attempt to cast | ||
self.data_type = data_type(self.data) | ||
|
||
if (self.data_type == 'unset'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (self.data_type == 'unset'): | |
if (self.data_type == UNSET): |
class Value(Parameter): | ||
"""An Entry that stores a PV name and data pair""" | ||
data: AnyEpicsType = _unset_data | ||
data_type: str = 'unset' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
data_type: str = 'unset' | |
data_type: str = UNSET |
while origin: # Drill down through nested types, only Lists currently | ||
if origin is Union: | ||
break | ||
elif origin in (list, Sequence): | ||
hint = get_args(hint)[0] # Lists only have one type | ||
origin = get_origin(hint) | ||
is_list = True | ||
else: | ||
origin = get_origin(hint) | ||
hint = get_args(hint) | ||
|
||
# end condition | ||
if origin is None: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
while origin: # Drill down through nested types, only Lists currently | |
if origin is Union: | |
break | |
elif origin in (list, Sequence): | |
hint = get_args(hint)[0] # Lists only have one type | |
origin = get_origin(hint) | |
is_list = True | |
else: | |
origin = get_origin(hint) | |
hint = get_args(hint) | |
# end condition | |
if origin is None: | |
break | |
while origin is not None and origin is not Union: # Drill down through nested types, only Lists currently | |
if origin in (list, Sequence): | |
hint = get_args(hint)[0] # Lists only have one type | |
origin = get_origin(hint) | |
is_list = True | |
else: | |
origin = get_origin(hint) | |
hint = get_args(hint) |
This suggestion is logically equivalent to the current code, except that hint
doesn't get reassigned to get_args(hint)
in the case that origin
starts out as None
.
Although this suggestion is a little cleaner, I'm not sure this while
loop is doing what it should. For a simple type hint (e.g. int
), origin
will be None
and hint
will be ()
, ultimately resulting in a ValidationError
.
f'improper type ({type(field_value)}) found in field ' | ||
f'(expecting {hint}])' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the error message reflect that an element in the list broke validation, rather than the list as a whole?
if (origin is Union): | ||
inner_hint = get_args(hint) | ||
else: | ||
inner_hint = hint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (origin is Union): | |
inner_hint = get_args(hint) | |
else: | |
inner_hint = hint |
The docs for isinstance
say it supports Union
as the classinfo
parameter.
f'improper type ({type(field_value)}) found in field ' | ||
f'(expecting {hint}])' | ||
) | ||
elif not isinstance(field_value, inner_hint): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif not isinstance(field_value, inner_hint): | |
elif not isinstance(field_value, hint): |
The docs for isinstance
say it supports Union
as the classinfo
parameter.
inner_hint = hint | ||
|
||
if is_list: | ||
list_comp = (isinstance(it, inner_hint) for it in field_value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list_comp = (isinstance(it, inner_hint) for it in field_value) | |
list_comp = (isinstance(it, hint) for it in field_value) |
The docs for isinstance
say it supports Union
as the classinfo
parameter.
Closing this after extensive discussions on the data model. This material is likely still useful, but should be superseded by future work. (That I hope will include similar tests) |
Description
Motivation and Context
Continue fleshing out structure of this package's backend.
(same description as #3 that I'm leaving behind, since we should make filled objects not track their parents/origin, duplicating that data instead.)
Notes on validation
I've chosen
apischema
here since it is the beast we know, and has similar functionality to pydantic.apischema
performs type validation happens on serialization/deserialization, but not outside of that. If we are to be constructing these dataclasses we'll want to validate them before attempting to save them to a backend (whether serialized or not).To help with that, the
.validate()
method has been added. This calls theapischema
validation routine. Adding routines can be done with the@validator
decorator, quite easily. I will note that these routines get called on deserialization, meaning we do duplicate some of apischema's validation when we check type hints. I think this gives us hooks to write more validation routines, as long as those routines can also be run at deserialization. Maybe in the future we stop using apischema to gather the routines, but for now this makes sense to me.In pursuit of lazy loading
This section is also the "Why hint
Union[UUID, Entry]
everywhere?" sectionI expect lazy loading will become a requirement, especially as database size grows. I imagine a world where we have a huge root tree structure, with deep nesting of collections and snapshots. In this case, we wouldn't want to pull the whole database's contents with every app load.
Instead we might first grab the metadata information (name, description, creation time), then fill its children when their information is requested. Exactly how we accomplish this should be fleshed out in the client/backend implementations, but the data-model would have to allow "unfilled" references to
Entry
objects.How Has This Been Tested?
Added a few tests, need to flesh them out a bit
Where Has This Been Documented?
This PR, soon pre-release notes
Pre-merge checklist
docs/pre-release-notes.sh
and created a pre-release documentation page