Add the `n_protocol_dag_results` property to `ProtocolResult` #381

ianmkenney · 2024-10-30T22:25:55Z

This PR adds the n_protocol_dag_results property to the ProtocolResult class. It reflects the number of ProtocolDAGResults that were input to the _gather call.

The `gather` aggregation method removes too much information when reducing the provided protocol_dag_results. This change adds the new n_protocol_dag_results property that reflects the number of ProtocolDAGResult objects that were used to create the ProtocolResult.

…t_n_protocol_dag_results

codecov · 2024-10-30T22:27:27Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.67%. Comparing base (10fef9b) to head (6cdbb72).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #381   +/-   ##
=======================================
  Coverage   98.66%   98.67%           
=======================================
  Files          36       36           
  Lines        2097     2109   +12     
=======================================
+ Hits         2069     2081   +12     
  Misses         28       28

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ianmkenney · 2024-10-30T23:14:33Z

I'm probably being a little (very) pedantic here, but since gather takes in an Iterable, which can include non-terminating generators, it's possible for gather to hang forever (before running out of memory and exiting).

Since I'm using the len function in this PR, even well-behaved generators will no longer work. An alternative would be to use list(protocol_dag_results), but this is still susceptible to the infinite generator case. I'm not sure what the best option for typing is, since there doesn't appear to be a good builtin that enforces a terminating iterable and I think defining and comparing to a custom type might be more extreme than just not allowing generators.

All of that said, I'm happy to revert the guardrail (and its test) if it seems like too much for this PR. Although I suspect type checkers will complain if I use len rather than list.

dotsdl · 2024-10-31T04:44:03Z

I think the case of infinite ProtocolDAGResults isn't remotely likely, but I see your point. We could slate changing the type hint for gather from Iterable to list for gufe 2.0, indicating that we explicitly expect a list (and therefore a finite number) of ProtocolDAGResults fed to gather.

Do you agree with this?

dotsdl · 2024-10-31T04:46:00Z

You will also need to make changes to the _to_dict and _from_dict methods in ProtocolResult reflecting these changes.

IAlibay

Also blocking on downstream testing and maybe a bit of a discussion on protocol needs.

IAlibay · 2024-10-31T09:49:59Z

gufe/protocols/protocol.py

        self._data = data
+        self._n_protocol_dag_results = (
+            n_protocol_dag_results if n_protocol_dag_results is not None else 0


I'm trying to wrap my head around the use case here, could you explain the case where None happens? I.e. at first glance setting None to zero seems incorrect if None means that the gather method did not pass through a value. In that case None means undefined and it's much cleaner than getting 0 with the idea that you got no results (but you do).

I think in the context of a Protocol's gather, you'll never see a None passed in. I can see instances where ProtocolResults are used directly, incorrectly or not, and a None would make sense. But of course I immediately throw away the None in the code you highlighted above. I suppose there is still some indecision on my part what the right signature should be and which values the ProtocolResult holds on to. I'm tempted to just drop Optional[int] and use int with default 0.

Just from the tests, I see instances where someone would make a ProtocolResult directly and I'll probably do something similar in testing parts of stratocaster downstream:

class DummyProtocolResult(gufe.ProtocolResult): def get_estimate(self): return self.data['estimate'] def get_uncertainty(self): return self.data['uncertainty'] class TestProtocolResult(GufeTokenizableTestsMixin): cls = DummyProtocolResult key = 'DummyProtocolResult-b7b854b39c1e37feabec58b4000680a0' repr = f'<{key}>' @pytest.fixture def instance(self): return DummyProtocolResult( estimate=4.2 * unit.kilojoule_per_mole, uncertainty=0.2 * unit.kilojoule_per_mole, )

Thanks @ianmkenney - is this a time critical thing? I would like to spend a bit of time to check what this means for the existing Protocols, but I don't think I'll have time until Monday :(

Not necessarily. I'll be testing stratocaster strategies based on this branch until it's merged in the hopes that we get some solution that's close in spirit to what we have here.

ianmkenney · 2024-10-31T16:20:00Z

I think the case of infinite ProtocolDAGResults isn't remotely likely, but I see your point. We could slate changing the type hint for gather from Iterable to list for gufe 2.0, indicating that we explicitly expect a list (and therefore a finite number) of ProtocolDAGResults fed to gather.

Do you agree with this?

Yeah that works, I think in the long term it will simplify typing issues.

`_to_dict` and `_from_dict` of `ProtocolResult` were also updated to account for the new n_protocol_dag_results property

Explicitly test `get_uncertainty`, `get_estimate`, and `n_protocol_dag_results`

…t_n_protocol_dag_results

dotsdl · 2024-11-12T17:01:13Z

@IAlibay downstream do you serialize ProtocolResult objects, or ProtocolDAGResult objects? The former will be impacted by this change, but the latter will not.

IAlibay · 2024-11-18T12:37:57Z

@IAlibay downstream do you serialize ProtocolResult objects, or ProtocolDAGResult objects? The former will be impacted by this change, but the latter will not.

We serialize ProtocolResult objects - since this is facing a ton of users, we'd want to be able to deserialize any 1.x ProtocolResults - is that possible? I.e. could we add a try/except in from_dict?

IAlibay

How about this?

gufe/protocols/protocol.py

First attempt to extract n_protocol_dag_results from the data dict. If a KeyError is raised, as you'd find for ProtocolResults serialized prior to this commit, set n_protocol_dag_results to the default value of 0. Co-authored-by: Irfan Alibay <[email protected]>

Test creating a ProtocolResult from a dictionary that is missing the n_protocol_dag_results key.

…ult_n_protocol_dag_results

IAlibay

lgtm thanks!

dotsdl · 2024-11-26T07:07:43Z

@ianmkenney if you can resolve conflicts, I think this one is good to go for merge!

ianmkenney added 3 commits October 30, 2024 15:02

Merge remote-tracking branch 'origin/main' into feature/protocolresul…

dc48543

…t_n_protocol_dag_results

Modified test and fixed typo

d97cf22

Added a test for the infinite generator guardrail

2aa451f

ianmkenney requested a review from dotsdl October 30, 2024 23:14

IAlibay requested changes Oct 31, 2024

View reviewed changes

IAlibay and others added 4 commits October 31, 2024 16:22

Merge branch 'main' into feature/protocolresult_n_protocol_dag_results

ef9c926

Replaced Optional[int] (default None) with int (default 0)

3b5ca83

`_to_dict` and `_from_dict` of `ProtocolResult` were also updated to account for the new n_protocol_dag_results property

Added more tests for the ProtocolResult

1c8bf8c

Explicitly test `get_uncertainty`, `get_estimate`, and `n_protocol_dag_results`

Merge remote-tracking branch 'origin/main' into feature/protocolresul…

32a8d1a

…t_n_protocol_dag_results

dotsdl added this to the Release 1.2 milestone Nov 12, 2024

dotsdl assigned ianmkenney Nov 12, 2024

IAlibay reviewed Nov 18, 2024

View reviewed changes

gufe/protocols/protocol.py Outdated Show resolved Hide resolved

ianmkenney and others added 2 commits November 20, 2024 20:38

Add test for creating a ProtocolResult from dict with missing data

8d7f3b7

Test creating a ProtocolResult from a dictionary that is missing the n_protocol_dag_results key.

ianmkenney force-pushed the feature/protocolresult_n_protocol_dag_results branch from 93adf94 to 8d7f3b7 Compare November 21, 2024 04:40

Merge remote-tracking branch 'upstream/main' into feature/protocolres…

3bac74d

…ult_n_protocol_dag_results

ianmkenney requested a review from IAlibay November 21, 2024 15:44

IAlibay approved these changes Nov 21, 2024

View reviewed changes

Merge branch 'main' into feature/protocolresult_n_protocol_dag_results

6cdbb72

ianmkenney merged commit a9b5982 into OpenFreeEnergy:main Nov 26, 2024
18 of 19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the `n_protocol_dag_results` property to `ProtocolResult` #381

Add the `n_protocol_dag_results` property to `ProtocolResult` #381

ianmkenney commented Oct 30, 2024

codecov bot commented Oct 30, 2024 •

edited

Loading

ianmkenney commented Oct 30, 2024

dotsdl commented Oct 31, 2024 •

edited

Loading

dotsdl commented Oct 31, 2024

IAlibay left a comment

IAlibay Oct 31, 2024

ianmkenney Oct 31, 2024

IAlibay Oct 31, 2024

ianmkenney Nov 1, 2024

ianmkenney commented Oct 31, 2024

dotsdl commented Nov 12, 2024 •

edited

Loading

IAlibay commented Nov 18, 2024

IAlibay left a comment

IAlibay left a comment

dotsdl commented Nov 26, 2024

Add the n_protocol_dag_results property to ProtocolResult #381

Add the n_protocol_dag_results property to ProtocolResult #381

Conversation

ianmkenney commented Oct 30, 2024

codecov bot commented Oct 30, 2024 • edited Loading

Codecov Report

ianmkenney commented Oct 30, 2024

dotsdl commented Oct 31, 2024 • edited Loading

dotsdl commented Oct 31, 2024

IAlibay left a comment

Choose a reason for hiding this comment

IAlibay Oct 31, 2024

Choose a reason for hiding this comment

ianmkenney Oct 31, 2024

Choose a reason for hiding this comment

IAlibay Oct 31, 2024

Choose a reason for hiding this comment

ianmkenney Nov 1, 2024

Choose a reason for hiding this comment

ianmkenney commented Oct 31, 2024

dotsdl commented Nov 12, 2024 • edited Loading

IAlibay commented Nov 18, 2024

IAlibay left a comment

Choose a reason for hiding this comment

IAlibay left a comment

Choose a reason for hiding this comment

dotsdl commented Nov 26, 2024

Add the `n_protocol_dag_results` property to `ProtocolResult` #381

Add the `n_protocol_dag_results` property to `ProtocolResult` #381

codecov bot commented Oct 30, 2024 •

edited

Loading

dotsdl commented Oct 31, 2024 •

edited

Loading

dotsdl commented Nov 12, 2024 •

edited

Loading