Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a helper function for looking at variables #170

Merged
merged 6 commits into from
Oct 18, 2024

Conversation

jeremykubica
Copy link
Contributor

For common parameters, we might want to extract them from the GraphState without going through the node. This is useful in both logging and debugging. This PR adds a helper function that can extract (uniquely named) parameters from arbitrary nodes in the graph.

@jeremykubica jeremykubica requested a review from mi-dai October 15, 2024 17:12
Copy link

github-actions bot commented Oct 15, 2024

Before [745e20c] After [507b66b] Ratio Benchmark (Parameter)
2.10±1ms 2.27±0.3ms 1.08 benchmarks.TimeSuite.time_fnu_to_flam
1.50±0.07ms 1.59±0.1ms 1.06 benchmarks.TimeSuite.time_apply_passbands
4.47±1ms 4.46±1ms 1 benchmarks.TimeSuite.time_chained_evaluate
8.94±0.09ms 8.95±0.02ms 1 benchmarks.TimeSuite.time_load_passbands
31.7±0.4μs 31.8±0.2μs 1 benchmarks.TimeSuite.time_make_new_salt3_model
1.37±0.01s 1.36±0.01s 0.99 benchmarks.TimeSuite.time_make_x1_from_hostmass
123±0.7μs 120±1μs 0.98 benchmarks.TimeSuite.time_sample_x0_from_distmod
16.1±0.1μs 15.8±0.2μs 0.98 benchmarks.TimeSuite.time_sample_x1_from_hostmass
10.3±0.7ms 9.98±0.9ms 0.97 benchmarks.TimeSuite.time_evaluate_salt3_model
5.75±0.2ms 5.51±0.1ms 0.96 benchmarks.TimeSuite.time_evaluate_salt3_passbands

Click here to view all benchmarks.

Copy link
Contributor

@mi-dai mi-dai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing with @hombit, we think we do not really need the collapse feature, since it's hard to define how close is close enough to be considered the same. I think we can keep just the collapse=False behavior.

@jeremykubica
Copy link
Contributor Author

After discussing with @hombit, we think we do not really need the collapse feature, since it's hard to define how close is close enough to be considered the same. I think we can keep just the collapse=False behavior.

In terms of "close enough", I had been thinking effectively an exact match. If the value is exactly the same then it is likely just information copied between nodes. But if we don't need this, then it is easier not to have it.

If we do not need collapse then we can do this a lot easier by making filtering an option in to_dict. I made that change. What do you think?

@mi-dai
Copy link
Contributor

mi-dai commented Oct 16, 2024

Looks good to me. I think we do not need collapse regardless of whether we can extract the exact same parameter.
But it seems now we are not able to do extract_parameters('node.parameter') any more? I would still like to have that feature. Also to_dict doesn't sound very intuitive compared to extract_parameters, can we still keep extract_parameters?

@jeremykubica
Copy link
Contributor Author

@mi-dai I changed the name from to_dict to extract_parameters. I kept the to_dict function since it is being used in other contexts. It just calls extract_parameters now.

I also re-added the functionality to pass a single string or pass parameter names as a positional argument. I don't think we should include the full name option though, because we can already access that with the [] notation. I added some tests to illustrate this. We should access the value of a single parameter using the full name as state["node_name.param_name"] and only use extract_parameters to get a dictionary of multiple parameters. This saves us from having to check which type is returned.

@jeremykubica jeremykubica requested a review from mi-dai October 16, 2024 14:56
assert results["a.v1"] == 1.0
assert results["a.v2"] == 2.0
assert results["c.v2"] == 5.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking something like extract_parameters(['a.v1','c.v2','v3']) which will return {'a.v1: 1, 'c.v2': 5, 'a.v3': 3, 'c.v3': 3, 'e.v3': 3}. Or extract_parameters(['a.v1','v5']) which will return {'a.v1': 1, 'v5': 7} or {'a.v1': 1, 'e.v5': 7} (either is fine but maybe the former is slightly preferred).
For example, I wanted to get a quick plot of the simulated Hubble Diagram using host redshift, so I'd want to do extract_parameters(['host.redshift','x0','x1','c']), and I don't care (or don't know) what node x0,x1 or c is from.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That behavior was the point of the collapse logic. It allowed returning a single value for a parameter using just the parameter name.

In your case of extract_parameters(['host.redshift','x0','x1','c']) how do you want to handle duplicates? If we find instances of x0 in two different nodes we could:

  1. Throw an error [initial behavior]
  2. Collapse them if they are identical
    2a. Throw an error if they are not identical
    2b. Return both in expanded form if they are not identical [second round behavior]
  3. Return all values [current behavior]
  4. Return an arbitrary one of the x0 values. [Not recommended as the behavior becomes undefined].

More concretely if someone where to call extract_parameters(['ra', 'dec', 't0']) where the host and source have two different values of ra, how would you like the code to behave?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to output duplicate parameter names with their node names (say a.x0, b.x0), regardless of whether they are identical in values.
If someone calls extract_parameters(['ra','dec','t0']), we'll return {'host.ra','source.ra','anything_else.ra', ...}, even some of them may be identical in their values.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to add that I think extract_parameters(node=[],parameters=[]) is still useful. Maybe parameters can be expended to include extract_parameters(parameters=['host.redshift','x0','x1','c'])?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I've got it, but let me check.

In the case of a state:

a.v1 -> 1.0
a.v2 -> 2.0
a.v3 -> 3.0
b.v1 -> 4.0
b.v2 -> 5.0
b.v4 -> 6.0
c.v1 -> 7.0

and the query extract_parameters(parameters=['a.v1', 'v2', 'v3', 'b.v4']) you would want the function to return the dictionary:

{
    "a.v1": 1.0,
    "a.v2": 2.0,
    "b.v2": 5.0,
    "v3": 3.0,
    "b.v4": 6.0,
}

Specifically:

  1. The full name are always preserved (even if the parameter does not exist in other nodes).
  2. The shortened name is preserved if the parameter is unique.
  3. The shortened name is transformed into multiple long names if the parameter is not unique.
    Is that correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented those rules. I've dropped the nodes parameter for now, because it is not clear how to incorporate that as well. We can always extract all parameters with to_dict.

@jeremykubica jeremykubica requested a review from mi-dai October 16, 2024 19:38
src/tdastro/graph_state.py Outdated Show resolved Hide resolved
src/tdastro/graph_state.py Outdated Show resolved Hide resolved
src/tdastro/graph_state.py Show resolved Hide resolved
@jeremykubica jeremykubica requested a review from mi-dai October 18, 2024 12:05
@jeremykubica jeremykubica merged commit fa2118b into main Oct 18, 2024
5 checks passed
@jeremykubica jeremykubica deleted the parameter_access_function branch October 18, 2024 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants