refactor: cache signature structure in ops.testing state classes #1499

tonyandrewmeyer · 2024-12-13T03:29:34Z

The Scenario code that limits the number of positional arguments (needed in Python 3.8 since the dataclasses there do not have KW_ONLY) is quite expensive. We have a method of providing this functionality in 3.8, but it is unfortunately expensive, and therefore a regression from older versions of Scenario.

I considered 4 options here:

Manually write out the dozen or so __init__ signatures (as we had at one point in the original PR). I'd rather not for the same reasons we made the decision at that time.
Figure out a way to have a syntax that allows using regular dataclasses.KW_ONLY with modern Python and only use our custom code on older Python. I spent some time looking into this, and couldn't figure out a way to do it where the code still seemed readable and maintainable enough.
Cache the most expensive work (what's in this PR).
Don't bother doing anything now, and eagerly wait for the ability to drop 3.8 support. A roughly 5% performance regression felt worth fixing as long as the change is fairly straightforward (although this change only gets about half of that back).

The most expensive parts of the __new__ method are the ones that work with inspect: inspect.signature in particular, but also getting the default value of the parameters. If we assume that no-one is run-time altering the signature of the class (I believe no-one should be) then the values of these never actually change, but we are currently calculating them every time an instance of the class is created. This PR changes that to cache those three values the first time they are needed.

There's one additional performance tweak in the branch that doesn't make a significant difference, but is trivial to do: when checking if the YAML files exist, skip the filesystem exists() call if we just created an empty temporary directory a few lines above, since we know that it will never exist.

A drive-by: I happened to notice while working on this branch Harness referring to options.yaml, which does not exist (any more?) as far as I know, so a docs tweak to address that.

Timing (best of 3):

Suite	main	branch
operator unit tests (no concurrency)	165.25s	161.89s
traefik scenario tests	45.49	44.30s
kafka-k8s-operator scenario tests	4.48s	4.38s

Refs #1434

…ml or charmcraft.yaml.

…empty directory.

…or every object.

james-garner-canonical

Looks good, clever and simple solution, and a nice performance win.

I was briefly confused when tracing the new vs old logic, because required_args in __new__ stores required (as args -- not kwargs), while cls._init_required_args stores required (as args or kwargs). Idk if there's a clearer short name for _init_required_args though, and it seems easy enough to follow when not trying to go side by side (so fine for future readers).

dimaqq

High level: the time saved is marginal, but then code is rather straightforward and I'm not sure what would the alternative be, maybe custom types where the inspected bits are inlined or hardcoded?

Anyway +1 on this.

testing/src/scenario/_runtime.py

testing/src/scenario/state.py

dimaqq · 2024-12-16T01:47:44Z

testing/src/scenario/state.py

+            ).parameters
+            cls._init_kw_only = {
+                name
+                for name in tuple(parameters)[cls._max_positional_args :]


I'm assuming this was auto-formatted by ruff, the dangling :] is kinda funny :]

It was, yeah - I lazily just wrote it out on one line and let tox -e fmt clean up.

Does the trailing happy face bother you? I could do something like move tuple(parameters) out to a separate line to get the dict comprehension to be on a single line, if that would be better.

tonyandrewmeyer added 3 commits December 13, 2024 16:14

options.yaml doesn't exist (any more?) - avoid referring to config.ya…

8f2bb8a

…ml or charmcraft.yaml.

Don't bother checking if a file exists if we know we just created an …

9086779

…empty directory.

Avoid the expensive inspect calls when the results will be the same f…

14532aa

…or every object.

tonyandrewmeyer requested review from dimaqq and james-garner-canonical December 13, 2024 04:18

tonyandrewmeyer marked this pull request as ready for review December 13, 2024 07:31

james-garner-canonical approved these changes Dec 15, 2024

View reviewed changes

dimaqq approved these changes Dec 16, 2024

View reviewed changes

Tweak comment per review.

b080d6a

tonyandrewmeyer merged commit eb80926 into canonical:main Dec 17, 2024
32 checks passed

tonyandrewmeyer deleted the cache-scenario-maxargs-calculations branch December 17, 2024 21:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: cache signature structure in ops.testing state classes #1499

refactor: cache signature structure in ops.testing state classes #1499

tonyandrewmeyer commented Dec 13, 2024 •

edited

Loading

james-garner-canonical left a comment

dimaqq left a comment

dimaqq Dec 16, 2024

tonyandrewmeyer Dec 16, 2024

refactor: cache signature structure in ops.testing state classes #1499

refactor: cache signature structure in ops.testing state classes #1499

Conversation

tonyandrewmeyer commented Dec 13, 2024 • edited Loading

james-garner-canonical left a comment

Choose a reason for hiding this comment

dimaqq left a comment

Choose a reason for hiding this comment

dimaqq Dec 16, 2024

Choose a reason for hiding this comment

tonyandrewmeyer Dec 16, 2024

Choose a reason for hiding this comment

tonyandrewmeyer commented Dec 13, 2024 •

edited

Loading