Perf: Remove unnecessary `deepcopy` calls in `Schemaview.induced_slot` #296

sneakers-the-rat · 2024-02-13T05:26:13Z

This is one of the largest sources of slow performance in the non-graph-based generators.

Currently, when I run the test suite, it takes me 50 minutes, which is entirely too long imo - we want tested contributions, but if i have to wait 50 minutes before submitting a PR or between every change then the odds start to really climb that i'm just going to forget about it.

This gets us down to 40 minutes essentially for free (and when using pydanticgen makes a ~60s generation process into ~2-3s).

First: we can remove the deepcopy at the end - that's wholly unnecessary, since induced_slot is already deepcopied earlier in the method. we don't need to protect something that will die immediately when the method ends.

Second, we also don't need the deepcopy in the middle of the method:

get_slot is cached.
it draws from all_slots which is also cached.
all_slots makes a (shallow) copy of self._get_dict as well
if get_slot doesn't find anything in all_slots it makes a shallow copy of the class attribute

so we shouldn't need deepcopy in induced_slot since the thing we draw from should be cached and copied already - future calls to get_slot should get the cached copy even if we mutate it downstream. I replaced it with a regular copy since that's much faster if it's necessary at all.

Third: Avoid double-calling get_slot with differing call signature since the first call is only necessary if we take the first leg of the if cls statement.

Fourth: the output of induced_slot is also cached, but it still seems like we're getting cache misses from the difference between slot name as a str vs SlotDefinitionName. not fixed in this PR.

Cumulative time in Schemaview.induced_slot before: 510s (95832 calls)
Cumulative time in Schemaview.induced_slot after: 243s

codecov · 2024-02-13T05:27:48Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (2e35728) 62.90% compared to head (fe8e821) 62.90%.
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #296   +/-   ##
=======================================
  Coverage   62.90%   62.90%           
=======================================
  Files          62       62           
  Lines        8529     8529           
  Branches     2239     2239           
=======================================
  Hits         5365     5365           
  Misses       2554     2554           
  Partials      610      610

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Remove unnecessary deepcopies and avoid multiple calls to get_slot

fe8e821

sneakers-the-rat mentioned this pull request Feb 13, 2024

Perf: Mark shex generator as slow, skip except for gh actions runner linkml/linkml#1921

Merged

sierra-moxon approved these changes Feb 13, 2024

View reviewed changes

cmungall merged commit c9adcae into linkml:main Feb 14, 2024
7 checks passed

sneakers-the-rat mentioned this pull request Feb 15, 2024

Pydanticgen - Modularize template linkml/linkml#1927

Merged

4 tasks

sujaypatil96 mentioned this pull request Feb 20, 2024

optimize get_classes_by_slot() in schemaview.py #300

Merged

sneakers-the-rat mentioned this pull request Jul 23, 2024

induced_slot mutates values and does a hidden deepcopy and generators have become reliant on that linkml/linkml#2219

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf: Remove unnecessary `deepcopy` calls in `Schemaview.induced_slot` #296

Perf: Remove unnecessary `deepcopy` calls in `Schemaview.induced_slot` #296

sneakers-the-rat commented Feb 13, 2024

codecov bot commented Feb 13, 2024 •

edited

Loading

Perf: Remove unnecessary deepcopy calls in Schemaview.induced_slot #296

Perf: Remove unnecessary deepcopy calls in Schemaview.induced_slot #296

Conversation

sneakers-the-rat commented Feb 13, 2024

codecov bot commented Feb 13, 2024 • edited Loading

Codecov Report

Perf: Remove unnecessary `deepcopy` calls in `Schemaview.induced_slot` #296

Perf: Remove unnecessary `deepcopy` calls in `Schemaview.induced_slot` #296

codecov bot commented Feb 13, 2024 •

edited

Loading