Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: OPTIC-1407: Optimize the tasks API paginated prediction and annotation totals #6735

Merged
merged 6 commits into from
Dec 2, 2024

Conversation

bmartel
Copy link
Contributor

@bmartel bmartel commented Nov 28, 2024

PR fulfills these requirements

  • Commit message(s) and PR title follows the format [fix|feat|ci|chore|doc]: TICKET-ID: Short description of change made ex. fix: DEV-XXXX: Removed inconsistent code usage causing intermittent errors
  • Tests for the changes have been added/updated (for bug fixes/features)
  • Docs have been added/updated (for bug fixes/features)
  • Best efforts were made to ensure docs/code are concise and coherent (checked for spelling/grammatical errors, commented out code, debug logs etc.)
  • Self-reviewed and ran all changes on a local instance (for bug fixes/features)

Change has impacts in these area(s)

(check all that apply)

  • Product design
  • Backend (Database)
  • Backend (API)
  • Frontend

Describe the reason for change

At larger counts of tasks and annotations, these 2 queries would become incredibly inefficient and cause slowdowns. This PR instead uses the task column values to do aggregation and simple SUM's on the values to produce the same output.

Does this change affect performance?

Improves performance about 15-20x when looking at volumes of data where there are greater than 500K tasks, and 500K annotations per project.

What alternative approaches were there?

  • Moving the original subquery to a JOIN, which is difficult to do without dropping down to raw SQL.
  • Similarily this query while an improvement could possibly benefit further from moving this subquery to a JOIN. Same point of why not, it would likely require dropping out of the ORM to achieve as it really likes to use subqueries (which do tend to be faster in most cases, just not at certain scale)

What feature flags were used to cover this change?

  • fflag_fix_back_optic_1407_optimize_tasks_api_pagination_counts

Does this PR introduce a breaking change?

(check only one)

  • Yes, and covered entirely by feature flag(s)
  • Yes, and covered partially by feature flag(s)
  • No
  • Not sure (briefly explain the situation below)

What level of testing was included in the change?

(check all that apply)

  • e2e
  • integration
  • unit

Which logical domain(s) does this change affect?

Tasks List API

Copy link

netlify bot commented Nov 28, 2024

Deploy Preview for label-studio-docs-new-theme canceled.

Name Link
🔨 Latest commit c730e49
🔍 Latest deploy log https://app.netlify.com/sites/label-studio-docs-new-theme/deploys/6749e1b545d962000800566f

Copy link

netlify bot commented Nov 28, 2024

Deploy Preview for heartex-docs canceled.

Name Link
🔨 Latest commit c730e49
🔍 Latest deploy log https://app.netlify.com/sites/heartex-docs/deploys/6749e1b5bd4b22000809fb2d

@github-actions github-actions bot added the fix label Nov 28, 2024
Copy link

codecov bot commented Nov 29, 2024

Codecov Report

Attention: Patch coverage is 90.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 76.70%. Comparing base (e5c1339) to head (c730e49).
Report is 5 commits behind head on develop.

Files with missing lines Patch % Lines
label_studio/data_manager/api.py 90.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #6735      +/-   ##
===========================================
- Coverage    76.73%   76.70%   -0.04%     
===========================================
  Files          170      170              
  Lines        13885    13919      +34     
===========================================
+ Hits         10655    10676      +21     
- Misses        3230     3243      +13     
Flag Coverage Δ
pytests 76.70% <90.00%> (-0.04%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@bmartel bmartel merged commit f293118 into develop Dec 2, 2024
42 of 43 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants