Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling of partial output format is slow #4579

Closed
niknetniko opened this issue Apr 17, 2023 · 9 comments · Fixed by #4639
Closed

Handling of partial output format is slow #4579

niknetniko opened this issue Apr 17, 2023 · 9 comments · Fixed by #4639
Assignees
Labels
enhancement A change that isn't substantial enough to be called a feature high priority Things we want to see implemented soon

Comments

@niknetniko
Copy link
Member

niknetniko commented Apr 17, 2023

From executing https://dodona.ugent.be/nl/courses/2263/activities/649388223/ on my local machine:

image

Looking at the code, some obvious things seem the validation of each JSON object individually. The same exercise without validating each command gives:

image

@niknetniko niknetniko added the enhancement A change that isn't substantial enough to be called a feature label Apr 17, 2023
@bmesuere bmesuere added this to Roadmap Apr 17, 2023
@github-project-automation github-project-automation bot moved this to Unplanned in Roadmap Apr 17, 2023
@bmesuere
Copy link
Member

bmesuere commented Apr 18, 2023

Validation of non-partial output can also take a while. This is from the legacy python judge from an exercise with 150 tests:
image

For exercises with smaller json output, the construction time is minimal (a few ms). It thus seems that the overhead for calling the validator is low, but runtimes quickly increase with the size of the json to check.

@bmesuere bmesuere added the high priority Things we want to see implemented soon label Apr 18, 2023
@niknetniko
Copy link
Member Author

niknetniko commented Apr 19, 2023

Using the json_schemer gem (for the partial format):

image

Modifying the code to only validate the result at the end gives:
image

I would thus propose using the second approach: validate once at the end to ensure we don't get invalid data in the database, and put a rescue block around the partial updates to catch and handle those errors.

This does mean we might be less strict in checking, since each command is not checked individually in the partial format.

@bmesuere
Copy link
Member

sounds good 👍

@niknetniko niknetniko self-assigned this Apr 19, 2023
@chvp
Copy link
Member

chvp commented Apr 19, 2023

If the checking of the full format fails, could we still check each individual partial command after? For judge development that seems quite important.

I'm also not sure how robust the result constructor is when the partial commands are invalid.

@bmesuere
Copy link
Member

Can't we keep the current behaviour in development, or do we also want these granular checks in production?

@chvp
Copy link
Member

chvp commented Apr 19, 2023

Keeping the current behaviour in development also seems fine.

@bmesuere
Copy link
Member

Switching to the new validator has improved things, but it can still be an issue for certain exercises: https://dodona.ugent.be/nl/submissions/13892016/

I propose to still implement the second part of the original PR.

Do note that there are 1000+ tests in this exercise, so this is also a case of users "consuming" all performance gains by creating more/bigger tests.

@jorg-vr
Copy link
Contributor

jorg-vr commented May 17, 2023

The fixing pull request was reverted. see #4647

@jorg-vr jorg-vr reopened this May 17, 2023
@github-project-automation github-project-automation bot moved this from Done to Todo in Roadmap May 17, 2023
@bmesuere
Copy link
Member

bmesuere commented Oct 21, 2023

Version 2 of the json schemer gem (merged in #4913) had additional performance improvements.
The legacy exercise with a validation time of 4.75 seconds in https://dodona.be/nl/submissions/13892016/ dropped to 0.58 seconds in https://dodona.be/nl/courses/2263/activities/1565418465/

There is thus no real need to implement additional strategies to speed this up. If others agree, this issue can be closed.

(do note that the prepare time is quite high again)

@jorg-vr jorg-vr closed this as completed Oct 23, 2023
@github-project-automation github-project-automation bot moved this from Todo to Done in Roadmap Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A change that isn't substantial enough to be called a feature high priority Things we want to see implemented soon
Projects
Status: Done
4 participants