You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Verification is an important step in determining whether a test implementation is valid, but can be very time-consuming in practice. The implementation of the new Verifier aims to make understanding verification implementations easier, as well as adding new verifications easier. With the assumption that verifications will be more easily expanded and new verifications added (see new verifications for reference), the time it takes to run a verification is expected to increase over time.
At the time of this writing, the update verification takes 31 seconds. json take 4 as does plaintext. There are 650 test implementations, each of which will end up verifying one or more of these test types. In the best case (incorrect) scenario, 2,600 seconds (43 minutes) are spent verifying. In the case where 3/5 take 30 seconds, it looks more like 50,960 seconds (14 hours). A continuous benchmark run can be shortened by several hours via the following:
The benchmark process will not run verification
Assume that if a test implementation is not tagged "broken", that it has passed verification
History
Currently (and also in the legacy implementation), running a benchmark of a given test implementation incurs the cost of verification. This is done to ensure that time is not spent on running a benchmark against a test which will not respond correctly to the end-point being benchmarked (e.g. if fortune returns a 500, instead of a 200, it should not be measured).
The legacy implementation has this rule imposed because verification came as an afterthought to the benchmarking process. Originally, the legacy implementation did benchmark test implementation which returned a 500 response, for example. Eventually, the verification step was added to ensure that tests were implemented correctly, and patches were made to tests to try and get them to pass verification retroactively. Verification has been the standard for several years now, and it seems like we are past the point where test implementations are merged which do not pass verification.
Drawbacks
A clever malicious contributor could open a pull request with a sophisticated black box framework implementation (as a linked library, rather than source code) that passes verification to get merged in, but returns empty 200s for all the tests (or a similar attack)
Implement a "light" verification step tailored to running a benchmark, which only checks that the service is available and returning a 200 response. This would alleviate, somewhat, the unreliable failures drawback mentioned above.
Alternatives
Leave verification as a first-step to running a benchmark
The text was updated successfully, but these errors were encountered:
Summary
Motivation
Verification is an important step in determining whether a test implementation is valid, but can be very time-consuming in practice. The implementation of the new Verifier aims to make understanding verification implementations easier, as well as adding new verifications easier. With the assumption that verifications will be more easily expanded and new verifications added (see new verifications for reference), the time it takes to run a verification is expected to increase over time.
At the time of this writing, the
update
verification takes 31 seconds.json
take 4 as does plaintext. There are 650 test implementations, each of which will end up verifying one or more of these test types. In the best case (incorrect) scenario, 2,600 seconds (43 minutes) are spent verifying. In the case where 3/5 take 30 seconds, it looks more like 50,960 seconds (14 hours). A continuous benchmark run can be shortened by several hours via the following:History
Currently (and also in the legacy implementation), running a benchmark of a given test implementation incurs the cost of verification. This is done to ensure that time is not spent on running a benchmark against a test which will not respond correctly to the end-point being benchmarked (e.g. if
fortune
returns a 500, instead of a 200, it should not be measured).The legacy implementation has this rule imposed because verification came as an afterthought to the benchmarking process. Originally, the legacy implementation did benchmark test implementation which returned a 500 response, for example. Eventually, the verification step was added to ensure that tests were implemented correctly, and patches were made to tests to try and get them to pass verification retroactively. Verification has been the standard for several years now, and it seems like we are past the point where test implementations are merged which do not pass verification.
Drawbacks
Supplemental Considerations
Alternatives
The text was updated successfully, but these errors were encountered: