Fix freeform guardrails #918

ansoncfit · 2023-12-19T04:24:59Z

We currently limit the number of destinations and OD pairs in regional analyses with time/path results. This PR fixes a couple issues with the related guardrails:

Fixes a conditional so that when oneToOne = true, the broker checks the number of OD pairs (which equals the number of origins) against the correct limit, rather than just allowing the analysis to proceed.
Checks limit on number of destinations for path analyses in broker. Workers were applying a stricter limit (5000 destinations for path analyses) than the broker, so the broker could potentially enqueue millions of tasks that would trigger an immediate error once they reached the workers.

Small refactor for brevity/legibility: task = job.templateTask

Correct logic for checking oneToOne analyses Check limit on number of destinations for path analyses in broker

src/main/java/com/conveyal/r5/analyst/cluster/PathResult.java

abyrd · 2023-12-20T05:48:59Z

The underlying problem where tasks continue to be distributed to workers after an error was reported should be solved by #921. With that fixed, mismatches in validation rules between different components (UI, backend, workers) should be much less problematic and just result in a few error messages.

That said, it's also a good idea to make these checks more specific as in this PR.

ansoncfit · 2023-12-26T20:10:07Z

src/main/java/com/conveyal/analysis/results/MultiOriginAssembler.java

+                int nODPairs = task.oneToOne ? nOriginsTotal : nOriginsTotal * nDestinations;
+                if (task.recordTimes &&
+                    (nDestinations > MAX_FREEFORM_DESTINATIONS || nODPairs > MAX_FREEFORM_OD_PAIRS)) {
+                    throw new AnalysisServerException(String.format(


Is throw AnalysisServerException.badRequest(... preferable here?

Is it straightforward to apply filterStackTrace (moved to ExceptionUtils in #921) to the message eventually passed along to the UI?

On your second question: yes, I think so. It looks like we could apply filterStackTrace to the message parameter in each call to HttpApi#respondToException (there are five of them just above the method definition).

On your first question: it looks like calling badRequest would yield an identical AnalysisServerException instance but with type BAD_REQUEST instead of UNKNOWN. It looks like the only effect would be the string BAD_REQUEST instead of UNKNOWN in a field in the JSON returned to the UI. Assuming this does not change UI behavior in any problematic way, yes badRequest does seem to better represent the source of the exception.

Thanks for pointing out that method. I think we want to apply it to the stackTrace, not the message. I committed that below for testing on staging, but feel free to revert.

Oh right. It was late when I was reading that and I mistakenly thought the callers were passing e.getStackTrace() as the message. This was probably driven by some kind of wishful thinking because I realized filterStackTrace is so narrowly suited to output from one method (throwable.printStackTrace via ExceptionUtils.stackTraceString) that it probably shouldn't be public and applicable to the more general String type. This is a minor detail but it's located in error handling code that should be foolproof (things could rapidly get annoying to debug if the signal path for the debugging messages themselves is failing). It looks like it is failsafe and would at worst return "Unknown stack frame" but there could be some edge case we're not thinking of.
Of course in current usage, both places the method is used are filterStackTrace(errorEvent.stackTrace) so we know it's always being applied to a final field that's the output of ExceptionUtils.filterStackTrace(). This should work fine but I'm still thinking about ways to clean up the method signature.
Option A: have the public method be public static String filterStackTrace (Throwable throwable) then call that in the ErrorEvent constructor. This would cause all console log traces to be filtered. All others (Slack) are already filtered.
Option B: move filterStackTrace back to a private method on ErrorEvent and always use it via a public wrapper method that only reads from ErrorEvent.stackTrace. (Won't work because we need it in RegionalWorkResult)
Option C: Add a field to ErrorEvent so it has both the full and filtered stack trace
Option D: keep the entire Throwable in a private field on the ErrorEvent and generate the stack traces on demand

I did option C as it ensures the trace string always came from the right method but still allows for verbose stack traces in console logs. Seems to work in local testing on some missing regional results.

D looks cleaner, but it defers processing of the stack trace to the moment the log or slack message is written, thus deferring some potential failures to those background event handling threads. I'd rather front-load the stack trace processing to the point where the ErrorEvent is created, containing any delays or problems closer to the source of the original error.

abyrd

I am approving this as-is in case you want to merge it now. Feel free to add the exception changes discussed in the PR comments if you like in which case I can merge immediately after re-review.

BAD_REQUEST instead of UNKNOWN

#918 (comment)

* fix comment on NETWORK_FORMAT_VERSION * Do not record opportunities 120 minutes away * Fix freeform guardrails Correct logic for checking oneToOne analyses Check limit on number of destinations for path analyses in broker * only record job when assembler created, fixes conveyal#887 * record errors (stopping job) before other checks addresses conveyal#887 * report worker errors as much shorter stack trace move filterStackTrace method into shared utility class * record only one error per job in broker addresses conveyal#919 and conveyal#887 * Add final modifier to PathResult constants * Use specific AnalysisServerException BAD_REQUEST instead of UNKNOWN * Filter stack traces sent to UI * build filtered stack trace directly from throwable conveyal#918 (comment) * Update to 2020 Census geometries * Update seamless-census test fixtures with 2020 Census geometries * do not include stacktrace in message rethrow AnalysisServerException since they already have clear messages * Implement Timo Jaakkonen’s crossing delays (#4), upload package to DGL GitHub organisation * Jaakkonen (2013)’s values * float -> int delays * let’s try this * enough for today * too many changes for such a small patch 😬 * fixed test * shield against undefined date * 😬 * am I allowed to push to DGL’s packages? * linted, moved package destination to my own fork * moved repo to fork * build package on pull request, publish to dgl’s repo * build package on pull request, publish to dgl’s repo * build package on pull request, publish to dgl’s repo * build package on pull request, publish to dgl’s repo * build package on pull request, publish to dgl’s repo * Run on older image, as ubuntu-latest has gradle-8, which fails (#5) * Freeze gradle version (#6) * Run on older image, as ubuntu-latest has gradle-8, which fails * freeze gradle version * fix build workflow (#7) * Run on older image, as ubuntu-latest has gradle-8, which fails * freeze gradle version * refactor workflow, correct repo package address * fix build workflow (#8) * Run on older image, as ubuntu-latest has gradle-8, which fails * freeze gradle version * refactor workflow, correct repo package address * clean-up * clean-up * token * print debug info * configurable (per-repo) package registry (#9) * Run on older image, as ubuntu-latest has gradle-8, which fails * freeze gradle version * refactor workflow, correct repo package address * clean-up * clean-up * token * print debug info * flexible package repo address * remove debug statement * configurable package registry * debugging GH actions * debugging ii * fix build system (#10) * Run on older image, as ubuntu-latest has gradle-8, which fails * freeze gradle version * refactor workflow, correct repo package address * clean-up * clean-up * token * print debug info * flexible package repo address * remove debug statement * configurable package registry * debugging GH actions * debugging ii * asfd * remove debug statements * read custom OSM tags, use them for bike routing (#11) * read custom OSM tags, use them for bike routing * allocate memory * write on correct edge * Latest 7.x.x gradle * Fix bicycle speed tag reading * Allow publish step to fail (in pull requests) * Add default speeds that represent the speed limits on Finnish roads (#13) * revert Jaakkola implementation (#16) * disable turning penalties for bicycle and car, do not apply perceived length multiplier for bicycle routing (#17) * Use per-edge speeds for cycling (#19) * Use per-edge speeds for cycling * Timo Jakkola’s times (#21) * save shapes of GTFS patterns (#22) * 23 roll back jaakkola congestion timings (#24) * save shapes of GTFS patterns * Roll back Jaakkola congestion penalties * make MAX_PATH_DESTINATIONS *not* final --------- Co-authored-by: abyrd <[email protected]> Co-authored-by: ansons <[email protected]> Co-authored-by: Anson Stewart <[email protected]>

Fix freeform guardrails

6b6c767

Correct logic for checking oneToOne analyses Check limit on number of destinations for path analyses in broker

abyrd reviewed Dec 19, 2023

View reviewed changes

src/main/java/com/conveyal/r5/analyst/cluster/PathResult.java Outdated Show resolved Hide resolved

Add final modifier to PathResult constants

2481a93

ansoncfit commented Dec 26, 2023

View reviewed changes

Merge branch 'dev' into freeform-guardrails

06e479f

abyrd previously approved these changes Dec 28, 2023

View reviewed changes

Use specific AnalysisServerException

9690d9d

BAD_REQUEST instead of UNKNOWN

ansoncfit dismissed abyrd’s stale review via 9690d9d December 29, 2023 05:07

ansoncfit and others added 2 commits December 28, 2023 21:07

Filter stack traces sent to UI

9abf498

build filtered stack trace directly from throwable

6a6ab47

#918 (comment)

ansoncfit merged commit 14cbb20 into dev Dec 31, 2023
3 checks passed

ansoncfit deleted the freeform-guardrails branch December 31, 2023 00:15

abyrd mentioned this pull request Jan 3, 2024

Reduce errors sent to broker #919

Closed

rafapereirabr mentioned this pull request Mar 11, 2024

Error testing R5 v7.1 - does not have member field 'int maxDestinations' ipeaGIT/r5r#378

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix freeform guardrails #918

Fix freeform guardrails #918

ansoncfit commented Dec 19, 2023 •

edited

Loading

abyrd commented Dec 20, 2023 •

edited

Loading

ansoncfit Dec 26, 2023 •

edited

Loading

abyrd Dec 28, 2023 •

edited

Loading

abyrd Dec 28, 2023

ansoncfit Dec 29, 2023

abyrd Dec 29, 2023

abyrd Dec 29, 2023 •

edited

Loading

abyrd left a comment

Fix freeform guardrails #918

Fix freeform guardrails #918

Conversation

ansoncfit commented Dec 19, 2023 • edited Loading

abyrd commented Dec 20, 2023 • edited Loading

ansoncfit Dec 26, 2023 • edited Loading

Choose a reason for hiding this comment

abyrd Dec 28, 2023 • edited Loading

Choose a reason for hiding this comment

abyrd Dec 28, 2023

Choose a reason for hiding this comment

ansoncfit Dec 29, 2023

Choose a reason for hiding this comment

abyrd Dec 29, 2023

Choose a reason for hiding this comment

abyrd Dec 29, 2023 • edited Loading

Choose a reason for hiding this comment

abyrd left a comment

Choose a reason for hiding this comment

ansoncfit commented Dec 19, 2023 •

edited

Loading

abyrd commented Dec 20, 2023 •

edited

Loading

ansoncfit Dec 26, 2023 •

edited

Loading

abyrd Dec 28, 2023 •

edited

Loading

abyrd Dec 29, 2023 •

edited

Loading