Remove Line-Number-Based Matching for Analysis Issues and Rely Solely on Message + File #516

fabianvf · 2024-12-06T21:26:16Z

WE currently rely on line numbers to differentiate multiple identical violations within a single file. The line numbers probably do not help the LLM fix the correct issue and add unnecessary complexity. Instead, we should simplify the approach so that each unique task is keyed off the message and file alone, rather than line numbers. Removing line-number-based matching would also eliminate the need for similarity checks.

We should:

Remove line number from the equality check for at least analyzer violations, and maybe other task types as well.
Use only the file and the violation message (and any unique variables it contains) to determine uniqueness.
Eliminate the complexity introduced by comparing multiple identical violations within the same file at different line numbers.
Confirm that this approach does not reduce the correctness of the LLM-generated solutions.

shawn-hurley · 2025-01-10T16:38:51Z

@fabianvf I think we fixed the issues we had with the equality check, can we close this or am I missing something?

fabianvf self-assigned this Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Remove Line-Number-Based Matching for Analysis Issues and Rely Solely on Message + File #516

Remove Line-Number-Based Matching for Analysis Issues and Rely Solely on Message + File #516

fabianvf commented Dec 6, 2024

shawn-hurley commented Jan 10, 2025

Remove Line-Number-Based Matching for Analysis Issues and Rely Solely on Message + File #516

Remove Line-Number-Based Matching for Analysis Issues and Rely Solely on Message + File #516

Comments

fabianvf commented Dec 6, 2024

shawn-hurley commented Jan 10, 2025