A more general solution to model answer extraction instead of `output_regex` #358

sadra-barikbin · 2024-10-11T19:13:34Z

Hi there!

Here is the PR for model answer extraction. Currently, we have output_regex and this PR attempts to replace it with a more general solution.

By the way, our current output_regex seems to be broken as it's not fed into apply_generative_metric in Pipeline._compute_metrics

Fixes #360

sadra-barikbin · 2024-10-11T19:22:52Z

src/lighteval/tasks/lighteval_task.py

@@ -504,7 +507,19 @@ def get_metric_method_from_category(self, metric_category):
        if not self.has_metric_category[metric_category]:
            raise ValueError(f"Requested a metric category {metric_category} absent from the task list.")

-        return LightevalTask._get_metric_method_from_category(metric_category)
+        metric_method = LightevalTask._get_metric_method_from_category(metric_category)
+        # Bad hack. I had no other way.


I suggest considering task as the argument of apply_*_metrics functions as a workaround.

clefourrier · 2024-10-14T07:32:24Z

Hi Sadra,
Sorry if I missed it, but did you discuss this PR in an issue before adding it?

src/lighteval/metrics/__init__.py

Co-authored-by: Nathan Habib <[email protected]>

clefourrier · 2024-12-12T11:24:30Z

Hi! We removed output_regex a couple PRs ago, and it's likely we would want this answer extraction step to happen at the metric stage and not in the task itself, so I'm going to close this PR.
Please start by discussing next features you want to add in issues first, and once we approve feel free to go for them

sadra-barikbin added 3 commits October 11, 2024 22:39

Implement the feature

a0c6bb2

Remove redundant code, now that we're on the main

c7d2b55

Fix AnswerExtractor type ann.

b802c04

sadra-barikbin commented Oct 11, 2024

View reviewed changes

NathanHB reviewed Oct 14, 2024

View reviewed changes

src/lighteval/metrics/__init__.py Outdated Show resolved Hide resolved

Update src/lighteval/metrics/__init__.py

3218483

Co-authored-by: Nathan Habib <[email protected]>

clefourrier closed this Dec 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A more general solution to model answer extraction instead of `output_regex` #358

A more general solution to model answer extraction instead of `output_regex` #358

sadra-barikbin commented Oct 11, 2024 •

edited

Loading

sadra-barikbin Oct 11, 2024

clefourrier commented Oct 14, 2024

clefourrier commented Dec 12, 2024

A more general solution to model answer extraction instead of output_regex #358

A more general solution to model answer extraction instead of output_regex #358

Conversation

sadra-barikbin commented Oct 11, 2024 • edited Loading

sadra-barikbin Oct 11, 2024

Choose a reason for hiding this comment

clefourrier commented Oct 14, 2024

clefourrier commented Dec 12, 2024

A more general solution to model answer extraction instead of `output_regex` #358

A more general solution to model answer extraction instead of `output_regex` #358

sadra-barikbin commented Oct 11, 2024 •

edited

Loading