Added new extract answer feature #148

whitead · 2024-12-11T17:01:41Z

Standardizing extraction of answers

jamesbraza · 2024-12-11T17:49:25Z

src/aviary/utils.py

+        "If the proposed answer is empty, invalid, or ambiguous, "
+        "return an empty string."


Can you upstream some of the tests from https://github.com/Future-House/paper-qa/blob/v5.8.0/tests/test_litqa.py#L117 to here? I think we should also have "multiple options are matched" mentioned somewhere

I would be nice if paper-qa can just import this function and use it, instead of having its own evaluation LLM prompt

Yup - will do

jamesbraza · 2024-12-11T17:51:29Z

src/aviary/utils.py

+    try:
+        from litellm import acompletion
+    except ImportError as e:
+        raise ImportError(
+            "eval_answer requires the 'llm' extra for 'litellm'. Please:"
+            " `pip install aviary[llm]`."
+        ) from e


Can you look at how ToolSelector.__init__ takes acompletion: "Callable[..., Awaitable[ModelResponse]] | None" = None, and apply it as an input arg here?

What this does is let people use local models.

Also, feel free to YAGNI on this one

jamesbraza · 2024-12-11T17:52:26Z

src/aviary/utils.py

@@ -22,6 +22,21 @@
    "temperature": 0,
 }

+LLM_EXTRACT_CONFIG = {
+    "prompt": (
+        "You are evaluating answers for a test which has fixed options. "


Can you add some statement that focuses the LLM on the message history?

Otherwise, in paper-qa, we witnessed the LLM using its innate knowledge

There is no message history here (?) Not sure what you mean?

I guess this function is responsible for both (1) extracting a letter that (2) ensuring it matches a multiple choice option.

What we saw in paper-qa was in the case of an empty string answer, the LLM would pull on its innate knowledge and could select the correct multiple choice option.

So for this:

Can you add some statement that focuses the LLM on the message history?

I guess what I should of said was can you add a statement that focuses the LLM on just the proposed: str, and tries to avoid pulling on any innate knowledge?

Yea - I was smart and didn't put the question into these, so there's no way it could get confused and try to answer.

we simultaneously posted

There is no question. So I don't see how it would be possible for it to attempt to answer. I don't know what else I could write to make it more clear in the prompt.

Ahhh I see, very clever! I follow and you're right

Do you mind documenting that rationale somewhere in the code? Maybe a docstring in extract_answer_llm

jamesbraza

Approving, there are outstanding comments still

jamesbraza · 2024-12-11T19:06:59Z

src/aviary/utils.py

@@ -22,6 +22,21 @@
    "temperature": 0,
 }

+LLM_EXTRACT_CONFIG = {
+    "prompt": (
+        "You are evaluating answers for a test which has fixed options. "


Ahhh I see, very clever! I follow and you're right

Do you mind documenting that rationale somewhere in the code? Maybe a docstring in extract_answer_llm

Added new extract answer feature

5076951

whitead requested a review from sidnarayanan December 11, 2024 17:01

dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Dec 11, 2024

whitead requested a review from jamesbraza December 11, 2024 17:01

dosubot bot added the enhancement New feature or request label Dec 11, 2024

whitead added 3 commits December 11, 2024 09:02

Simplified prompts

783b483

Updated casettes

9547134

Removed prints

d7ef73a

jamesbraza reviewed Dec 11, 2024

View reviewed changes

jamesbraza approved these changes Dec 11, 2024

View reviewed changes

whitead mentioned this pull request Dec 11, 2024

Reactor: Overly Complicated Question Eval Future-House/paper-qa#761

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added new extract answer feature #148

Added new extract answer feature #148

whitead commented Dec 11, 2024

jamesbraza Dec 11, 2024

whitead Dec 11, 2024

jamesbraza Dec 11, 2024

jamesbraza Dec 11, 2024

whitead Dec 11, 2024

jamesbraza Dec 11, 2024

whitead Dec 11, 2024

whitead Dec 11, 2024

jamesbraza Dec 11, 2024

jamesbraza left a comment

jamesbraza Dec 11, 2024

		"If the proposed answer is empty, invalid, or ambiguous, "
		"return an empty string."

Added new extract answer feature #148

Are you sure you want to change the base?

Added new extract answer feature #148

Conversation

whitead commented Dec 11, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jamesbraza left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment