create `Argilla` integration for `haystack`nodes #25

davidberenstein1957 · 2023-07-13T06:58:33Z

I think it would be cool to be able to monitor Haystack pipelines and nodes directly to Argilla.

https://docs.haystack.deepset.ai/docs/prompts_and_llms could be similar to https://docs.argilla.io/en/latest/guides/llms/practical_guides/use_argilla_callback_in_langchain.html

However, I can also see support for:

davidberenstein1957 · 2023-07-13T06:59:49Z

Does it make sense from your perspective?
In terms of usage, what should be the prioritization for each one of these supported tasks?

davidberenstein1957 · 2023-07-14T06:44:28Z

@TuanaCelik, what do you think about this issue?

TuanaCelik · 2023-07-17T09:52:08Z

@davidberenstein1957 this to me sounds like an amazing idea. @masci @anakin87 wdyt?

In terms of usage, I think the most interesting one might be PromptNode as you mentioned. But I can see it extending to the rest as well

anakin87 · 2023-07-17T09:59:35Z

It seems great to me!

I haven't tried Argilla Feedback and Callbacks but I think that by setting up an ArgillaCallbackHandler,
you get some "functional" observability:

graphical monitoring of your LLM activity
some data to score/annotate if you then want to fine-tune an LLM.

TuanaCelik · 2023-07-17T13:13:57Z

@anakin87 would you see that as it's own component, or would you see it as a functionality of a component.
@davidberenstein1957 are you interested in contributing such an integration. Having a package that Haystack users can install and use alongside Haystack to log to Argilla would be amazing

davidberenstein1957 · 2023-07-19T10:58:31Z

I am interested in working on this. I am not sure if a separate package makes sense or we should install it through extras? But let me know what makes sense for your part of the codebase.

TuanaCelik · 2023-07-20T12:08:41Z

Hey @davidberenstein1957 - we're going to do some investigation with @anakin87 on Monday to get a better understanding how/if the source code should change, or whether this would be possible to implement as a lone standing integration. We will comment back here with our findings 👍

davidberenstein1957 · 2023-07-21T11:23:15Z

No problem👍

TuanaCelik · 2023-07-25T10:50:08Z

Hey @davidberenstein1957 we've done some thinking and we see 2 options for you. And it's largely your choice of what type of integration you would like to build:

The Agent in Haystack aldready has callback functionality. Similar to what you did with LangChain, you could make use of this callback and implement an ArgillaCallbackHandler, similar to what Stefano suggested. If you decide to do this, and find out that our callback functionality in Agent isn't enough, we'd be happy to extend it.
Since the rest of our nodes such as PromptNode doesn't have callbacks, you could opt to create a custom node that can go before and after a node of which you would like to record input and output to. You could design this node(s) as something I could add into my pipeline, before and after a node, and it records the data to Argilla.

If you decide to go with option 2, @anakin87 had the idea that it could even be a custom node that can wrap another node like the PromptNode

We're currently desigining Haystack v2, whatever option you prefer to go for, we would be happy to update you if there's any major change that would make your integration better. E.g., if we do end up implementing callbacks for PromptNode etc.

davidberenstein1957 · 2023-08-12T07:58:19Z

Hi @TuanaCelik @anakin87 ,

I think starting off with the callback functionality would be best in terms of effort vs gain. However, I think an ArgillaNode encompassing a lot of additional tasks would be a more robust solution for the future but we work on this, whenever the first integration proves successful.

What do you think?

anakin87 · 2023-08-14T09:31:36Z

Thanks, @davidberenstein1957!
That sounds like a good idea to me...

davidberenstein1957 · 2023-08-14T09:53:52Z

@anakin87 Great! So, from your perspective, should we add the ArgillaCallbackHandler as an integration or as an addition to the main repo? Within LangChain the integration is managed in the main repository but I am not sure what you prefer.

anakin87 · 2023-08-17T13:16:37Z

Sorry! I missed this one...
We would like to have ArgillaCallbackHandler as an integration.

TuanaCelik · 2023-08-17T13:16:42Z

@anakin87 - correct me if I'm wrong, but I think currently this would be a good integration and if @davidberenstein1957 is interested in it, we can involve you in Haystack v2 discussions when we start looking into evaluation and observability. You can see our current status on the Discussions page.

davidberenstein1957 · 2023-08-21T07:02:53Z

@TuanaCelik @anakin87, I noticed that you do not directly allow for customization of the callback_manager.

For me, it would make sense to change the haystack/agents/base.py.

And introduces a base CallbackManager class and allow for passing a custom implement of this class to the Agent.init().

from abc import ABC, abstractmethod

class BaseCallbackManager(ABC):
    @abstractmethod
    def abstract_func(self, *args, **kwargs):
        pass

class CallbackManager(BaseCallbackManager):
    def __init__(self, agent_color: Color = Color.GREEN) -> None:
        self.agent_color = agent_color

    def on_tool_finish(
        self,
        tool_output: str,
        color: Optional[Color] = None,
        observation_prefix: Optional[str] = None,
        llm_prefix: Optional[str] = None,
        **kwargs: Any,
    ) -> None:
        print_text(observation_prefix)  # type: ignore
        print_text(tool_output, color=color)
        print_text(f"\n{llm_prefix}")

    def on_new_token(self, token: str, **kwargs: Any) -> None:
        print_text(token, color=self.agent_color)

    def on_agent_start(self, **kwargs: Any) -> None:
        agent_name = kwargs.pop("name", "react")
        print_text(f"\nAgent {agent_name} started with {kwargs}\n", color=self.agent_color)

    def on_agent_step(self, agent_step, **kwargs: Any) -> None:
        print_text(agent_step.prompt_node_response, end="\n", color=self.agent_color)

I, and others, would then be able to do:

class ArgillaCallbackManager(CallbackManager):
    def __init__(self, *args, **kwargs):
        pass

argilla_callbackmanager = ArgillaCallbackManager()

prompt_node = ...
agent = Agent(
    prompt_node=...,
    callback_manager=argilla_callbackmanager
)

Let me know if I can pick this up in a PR to the core package.

davidberenstein1957 · 2023-08-21T07:18:11Z

I also noticed that callback_manager.on_agent_finish() is being used within Agents but that it is not being defined as a callable. Not sure what the intended behavior would have been for this but it takes agent_step as a param.

anakin87 · 2023-08-21T09:03:19Z

@davidberenstein1957 your observations make sense to me and I would say that introducing a base CallbackManager class is reasonable.
@masci @vblagoje any thoughts on this?

davidberenstein1957 · 2023-08-22T07:05:44Z

@masci @vblagoje

For me, it makes sense to also add a couple of additional callback points and use the same callback manager for both agents and tools from the ToolManager,i.e., also include on_tool_* methods to the CallbackManager. Not sure how fixed you are on the usage of Events but perhaps we can move CallbackManager to make the function usage more specific and maintainable. Also, this would allow any user to fully customize their Callbacks and re-use base logic where needed.

davidberenstein1957 · 2023-08-29T07:29:23Z

@masci @vblagoje reminder

vblagoje · 2023-09-04T09:05:22Z

@davidberenstein1957 apologies for the late response; I never got these notifications you sent. I understand your intentions regarding the custom Argilla callback manager. In striving for simplicity and prevention of a class explosion, we used the Events class from events Python library for agent callbacks. Is there a way you can adapt to this approach for the 1.x branch, and then for 2.x, we can talk more about trade-offs of a more explicit callback mechanism and which one to use? Let us know your thoughts.

davidberenstein1957 · 2023-09-06T13:39:38Z

@vblagoje I think this normally would be a problem but the hardcoded event-call are still limited. At the moment it is not possible to get the final answer from the Events because there is no on_agent_finish(final step) method hardcoded in the Agent class so it remains difficult to add monitoring options similar to the LangChain integration.

In order to do the 1.x integration, we would need to mimic the methods that have been defined here, which I can handle in a PR if you want to.

I propose altering this line https://github.com/deepset-ai/haystack/blob/d048bb53523b11426d3c39f7cdc0aa8d15f0cf67/haystack/agents/base.py#L362

vblagoje · 2023-09-06T21:07:16Z

That seems reasonable @davidberenstein1957 Would you please follow up on this idea @anakin87 ?

davidberenstein1957 · 2023-09-07T07:28:39Z

@anakin87 I would like to handle the PR. Do you think it makes sense to also include other steps outlined by langchain or to just include the on_agent_final_answer?

anakin87 · 2023-09-07T08:14:50Z

@davidberenstein1957 As this part will be rethought and redesigned in Haystack 2.0, I would say open a PR and include the minimum set of events needed for the Argilla implementation.
(thanks for your patience 💟)

davidberenstein1957 · 2023-09-07T09:04:30Z

@anakin87 I outlined the PR. Not sure if it requires doc changes or tests?
https://github.com/deepset-ai/haystack/pull/5736/files

davidberenstein1957 · 2024-01-15T17:01:55Z

@anakin87 @TuanaCelik , do you have any pointers w.r.t. this PR? We would love to log some more info into Argilla but maybe the Agent implementation is still limited?

We hope to align further with components/node and pipeline/chain monitoring functionality in a follow-up PR :)

sdiazlor · 2024-02-21T11:05:06Z

Hi, @anakin87 @TuanaCelik, finally we decided to add the integration for Haystack 1.x with Argilla. But we're excited about the Haystack 2.0 possibilities and we plan a refactoring when agents or callbacks become available (I think that it's planned for the upcoming months). So, we are looking forward to the new updates! :)

bilgeyucel mentioned this issue Jul 14, 2023

Seek out for possible new integrations #29

Closed

davidberenstein1957 mentioned this issue Sep 7, 2023

feat: added on_agent_final_answer-support to Agent callback_manager deepset-ai/haystack#5736

Merged

davidberenstein1957 mentioned this issue Nov 15, 2023

[FEATURE] look into an integration with deepset haystack argilla-io/argilla#4235

Closed

kursathalat mentioned this issue Jan 5, 2024

[FEATURE] Create a custom ArgillaNode to monitor Nodes/Components argilla-io/argilla-haystack#3

Open

kursathalat mentioned this issue Jan 18, 2024

Add Argilla Integration #135

Closed

sdiazlor mentioned this issue Feb 21, 2024

Add Argilla integration #191

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create `Argilla` integration for `haystack`nodes #25

create `Argilla` integration for `haystack`nodes #25

davidberenstein1957 commented Jul 13, 2023

davidberenstein1957 commented Jul 13, 2023

davidberenstein1957 commented Jul 14, 2023

TuanaCelik commented Jul 17, 2023

anakin87 commented Jul 17, 2023

TuanaCelik commented Jul 17, 2023

davidberenstein1957 commented Jul 19, 2023 •

edited

Loading

TuanaCelik commented Jul 20, 2023

davidberenstein1957 commented Jul 21, 2023

TuanaCelik commented Jul 25, 2023

davidberenstein1957 commented Aug 12, 2023 •

edited

Loading

anakin87 commented Aug 14, 2023

davidberenstein1957 commented Aug 14, 2023 •

edited

Loading

anakin87 commented Aug 17, 2023

TuanaCelik commented Aug 17, 2023

davidberenstein1957 commented Aug 21, 2023 •

edited

Loading

davidberenstein1957 commented Aug 21, 2023

anakin87 commented Aug 21, 2023

davidberenstein1957 commented Aug 22, 2023 •

edited

Loading

davidberenstein1957 commented Aug 29, 2023

vblagoje commented Sep 4, 2023

davidberenstein1957 commented Sep 6, 2023 •

edited

Loading

vblagoje commented Sep 6, 2023

davidberenstein1957 commented Sep 7, 2023

anakin87 commented Sep 7, 2023

davidberenstein1957 commented Sep 7, 2023

davidberenstein1957 commented Jan 15, 2024

sdiazlor commented Feb 21, 2024

create Argilla integration for haystacknodes #25

create Argilla integration for haystacknodes #25

Comments

davidberenstein1957 commented Jul 13, 2023

davidberenstein1957 commented Jul 13, 2023

davidberenstein1957 commented Jul 14, 2023

TuanaCelik commented Jul 17, 2023

anakin87 commented Jul 17, 2023

TuanaCelik commented Jul 17, 2023

davidberenstein1957 commented Jul 19, 2023 • edited Loading

TuanaCelik commented Jul 20, 2023

davidberenstein1957 commented Jul 21, 2023

TuanaCelik commented Jul 25, 2023

davidberenstein1957 commented Aug 12, 2023 • edited Loading

anakin87 commented Aug 14, 2023

davidberenstein1957 commented Aug 14, 2023 • edited Loading

anakin87 commented Aug 17, 2023

TuanaCelik commented Aug 17, 2023

davidberenstein1957 commented Aug 21, 2023 • edited Loading

davidberenstein1957 commented Aug 21, 2023

anakin87 commented Aug 21, 2023

davidberenstein1957 commented Aug 22, 2023 • edited Loading

davidberenstein1957 commented Aug 29, 2023

vblagoje commented Sep 4, 2023

davidberenstein1957 commented Sep 6, 2023 • edited Loading

vblagoje commented Sep 6, 2023

davidberenstein1957 commented Sep 7, 2023

anakin87 commented Sep 7, 2023

davidberenstein1957 commented Sep 7, 2023

davidberenstein1957 commented Jan 15, 2024

sdiazlor commented Feb 21, 2024

create `Argilla` integration for `haystack`nodes #25

create `Argilla` integration for `haystack`nodes #25

davidberenstein1957 commented Jul 19, 2023 •

edited

Loading

davidberenstein1957 commented Aug 12, 2023 •

edited

Loading

davidberenstein1957 commented Aug 14, 2023 •

edited

Loading

davidberenstein1957 commented Aug 21, 2023 •

edited

Loading

davidberenstein1957 commented Aug 22, 2023 •

edited

Loading

davidberenstein1957 commented Sep 6, 2023 •

edited

Loading