Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ChatOllama in "astream_events" is switching from "on_chat_model_stream" to "on_llm_stream" #29383

Open
5 tasks done
weissenbacherpwc opened this issue Jan 23, 2025 · 0 comments
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@weissenbacherpwc
Copy link

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

This is my astream_events code:

# The Async Chain loaded with a LLM from the GROQ API                 
async def langgraph_agentic_pipeline(question: str, collection_name: str, streamlit_username: str,chat_with_doc_use_case: bool = False):
    """
    This function represents the Langgraph RAG pipeline with Chat Model, in this case from the Groq-API.
    First the retriever is initialized.
    Afterwards the response is streamed async, based on the question.
    With Chat Models, events are called: "on_chat_model_end" or "on_chat_model_stream".
    
    params:
    question (str): The question of a user
    collection_name (str): The collection name of the vectordatabase of the confluence export
    chat_with_doc_use_case (bool): If set to True, the retriever will be loaded for the Chat-with-your-document Use Case with lower thresholds than defined in the config.
    """    
    if chat_with_doc_use_case == False:
        retriever = load_retriever(
            collection_name=collection_name,
            embeddings=embeddings,
            reranking_model=reranking_model,
            CONNECTION_STRING=CONNECTION_STRING
        )   
        chain = create_workflow_app(retriever=retriever, model=llm)
    else:        
        print("Loading Retriever for 'Chat-with-your-document' use case'")
        retriever = load_retriever(
            collection_name=collection_name,
            embeddings=embeddings,
            reranking_model=reranking_model,
            reranking_top_k=10,
            threshold_after_colbert_reranking=8,
            threshold_after_cross_encoder_reranking=0.3,
            CONNECTION_STRING=CONNECTION_STRING
        )   
        chain = create_workflow_app(retriever=retriever, model=llm, langgraph_mode="basic_agentic_workflow")
    thread_id = streamlit_username # E.g. if thread_id changes from id_1 to id_2, the message history will be empty
    input_message = HumanMessage(content=question)
    config = {
            "configurable": {"thread_id": thread_id}, #for every user, a different thread_id should be selected
            'callbacks': [langfuse_handler] #if you are not using Langfuse, use 'ConsoleCallbackHandler' for 'callbacks' 
        }
    if chain_states.get(thread_id) and bool(chain_states[thread_id]):
        chain.update_state(
            config,
            chain_states[thread_id]
        )
    try:
        async for event in chain.astream_events(
            {"messages":  [input_message]},
            version="v1",
            config=config
        ):
            print(event)
            if event["event"] == f"on_{model_type_for_astream_event}_start" and event.get("metadata", {}).get("langgraph_node") == "generate":
                print("Stream started...")
                if model_type_for_astream_event == "llm":
                    prompt_length = len(event["data"]["input"]["prompts"][0])
                else:
                    prompt_length= len(event["data"]["input"]["messages"][0][0].content)
                yield f'data: {json.dumps({"type": "prompt_length_characters", "content": prompt_length})}\n\n'
                yield f'data: {json.dumps({"type": "prompt_length_tokens", "content": prompt_length / 4})}\n\n'
                
            if event["event"] == f"on_{model_type_for_astream_event}_stream" and event.get("metadata", {}).get("langgraph_node") == "generate":
                if model_type_for_astream_event == "llm":
                    chunks = event["data"]['chunk']
                else:
                    chunks = event["data"]['chunk'].content
                        # Serialize the chunk to JSON-safe format
                yield f'data: {json.dumps({"type": "chunk", "content": chunks})}\n\n'
                await asyncio.sleep(0.1) 
                
            elif event["event"] == "on_chain_end" and event.get("metadata", {}).get("langgraph_node") == "format_docs" and event["name"] == "format_docs":
                retrieved_docs = event["data"]["input"]["raw_docs"]
                serialized_docs = serialize_documents(retrieved_docs)
                yield f'data: {{"type": "docs", "content": {serialized_docs}}}\n\n'
    finally:
        states = chain.get_state(config).values
        #print(states)
        chain_states[thread_id] = states

Error Message and Stack Trace (if applicable)

No response

Description

Hi,

I am using ChatOllama and astream_events to stream my responses. I see a weird behaviour when using the astream_events function. When I call my LLM for the first time in a session, streaming works as expected and the event type is "'event': 'on_chat_model_stream'" as expected. However when I run the same question again the event type switches to "on_llm_stream". With this bug my code is not fetching the new event and the streaming does not work.

I see there is a field "Is_model_type: chat" which might be another way to debug this, however I wanted to report this weird behaviour.

First question:
'data': {'chunk': AIMessageChunk(content='-Le', id='run-80026f5b-cc9b-443a-a9f2-3f14188d25a6')}, 'parent_ids': []} ungs{'event': 'on_chat_model_stream', 'name': 'ChatOllama', 'run_id': '80026f5b-cc9b-443a-a9f2-3f14188d25a6', 'tags': ['seq:step:1'], 'metadata': {'thread_id': 'maxiw', 'langgraph_step': 3, 'langgraph_node': 'generate', 'langgraph_triggers': ['format_docs'], 'langgraph_task_idx': 0, 'thread_ts': '1efd9a0b-1fca-6892-8002-4d1eb12a4ad1', 'ls_provider': 'ollama', 'ls_model_name': 'maxiweissenbacher/sauerkrautlm-nistral-nemo-12b-instruct:fp16', 'ls_model_type': 'chat', 'ls_temperature': 0.1}, 'data':

Second question:
'data': {'chunk': AIMessageChunk(content=' Antwort', id='run-f16ce854-ba43-4225-8418-fbe5c1a3a6da')}, 'parent_ids': []} lie{'event': 'on_llm_stream', 'name': 'ChatOllama', 'run_id': 'f16ce854-ba43-4225-8418-fbe5c1a3a6da', 'tags': ['seq:step:1'], 'metadata': {'thread_id': 'maxiw', 'langgraph_step': 5, 'langgraph_node': 'generate', 'langgraph_triggers': ['no_docs_found_question'], 'langgraph_task_idx': 0, 'thread_ts': '1efd9a10-83d9-69dc-8004-84ec81d877c6', 'ls_provider': 'ollama', 'ls_model_name': 'maxiweissenbacher/sauerkrautlm-nistral-nemo-12b-instruct:fp16', 'ls_model_type': 'chat', 'ls_temperature': 0.1}, 'data'

System Info

ollama 0.4.7
langchain 0.2.7
langchain-community 0.2.7
langchain-core 0.2.43
langchain-experimental 0.0.63
langchain-groq 0.1.5
langchain-huggingface 0.0.3
langchain-ollama 0.1.1
langchain-openai 0.1.20
langchain-postgres 0.0.12

@dosubot dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant