-
Notifications
You must be signed in to change notification settings - Fork 116
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #136 from XeonHis/develop/v0.2.1
Develop/v0.2.1: Update video understanding example with DnC operator
- Loading branch information
Showing
19 changed files
with
155 additions
and
68 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -150,6 +150,7 @@ tests | |
data/ | ||
tests/mathvista | ||
running_logs/ | ||
video_cache/ | ||
*.db | ||
|
||
# vscode | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
from pathlib import Path | ||
from typing import List | ||
|
||
from omagent_core.models.llms.base import BaseLLMBackend | ||
from omagent_core.engine.worker.base import BaseWorker | ||
from omagent_core.models.llms.prompt import PromptTemplate | ||
from omagent_core.memories.ltms.ltm import LTM | ||
from omagent_core.utils.registry import registry | ||
from pydantic import Field | ||
from omagent_core.advanced_components.workflow.dnc.schemas.dnc_structure import TaskTree | ||
from omagent_core.utils.logger import logging | ||
from openai import Stream | ||
|
||
|
||
CURRENT_PATH = root_path = Path(__file__).parents[0] | ||
|
||
|
||
@registry.register_worker() | ||
class Conclude(BaseLLMBackend, BaseWorker): | ||
prompts: List[PromptTemplate] = Field( | ||
default=[ | ||
PromptTemplate.from_file( | ||
CURRENT_PATH.joinpath("sys_prompt.prompt"), role="system" | ||
), | ||
PromptTemplate.from_file( | ||
CURRENT_PATH.joinpath("user_prompt.prompt"), role="user" | ||
), | ||
] | ||
) | ||
|
||
def _run(self, dnc_structure: dict, last_output: str, *args, **kwargs): | ||
"""A conclude node that summarizes and completes the root task. | ||
This component acts as the final node that: | ||
- Takes the root task and its execution results | ||
- Generates a final conclusion/summary of the entire task execution | ||
- Formats and presents the final output in a clear way | ||
- Cleans up any temporary state/memory used during execution | ||
The conclude node is responsible for providing a coherent final response that | ||
addresses the original root task objective based on all the work done by | ||
previous nodes. | ||
Args: | ||
agent_task (dict): The task tree containing the root task and results | ||
last_output (str): The final output from previous task execution | ||
*args: Additional arguments | ||
**kwargs: Additional keyword arguments | ||
Returns: | ||
dict: Final response containing the conclusion/summary | ||
""" | ||
task = TaskTree(**dnc_structure) | ||
self.callback.info(agent_id=self.workflow_instance_id, progress=f'Conclude', message=f'{task.get_current_node().task}') | ||
chat_complete_res = self.simple_infer( | ||
task=task.get_root().task, | ||
result=str(last_output), | ||
img_placeholders="".join(list(self.stm(self.workflow_instance_id).get('image_cache', {}).keys())), | ||
) | ||
if isinstance(chat_complete_res, Stream): | ||
last_output = 'Answer: ' | ||
self.callback.send_incomplete(agent_id=self.workflow_instance_id, msg='Answer: ') | ||
for chunk in chat_complete_res: | ||
if chunk.choices[0].delta.content is not None: | ||
self.callback.send_incomplete(agent_id=self.workflow_instance_id, msg=f'{chunk.choices[0].delta.content}') | ||
last_output += chunk.choices[0].delta.content | ||
else: | ||
self.callback.send_block(agent_id=self.workflow_instance_id, msg='') | ||
last_output += '' | ||
break | ||
else: | ||
last_output = chat_complete_res["choices"][0]["message"]["content"] | ||
self.callback.send_answer(agent_id=self.workflow_instance_id, msg=f'Answer: {chat_complete_res["choices"][0]["message"]["content"]}') | ||
self.stm(self.workflow_instance_id).clear() | ||
return {'last_output': last_output} |
13 changes: 13 additions & 0 deletions
13
examples/video_understanding/agent/conclude/sys_prompt.prompt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
As the final stage of our task processing workflow, your role is to inform the user about the final execution result of the task. | ||
Your task includes two parts: | ||
1. Verify the result, ensure it is a valid result of the user's question or task. | ||
2. Image may be visual prompted by adding bound boxes and labels to the image, this is the important information. | ||
3. Generate the output message since you may get some raw data, you have to get the useful information and generate a detailed message. | ||
|
||
The task may complete successfully or it can be failed for some reason. You just need to honestly express the situation. | ||
|
||
*** Important Notice *** | ||
1. Please use the language used in the question when responding. | ||
2. Your response MUST be based on the results provided to you. Do not attempt to solve the problem on your own or try to correct any errors. | ||
3. Do not mention your source of information. Present the response as if it were your own. | ||
4. Handle the conversions between different units carefully. |
7 changes: 7 additions & 0 deletions
7
examples/video_understanding/agent/conclude/user_prompt.prompt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
Now, it's your turn to complete the task. | ||
|
||
Task (The task you need to complete.): {{task}} | ||
result (The result from former agents.): {{result}} | ||
images: {{img_placeholders}} | ||
|
||
Now show your super capability as a super agent that beyond regular AIs or LLMs! |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
7 changes: 7 additions & 0 deletions
7
examples/video_understanding/configs/llms/text_res_stream.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
name: OpenaiGPTLLM | ||
model_id: gpt-4o-mini | ||
api_key: ${env| custom_openai_key, openai_api_key} | ||
endpoint: ${env| custom_openai_endpoint, https://api.openai.com/v1} | ||
temperature: 0 | ||
stream: true | ||
response_format: text |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
name: Conclude | ||
llm: ${sub|text_res} | ||
llm: ${sub|text_res_stream} | ||
output_parser: | ||
name: StrParser |
18 changes: 18 additions & 0 deletions
18
examples/video_understanding/configs/workers/dnc_workflow.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
- name: ConstructDncPayload | ||
- name: StructureUpdate | ||
- name: TaskConqueror | ||
llm: ${sub|json_res} | ||
tool_manager: ${sub|all_tools} | ||
output_parser: | ||
name: StrParser | ||
- name: TaskDivider | ||
llm: ${sub|json_res} | ||
tool_manager: ${sub|all_tools} | ||
output_parser: | ||
name: StrParser | ||
- name: TaskRescue | ||
llm: ${sub|text_res} | ||
tool_manager: ${sub|all_tools} | ||
output_parser: | ||
name: StrParser | ||
- name: TaskExitMonitor |
5 changes: 0 additions & 5 deletions
5
examples/video_understanding/configs/workers/task_conqueror.yml
This file was deleted.
Oops, something went wrong.
5 changes: 0 additions & 5 deletions
5
examples/video_understanding/configs/workers/task_divider.yml
This file was deleted.
Oops, something went wrong.
1 change: 0 additions & 1 deletion
1
examples/video_understanding/configs/workers/task_exit_monitor.yml
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
4 changes: 2 additions & 2 deletions
4
examples/video_understanding/docs/images/video_understanding_workflow_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters