Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support internet meme #128

Open
wants to merge 1 commit into
base: develop/v0.2.1
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions examples/internet_meme/agent/input_interface/input_interface.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from pathlib import Path

from omagent_core.utils.registry import registry
from omagent_core.utils.general import read_image
from omagent_core.engine.worker.base import BaseWorker
from omagent_core.utils.logger import logging

CURRENT_PATH = Path(__file__).parents[0]


@registry.register_worker()
class InputInterface(BaseWorker):
"""Input interface processor that handles user instructions and image input.

This processor:
1. Reads user input containing question and image via input interface
2. Extracts text instruction and image path from the input
3. Loads and caches the image in workflow storage
4. Returns the user instruction for next steps
"""

def _run(self, *args, **kwargs):
# Read user input through configured input interface
user_input = self.input.read_input(workflow_instance_id=self.workflow_instance_id, input_prompt='Please tell me a question and a image.')

image_path = None
# Extract text and image content from input message
content = user_input['messages'][-1]['content']
for content_item in content:
if content_item['type'] == 'text':
user_instruction = content_item['data']
elif content_item['type'] == 'image_url':
image_path = content_item['data']

Comment on lines +28 to +34
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add input validation and error handling

The content extraction lacks proper validation and error handling:

  1. No validation of user_input structure
  2. Potential KeyError if 'messages' or 'content' is missing
  3. No handling for missing text/image_url types

Consider implementing robust validation:

-        content = user_input['messages'][-1]['content']
-        for content_item in content:
-            if content_item['type'] == 'text':
-                user_instruction = content_item['data']
-            elif content_item['type'] == 'image_url':
-                image_path = content_item['data']
+        user_instruction = None
+        try:
+            content = user_input.get('messages', [])
+            if not content:
+                raise ValueError("No messages found in user input")
+            
+            last_message = content[-1].get('content', [])
+            if not last_message:
+                raise ValueError("No content found in last message")
+            
+            for content_item in last_message:
+                if not isinstance(content_item, dict) or 'type' not in content_item or 'data' not in content_item:
+                    continue
+                
+                if content_item['type'] == 'text':
+                    user_instruction = content_item['data']
+                elif content_item['type'] == 'image_url':
+                    image_path = content_item['data']
+            
+            if user_instruction is None:
+                raise ValueError("No text instruction found in input")
+        except Exception as e:
+            logging.error(f"Error processing user input: {e}")
+            raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
content = user_input['messages'][-1]['content']
for content_item in content:
if content_item['type'] == 'text':
user_instruction = content_item['data']
elif content_item['type'] == 'image_url':
image_path = content_item['data']
user_instruction = None
try:
content = user_input.get('messages', [])
if not content:
raise ValueError("No messages found in user input")
last_message = content[-1].get('content', [])
if not last_message:
raise ValueError("No content found in last message")
for content_item in last_message:
if not isinstance(content_item, dict) or 'type' not in content_item or 'data' not in content_item:
continue
if content_item['type'] == 'text':
user_instruction = content_item['data']
elif content_item['type'] == 'image_url':
image_path = content_item['data']
if user_instruction is None:
raise ValueError("No text instruction found in input")
except Exception as e:
logging.error(f"Error processing user input: {e}")
raise

logging.info(f'user_instruction: {user_instruction}\nImage_path: {image_path}')
self.stm(self.workflow_instance_id)['user_instruction'] = user_instruction
# Load image from file system
if image_path:
img = read_image(input_source=image_path)

# Store image in workflow shared memory with standard key
image_cache = {'<image_0>' : img}
self.stm(self.workflow_instance_id)['image_cache'] = image_cache
Comment on lines +38 to +43
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance image handling and caching

The image loading and caching logic needs improvement:

  1. No validation of image file existence
  2. No validation of image format
  3. No size limits for image cache

Consider implementing these safeguards:

         if image_path:
+            if not Path(image_path).exists():
+                raise FileNotFoundError(f"Image file not found: {image_path}")
+            
             img = read_image(input_source=image_path)
+            
+            # Validate image size and format
+            if img.size > MAX_IMAGE_SIZE:  # Define MAX_IMAGE_SIZE constant
+                raise ValueError(f"Image size exceeds limit of {MAX_IMAGE_SIZE} bytes")
             
             # Store image in workflow shared memory with standard key
             image_cache = {'<image_0>' : img}
+            # Implement cache size management
+            self._manage_cache_size(image_cache)
             self.stm(self.workflow_instance_id)['image_cache'] = image_cache

Committable suggestion skipped: line range outside the PR's diff.


return {'user_instruction': user_instruction}
59 changes: 59 additions & 0 deletions examples/internet_meme/agent/meme_explain/meme_explain.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
from pathlib import Path
from typing import List

from omagent_core.models.llms.base import BaseLLMBackend
from omagent_core.engine.worker.base import BaseWorker
from omagent_core.utils.registry import registry
from omagent_core.models.llms.prompt.prompt import PromptTemplate
from omagent_core.models.llms.openai_gpt import OpenaiGPTLLM

from pydantic import Field


CURRENT_PATH = Path(__file__).parents[0]


@registry.register_worker()
class MemeExplain(BaseWorker, BaseLLMBackend):
llm: OpenaiGPTLLM

prompts: List[PromptTemplate] = Field(
default=[
PromptTemplate.from_file(
CURRENT_PATH.joinpath("sys_prompt.prompt"), role="system"
),
PromptTemplate.from_file(
CURRENT_PATH.joinpath("user_prompt.prompt"), role="user"
),
]
)
Comment on lines +16 to +29
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Add validation for prompt file existence

The code loads prompt files without checking if they exist first. This could lead to runtime errors if the files are missing.

Consider adding validation:

     prompts: List[PromptTemplate] = Field(
         default=[
             PromptTemplate.from_file(
-                CURRENT_PATH.joinpath("sys_prompt.prompt"), role="system"
+                sys_prompt_path := CURRENT_PATH.joinpath("sys_prompt.prompt"),
+                role="system"
+            ) if sys_prompt_path.exists() else None,
             PromptTemplate.from_file(
-                CURRENT_PATH.joinpath("user_prompt.prompt"), role="user"
+                user_prompt_path := CURRENT_PATH.joinpath("user_prompt.prompt"),
+                role="user"
+            ) if user_prompt_path.exists() else None,
         ]
     )

Also, consider adding input validation for the prompt content to prevent potential security issues.

Committable suggestion skipped: line range outside the PR's diff.


def _run(self, *args, **kwargs):
"""Process user input and generate outfit recommendations.

Retrieves user instruction and weather information from workflow context,
generates outfit recommendations using the LLM model, and returns the
recommendations while also sending them via callback.

Args:
*args: Variable length argument list
**kwargs: Arbitrary keyword arguments

Returns:
str: Generated outfit recommendations
"""
Comment on lines +31 to +44
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Update docstring to reflect meme explanation functionality

The current docstring refers to "outfit recommendations" which appears to be copied from another module. This should be updated to reflect the actual meme explanation functionality.

Consider updating the docstring:

-    """Process user input and generate outfit recommendations.
+    """Process user input and generate meme explanations.
     
-    Retrieves user instruction and weather information from workflow context,
-    generates outfit recommendations using the LLM model, and returns the 
-    recommendations while also sending them via callback.
+    Retrieves user instruction and search information from workflow context,
+    generates meme explanations using the LLM model, and returns the 
+    explanation while also sending it via callback.
     
     Args:
         *args: Variable length argument list
         **kwargs: Arbitrary keyword arguments
         
     Returns:
-        str: Generated outfit recommendations
+        str: Generated meme explanation
     """
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _run(self, *args, **kwargs):
"""Process user input and generate outfit recommendations.
Retrieves user instruction and weather information from workflow context,
generates outfit recommendations using the LLM model, and returns the
recommendations while also sending them via callback.
Args:
*args: Variable length argument list
**kwargs: Arbitrary keyword arguments
Returns:
str: Generated outfit recommendations
"""
def _run(self, *args, **kwargs):
"""Process user input and generate meme explanations.
Retrieves user instruction and search information from workflow context,
generates meme explanations using the LLM model, and returns the
explanation while also sending it via callback.
Args:
*args: Variable length argument list
**kwargs: Arbitrary keyword arguments
Returns:
str: Generated meme explanation
"""

# Retrieve user instruction and optional weather info from workflow context
user_instruct = self.stm(self.workflow_instance_id).get("user_instruction")
search_info = self.stm(self.workflow_instance_id)["search_info"] if "search_info" in self.stm(self.workflow_instance_id) else None
# Generate outfit recommendations using LLM with weather and user input
chat_complete_res = self.simple_infer(info=str(search_info), name=user_instruct)

# Extract recommendations from LLM response
outfit_recommendation = chat_complete_res["choices"][0]["message"]["content"]

# Send recommendations via callback and return
self.callback.send_answer(agent_id=self.workflow_instance_id, msg=outfit_recommendation)

self.stm(self.workflow_instance_id).clear()
return outfit_recommendation

Comment on lines +45 to +59
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling and improve code robustness

Several improvements can be made to make the code more robust:

  1. The search_info retrieval can be simplified (as suggested by static analysis)
  2. Missing error handling for LLM failures
  3. No validation of user_instruct
  4. Missing type hints

Consider these improvements:

+    def _run(self, *args, **kwargs) -> str:
         # Retrieve user instruction and optional weather info from workflow context
-        user_instruct = self.stm(self.workflow_instance_id).get("user_instruction")
+        user_instruct = self.stm(self.workflow_instance_id).get("user_instruction")
+        if not user_instruct:
+            raise ValueError("User instruction is required")
+
-        search_info = self.stm(self.workflow_instance_id)["search_info"] if "search_info" in self.stm(self.workflow_instance_id) else None
+        search_info = self.stm(self.workflow_instance_id).get("search_info")
 
         # Generate outfit recommendations using LLM with weather and user input
-        chat_complete_res = self.simple_infer(info=str(search_info), name=user_instruct)
+        try:
+            chat_complete_res = self.simple_infer(info=str(search_info), name=user_instruct)
+        except Exception as e:
+            self.stm(self.workflow_instance_id).clear()
+            raise RuntimeError(f"Failed to generate meme explanation: {str(e)}")
 
         # Extract recommendations from LLM response
-        outfit_recommendation = chat_complete_res["choices"][0]["message"]["content"]
+        try:
+            meme_explanation = chat_complete_res["choices"][0]["message"]["content"]
+        except (KeyError, IndexError) as e:
+            self.stm(self.workflow_instance_id).clear()
+            raise RuntimeError(f"Invalid response format from LLM: {str(e)}")
         
         # Send recommendations via callback and return
-        self.callback.send_answer(agent_id=self.workflow_instance_id, msg=outfit_recommendation)
+        self.callback.send_answer(agent_id=self.workflow_instance_id, msg=meme_explanation)
         
         self.stm(self.workflow_instance_id).clear()
-        return outfit_recommendation
+        return meme_explanation
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Retrieve user instruction and optional weather info from workflow context
user_instruct = self.stm(self.workflow_instance_id).get("user_instruction")
search_info = self.stm(self.workflow_instance_id)["search_info"] if "search_info" in self.stm(self.workflow_instance_id) else None
# Generate outfit recommendations using LLM with weather and user input
chat_complete_res = self.simple_infer(info=str(search_info), name=user_instruct)
# Extract recommendations from LLM response
outfit_recommendation = chat_complete_res["choices"][0]["message"]["content"]
# Send recommendations via callback and return
self.callback.send_answer(agent_id=self.workflow_instance_id, msg=outfit_recommendation)
self.stm(self.workflow_instance_id).clear()
return outfit_recommendation
def _run(self, *args, **kwargs) -> str:
# Retrieve user instruction and optional weather info from workflow context
user_instruct = self.stm(self.workflow_instance_id).get("user_instruction")
if not user_instruct:
raise ValueError("User instruction is required")
search_info = self.stm(self.workflow_instance_id).get("search_info")
# Generate outfit recommendations using LLM with weather and user input
try:
chat_complete_res = self.simple_infer(info=str(search_info), name=user_instruct)
except Exception as e:
self.stm(self.workflow_instance_id).clear()
raise RuntimeError(f"Failed to generate meme explanation: {str(e)}")
# Extract recommendations from LLM response
try:
meme_explanation = chat_complete_res["choices"][0]["message"]["content"]
except (KeyError, IndexError) as e:
self.stm(self.workflow_instance_id).clear()
raise RuntimeError(f"Invalid response format from LLM: {str(e)}")
# Send recommendations via callback and return
self.callback.send_answer(agent_id=self.workflow_instance_id, msg=meme_explanation)
self.stm(self.workflow_instance_id).clear()
return meme_explanation
🧰 Tools
🪛 Ruff (0.8.2)

47-47: Use self.stm(self.workflow_instance_id).get("search_info", None) instead of an if block

Replace with self.stm(self.workflow_instance_id).get("search_info", None)

(SIM401)

21 changes: 21 additions & 0 deletions examples/internet_meme/agent/meme_explain/sys_prompt.prompt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
你是一个互联网网络梗百科专家。我会提供一些在网络上搜索到的关于某个梗的解释以及一些相关的使用例子,你的任务是根据网络的信息生成这个网络梗的百科页面。需要包含的信息为:

1. 网络梗的介绍,解释出处
2. 关于这个梗的3个使用案例,包括来源和使用例子的内容。如果搜到的信息没有例子,则创造三个例子,这种情况不需要输出来源。
Comment on lines +1 to +4
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider adding guidelines for handling sensitive content and incomplete information

The prompt should include:

  1. Guidelines for handling sensitive, inappropriate, or offensive meme content
  2. Instructions for cases where information is incomplete or ambiguous
  3. Criteria for verifying the reliability of sources

Would you like me to propose additional prompt text addressing these concerns?


输出使用如下格式:
### XXX定义
XXXX

### 使用案例
1. **例子一**:
- **来源**:XXX
- **使用例子**:XXX

2. **例子一**:
- **来源**:XXX
- **使用例子**:XXX

3. **例子一**:
- **来源**:XXX
- **使用例子**:XXX
Comment on lines +15 to +21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix example numbering in the prompt template

The example numbering shows "例子一" (Example One) three times. This should be incremented for each example.

Apply this diff to fix the numbering:

-2. **例子一**:
+2. **例子二**:
    - **来源**:XXX
    - **使用例子**:XXX

-3. **例子一**:
+3. **例子三**:
    - **来源**:XXX
    - **使用例子**:XXX
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
2. **例子一**:
- **来源**:XXX
- **使用例子**:XXX
3. **例子一**:
- **来源**:XXX
- **使用例子**:XXX
2. **例子二**:
- **来源**:XXX
- **使用例子**:XXX
3. **例子三**:
- **来源**:XXX
- **使用例子**:XXX

6 changes: 6 additions & 0 deletions examples/internet_meme/agent/meme_explain/user_prompt.prompt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Now, it's your turn to complete the task.
Give anwer using the language according to the user's answer.
Comment on lines +1 to +2
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix grammar and spelling issues in the English instructions.

The instructions contain spelling and grammar errors that should be corrected.

Apply this diff to fix the issues:

 Now, it's your turn to complete the task.
-Give anwer using the language according to the user's answer.
+Give an answer using language that matches the user's response.
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Now, it's your turn to complete the task.
Give anwer using the language according to the user's answer.
Now, it's your turn to complete the task.
Give an answer using language that matches the user's response.


Input Information:
- 搜到的信息: {{info}}
- 网络梗的名称: {{name}}
37 changes: 37 additions & 0 deletions examples/internet_meme/agent/meme_searcher/meme_seacher.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
from pathlib import Path
from omagent_core.engine.worker.base import BaseWorker
from omagent_core.utils.registry import registry
from omagent_core.tool_system.manager import ToolManager
from omagent_core.utils.logger import logging

CURRENT_PATH = root_path = Path(__file__).parents[0]


@registry.register_worker()
class MemeSearcher(BaseWorker):

tool_manager : ToolManager

def _run(self, user_instruction:str, *args, **kwargs):
# Construct search query with instructions for datetime and location extraction
# search_query = "Please consider the user instruction and generate a search query for the internet memo search tool to search for the explaination according to user requirements. You MUST choose the web search tool in the tool_call to excute. When generating the search query, please include how this memo comes from how to use this memo. User Instruction: {}".format(user_instruction)

# search_query = "Please consider the user instruction and generate a search query for the internet memo search. You MUST choose the web search tool in the tool_call to excute. User Instruction: 搜索{}, 并提供相关的3个例子,需要获得三个query results".format(user_instruction)

search_query = "Please consider the user instruction and generate a search query for the internet meme search. You MUST choose the web search tool in the tool_call to excute. User Instruction: search {} meme, and provide three examples of {} usage in context,need to gie out three query results".format(user_instruction, user_instruction)
Comment on lines +16 to +21
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove commented code and improve search query construction

The file contains commented out code and a hardcoded search query format. Consider:

  1. Removing commented code
  2. Moving the query template to a configuration file
  3. Adding input validation for user instructions
-        # search_query = "Please consider the user instruction and generate a search query for the internet memo search tool to search for the explaination according to user requirements. You MUST choose the web search tool in the tool_call to excute. When generating the search query, please include how this memo comes from how to use this memo. User Instruction: {}".format(user_instruction)
-
-        # search_query = "Please consider the user instruction and generate a search query for the internet memo search. You MUST choose the web search tool in the tool_call to excute. User Instruction: 搜索{}, 并提供相关的3个例子,需要获得三个query results".format(user_instruction)
-
-        search_query = "Please consider the user instruction and generate a search query for the internet meme search. You MUST choose the web search tool in the tool_call to excute. User Instruction: search {} meme, and provide three examples of {} usage in context,need to gie out three query results".format(user_instruction, user_instruction)
+        if not user_instruction or not isinstance(user_instruction, str):
+            raise ValueError("Invalid user instruction")
+        
+        search_query = self.config.get_template("meme_search").format(
+            user_instruction=user_instruction
+        )

Committable suggestion skipped: line range outside the PR's diff.


logging.info(search_query)
# Execute memo search via tool manager and notify user
execution_status, execution_results = self.tool_manager.execute_task(
task=search_query
)
self.callback.send_block(agent_id=self.workflow_instance_id, msg='Using web search tool to search for meme information')
logging.info(execution_results)

Comment on lines +25 to +30
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve error handling and logging for tool execution

The tool execution could benefit from better error handling and logging:

  1. Add timeout handling
  2. Log execution failures with details
  3. Add retry mechanism for transient failures
         execution_status, execution_results = self.tool_manager.execute_task(
-                task=search_query
+                task=search_query,
+                timeout=self.config.get("search_timeout", 30),
+                retries=self.config.get("max_retries", 3)
             )
         self.callback.send_block(agent_id=self.workflow_instance_id, msg='Using web search tool to search for meme information')
-        logging.info(execution_results)
+        if execution_status == "success":
+            logging.info("Search completed successfully: %s", execution_results)
+        else:
+            logging.error("Search failed: %s", execution_results)

Committable suggestion skipped: line range outside the PR's diff.

# Store successful results in workflow context or raise error
if execution_status == "success":
self.stm(self.workflow_instance_id)["search_info"] = execution_results
else:
raise ValueError("Web search tool execution failed.")

return
53 changes: 53 additions & 0 deletions examples/internet_meme/agent/simple_vqa/simple_vqa.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
from pathlib import Path
from typing import List

from omagent_core.models.llms.base import BaseLLMBackend
from omagent_core.utils.registry import registry
from omagent_core.models.llms.schemas import Message, Content
from omagent_core.utils.general import encode_image
from omagent_core.models.llms.prompt.parser import StrParser
from omagent_core.models.llms.openai_gpt import OpenaiGPTLLM
from omagent_core.engine.worker.base import BaseWorker
from omagent_core.utils.container import container


@registry.register_worker()
class SimpleVQA(BaseWorker, BaseLLMBackend):
"""Simple Visual Question Answering processor that handles image-based questions.

This processor:
1. Takes user instruction and cached image from workflow context
2. Creates chat messages containing the text question and base64-encoded image
3. Sends messages to LLM to generate a response
4. Returns response and sends it via callback
"""
llm: OpenaiGPTLLM

def _run(self, user_instruction:str, *args, **kwargs):
# Initialize empty list for chat messages
chat_message = []

# Add text question as first message
chat_message.append(Message(role="user", message_type='text', content=user_instruction))

# Retrieve cached image from workflow shared memory
if self.stm(self.workflow_instance_id).get('image_cache', None):
img = self.stm(self.workflow_instance_id)['image_cache']['<image_0>']
Comment on lines +34 to +35
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for missing image key in cache.

When accessing '<image_0>' in image_cache, there's a risk of a KeyError if the key doesn't exist. Consider adding a check to handle this scenario gracefully.

Apply this diff to prevent potential KeyError:

if self.stm(self.workflow_instance_id).get('image_cache', None):
-    img = self.stm(self.workflow_instance_id)['image_cache']['<image_0>']
+    image_cache = self.stm(self.workflow_instance_id)['image_cache']
+    img = image_cache.get('<image_0>')
+    if img:
+        # Add base64 encoded image as second message
+        chat_message.append(Message(
+            role="user",
+            message_type='image',
+            content=[Content(
+                type="image_url",
+                image_url={
+                    "url": f"data:image/jpeg;base64,{encode_image(img)}"
+                },
+            )]
+        ))
+    else:
+        # Handle the case where the image is missing
+        pass  # Optionally log a warning or take alternative action

Committable suggestion skipped: line range outside the PR's diff.


# Add base64 encoded image as second message
chat_message.append(Message(role="user", message_type='image', content=[Content(
type="image_url",
image_url={
"url": f"data:image/jpeg;base64,{encode_image(img)}"
},
)]))

# Get response from LLM model
chat_complete_res = self.llm.generate(records=chat_message)

# Extract answer text from response
answer = chat_complete_res["choices"][0]["message"]["content"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for unexpected LLM response structure.

Accessing chat_complete_res["choices"][0]["message"]["content"] without checks may raise exceptions if the response is not as expected. Add error handling to manage unexpected responses.

Apply this diff to handle potential exceptions:

try:
    answer = chat_complete_res["choices"][0]["message"]["content"]
except (KeyError, IndexError, TypeError) as e:
    # Handle the error, e.g., log it or provide a default response
+   self.logger.error(f"Failed to parse LLM response: {e}")
+   answer = "I'm sorry, I couldn't process your request."

Committable suggestion skipped: line range outside the PR's diff.


# Send answer via callback and return
self.callback.send_answer(self.workflow_instance_id, msg=answer)
return answer
20 changes: 20 additions & 0 deletions examples/internet_meme/compile_container.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from omagent_core.utils.container import container
from pathlib import Path
from omagent_core.utils.registry import registry

Comment on lines +1 to +4
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove duplicate Path import

The Path class is imported twice unnecessarily.

Apply this diff to fix the duplicate import:

from omagent_core.utils.container import container
from pathlib import Path
from omagent_core.utils.registry import registry

-# Configure import path for agent modules
-from pathlib import Path
CURRENT_PATH = Path(__file__).parents[0]

Also applies to: 10-11


# Load all registered workflow components
registry.import_module()

# Configure import path for agent modules
from pathlib import Path
CURRENT_PATH = Path(__file__).parents[0]

# Register core workflow components for state management, callbacks and input handling
container.register_stm(stm='RedisSTM')
container.register_callback(callback='AppCallback')
container.register_input(input='AppInput')
Comment on lines +14 to +16
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add error handling for component registration

The component registration calls lack error handling, which could lead to silent failures.

Consider wrapping the registration calls in try-except blocks:

-container.register_stm(stm='RedisSTM')
-container.register_callback(callback='AppCallback')
-container.register_input(input='AppInput')
+try:
+    container.register_stm(stm='RedisSTM')
+    container.register_callback(callback='AppCallback')
+    container.register_input(input='AppInput')
+except Exception as e:
+    logging.error(f"Failed to register components: {e}")
+    raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
container.register_stm(stm='RedisSTM')
container.register_callback(callback='AppCallback')
container.register_input(input='AppInput')
try:
container.register_stm(stm='RedisSTM')
container.register_callback(callback='AppCallback')
container.register_input(input='AppInput')
except Exception as e:
logging.error(f"Failed to register components: {e}")
raise



# Compile container config
container.compile_config(CURRENT_PATH)
6 changes: 6 additions & 0 deletions examples/internet_meme/configs/llms/gpt.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
name: OpenaiGPTLLM
model_id: gpt-4o
api_key: ${env| custom_openai_key, openai_api_key}
endpoint: ${env| custom_openai_endpoint, https://api.openai.com/v1}
temperature: 0
vision: true
6 changes: 6 additions & 0 deletions examples/internet_meme/configs/llms/text_res.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
name: OpenaiGPTLLM
model_id: gpt-4o
api_key: ${env| custom_openai_key, openai_api_key}
endpoint: ${env| custom_openai_endpoint, https://api.openai.com/v1}
temperature: 0
vision: false
5 changes: 5 additions & 0 deletions examples/internet_meme/configs/tools/websearch.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
llm: ${sub| text_res}
tools:
- name: WebSearch
bing_api_key: ${env| bing_api_key, null}
llm: ${sub|text_res}
2 changes: 2 additions & 0 deletions examples/internet_meme/configs/workers/meme_explain.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
name: MemeExplain
llm: ${sub| gpt}
3 changes: 3 additions & 0 deletions examples/internet_meme/configs/workers/meme_seacher.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
name: MemeSearcher
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Correct the filename to meme_searcher.yaml for consistency.

The filename meme_seacher.yaml appears to have a typo. Renaming it to meme_searcher.yaml will maintain consistency with the component's name MemeSearcher and improve clarity.

llm: ${sub| text_res}
tool_manager: ${sub|websearch}
2 changes: 2 additions & 0 deletions examples/internet_meme/configs/workers/simple_vqa.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
name: SimpleVQA
llm: ${sub|gpt}
84 changes: 84 additions & 0 deletions examples/internet_meme/container.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
conductor_config:
name: Configuration
base_url:
value: http://10.8.25.26:8080
description: The Conductor Server API endpoint
env_var: CONDUCTOR_SERVER_URL
Comment on lines +3 to +6
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Security: Consider using HTTPS for the Conductor Server endpoint

The base URL is using HTTP which is insecure for API communication. Consider using HTTPS to ensure secure communication.

-    value: http://10.8.25.26:8080
+    value: https://10.8.25.26:8080
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
base_url:
value: http://10.8.25.26:8080
description: The Conductor Server API endpoint
env_var: CONDUCTOR_SERVER_URL
base_url:
value: https://10.8.25.26:8080
description: The Conductor Server API endpoint
env_var: CONDUCTOR_SERVER_URL

auth_key:
value: null
description: The authorization key
env_var: AUTH_KEY
auth_secret:
value: null
description: The authorization secret
env_var: CONDUCTOR_AUTH_SECRET
auth_token_ttl_min:
value: 45
description: The authorization token refresh interval in minutes.
env_var: AUTH_TOKEN_TTL_MIN
debug:
value: false
description: Debug mode
env_var: DEBUG
connectors:
redis_stream_client:
name: RedisConnector
host:
value: localhost
env_var: HOST
port:
value: 6379
env_var: PORT
password:
value: null
env_var: PASSWORD
username:
value: null
env_var: USERNAME
db:
value: 0
env_var: DB
redis_stm_client:
name: RedisConnector
host:
value: localhost
env_var: HOST
port:
value: 6379
env_var: PORT
password:
value: null
env_var: PASSWORD
username:
value: null
env_var: USERNAME
db:
value: 0
env_var: DB
Comment on lines +24 to +57
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Refactor: Consolidate duplicate Redis configurations

The Redis configurations for redis_stream_client and redis_stm_client are identical. Consider using a single base configuration that can be extended or referenced to avoid duplication.

Consider restructuring using YAML anchors and aliases:

redis_base: &redis_base
  name: RedisConnector
  host:
    value: localhost
    env_var: HOST
  port:
    value: 6379
    env_var: PORT
  password:
    value: null
    env_var: PASSWORD
  username:
    value: null
    env_var: USERNAME
  db:
    value: 0
    env_var: DB

connectors:
  redis_stream_client:
    <<: *redis_base
  redis_stm_client:
    <<: *redis_base

components:
DefaultCallback:
name: DefaultCallback
bot_id:
value: ''
env_var: BOT_ID
start_time:
value: 2024-12-03_18:51:46
env_var: START_TIME
folder_name:
value: ./running_logs/2024-12-03_18:51:46
env_var: FOLDER_NAME
Comment on lines +64 to +69
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove hardcoded timestamps and paths

The configuration contains hardcoded timestamps and log paths. These should be dynamically generated at runtime rather than stored in configuration.

These values should be removed from the configuration and set programmatically in the application code:

  • start_time: 2024-12-03_18:51:46
  • folder_name: ./running_logs/2024-12-03_18:51:46

Also applies to: 77-82

AppInput:
name: AppInput
AppCallback:
name: AppCallback
bot_id:
value: ''
env_var: BOT_ID
start_time:
value: 2024-12-03_18:51:46
env_var: START_TIME
folder_name:
value: ./running_logs/2024-12-03_18:51:46
env_var: FOLDER_NAME
RedisSTM:
name: RedisSTM
Loading