Merge pull request #80 from kense-lab/feat/r2r

Feat RAG enhancement
MLT-OSS · Aug 28, 2024 · 2aa5c91 · 2aa5c91
2 parents 5e13c36 + d9f87a7
commit 2aa5c91
Show file tree

Hide file tree

Showing 32 changed files with 679 additions and 99 deletions.
diff --git a/.env.example b/.env.example
@@ -25,9 +25,6 @@ REDIS_PORT=6379
 REDIS_DB=0
 REDIS_PASSWORD=123456
 
-# file service
-FILE_SERVICE_MODULE=app.services.file.impl.oss_file.OSSFileService
-
 # s3 storage
 S3_ENDPOINT=http://minio:9000
 S3_BUCKET_NAME=oas
@@ -52,6 +49,15 @@ BING_SEARCH_URL=https://api.bing.microsoft.com/v7.0/search
 BING_SUBSCRIPTION_KEY=xxxx
 WEB_SEARCH_NUM_RESULTS=5
 
+# file service
+FILE_SERVICE_MODULE=app.services.file.impl.oss_file.OSSFileService
+# FILE_SERVICE_MODULE=app.services.file.impl.r2r_file.R2RFileService
+
+# file search tool
+R2R_BASE_URL=http://127.0.0.1:8000
+R2R_USERNAME=[email protected]
+R2R_PASSWORD=change_me_immediately
+R2R_SEARCH_LIMIT=10
+
 # secret
 APP_AES_ENCRYPTION_KEY=7700b2f9c8dd982dfaddf8b47a92f1d900507ee8ac335f96a64e9ca0f018b195
-
diff --git a/README.md b/README.md
@@ -19,6 +19,8 @@ LLM applications.
 
 It supports [One API](https://github.com/songquanpeng/one-api) for integration with more commercial and private models.
 
+It supports [R2R](https://github.com/SciPhi-AI/R2R) RAG engine。
+
 ## Usage
 
 Below is an example of using the official OpenAI Python `openai` library:
@@ -40,21 +42,22 @@ assistant = client.beta.assistants.create(
 
 ## Why Choose Open Assistant API
 
-| Feature                  | Open Assistant API    | OpenAI Assistant API |
-|--------------------------|-----------------------|----------------------|
-| Ecosystem Strategy       | Open Source           | Closed Source        |
-| RAG Engine               | Simple Implementation | Supported            |
-| Internet Search          | Supported             | Not Supported        |
-| Custom Functions         | Supported             | Supported            |
-| Built-in Tool            | Extendable            | Not Extendable       |
-| Code Interpreter         | Under Development     | Supported            |
-| LLM Support              | Supports More LLMs    | Only GPT             |
-| Message Streaming Output | Supports              | Supported        |
-| Local Deployment         | Supported             | Not Supported        |
+| Feature                  | Open Assistant API | OpenAI Assistant API |
+|--------------------------|--------------------|----------------------|
+| Ecosystem Strategy       | Open Source        | Closed Source        |
+| RAG Engine               | Support R2R        | Supported            |
+| Internet Search          | Supported          | Not Supported        |
+| Custom Functions         | Supported          | Supported            |
+| Built-in Tool            | Extendable         | Not Extendable       |
+| Code Interpreter         | Under Development  | Supported            |
+| Multimodal               | Supported          | Supported            |
+| LLM Support              | Supports More LLMs | Only GPT             |
+| Message Streaming Output | Supports           | Supported            |
+| Local Deployment         | Supported          | Not Supported        |
 
 - **LLM Support**: Compared to the official OpenAI version, more models can be supported by integrating with One API.
 - **Tool**: Currently supports online search; can easily expand more tools.
-- **RAG Engine**: The currently supported file types are txt, pdf, html, markdown. We provide a preliminary
+- **RAG Engine**: The currently supported file types are txt, html, markdown, pdf, docx, pptx, xlsx, png, mp3, mp4, etc. We provide a preliminary
   implementation.
 - **Message Streaming Output**: Support message streaming output for a smoother user experience.
 - **Ecosystem Strategy**: Open source, you can deploy the service locally and expand the existing features.
@@ -76,6 +79,18 @@ OPENAI_API_KEY=<openai_api_key>
 BING_SUBSCRIPTION_KEY=<bing_subscription_key>
 ```
 
+It is recommended to configure the R2R RAG engine to replace the default RAG implementation to provide better RAG capabilities.
+You can learn about and use R2R through the [R2R Github repository](https://github.com/SciPhi-AI/R2R).
+
+```sh
+# RAG config
+# FILE_SERVICE_MODULE=app.services.file.impl.oss_file.OSSFileService
+FILE_SERVICE_MODULE=app.services.file.impl.r2r_file.R2RFileService
+R2R_BASE_URL=http://<r2r_api_address>
+R2R_USERNAME=<r2r_username>
+R2R_PASSWORD=<r2r_password>
+```
+
 ### Run
 
 #### Run with Docker Compose:
@@ -92,14 +107,14 @@ Interface documentation address: http://127.0.0.1:8086/docs
 
 ### Complete Usage Example
 
-In this example, an AI assistant is created and run using the official OpenAI client library. If you need to explore other usage methods, 
+In this example, an AI assistant is created and run using the official OpenAI client library. If you need to explore other usage methods,
 such as streaming output, tools (web_search, retrieval, function), etc., you can find the corresponding code under the examples directory.
 Before running, you need to run `pip install openai` to install the Python `openai` library.
 
 ```sh
 # !pip install openai
-export PYTHONPATH=$(pwd) 
-python examples/run_assistant.py 
+export PYTHONPATH=$(pwd)
+python examples/run_assistant.py
 ```
 
 
@@ -135,6 +150,7 @@ We mainly referred to and relied on the following projects:
 
 - [OpenOpenAI](https://github.com/transitive-bullshit/OpenOpenAI): Assistant API implemented in Node
 - [One API](https://github.com/songquanpeng/one-api): Multi-model management tool
+- [R2R](https://github.com/SciPhi-AI/R2R): RAG engine
 - [OpenAI-Python](https://github.com/openai/openai-python): OpenAI Python Client
 - [OpenAI API](https://github.com/openai/openai-openapi): OpenAI interface definition
 - [LangChain](https://github.com/langchain-ai/langchain): LLM application development library

diff --git a/README_CN.md b/README_CN.md
@@ -18,6 +18,8 @@ Open Assistant API 是一个开源自托管的 AI 智能助手 API，兼容 Open
 
 支持 [One API](https://github.com/songquanpeng/one-api) 可以用其接入更多商业和私有模型。
 
+支持 [R2R](https://github.com/SciPhi-AI/R2R) RAG 引擎。
+
 ## 使用
 
 以下是使用了 OpenAI 官方的 Python `openai` 库的使用示例:
@@ -42,18 +44,19 @@ assistant = client.beta.assistants.create(
 | 功能               | Open Assistant API | OpenAI Assistant API |
 |------------------|--------------------|----------------------|
 | 生态策略             | 开源                 | 闭源                   |
-| RAG 引擎           | 简单实现               | 支持                   |
+| RAG 引擎           | 支持 R2R             | 支持                   |
 | 联网搜索             | 支持                 | 不支持                  |
 | 自定义 Functions    | 支持                 | 支持                   |
 | 内置 Tool          | 支持扩展               | 不支持扩展                |
 | Code Interpreter | 待开发                | 支持                   |
+| 多模态识别            | 支持                 | 支持                   |
 | LLM 支持           | 支持更多的 LLM          | 仅 GPT                |
-| Message 流式输出     | 支持                 | 支持                  |
+| Message 流式输出     | 支持                 | 支持                   |
 | 本地部署             | 支持                 | 不支持                  |
 
 - **LLM 支持**: 相较于 OpenAI 官方版本，可以通过接入 One API 来支持更多的模型。
 - **Tool**: 目前支持联网搜索；可以较容易扩展更多的 Tool。
-- **RAG 引擎**: 目前支持的文件类型有 txt、pdf、html、markdown。我们提供了一个初步的实现。
+- **RAG 引擎**: 支持 R2R RAG 引擎，目前支持的文件类型有 txt、html、markdown、pdf、docx、pptx、xlsx、png、mp3、mp4 等。
 - **Message 流式输出**: 支持 Message 流式输出，提供更流畅的用户体验。
 - **生态策略**: 开源，你可以将服务部署在本地，可以对已有功能进行扩展。
 
@@ -71,6 +74,18 @@ OPENAI_API_KEY=<openai_api_key>
 
 # bing search key (非必填)
 BING_SUBSCRIPTION_KEY=<bing_subscription_key>
+````
+
+建议配置 R2R RAG 引擎替换默认的 RAG 实现，以提供更好的 RAG 能力。
+关于 R2R，可以通过 [R2R Github 仓库](https://github.com/SciPhi-AI/R2R) 了解和使用。
+
+```sh
+# RAG 配置
+# FILE_SERVICE_MODULE=app.services.file.impl.oss_file.OSSFileService
+FILE_SERVICE_MODULE=app.services.file.impl.r2r_file.R2RFileService
+R2R_BASE_URL=http://<r2r_api_address>
+R2R_USERNAME=<r2r_username>
+R2R_PASSWORD=<r2r_password>
 ```
 
 ### 运行
@@ -95,8 +110,8 @@ Api Base URL: http://127.0.0.1:8086/api/v1
 
 ```sh
 # !pip install openai
-export PYTHONPATH=$(pwd) 
-python examples/run_assistant.py 
+export PYTHONPATH=$(pwd)
+python examples/run_assistant.py
 ```
 
 ### 权限
@@ -105,7 +120,7 @@ python examples/run_assistant.py
 ![](docs/imgs/user.png)
 
 1. 验证方式为 Bearer token，可在 Header 中填入 ```Authorization: Bearer ***``` 进行验证
-2. token 管理参考 api 文档中的 token 小节  
+2. token 管理参考 api 文档中的 token 小节
 相关 api 需通过 admin token 验证，配置为 ```APP_AUTH_ADMIN_TOKEN```，默认为 admin
 3. 创建 token 需填入大模型 base_url 和 api_key，创建的 assistant 将使用相关配置访问大模型
 ### 工具
@@ -130,6 +145,7 @@ python examples/run_assistant.py
 
 - [OpenOpenAI](https://github.com/transitive-bullshit/OpenOpenAI): Node 实现的 Assistant API
 - [One API](https://github.com/songquanpeng/one-api): 多模型管理工具
+- [R2R](https://github.com/SciPhi-AI/R2R): RAG 引擎
 - [OpenAI-Python](https://github.com/openai/openai-python): OpenAI Python Client
 - [OpenAI API](https://github.com/openai/openai-openapi): OpenAI 接口定义
 - [LangChain](https://github.com/langchain-ai/langchain): LLM 应用开发库

diff --git a/app/api/v1/files.py b/app/api/v1/files.py
@@ -18,7 +18,8 @@
 # 限制文件大小
 max_size = 512 * 1024 * 1024
 # 支持的文件类型
-file_ext = [".txt", ".md", ".pdf", ".html"]
+file_ext = [".csv", ".docx", ".html", ".json", ".md", ".pdf", ".pptx", ".txt",
+            ".xlsx", ".gif", ".png", ".jpg", ".jpeg", ".svg", ".mp3", ".mp4"]
 
 
 @router.get("", response_model=ListFilesResponse)

diff --git a/app/core/runner/pub_handler.py b/app/core/runner/pub_handler.py
@@ -51,7 +51,7 @@ def read_event(channel: str, x_index: str = None) -> Tuple[Optional[str], Option
 
 def _data_adjust_tools(tools: List[dict]) -> List[dict]:
     def _adjust_tool(tool: dict):
-        if tool["type"] not in {"code_interpreter", "retrieval", "function"}:
+        if tool["type"] not in {"code_interpreter", "file_search", "function"}:
             return {
                 "type": "function",
                 "function": {

diff --git a/app/core/runner/thread_runner.py b/app/core/runner/thread_runner.py
@@ -10,7 +10,6 @@
 from config.config import settings
 from config.llm import llm_settings, tool_settings
 
-from app.core.doc_loaders import doc_loader
 from app.core.runner.llm_backend import LLMBackend
 from app.core.runner.llm_callback_handler import LLMCallbackHandler
 from app.core.runner.memory import Memory, find_memory
@@ -28,8 +27,7 @@
 from app.models.message import Message, MessageUpdate
 from app.models.run import Run
 from app.models.run_step import RunStep
-from app.models.file import File
-from app.providers.storage import storage
+from app.models.token_relation import RelationType
 from app.services.assistant.assistant import AssistantService
 from app.services.file.file import FileService
 from app.services.message.message import MessageService
@@ -261,11 +259,6 @@ def __generate_chat_messages(self, messages: List[Message]):
         根据历史信息生成 chat message
         """
 
-        def file_load(file: File):
-            file_data = storage.load(file.key)
-            content = doc_loader.load(file_data)
-            return f"For reference, here is is the content of file {file.filename}: '{content}'"
-
         chat_messages = []
         for message in messages:
             role = message.role
@@ -274,7 +267,7 @@ def file_load(file: File):
                 if message.file_ids:
                     files = FileService.get_file_list_by_ids(session=self.session, file_ids=message.file_ids)
                     for file in files:
-                        chat_messages.append(msg_util.new_message(role, file_load(file)))
+                        chat_messages.append(msg_util.new_message(role, f'The file "{file.filename}" can be used as a reference'))
                 else:
                     for content in message.content:
                         if content["type"] == "text":

diff --git a/app/core/runner/utils/tool_call_util.py b/app/core/runner/utils/tool_call_util.py
@@ -5,7 +5,7 @@
 {
     "id": "tool_call_0",
     "function": {
-        "name": "retrieval",
+        "name": "file_search",
         "arguments": "{\"file_keys\": [\"file_0\", \"file_1\"], \"query\": \"query\"}"
     }
 }

diff --git a/app/core/tools/__init__.py b/app/core/tools/__init__.py
@@ -9,17 +9,17 @@
 from app.core.tools.base_tool import BaseTool
 from app.core.tools.external_function_tool import ExternalFunctionTool
 from app.core.tools.openapi_function_tool import OpenapiFunctionTool
-from app.core.tools.retrieval import RetrievalTool
+from app.core.tools.file_search_tool import FileSearchTool
 from app.core.tools.web_search import WebSearchTool
 
 
 class AvailableTools(str, Enum):
-    RETRIEVAL = "retrieval"
+    FILE_SEARCH = "file_search"
     WEB_SEARCH = "web_search"
 
 
 TOOLS = {
-    AvailableTools.RETRIEVAL: RetrievalTool,
+    AvailableTools.FILE_SEARCH: FileSearchTool,
     AvailableTools.WEB_SEARCH: WebSearchTool,
 }
 

diff --git a/app/core/tools/retrieval.py → app/core/tools/file_search_tool.py b/app/core/tools/retrieval.py → app/core/tools/file_search_tool.py
@@ -3,26 +3,24 @@
 from pydantic import BaseModel, Field
 from sqlalchemy.orm import Session
 
-from app.core.doc_loaders import doc_loader
 from app.core.tools.base_tool import BaseTool
 from app.models.run import Run
-from app.providers.storage import storage
 from app.services.file.file import FileService
 
 
-class RetrievalToolInput(BaseModel):
+class FileSearchToolInput(BaseModel):
     indexes: List[int] = Field(..., description="file index list to look up in retrieval")
     query: str = Field(..., description="query to look up in retrieval")
 
 
-class RetrievalTool(BaseTool):
-    name: str = "retrieval"
+class FileSearchTool(BaseTool):
+    name: str = "file_search"
     description: str = (
         "Can be used to look up information that was uploaded to this assistant."
         "If the user is referencing particular files, that is often a good hint that information may be here."
     )
 
-    args_schema: Type[BaseModel] = RetrievalToolInput
+    args_schema: Type[BaseModel] = FileSearchToolInput
 
     def __init__(self) -> None:
         super().__init__()
@@ -40,13 +38,12 @@ def configure(self, session: Session, run: Run, **kwargs):
             self.__keys.append(file.key)
 
     def run(self, indexes: List[int], query: str) -> dict:
-        files = {}
+        file_keys = []
         for index in indexes:
             file_key = self.__keys[index]
-            file_data = storage.load(file_key)
-            # 截取前 5000 字符，防止超出 LLM 最大上下文限制
-            files[file_key] = doc_loader.load(file_data)[:5000]
+            file_keys.append(file_key)
 
+        files = FileService.search_in_files(query=query, file_keys=file_keys)
         return files
 
     def instruction_supplement(self) -> str:

diff --git a/app/libs/util.py b/app/libs/util.py
@@ -1,6 +1,8 @@
 import uuid
 from datetime import datetime
 
+import jwt
+
 
 def datetime2timestamp(value: datetime):
     if not value:
@@ -26,3 +28,10 @@ def is_valid_datetime(date_str, format="%Y-%m-%d %H:%M:%S"):
 
 def random_uuid() -> str:
     return "ml-" + str(uuid.uuid4()).replace("-", "")
+
+
+def verify_jwt_expiration(token):
+    decoded_token = jwt.decode(token, options={"verify_signature": False, "verify_exp": False})
+    expiration_time = datetime.fromtimestamp(decoded_token['exp'])
+    current_time = datetime.now()
+    return current_time < expiration_time
diff --git a/app/models/message.py b/app/models/message.py
@@ -14,6 +14,7 @@ class MessageBase(BaseModel):
     object: str = Field(nullable=False, default="thread.message")
     content: Optional[list] = Field(default=None, sa_column=Column(JSON))
     file_ids: Optional[list] = Field(default=None, sa_column=Column(JSON))
+    attachments: Optional[list] = Field(default=None, sa_column=Column(JSON))  # 附件
     metadata_: Optional[dict] = Field(default=None, sa_column=Column("metadata", JSON), schema_extra={"validation_alias": "metadata"})
     assistant_id: Optional[str] = Field(default=None)
     run_id: Optional[str] = Field(default=None)
@@ -27,6 +28,7 @@ class MessageCreate(BaseModel):
     role: str = Field(sa_column=Column(Enum("assistant", "user"), nullable=False))
     content: Union[str, List[dict]] = Field(nullable=False)
     file_ids: Optional[list] = Field(default=None)
+    attachments: Optional[list] = Field(default=None, sa_column=Column(JSON))  # 附件
     metadata_: Optional[dict] = Field(default=None, schema_extra={"validation_alias": "metadata"})
 
 

diff --git a/app/models/run.py b/app/models/run.py
@@ -16,7 +16,7 @@
 
 class RunBase(BaseModel):
     instructions: Optional[str] = Field(default=None, max_length=32768, sa_column=Column(TEXT))
-    model: str = Field(default=None)
+    model: Optional[str] = Field(default=None)
     status: str = Field(
         default="queued",
         sa_column=Column(
@@ -70,8 +70,7 @@ class RunCreate(BaseModel):
     instructions: Optional[str] = None
     additional_instructions: Optional[str] = None
     model: Optional[str] = None
-    file_ids: Optional[list] = []
-    metadata_: Optional[dict] = Field(default={}, alias="metadata")
+    metadata_: Optional[dict] = Field(default=None, schema_extra={"validation_alias": "metadata"})
     tools: Optional[list] = []
     extra_body: Optional[dict[str, Union[dict[str, Union[Authentication, Any]], Any]]] = {}
     stream: Optional[bool] = False
@@ -97,7 +96,7 @@ def model_validator(cls, data: Any):
 
 class RunUpdate(BaseModel):
     tools: Optional[list] = []
-    metadata_: Optional[dict] = Field(default=None)
+    metadata_: Optional[dict] = Field(default=None, schema_extra={"validation_alias": "metadata"})
     extra_body: Optional[dict[str, Authentication]] = {}
 
     @model_validator(mode="before")