增加 MiniCPM-V 2.6 int4

shadowcz007 · Aug 22, 2024 · 0846013 · 0846013
1 parent c141ba4
commit 0846013
Show file tree

Hide file tree

Showing 9 changed files with 185 additions and 32 deletions.
diff --git a/README.md b/README.md
@@ -6,26 +6,31 @@
 商务合作请联系 [email protected]
 For business cooperation, please contact email [email protected]
 
-
 ##### `最新`：
-- 移动端适配、修改app模式的Mask编辑器
 
-- 增加p5.js作为输入节点
-[workflow](./workflow/p5workflow.json)
-[workflow2](./workflow/p5-video-workflow.json)
+- 增加 MiniCPM-V 2.6 int4
+
+This is the int4 quantized version of MiniCPM-V 2.6.
+Running with int4 version would use lower GPU memory (about 7GB).
+
+- 移动端适配、修改 app 模式的 Mask 编辑器
+
+- 增加 p5.js 作为输入节点
+  [workflow](./workflow/p5workflow.json)
+  [workflow2](./workflow/p5-video-workflow.json)
 
-- App模式增加batch prompt，批量提示词，可以把动态提示词批量组成后运行
+- App 模式增加 batch prompt，批量提示词，可以把动态提示词批量组成后运行
 
 ![alt text](./assets/1722517810720.png)
 
-- 增加 API Key Input 节点，用于管理LLM的Key,同时优化LLM相关节点，为后续agent模式做准备
+- 增加 API Key Input 节点，用于管理 LLM 的 Key,同时优化 LLM 相关节点，为后续 agent 模式做准备
 
-- 增加 SiliconflowLLM，可以使用由Siliconflow提供的免费LLM
+- 增加 SiliconflowLLM，可以使用由 Siliconflow 提供的免费 LLM
 
 <!-- - ChatGPT 节点支持 Local LLM（llama.cpp），Phi3、llama3 都可以直接一个节点运行了。模型下载后，放置到 `models/llamafile/` -->
 
 <!-- - 右键菜单支持 text-to-text，方便对 prompt 词补全 -->
-<!-- 
+<!--
 强烈推荐：
 [Phi-3-mini-4k-instruct-function-calling-GGUF](https://huggingface.co/nold/Phi-3-mini-4k-instruct-function-calling-GGUF)
 
@@ -36,7 +41,6 @@ For business cooperation, please contact email [email protected]
 ![](./assets/prompt_ai_setup.png)
 ![](./assets/prompt-ai.png) -->
 
-
 #### `相关插件推荐`
 
 [comfyui-liveportrait](https://github.com/shadowcz007/comfyui-liveportrait)
@@ -60,8 +64,8 @@ For business cooperation, please contact email [email protected]
 - 发布为 app 的 workflow，可以在右键里再次编辑了
 - web app 可以设置分类，在 comfyui 右键菜单可以编辑更新 web app
 - 支持动态提示
-- 支持把输出显示到comfyui背景（TouchDesigner 风格）
-- 如果转为web app打开是空白的，注意检查下插件目录的名字需要是：comfyui-mixlab-nodes(如果是zip包下载会多了个-main的后缀，需要去掉)
+- 支持把输出显示到 comfyui 背景（TouchDesigner 风格）
+- 如果转为 web app 打开是空白的，注意检查下插件目录的名字需要是：comfyui-mixlab-nodes(如果是 zip 包下载会多了个-main 的后缀，需要去掉)
 
 ![](./assets/微信图片_20240421205440.png)
 
@@ -190,7 +194,6 @@ pip install llama-cpp-python \
 
 > The composite images node overlays a foreground image onto a background image at specified positions and scales, with optional blending modes and masking capabilities. position : 'overall',"center_center","left_bottom","center_bottom","right_bottom","left_top","center_top","right_top"
 
-
 ![layers](./assets/layers-workflow.svg)
 
 ![poster](./assets/poster-workflow.svg)
@@ -224,9 +227,16 @@ pip install llama-cpp-python \
 
 #### TextImage
 
-> [下载字体](https://drxie.github.io/OSFCC/)放到 ```custom_nodes/comfyui-mixlab-nodes/assets/fonts```
+> [下载字体](https://drxie.github.io/OSFCC/)放到 `custom_nodes/comfyui-mixlab-nodes/assets/fonts`
+
+#### MiniCPM-VQA Simple
+
+This is the int4 quantized version of MiniCPM-V 2.6.
+Running with int4 version would use lower GPU memory (about 7GB).
 
+[模型](https://huggingface.co/openbmb/MiniCPM-V-2_6-int4)
 
+![alt text](assets/1724308322276.png)
 
 ### Style
 
@@ -269,11 +279,11 @@ Add edges to an image.
 
 > LaMaInpainting（需要手动安装）
 
-*  simple-lama-inpainting 里的pillow造成冲突，暂时从依赖里移除，如果有安装 simple-lama-inpainting ，节点会自动添加，没有，则不会自动添加。
+- simple-lama-inpainting 里的 pillow 造成冲突，暂时从依赖里移除，如果有安装 simple-lama-inpainting ，节点会自动添加，没有，则不会自动添加。
 
 from [simple-lama-inpainting](https://github.com/enesmsahin/simple-lama-inpainting)
 
-* [问题汇总](https://github.com/shadowcz007/comfyui-mixlab-nodes/issues/294)
+- [问题汇总](https://github.com/shadowcz007/comfyui-mixlab-nodes/issues/294)
 
 > rembgNode
 

diff --git a/__init__.py b/__init__.py
@@ -1397,5 +1397,16 @@ def mix_status(request):
 except Exception as e:
     logging.info('TripoSR.available False' )
 
+from .nodes.MiniCPMNode import MiniCPM_VQA_Simple
+try:
+
+    logging.info('MiniCPMNode.available')
+    # logging.info( folder_paths.get_temp_directory())
+    NODE_CLASS_MAPPINGS['MiniCPM_VQA_Simple']=MiniCPM_VQA_Simple
+    NODE_DISPLAY_NAME_MAPPINGS["MiniCPM_VQA_Simple"]= "MiniCPM VQA Simple"
+
+except Exception as e:
+    logging.info('MiniCPMNode.available False' )
+
 
 logging.info('\033[93m -------------- \033[0m')
diff --git a/assets/1724308322276.png b/assets/1724308322276.png
diff --git a/nodes/MiniCPMNode.py b/nodes/MiniCPMNode.py
@@ -0,0 +1,127 @@
+# Referenced some code：https://github.com/IuvenisSapiens/ComfyUI_MiniCPM-V-2_6-int4
+
+import os
+import torch
+import folder_paths
+from transformers import AutoTokenizer, AutoModel
+from torchvision.transforms.v2 import ToPILImage
+from decord import VideoReader, cpu  # pip install decord
+from PIL import Image
+
+def get_model_path(n=""):
+    try:
+        return folder_paths.get_folder_paths(n)[0]
+    except:
+        return os.path.join(folder_paths.models_dir, n)
+
+
+class MiniCPM_VQA_Simple:
+    def __init__(self):
+        self.model_checkpoint = None
+        self.tokenizer = None
+        self.model = None
+        self.device = (
+            torch.device("cuda") if torch.cuda.is_available() else torch.device("cpu")
+        )
+        self.bf16_support = (
+            torch.cuda.is_available()
+            and torch.cuda.get_device_capability(self.device)[0] >= 8
+        )
+
+    @classmethod
+    def INPUT_TYPES(s):
+        return {
+            "required": {
+                "images": ("IMAGE",),
+                "text": ("STRING", {"default": "", "multiline": True}),   
+                "seed": ("INT", {"default": -1}),  # add seed parameter, default is -1
+                "temperature": (
+                    "FLOAT",
+                    {
+                        "default": 0.7,
+                    },
+                ),
+                "keep_model_loaded": ("BOOLEAN", {"default": False}),
+            },
+
+        }
+
+    RETURN_TYPES = ("STRING",)
+    FUNCTION = "inference"
+    CATEGORY = "♾️Mixlab/Image"
+
+    def inference(
+        self,
+        images,
+        text, 
+        seed,  # add seed parameter, default is -1
+        temperature,
+        keep_model_loaded,
+    ):
+        if seed != -1:
+            torch.manual_seed(seed)
+        model_id = "openbmb/MiniCPM-V-2_6-int4"
+
+        self.model_checkpoint = os.path.join( get_model_path("prompt_generator"), os.path.basename(model_id))
+
+        if not os.path.exists(self.model_checkpoint):
+            from huggingface_hub import snapshot_download
+
+            snapshot_download(
+                repo_id=model_id,
+                local_dir=self.model_checkpoint,
+                local_dir_use_symlinks=False,
+                endpoint='https://hf-mirror.com'
+            )
+
+        if self.tokenizer is None:
+            self.tokenizer = AutoTokenizer.from_pretrained(
+                self.model_checkpoint,
+                trust_remote_code=True,
+                low_cpu_mem_usage=True,
+            )
+
+        if self.model is None:
+            self.model = AutoModel.from_pretrained(
+                self.model_checkpoint,
+                trust_remote_code=True,
+                low_cpu_mem_usage=True,
+                attn_implementation="sdpa",
+                torch_dtype=torch.bfloat16 if self.bf16_support else torch.float16,
+            )
+
+        with torch.no_grad():
+            images = images.permute([0, 3, 1, 2])
+            images = [ToPILImage()(img).convert("RGB") for img in images]
+            msgs = [{"role": "user", "content": images + [text]}]
+
+            params = {"use_image_id": False, }
+
+            # offload model to CPU
+            # self.model = self.model.to(torch.device("cpu"))
+            # self.model.eval()
+
+            result = self.model.chat(
+                image=None,
+                msgs=msgs,
+                tokenizer=self.tokenizer,
+                sampling=True,
+                # top_k=top_k,
+                # top_p=top_p,
+                temperature=temperature,
+                # repetition_penalty=repetition_penalty,
+                # max_new_tokens=max_new_tokens,
+                **params,
+            )
+            # offload model to GPU
+            # self.model = self.model.to(torch.device("cpu"))
+            # self.model.eval()
+            if not keep_model_loaded:
+                del self.tokenizer  # release tokenizer memory
+                del self.model  # release model memory
+                self.tokenizer = None  # set tokenizer to None
+                self.model = None  # set model to None
+                torch.cuda.empty_cache()  # release GPU memory
+                torch.cuda.ipc_collect()
+
+            return (result,)
diff --git a/nodes/PromptNode.py b/nodes/PromptNode.py
@@ -18,7 +18,13 @@
 #     req =  request.Request("http://127.0.0.1:8188/prompt", data=data)
 #     request.urlopen(req)    
 
-embeddings_path=os.path.join(folder_paths.models_dir, "embeddings")
+def get_model_path(n=""):
+    try:
+        return folder_paths.get_folder_paths(n)[0]
+    except:
+        return os.path.join(folder_paths.models_dir, n)
+
+embeddings_path=get_model_path("embeddings")
 
 def get_files_with_extension(directory, extension):
 

diff --git a/pyproject.toml b/pyproject.toml
@@ -1,7 +1,7 @@
 [project]
 name = "comfyui-mixlab-nodes"
 description = "3D, ScreenShareNode & FloatingVideoNode, SpeechRecognition & SpeechSynthesis, GPT, LoadImagesFromLocal, Layers, Other Nodes, ..."
-version = "0.38.0"
+version = "0.39.0"
 license = "MIT"
 dependencies = ["numpy", "pyOpenSSL", "watchdog", "opencv-python-headless", "matplotlib", "openai", "simple-lama-inpainting", "clip-interrogator==0.6.0", "transformers>=4.36.0", "lark-parser", "imageio-ffmpeg", "rembg[gpu]", "omegaconf==2.3.0", "Pillow>=9.5.0", "einops==0.7.0", "trimesh>=4.0.5", "huggingface-hub", "scikit-image"]
 

diff --git a/requirements.txt b/requirements.txt
@@ -18,4 +18,8 @@ huggingface-hub
 scikit-image
 torchaudio
 soundfile>=0.12.1
-json-repair
+json-repair
+
+decord
+bitsandbytes
+accelerate
diff --git a/web/javascript/checkVersion_mixlab.js b/web/javascript/checkVersion_mixlab.js
@@ -3,7 +3,7 @@ import { app } from '../../../scripts/app.js'
 const repoOwner = 'shadowcz007' // 替换为仓库的所有者
 const repoName = 'comfyui-mixlab-nodes' // 替换为仓库的名称
 
-const version = 'v0.37.0'
+const version = 'v0.39.0'
 
 fetch(`https://api.github.com/repos/${repoOwner}/${repoName}/releases/latest`)
   .then(response => response.json())

diff --git a/web/javascript/image_mixlab.js b/web/javascript/image_mixlab.js
@@ -4,7 +4,7 @@ import { api } from '../../../scripts/api.js'
 import { $el } from '../../../scripts/ui.js'
 import { applyTextReplacements } from '../../../scripts/utils.js'
 
-import { loadExternalScript,get_position_style } from './common.js'
+import { loadExternalScript, get_position_style } from './common.js'
 
 function loadImageToCanvas (base64Image) {
   var img = new Image()
@@ -349,9 +349,8 @@ app.registerExtension({
           draw (ctx, node, widget_width, y, widget_height) {
             Object.assign(
               this.div.style,
-              get_position_style(ctx, widget_width, 44, node.size[1],36)
+              get_position_style(ctx, widget_width, 44, node.size[1], 36)
             )
-
           }
         }
 
@@ -536,7 +535,7 @@ app.registerExtension({
           draw (ctx, node, widget_width, y, widget_height) {
             Object.assign(
               this.div.style,
-              get_position_style(ctx, widget_width, y, node.size[1])
+              get_position_style(ctx, widget_width, y, node.size[1], 36)
             )
           }
         }
@@ -736,7 +735,7 @@ app.registerExtension({
           draw (ctx, node, widget_width, y, widget_height) {
             Object.assign(
               this.div.style,
-              get_position_style(ctx, widget_width, 44, node.size[1])
+              get_position_style(ctx, widget_width, 44, node.size[1], 44)
             )
           },
           serialize: false
@@ -919,14 +918,10 @@ app.registerExtension({
           type: 'div',
           name: 'preview',
           draw (ctx, node, widget_width, y, widget_height) {
-            let s=get_position_style(ctx, widget_width, 44, node.size[1],36);
+            let s = get_position_style(ctx, widget_width, 44, node.size[1], 36)
             delete s.height
-
-            Object.assign(
-              this.div.style,
-              s
-            )
-
+
+            Object.assign(this.div.style, s)
           },
           serialize: false
         }