fix

zhongpei · Mar 25, 2024 · d408a3c · d408a3c
1 parent 0e0b1bd
commit d408a3c
Show file tree

Hide file tree

Showing 7 changed files with 1,023 additions and 26 deletions.
diff --git a/README.md b/README.md
@@ -13,6 +13,9 @@
 ### Prompt组合
 ![image](workflows/prompt_cond.png)
 
+### Reward Images(美学评估)
+![image](workflows/reward_images.png)
+
 ### 介绍
 
 
@@ -21,6 +24,7 @@
 - `Text2GPTPrompt`节点旨在创造高效的Prompt，该Prompt能够融合`moondream`系列模型和`wd-swinv2-tagger-v3`产生的关键词，为7b级别的模型定制，内含`qwen1.5-7b`和`deepseek-ai/deepseek-vl-7b-chat`。
 - 利用`hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt`模型，我们能够充分发挥Qwen的潜力，特别是在生成包括古诗词在内的各式提示语时展现卓越性能。此模型经过35000条数据的特定任务微调(SFT)，不仅性价比高，而且在CPU上运行的速度也相当可观。
 - Prompt组合方式生成Conditioning （code modify from https://github.com/Extraltodeus/Vector_Sculptor_ComfyUI）
+- Reward Images (美学评估)(https://github.com/THUDM/ImageReward)
 
 
 
@@ -39,21 +43,22 @@ git clone https://github.com/zhongpei/Comfyui-image2prompt
 ### 第2步：下载模型
 
 模型在第一次运行时候会自动下载，如果没有正常下载，为了使插件正常工作，您需要下载必要的模型。该插件使用来自Hugging Face的 `vikhyatk/moondream1` `vikhyatk/moondream2` `unum-cloud/uform-gen2-qwen-500m` 和 `internlm/internlm-xcomposer2-vl-7b` 模型。
-确保您已将这些模型下载到插件的 `custom_nodes/Comfyui_image2prompt/model` 目录中。使用以下链接进行下载：
+确保您已将这些模型下载到插件的 `ComfyUI/models/image2text` 目录中。使用以下链接进行下载：
 
 * [下载moondream1模型](https://huggingface.co/vikhyatk/moondream1)
 * [下载moondream2模型](https://huggingface.co/vikhyatk/moondream2)
 * [下载internlm-xcomposer2-vl-7b模型](https://huggingface.co/internlm/internlm-xcomposer2-vl-7b)
 * [下载uform-gen2-qwen-500m模型](https://huggingface.co/unum-cloud/uform-gen2-qwen-500m)
 * [下载Qwen-1_8B-Stable-Diffusion-Prompt](https://huggingface.co/hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt)
-* [deepseek-ai/deepseek-vl-1.3b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-1.3b-chat)
+* [下载deepseek-ai/deepseek-vl-1.3b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-1.3b-chat)
+* [下载deepseek-ai/deepseek-vl-7b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-7b-chat)
 此外，如果您更喜欢使用镜像站点下载，可以将Hugging Face端点设置为镜像URL。在终端中执行以下命令以使用镜像：
 
 ```bash
 export HF_ENDPOINT=https://hf-mirror.com
-huggingface-cli download --resume-download vikhyatk/moondream1 --local-dir custom_nodes/Comfyui-image2prompt/model/moondream1
-huggingface-cli download --resume-download internlm/internlm-xcomposer2-vl-7b --local-dir custom_nodes/Comfyui-image2prompt/model/internlm-xcomposer2-vl-7b
-huggingface-cli download --resume-download unum-cloud/uform-gen2-qwen-500m --local-dir custom_nodes/Comfyui-image2prompt/model/uform-gen2-qwen-500m
+huggingface-cli download --resume-download vikhyatk/moondream1 --local-dir ComfyUI/models/image2text/moondream1
+huggingface-cli download --resume-download internlm/internlm-xcomposer2-vl-7b --local-dir ComfyUI/models/image2text/internlm-xcomposer2-vl-7b
+huggingface-cli download --resume-download unum-cloud/uform-gen2-qwen-500m --local-dir ComfyUI/models/image2text/uform-gen2-qwen-500m
 ```
 
 按照这些步骤操作，您将确保插件能够访问所需的模型，从而准确地将图片转换为提示，增强您的ComfyUI体验。
@@ -66,6 +71,7 @@ huggingface-cli download --resume-download unum-cloud/uform-gen2-qwen-500m --loc
 - The `Text2GPTPrompt` node is designed to create efficient prompts that can integrate keywords generated by the `moondream` series models and `wd-swinv2-tagger-v3`, customized for models at the 7b level, including `qwen1.5-7b`.
 - Utilizing the `hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt` model, we can fully leverage the potential of Qwen, especially in generating various forms of prompts, including classical poetry. This model, fine-tuned with 35,000 pieces of data for specific tasks (SFT), not only offers a high cost-performance ratio but also runs at a considerable speed on CPUs.
 - Generating Conditioning through Prompt Combination （code modify from https://github.com/Extraltodeus/Vector_Sculptor_ComfyUI）
+- Reward Images (https://github.com/THUDM/ImageReward)
 
 ### Step 1: Install the Plugin
 
@@ -79,22 +85,23 @@ This step is crucial for enabling the plugin within the ComfyUI environment, fac
 
 ### Step 2: Download the Model
 
-The model will be automatically downloaded the first time it is run. If it does not download normally, for the plugin to function properly, you need to download the necessary models. This plugin utilizes the `vikhyatk/moondream1` `vikhyatk/moondream2` `unum-cloud/uform-gen2-qwen-500m` and `internlm/internlm-xcomposer2-vl-7b` models from Hugging Face. Make sure to download these models into the plugin's `custom_nodes/Comfyui_image2prompt/model` directories, respectively. Use the following links for downloading:
+The model will be automatically downloaded the first time it is run. If it does not download normally, for the plugin to function properly, you need to download the necessary models. This plugin utilizes the `vikhyatk/moondream1` `vikhyatk/moondream2` `unum-cloud/uform-gen2-qwen-500m` and `internlm/internlm-xcomposer2-vl-7b` models from Hugging Face. Make sure to download these models into the plugin's `ComfyUI/models/image2text` directories, respectively. Use the following links for downloading:
 
 * [Download moondream1 Model](https://huggingface.co/vikhyatk/moondream1)
 * [Download moondream2 Model](https://huggingface.co/vikhyatk/moondream2)
 * [Download internlm-xcomposer2-vl-7b Model](https://huggingface.co/internlm/internlm-xcomposer2-vl-7b)
 * [Download uform-gen2-qwen-500m Model](https://huggingface.co/unum-cloud/uform-gen2-qwen-500m)
 * [Download Qwen-1_8B-Stable-Diffusion-Prompt](https://huggingface.co/hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt)
 * [Download deepseek-vl-1.3b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-1.3b-chat)
+* [Download deepseek-ai/deepseek-vl-7b-chat](https://huggingface.co/deepseek-ai/deepseek-vl-7b-chat)
 Additionally, if you prefer using a chinese mirror site for downloading, you can set the Hugging Face endpoint to a mirror URL. Execute the following commands in your terminal to utilize the mirror:
 
 ```bash
 
-huggingface-cli download --resume-download vikhyatk/moondream1 --local-dir custom_nodes/Comfyui-image2prompt/model/moondream1
-huggingface-cli download --resume-download vikhyatk/moondream2 --local-dir custom_nodes/Comfyui-image2prompt/model/moondream2
-huggingface-cli download --resume-download internlm/internlm-xcomposer2-vl-7b --local-dir custom_nodes/Comfyui-image2prompt/model/internlm-xcomposer2-vl-7b
-huggingface-cli download --resume-download unum-cloud/uform-gen2-qwen-500m --local-dir custom_nodes/Comfyui-image2prompt/model/uform-gen2-qwen-500m
+huggingface-cli download --resume-download vikhyatk/moondream1 --local-dir ComfyUI/models/image2text/moondream1
+huggingface-cli download --resume-download vikhyatk/moondream2 --local-dir ComfyUI/models/image2text/moondream2
+huggingface-cli download --resume-download internlm/internlm-xcomposer2-vl-7b --local-dir ComfyUI/models/image2text/model/internlm-xcomposer2-vl-7b
+huggingface-cli download --resume-download unum-cloud/uform-gen2-qwen-500m --local-dir ComfyUI/models/image2text/model/uform-gen2-qwen-500m
 ```
 
 By completing these steps, you'll ensure that the plugin has access to the necessary models, enabling it to accurately convert images into prompts, thereby enhancing your ComfyUI experience.
diff --git a/src/conditioning.py b/src/conditioning.py
@@ -201,15 +201,15 @@ def INPUT_TYPES(s):
                 "merge_conditioning_strength": ("FLOAT", {"default": 0.5, "min": 0, "max": 1, "step": 0.01}),
                 "merge_conditioning_strength_custom": ("STRING", {"multiline": True} ),
                 "sculptor_intensity": ("FLOAT", {"default": 1, "min": 0, "max": 10, "step": 0.1}),
-                "sculptor_method" : (["forward","backward","maximum_absolute"],),
+                "sculptor_method" : (["forward","backward","maximum_absolute"],{"default":"backward"}),
                 "token_normalization": (["none", "mean", "set at 1", "default * attention", "mean * attention", "set at attention", "mean of all tokens"],),
 
             }
         }
 
     FUNCTION = "exec"
     RETURN_TYPES = ("CONDITIONING",)
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/conditioning"
 
     def exec(
             self, 
@@ -286,7 +286,7 @@ def INPUT_TYPES(s):
     RETURN_TYPES = ("CONDITIONING",)
     FUNCTION = "encode"
 
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/conditioning"
 
     def encode(self, clip, text, token_normalization, weight_interpretation, affect_pooled='disable'):
         embeddings_final, pooled = advanced_encode(clip, text, token_normalization, weight_interpretation, w_max=1.0,

diff --git a/src/image2text.py b/src/image2text.py
@@ -34,7 +34,7 @@ def INPUT_TYPES(cls):
 
     RETURN_TYPES = ("IMAGE2TEXT_MODEL",)
     FUNCTION = "get_model"
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/image2prompt"
 
     def get_model(self, model, device, low_memory):       
 
@@ -81,7 +81,7 @@ def INPUT_TYPES(cls):
     OUTPUT_IS_LIST = (True,)
     RETURN_TYPES = ("STRING",)
     FUNCTION = "get_value"
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/image2prompt"
 
     def get_value(self, model, image, query, custom_query, print_log):
         # Ensure custom queries are prioritized
@@ -133,7 +133,7 @@ def INPUT_TYPES(cls):
     RETURN_TYPES = ("STRING", "STRING", "STRING")
     RETURN_NAMES = ('FULL PROMPT', "PROMPT", "TAGS")
     FUNCTION = "get_value"
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/image2prompt"
 
     def get_value(self, model, image, query, custom_query, print_log, score,remove_1girl):
         global GLOBAL_WdV3Model

diff --git a/src/reward.py b/src/reward.py
@@ -19,17 +19,30 @@
 class ImageBatchToList:
     @classmethod
     def INPUT_TYPES(s):
-        return {"required": {"image_batch": ("IMAGE",), }}
+        return {
+            "required": {"image_batch1": ("IMAGE",), },
+            "optional":{
+                "image_batch2": ("IMAGE",), 
+                "image_batch3": ("IMAGE",), 
+                "image_batch4": ("IMAGE",), 
+            }
+        }
 
     RETURN_TYPES = ("IMAGE",)
     RETURN_NAMES = ("IMAGES",)
     OUTPUT_IS_LIST = (True,)
     FUNCTION = "run"
 
-    CATEGORY = "fofo/Image"
+    CATEGORY = "fofo🐼/image"
 
-    def run(self, image_batch):
-        images = [image_batch[i:i + 1, ...] for i in range(image_batch.shape[0])]
+    def run(self, image_batch1,image_batch2=None,image_batch3=None,image_batch4=None):
+        images = [image_batch1[i:i + 1, ...] for i in range(image_batch1.shape[0])]
+        if image_batch2 is not None:
+            images += [image_batch2[i:i + 1, ...] for i in range(image_batch2.shape[0])]
+        if image_batch3 is not None:
+            images += [image_batch3[i:i + 1, ...] for i in range(image_batch3.shape[0])]
+        if image_batch4 is not None:
+            images += [image_batch4[i:i + 1, ...] for i in range(image_batch4.shape[0])]
         return (images, )
 
 class LoadImageRewardScoreModel:
@@ -45,7 +58,7 @@ def INPUT_TYPES(s):
     RETURN_TYPES = ("IMAGEREWARD_MODEL",)
     RETURN_NAMES = ("IMAGEREWARD_MODEL",)
     OUTPUT_NODE = True
-    CATEGORY = "fofo"
+    CATEGORY = "fofo🐼/image"
     FUNCTION = "load_model"
 
     def load_model(self, device):       
@@ -85,7 +98,7 @@ def INPUT_TYPES(s):
     RETURN_TYPES = ("IMAGE","STRING","INT")
     RETURN_NAMES = ("IMAGES","SCORES_STR","SCORES_INT")
     OUTPUT_NODE = True
-    CATEGORY = "fofo"
+    CATEGORY = "fofo🐼/image"
     FUNCTION = "reward"
 
     INPUT_IS_LIST = True
@@ -98,7 +111,7 @@ def reward(self, model, prompt, top_k , images):
         model = model[0]
         top_k = top_k[0]
         prompt = prompt[0]
-        
+
         scores = []
         print(f"image count: {len(images)}")
 

diff --git a/src/text2prompt.py b/src/text2prompt.py
@@ -22,7 +22,7 @@ def INPUT_TYPES(cls):
 
     RETURN_TYPES = ("TEXT2PROMPT_MODEL",)
     FUNCTION = "get_model"
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/prompt"
 
     def get_model(self, model, device, low_memory):     
 
@@ -84,7 +84,7 @@ def INPUT_TYPES(cls):
 
     RETURN_TYPES = ("STRING",)
     FUNCTION = "generate_text"
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/prompt"
 
     def generate_text(
             self,
@@ -139,7 +139,7 @@ def INPUT_TYPES(cls):
 
     RETURN_TYPES = ("STRING",)
     FUNCTION = "generate_text"
-    CATEGORY = "fofo🐼"
+    CATEGORY = "fofo🐼/prompt"
 
     def generate_text(self, prompt,text1,text2,text1_perfix,text2_perfix,print_output):
         text1 = f"{text1_perfix}{text1}"