Skip to content

Commit

Permalink
fix readme
Browse files Browse the repository at this point in the history
  • Loading branch information
zhongpei committed Mar 11, 2024
1 parent dfeeb97 commit 48453ab
Showing 1 changed file with 13 additions and 4 deletions.
17 changes: 13 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,15 @@

### 介绍

使用moondream等模型描述图片主要信息,同时使用了`wd-swinv2-tagger-v3`模型增加人物描述的准确性。
使用了`hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt`充分利用Qwen的能力,支持包括古诗词在内的多种形式进行提示语的生成,此模型基于35k 条数据进行特定于任务的微调(SFT).

- 我们采用了`wd-swinv2-tagger-v3`模型,显著提升了人物特征的描述准确度,特别适用于需要细腻描绘人物的场景。
- 对于场景描写,`moondream1`模型提供了丰富的细节,但有时候可能显得冗长并缺乏准确性。相比之下,`moondream2`模型以其简洁而精确的场景描述脱颖而出。因此,在使用`Image2TextWithTags`节点时,对于以场景为主的文本生成,推荐`moondream1``wd-swinv2-tagger-v3`的组合;而对于注重人物描述的内容,`wd-swinv2-tagger-v3``moondream2`的搭配将是理想选择。
- `Text2GPTPrompt`节点旨在创造高效的Prompt,该Prompt能够融合`moondream`系列模型和`wd-swinv2-tagger-v3`产生的关键词,为7b级别的模型定制,内含`qwen1.5-7b`
- 利用`hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt`模型,我们能够充分发挥Qwen的潜力,特别是在生成包括古诗词在内的各式提示语时展现卓越性能。此模型经过35000条数据的特定任务微调(SFT),不仅性价比高,而且在CPU上运行的速度也相当可观。





### 第1步:安装插件

Expand Down Expand Up @@ -48,8 +55,10 @@ huggingface-cli download --resume-download unum-cloud/uform-gen2-qwen-500m --loc



Models such as `moondream` are used to describe the main information of the image, and the `wd-swinv2-tagger-v3` model is used to increase the accuracy of character description.
Used `hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt` to make full use of Qwen's capabilities and support multiple forms of prompt generation including ancient poetry. This model is based on 35k pieces of data for task-specific fine-tuning ( SFT).
- We adopted the `wd-swinv2-tagger-v3` model, which significantly enhanced the accuracy of character trait descriptions, making it particularly suitable for scenarios requiring detailed depiction of characters.
- For scene description, the `moondream1` model offers rich details but might sometimes appear verbose and lack precision. In contrast, the `moondream2` model stands out for its concise and accurate scene descriptions. Therefore, when using the `Image2TextWithTags` node, for text generation centered on scenes, a combination of `moondream1` and `wd-swinv2-tagger-v3` is recommended; for content focusing on character descriptions, pairing `wd-swinv2-tagger-v3` with `moondream2` is the ideal choice.
- The `Text2GPTPrompt` node is designed to create efficient prompts that can integrate keywords generated by the `moondream` series models and `wd-swinv2-tagger-v3`, customized for models at the 7b level, including `qwen1.5-7b`.
- Utilizing the `hahahafofo/Qwen-1_8B-Stable-Diffusion-Prompt` model, we can fully leverage the potential of Qwen, especially in generating various forms of prompts, including classical poetry. This model, fine-tuned with 35,000 pieces of data for specific tasks (SFT), not only offers a high cost-performance ratio but also runs at a considerable speed on CPUs.

### Step 1: Install the Plugin

Expand Down

0 comments on commit 48453ab

Please sign in to comment.