We use BEATs model to acquire tags and utilize LLM to expand into captions
-
Download pretrain BEATs weight from BEATs
-
BEATs model to classfier
accelerate config
accelerate launch --multi_gpu classfier.py
python generate_tag.py
- LLM(such as GPT4 or deepseek) to expand into captions
python gpt/tag_caption.py
find /path -type f > output.txt
Download the mtg dataset. You can download mtg-jamendo-dataset and get raw_30s 55,701 tracks. https://huggingface.co/datasets/wanghappy/Music-tag-generation/
This project is licensed under the MIT License.