Stars
The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]
AIDC-AI / AutoGPTQ
Forked from AutoGPTQ/AutoGPTQAn easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Alibaba LangEngine is an AI application development framework written in Java.
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 100+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…
[ECCV 2024] Official Implementation of An Incremental Unified Framework for Small Defect Inspection
程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).
ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
When do we not need larger vision models?
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Official repo of "Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs"
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
🎉 PILOT: A Pre-trained Model-Based Continual Learning Toolbox
LibAUC: A Deep Learning Library for X-Risk Optimization