Skip to content
View runninglsy's full-sized avatar

Block or report runninglsy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

Python 28 2 Updated Jan 17, 2025

The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]

Python 13 1 Updated Dec 28, 2024

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 2 Updated Nov 4, 2024
Python 3 Updated Nov 14, 2024

Alibaba LangEngine is an AI application development framework written in Java.

Java 163 14 Updated Dec 9, 2024

[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding

Python 573 42 Updated Jan 11, 2025

Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors

Python 215 12 Updated May 11, 2024
Python 165 10 Updated Dec 17, 2024

Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, InternLM3, GLM4, Llama3.3, Mistral, Yi1.5, Baichuan2, DeepSeek3, ...) and 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…

Python 5,100 444 Updated Jan 20, 2025

[ECCV 2024] Official Implementation of An Incremental Unified Framework for Small Defect Inspection

Python 35 Updated Nov 16, 2024

程序员在家做饭方法指南。Programmer's guide about how to cook at home (Simplified Chinese only).

Dockerfile 68,686 8,805 Updated Jan 18, 2025

ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Python 98 3 Updated Jul 18, 2024

🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.

Python 35 1 Updated Aug 21, 2024

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Python 6,837 524 Updated Dec 25, 2024
Python 6 Updated Aug 16, 2024

MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.

Python 79 7 Updated Oct 10, 2024

Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models

Python 76 8 Updated Jun 28, 2024

[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

Python 93 11 Updated Dec 25, 2024

When do we not need larger vision models?

Python 355 11 Updated Dec 4, 2024

Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"

Python 128 30 Updated Nov 28, 2024
Python 9 Updated May 9, 2023

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Python 1,696 242 Updated Jan 20, 2025
Python 65 6 Updated Dec 6, 2024

DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling

Python 29 2 Updated Jul 12, 2024

Official repo of "Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs"

27 Updated Oct 9, 2024

(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions

Python 251 28 Updated Apr 14, 2024
Jupyter Notebook 32 3 Updated Jan 8, 2025

🎉 PILOT: A Pre-trained Model-Based Continual Learning Toolbox

Python 341 37 Updated Dec 19, 2024

Github Actions: 完成每日健康填报打卡,So easy

66 112 Updated Mar 24, 2022

《推荐系统实践》代码实现

Jupyter Notebook 707 222 Updated Mar 11, 2019
Next