Stars
A Keras implementation of YOLOv3 (Tensorflow backend)
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Statistical learning methods, 统计学习方法(第2版)[李航] [笔记, 代码, notebook, 参考文献, Errata, lihang]
Use PEFT or Full-parameter to finetune 400+ LLMs (Qwen2.5, Llama3.2, GLM4, Internlm2.5, Yi1.5, Mistral, Baichuan2, DeepSeek, ...) or 150+ MLLMs (Qwen2-VL, Qwen2-Audio, Llama3.2-Vision, Llava, Inter…
根据网易云音乐的歌单, 下载flac无损音乐到本地. Download the FLAC music from Internet according to your NeteaseCloudMusic playlist.
Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks
(WARNING: This repository is NO LONGER maintained ) Real time face detection and recognition base on opencv/tensorflow/mtcnn/facenet
Python implementation of Empirical Mode Decompoisition (EMD) method
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Using a U-Net for image segmentation, blending predicted patches smoothly is a must to please the human eye.
When do we not need larger vision models?
🎉 PILOT: A Pre-trained Model-Based Continual Learning Toolbox
LibAUC: A Deep Learning Library for X-Risk Optimization
(AAAI 2024) BLIVA: A Simple Multimodal LLM for Better Handling of Text-rich Visual Questions
Steer LLM outputs towards a certain topic/subject and enhance response capabilities using activation engineering by adding steering vectors
Data and Code for ACL 2021 Paper "Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning"
ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs
MATH-Vision dataset and code to measure Multimodal Mathematical Reasoning capabilities.
Code for Math-LLaVA: Bootstrapping Mathematical Reasoning for Multimodal Large Language Models
mattzheng / keras-yolo3-improved
Forked from qqwweee/keras-yolo3A Keras implementation of YOLOv3 (Tensorflow backend) 最简单的yolov3训练过程
🎉 The code repository for "Parrot: Multilingual Visual Instruction Tuning" in PyTorch.
[ECCV 2024] Official Implementation of An Incremental Unified Framework for Small Defect Inspection
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling
Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.
The code repository for "Wings: Learning Multimodal LLMs without Text-only Forgetting" [NeurIPS 2024]