Stars
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Label Studio is a multi-type data labeling and annotation tool with standardized output format
FastVideo is a lightweight framework for accelerating large video diffusion models.
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
Fast and memory-efficient exact attention
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
✨✨Latest Advances on Multimodal Large Language Models
Official Code for Stable Cascade
MiniCPM-o 2.6: A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone
MiniCPM3-4B: An edge-side LLM that surpasses GPT-3.5-Turbo.
中文大模型能力评测榜单:目前已囊括164个大模型,覆盖chatgpt、gpt-4o、谷歌gemini、Claude3.5、百度文心一言、千问、百川、讯飞星火、商汤senseChat、minimax等商用模型, 以及deepseek-v3、qwen2.5、llama3.3、phi-4、glm4、书生internLM2.5等开源大模型。不仅提供能力评分排行榜,也提供所有模型的原始输出结果!
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Drag & drop UI to build your customized LLM flow
A community-maintained Python framework for creating mathematical animations.
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 275+ supported cars.
Autonomous coding agent right in your IDE, capable of creating/editing files, executing commands, using the browser, and more with your permission every step of the way.
Tracking and collecting papers/projects/others related to Segment Anything.
[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
LAVIS - A One-stop Library for Language-Vision Intelligence
[NeurIPS 2023] Official implementation of the paper "Segment Everything Everywhere All at Once"
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"