Skip to content
View charlesCXK's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Organizations

@HRNet @Atten4Vis

Block or report charlesCXK

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Python 807 72 Updated Jan 16, 2025

This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.

Python 1,145 54 Updated Nov 22, 2024

Janus-Series: Unified Multimodal Understanding and Generation Models

Python 1,328 69 Updated Nov 13, 2024

Open-source evaluation toolkit of large vision-language models (LVLMs), support 160+ VLMs, 50+ benchmarks

Python 1,691 237 Updated Jan 18, 2025

This repo contains the code for 1D tokenizer and generator

Jupyter Notebook 650 34 Updated Jan 18, 2025

LLaVA-UHD v2: an MLLM Integrating High-Resolution Feature Pyramid via Hierarchical Window Transformer

Python 350 15 Updated Jan 14, 2025

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

4,037 199 Updated Sep 25, 2024

Accelerating the development of large multimodal models (LMMs) with one-click evaluation module - lmms-eval.

Python 2,010 189 Updated Jan 17, 2025

Schedule-Free Optimization in PyTorch

Python 2,062 71 Updated Dec 2, 2024

Annotated version of the Mamba paper

Jupyter Notebook 469 18 Updated Feb 27, 2024

VideoSys: An easy and efficient system for video generation

Python 1,881 128 Updated Jan 1, 2025

A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.

3,809 215 Updated Jan 18, 2025

Finetune Llama 3.3, Mistral, Phi-4, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory

Python 20,817 1,468 Updated Jan 17, 2025

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Python 4,427 232 Updated Jun 14, 2024

[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.

Python 1,768 125 Updated Aug 20, 2024

Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models

Python 3,030 270 Updated Jan 10, 2025

OMG-LLaVA and OMG-Seg codebase [CVPR-24 and NeurIPS-24]

Python 1,212 50 Updated Dec 11, 2024

【CVPR 2024 Highlight】Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models

Python 1,691 134 Updated Dec 30, 2024

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 37,514 4,592 Updated Jan 18, 2025

Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"

Jupyter Notebook 939 43 Updated Aug 12, 2024

Recent LLM-based CV and related works. Welcome to comment/contribute!

851 36 Updated Jun 5, 2024

Official code for the NeurIPS 2023 paper "Switching Temporary Teachers for Semi-Supervised Semantic Segmentation"

Python 44 3 Updated Nov 16, 2023

✨✨Latest Advances on Multimodal Large Language Models

13,571 867 Updated Jan 17, 2025

The Startup CTO's Handbook, a book covering leadership, management and technical topics for leaders of software engineering teams

10,305 500 Updated May 5, 2024

[ICCV-2023]-Universal Video Segmentaion For VSS, VPS and VIS

Python 111 3 Updated Mar 18, 2024

[ICCV2023] DETR Doesn’t Need Multi-Scale or Locality Design

Python 193 4 Updated Nov 14, 2023

[ICCV 2023] Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment

43 3 Updated Jul 20, 2023

official code for TMLR Paper: "Understanding Self-Supervised Pretraining with Part-Aware Representation Learning"

Python 4 Updated Jan 3, 2024

CV算法岗知识点及面试问答汇总,主要分为计算机视觉、机器学习、图像处理和 C++基础四大块,一起努力向offers发起冲击!

1,644 265 Updated Nov 2, 2021

I hope this repo can help you a lot!

1,295 221 Updated Dec 1, 2023
Next