-
Beihang University
Stars
Stable Diffusion web UI
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The …
Implementation of the Capsule-Forensics-v2
Public repo for 365 Data Science ML Algorithms Course
A Collection of Papers and Codes for CVPR2024/ECCV2024 AIGC
A Collection of AIGC Research Groups
Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition
[CVPR23] Official Implementation of MIC: Masked Image Consistency for Context-Enhanced Domain Adaptation
Self-supervised Augmentation Consistency for Adapting Semantic Segmentation (CVPR 2021)
[SIGGRAPH Asia 2023 (Technical Communications)] EasyVolcap: Accelerating Neural Volumetric Video Research
Robust Speech Recognition via Large-Scale Weak Supervision
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Facebook AI Research's Automatic Speech Recognition Toolkit
Effective Data Augmentation With Diffusion Models
[CVPR'24] GraphDreamer: a novel framework of generating compositional 3D scenes from scene graphs.
Code for RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion [Arxiv 2024]
Official code for ICLR 2024 paper Do Generated Data Always Help Contrastive Learning?
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
Official code for "DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion"
[ICLR 2024 spotlight] Official implementation of "InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior".
Official implementation of "LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching"
[ICML 2024] GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
[CVPR 2024 Highlight] MIGC and [TPAMI 2024] MIGC++ (Official Implementation)
Text2Room generates textured 3D meshes from a given text prompt using 2D text-to-image models (ICCV2023).
An open-source codebase for exploring autonomous driving pre-training
[ECCV 2024] Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting