-
Microsoft Research Asia
- Beijing, China
Lists (1)
Sort Name ascending (A-Z)
Stars
Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence
World's First Large-scale High-quality Robotic Manipulation Benchmark
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
a family of versatile and state-of-the-art video tokenizers.
A deep learning library for video understanding research.
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Official repository of In-Context LoRA for Diffusion Transformers
Modeling, training, eval, and inference code for OLMo
Implementation of the proposed LVMAE, from the paper, Extending Video Masked Autoencoders to 128 frames, in Pytorch
SimVQ: Addressing Representation Collapse in Vector Quantized Models with One Linear Layer
Apache ECharts is a powerful, interactive charting and data visualization library for browser
Official inference repo for FLUX.1 models
DIAMOND (DIffusion As a Model Of eNvironment Dreams) is a reinforcement learning agent trained in a diffusion world model. NeurIPS 2024 Spotlight.
Official Implementation of "ADOPT: Modified Adam Can Converge with Any β2 with the Optimal Rate"
A suite of image and video neural tokenizers
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
🏆 A ranked list of awesome machine learning Python libraries. Updated weekly.
Efficient vision foundation models for high-resolution generation and perception.
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Implementation of CamTrol: Training-free Camera Control for Video Generation
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
CoTracker is a model for tracking any point (pixel) on a video.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.