Highlights
- Pro
Stars
Talk to any LLM with hands-free voice interaction, voice interruption, Live2D taking face, and long-term memory running locally across platforms
Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Official inference repo for FLUX.1 models
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
takagen99 / Box
Forked from CatVodTVOfficial/TVBoxOSCExperimental
A library to generate LaTeX expression from Python code.
Netflix-level subtitle cutting, translation, alignment, and even dubbing - one-click fully automated AI video subtitle team | Netflix级字幕切割、翻译、对齐、甚至加上配音,一键全自动视频搬运AI字幕组
[CVPR'2024 Highlight] Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle
The real state 10k dataset from https://google.github.io/realestate10k
A simple compiler for SysY language with Java.
COLMAP - Structure-from-Motion and Multi-View Stereo
High-Resolution Image Synthesis with Latent Diffusion Models
批量为本地视频生成字幕文件,并可将字幕文件翻译成其它语言, 跨平台支持 window, mac 系统
A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。
[ECCV2024] Relightable 3D Gaussian: Real-time Point Cloud Relighting with BRDF Decomposition and Ray Tracing
A feature-rich command-line audio/video downloader
Open-Sora: Democratizing Efficient Video Production for All
Official repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
🤯 Lobe Chat - an open-source, modern-design AI chat framework. Supports Multi AI Providers( OpenAI / Claude 3 / Gemini / Ollama / Qwen / DeepSeek), Knowledge Base (file upload / knowledge managemen…
DeepSeek Coder: Let the Code Write Itself
A paper list of my history reading. Robotics, Learning, Vision.