Lists (3)
Sort Name descending (Z-A)
Stars
GRUtopia: Dream General Robots in a City at Scale
https://hf.co/hexgrad/Kokoro-82M
A CLI text-to-speech tool using the Kokoro model, supporting multiple languages, voices (with blending), and various input formats including EPUB books and PDF documents.
Two conversational AI agents switching from English to sound-level protocol after confirming they are both AI agents
MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/READ…
TextGrad: Automatic ''Differentiation'' via Text -- using large language models to backpropagate textual gradients.
Code for the paper: "Active Vision Might Be All You Need: Exploring Active Vision in Bimanual Robotic Manipulation"
Demo-Driven Mobile Bi-Manual Manipulation Benchmark.
SpatialLM: Large Language Model for Spatial Understanding
Magnificent app which corrects your previous console command.
The official Soundwave repository
Making a mini version of the BDX droid. https://discord.gg/UtJZsgfQGe
[CVPR 2025] HumanMM: Global Human Motion Recovery from Multi-shot Videos
A simple screen parsing tool towards pure vision based GUI agent
Genome modeling and design across all domains of life
Fine-tuned LLMs generate accurate 3D human avatars from textual descriptions using the SMPL-X model, enhancing customization and simulation in virtual environments.
🔥 Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos
Pretraining code for a large-scale depth-recurrent language model
MR.Q is a general-purpose model-free reinforcement learning algorithm.
Zeying-Gong / Falcon
Forked from facebookresearch/habitat-labOfficial Code for "From Cognition to Precognition: A Future-Aware Framework for Social Navigation" (ICRA 2025)
[ICLR 2024] Official code for the paper "LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts"
SpeechGPT Series: Speech Large Language Models