OCR software, free and offline
Open source no-code system for text annotation and building of text
CLIP, Predict the most relevant text snippet given an image
A simple, high-quality voice conversion tool focused on ease of use
Spark-TTS Inference Code
Code for running inference and finetuning with SAM 3 model
Wan2.2: Open and Advanced Large-Scale Video Generative Model
Official inference repo for FLUX.1 models
SOTA Open Source TTS
Wan2.1: Open and Advanced Large-Scale Video Generative Model
A Powerful Native Multimodal Model for Image Generation
Offline inference engine for art, real-time voice conversations
Official inference repo for FLUX.2 models
An Open Source text-to-speech system built by inverting Whisper
Audiocraft is a library for audio processing and generation
Sample code and notebooks for Generative AI on Google Cloud
NLP Cloud serves high performance pre-trained or custom models for NER
High-Resolution Image Synthesis with Latent Diffusion Models
Offline Text To Speech synthesis for python
HunyuanVideo: A Systematic Framework For Large Video Generation Model
Code for the paper "Evaluating Large Language Models Trained on Code"
Collection of Gemma 3 variants that are trained for performance
Dockerized FastAPI wrapper for Kokoro-82M text-to-speech model
Accurate × Fast × Comprehensive
Long-form streaming TTS system for multi-speaker dialogue generation