Fish Speech is a state-of-the-art open-source text-to-speech project that has evolved into the OpenAudio series of advanced TTS models. The repository hosts the code and tooling for training, fine-tuning, and serving high-quality TTS, while the current flagship models (OpenAudio-S1 and S1-mini) are distributed via Fish Audio’s playground and Hugging Face. The models are evaluated with Seed TTS metrics and achieve exceptionally low word and character error rates, indicating strong intelligibility and alignment between text and audio. Fish Speech emphasizes expressive and controllable voices: it supports a long list of emotion tags, tone markers, and special audio effect markers that can be embedded in the text to drive prosody and vocal style, from basic emotions to nuanced states like sarcastic, conciliative, or hysterical. The system is multilingual and cross-lingual, handling multiple languages in a single input without explicit phoneme markup, and is trained on large-scale datasets.

Features

  • SOTA multilingual TTS models with extremely low WER and CER on benchmark evaluations
  • Extensive emotion, tone, and special-effect markers to control prosody and expressiveness directly from text
  • Zero-shot and few-shot voice cloning from short reference audio segments
  • Multilingual and cross-lingual support without explicit phoneme dependency
  • Gradio WebUI and Docker setup for easy local inference servers on common platforms
  • Large-scale RLHF-tuned models with flagship S1 and compact S1-mini variants for different resource budgets

Project Samples

Project Activity

See All Activity >

Categories

Text to Speech

License

Apache License V2.0

Follow Fish Speech

Fish Speech Web Site

Other Useful Business Software
Go from Data Warehouse to Data and AI platform with BigQuery Icon
Go from Data Warehouse to Data and AI platform with BigQuery

Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
Try BigQuery Free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of Fish Speech!

Additional Project Details

Operating Systems

Linux, Mac, Windows

Programming Language

Python

Related Categories

Python Text to Speech Software

Registered

2025-11-28