7 releases
Uses new Rust 2024
| 0.3.0 | Jan 20, 2026 |
|---|---|
| 0.2.5 | Jan 20, 2026 |
| 0.2.4 | Nov 25, 2024 |
| 0.2.2 | Sep 19, 2024 |
| 0.1.0 | Jan 24, 2024 |
#101 in Audio
435 downloads per month
Used in natural-tts
88KB
1.5K
SLoC
Description
This library is a wrapper of MSEdge Read aloud function API. You can use it to synthesize text to speech with many voices MS provided.
How to use
- You need get a
SpeechConfigto configure the voice of text to speech.
You can convertVoicetoSpeechConfigsimply. Useget_voices_listfunction to get all available voices.
VoiceandSpeechConfigimplementedserde::Serializeandserde::Deserialize.
For example:
You can also createuse msedge_tts::voice::get_voices_list; use msedge_tts::tts::SpeechConfig; fn main() { let voices = get_voices_list().unwrap(); let speechConfig = SpeechConfig::from(&voices[0]); }SpeechConfigby yourself. Make sure you know the right voice name and audio format. - Create a TTS
ClientorStream. Both of them have sync and async version. Example below step 3. - Synthesize text to speech.
Sync Client
Call client functionsynthesizeto synthesize text to speech. This function return TypeSynthesizedAudio, you can getaudio_bytesandaudio_metadata.use msedge_tts::{tts::client::connect, tts::SpeechConfig, voice::get_voices_list}; fn main() { let voices = get_voices_list().unwrap(); for voice in &voices { if voice.name.contains("YunyangNeural") { let config = SpeechConfig::from(voice); let mut tts = connect().unwrap(); let audio = tts .synthesize("Hello, World! 你好,世界!", &config) .unwrap(); break; } } }Async Client
Call client functionsynthesizeto synthesize text to speech. This function return TypeSynthesizedAudio, you can getaudio_bytesandaudio_metadata.use msedge_tts::{tts::client::connect_async, tts::SpeechConfig, voice::get_voices_list_async}; fn main() { smol::block_on(async { let voices = get_voices_list_async().await.unwrap(); for voice in &voices { if voice.name.contains("YunyangNeural") { let config = SpeechConfig::from(voice); let mut tts = connect_async().await.unwrap(); let audio = tts .synthesize("Hello, World! 你好,世界!", &config) .await .unwrap(); break; } } }); }Sync Stream
Call Sender Stream functionsendto synthesize text to speech. Call Reader Stream functionreadto get data.
readreturnOption<SynthesizedResponse>, the response may beAudioBytesorAudioMetadataor None. This is because the MSEdge Read aloud API returns multiple data segment and metadata and other information sequentially.
Caution: Onesendcorresponds to multipleread. Nextsendcall will block until there no data to read.readwill block before you call asend.use msedge_tts::{ tts::stream::{msedge_tts_split, SynthesizedResponse}, tts::SpeechConfig, voice::get_voices_list, }; use std::{ sync::{ atomic::{AtomicBool, Ordering}, Arc, }, thread::spawn, }; fn main() { let voices = get_voices_list().unwrap(); for voice in &voices { if voice.name.contains("YunyangNeural") { let config = SpeechConfig::from(voice); let (mut sender, mut reader) = msedge_tts_split().unwrap(); let signal = Arc::new(AtomicBool::new(false)); let end = signal.clone(); spawn(move || { sender.send("Hello, World! 你好,世界!", &config).unwrap(); println!("synthesizing...1"); sender.send("Hello, World! 你好,世界!", &config).unwrap(); println!("synthesizing...2"); sender.send("Hello, World! 你好,世界!", &config).unwrap(); println!("synthesizing...3"); sender.send("Hello, World! 你好,世界!", &config).unwrap(); println!("synthesizing...4"); end.store(true, Ordering::Relaxed); }); loop { if signal.load(Ordering::Relaxed) && !reader.can_read() { break; } let audio = reader.read().unwrap(); if let Some(audio) = audio { match audio { SynthesizedResponse::AudioBytes(_) => { println!("read bytes") } SynthesizedResponse::AudioMetadata(_) => { println!("read metadata") } } } else { println!("read None"); } } } } }Async Stream
Call Sender Async functionsendto synthesize text to speech. Call Reader Async functionreadto get data.readreturnOption<SynthesizedResponse>as above.sendandreadblock as above.use msedge_tts::{ tts::{ stream::{msedge_tts_split_async, SynthesizedResponse}, SpeechConfig, }, voice::get_voices_list_async, }; use std::{ sync::{ atomic::{AtomicBool, Ordering}, Arc, }, }; fn main() { smol::block_on(async { let voices = get_voices_list_async().await.unwrap(); for voice in &voices { if voice.name.contains("YunyangNeural") { let config = SpeechConfig::from(voice); let (mut sender, mut reader) = msedge_tts_split_async().await.unwrap(); let signal = Arc::new(AtomicBool::new(false)); let end = signal.clone(); smol::spawn(async move { sender .send("Hello, World! 你好,世界!", &config) .await .unwrap(); println!("synthesizing...1"); sender .send("Hello, World! 你好,世界!", &config) .await .unwrap(); println!("synthesizing...2"); sender .send("Hello, World! 你好,世界!", &config) .await .unwrap(); println!("synthesizing...3"); sender .send("Hello, World! 你好,世界!", &config) .await .unwrap(); println!("synthesizing...4"); end.store(true, Ordering::Relaxed); }) .detach(); loop { if signal.load(Ordering::Relaxed) && !reader.can_read().await { break; } let audio = reader.read().await.unwrap(); if let Some(audio) = audio { match audio { SynthesizedResponse::AudioBytes(_) => { println!("read bytes") } SynthesizedResponse::AudioMetadata(_) => { println!("read metadata") } } } else { println!("read None"); } } } } }); }
see all examples.
Dependencies
~24–43MB
~616K SLoC