Voicebox is a powerful, open-source desktop application for local voice synthesis and cloning. Using the Qwen3-TTS model, it creates detailed voice profiles from just 30 seconds of audio, accurately capturing tone and timbre. All processing happens entirely on your PC—no cloud services or subscriptions required. The built-in timeline editor integrates Whisper for text editing and audio synchronization. Features include system audio capture and the ability to create voice-driven stories. For developers, REST API and a local server are available, enabling seamless integration of speech synthesis into games, applications, and AI agents. Built on Tauri, Rust, and Python, it delivers high performance and uncompromising privacy.












