How does Gemini 3.5 Live Translate differ from previous live translation systems?

Unlike traditional turn-by-turn systems that wait for a speaker to finish a sentence before translating, Gemini 3.5 Live Translate processes and generates audio continuously. This allows it to stay just a few seconds behind the speaker, eliminating awkward pauses and preserving the original speaker's intonation, pace, and pitch for a more natural conversation.

Fluid, natural voice translation with Gemini 3.5 Live Translate

Google Launches Continuous Speech-to-Speech Translation

Google has released Gemini 3.5 Live Translate, a new audio model designed to provide near real-time, speech-to-speech translation. The system's primary distinction is its ability to generate translated speech continuously, staying only a few seconds behind the live speaker. This approach is a departure from conventional turn-by-turn systems that require a speaker to pause before translation begins, aiming instead for more fluid and natural multilingual conversations by preserving the original speaker’s intonation, pacing, and pitch.

Technical Details and Phased Rollout

The Gemini 3.5 Live Translate model is being deployed across Google’s product ecosystem, targeting developers, enterprises, and consumers simultaneously. For developers, it is available in public preview via the Gemini Live API and Google AI Studio. Enterprise customers can access it in a private preview within Google Meet, which will expand language support from five to over 70. The model is also being integrated into the public Google Translate app on Android and iOS. Key partners, including Grab and CJ ENM, have tested the model, citing its accuracy and low latency. The generated audio is watermarked using SynthID to ensure AI-generated content remains detectable.

Supported Languages: 70+ with automatic detection
Core Functionality: Continuous speech generation, preserving speaker tone
Availability: Gemini Live API, Google Meet (preview), Google Translate app
Key Integrations: API support through platforms like Agora, Fishjam, LiveKit, and Pipecat

Ecosystem Impact and Competitive Landscape

By releasing the model as an API, Google is positioning Gemini 3.5 Live Translate as a foundational technology for the broader communications industry. This strategy allows third-party developers and platforms to build sophisticated, real-time multilingual capabilities directly into their own applications. The early adoption by real-time media platforms like LiveKit and Agora indicates a clear push to establish this technology as a core utility for any service involving live voice communication, from business meetings to live broadcasts and customer support. This move intensifies competition in the communication-platform-as-a-service (CPaaS) market by offering a powerful, easily integrated AI feature.

Google's release of the Gemini Live API is less about enhancing its own translation apps and more a strategic move to commoditize real-time audio translation as a core developer utility, aiming to embed its AI infrastructure across the entire communication technology stack.

>> Verify Original Transmission at Google DeepMind