Real-time translation (Beta)
In addition to transcribing the host's audio in real time, Real-Time STT also supports translation during transcription. For example, in scenarios like international conferences, you can transcribe the host's speech, translate the transcribed content, and then push both the original text and the translation back to the channel as subtitles.
Understand the tech
Following are the key features of real-time translation:
-
Instant translation
Live speech-to-text translation to keep conversations flowing seamlessly in real-time communication or live streaming. -
Multi-language support
Manage multilingual interactions with speech translation from up to two source languages into five target languages, and support for 30+ languages. -
High accuracy
Advanced Speech Recognition (ASR) captures spoken language and converts it accurately to text using sophisticated recognition technologies. -
Translated captions
Live captions are continually updated during speech, providing readable, translated text. Video Text Track (VTT) files can be stored in the cloud for future reference, AI analysis, or compliance. -
Ultra-low Latency translation
Seamless translation with an end-to-start latency of under 1 second and average end-to-end latency of under 3 seconds. -
LLM integration
Process translated text using custom large language models (LLMs) or integrate additional AI services to enhance capabilities and streamline workflows.
This page shows you how to set up translation of the transcribed content when starting a transcription task.
Prerequisites
To follow this guide, first implement basic speech-to-text transcription by following the Rest quickstart.
Implementation
When calling start
to start a transcription task, set translateConfig
to translate the transcribed text content.
Sample request
The following example shows you how to set up translation when transcribing. The example includes the URL, header, and body. To record and encrypt at the same time in the transcription task, refer to Record captions and Encrypt captions.
Parameter | Type | Description |
---|---|---|
source | string array | The source language for the translation. |
target | array | The target languages for translation. You can configure up to 5 target languages. See Supported languages. |
Sample response
Parameter | Type | Description |
---|---|---|
taskId | string | The task ID. |
createTs | number | The timestamp when the task was created. |
status | enum | Task Status:IDLE : The task is not initialized.PREPARING : The task has received an initialization request.PREPARED : Task initialization is complete.STARTING : The task is starting.CREATED : Part of the task startup is completed.STARTED : All tasks are started and completed.IN_PROGRESS : Transcription in progress.STOPPING : The task is paused.STOPPED : The ask has terminated.FAILURE_STOP : Task termination has failed. |
To query
, update
, or stop
the transcription task, refer to the Rest quickstart.
Supported languages
The translation feature of Real-Time STT supports the following languages:
Language | Parameter |
---|---|
Arabic (Egypt) | ar-EG |
Arabic (Jordan) | ar-JO |
Arabic (Saudi Arabia) | ar-SA |
Arabic (UAE) | ar-AE |
Cantonese (Traditional) | zh-HK |
Chinese (Simplified) | zh-CN |
Chinese (Traditional) | zh-TW |
English | en-US |
French | fr-FR |
German | de-DE |
Hindi | hi-IN |
Indonesian | id-ID |
Italian | it-IT |
Japanese | ja-JP |
Korean | ko-KR |
Malay | ms-MY |
Persian (Iran) | fa-IR |
Portuguese | pt-PT |
Russian | ru-RU |
Spanish | es-ES |
Thai | th-TH |
Turkish | tr-TR |
Vietnamese | vi-VN |