Azure AI Speech Translation Setup Guide
โ 1. Provision Azure AI Speech Resource
- Go to https://portal.azure.com
- Search "Azure AI services" > Click Create under Speech service
- Fill in:
- Subscription: Your Azure sub
- Resource Group: Create or choose
- Region: Any available
- Name: Unique
- Pricing: Select F0 (Free) or S (Standard)
- Agree to Responsible AI
- Click Review + create > Create
After deployment, open the Keys and Endpoint page. You'll use the key and region in your code.
โ 2. Prepare Development Environment
Option A: Already cloned?
Just open mslearn-ai-language in VS Code.
Option B: Clone the repo
- Open VS Code
- Press Ctrl+Shift+P โ type Git: Clone โ enter:
- Open the cloned folder in VS Code.
- If prompted, trust the authors.
- Don't add required assets if prompted.
โ 3. Install Azure Speech SDK
Open terminal in:
Run:
For C#:
For Python:
โ 4. Configure API Key and Region
- C#: Edit appsettings.json
- Python: Edit .env
Add your key and region from Azure (not the endpoint!).
โ 5. Modify Code for Translation
Open: - C#: Program.cs - Python: translator.py
Import namespaces
C#:
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using Microsoft.CognitiveServices.Speech.Translation;
Python:
Configure Translation & Speech
C#:
translationConfig = SpeechTranslationConfig.FromSubscription(aiSvcKey, aiSvcRegion);
translationConfig.SpeechRecognitionLanguage = "en-US";
translationConfig.AddTargetLanguage("fr");
translationConfig.AddTargetLanguage("es");
translationConfig.AddTargetLanguage("hi");
speechConfig = SpeechConfig.FromSubscription(aiSvcKey, aiSvcRegion);
Python:
translation_config = speech_sdk.translation.SpeechTranslationConfig(ai_key, ai_region)
translation_config.speech_recognition_language = 'en-US'
translation_config.add_target_language('fr')
translation_config.add_target_language('es')
translation_config.add_target_language('hi')
speech_config = speech_sdk.SpeechConfig(ai_key, ai_region)
โ 6. Translate Speech
๐ค Option A: Microphone Input
Add inside Translate function:
C#:
using AudioConfig audioConfig = AudioConfig.FromDefaultMicrophoneInput();
using TranslationRecognizer translator = new TranslationRecognizer(translationConfig, audioConfig);
Console.WriteLine("Speak now...");
TranslationRecognitionResult result = await translator.RecognizeOnceAsync();
translation = result.Translations[targetLanguage];
Console.WriteLine(translation);
Python:
audio_config = speech_sdk.AudioConfig(use_default_microphone=True)
translator = speech_sdk.translation.TranslationRecognizer(translation_config, audio_config=audio_config)
print("Speak now...")
result = translator.recognize_once_async().get()
translation = result.translations[targetLanguage]
print(translation)
๐ Option B: Use Audio File
Install sound library:
C#:
Python:
Then, add:
C#:
using System.Media;
string audioFile = "station.wav";
SoundPlayer wavPlayer = new SoundPlayer(audioFile);
wavPlayer.Play();
using AudioConfig audioConfig = AudioConfig.FromWavFileInput(audioFile);
using TranslationRecognizer translator = new TranslationRecognizer(translationConfig, audioConfig);
TranslationRecognitionResult result = await translator.RecognizeOnceAsync();
translation = result.Translations[targetLanguage];
Console.WriteLine(translation);
Python:
from playsound import playsound
audioFile = 'station.wav'
playsound(audioFile)
audio_config = speech_sdk.AudioConfig(filename=audioFile)
translator = speech_sdk.translation.TranslationRecognizer(translation_config, audio_config=audio_config)
result = translator.recognize_once_async().get()
translation = result.translations[targetLanguage]
print(translation)
โ 7. Add Speech Synthesis
Add under Synthesize translation comment:
C#:
var voices = new Dictionary<string, string>
{
["fr"] = "fr-FR-HenriNeural",
["es"] = "es-ES-ElviraNeural",
["hi"] = "hi-IN-MadhurNeural"
};
speechConfig.SpeechSynthesisVoiceName = voices[targetLanguage];
using SpeechSynthesizer speechSynthesizer = new SpeechSynthesizer(speechConfig);
SpeechSynthesisResult speak = await speechSynthesizer.SpeakTextAsync(translation);
Python:
voices = {
"fr": "fr-FR-HenriNeural",
"es": "es-ES-ElviraNeural",
"hi": "hi-IN-MadhurNeural"
}
speech_config.speech_synthesis_voice_name = voices.get(targetLanguage)
speech_synthesizer = speech_sdk.SpeechSynthesizer(speech_config)
speak = speech_synthesizer.speak_text_async(translation).get()
โ 8. Run the Program
From the terminal in the translator folder:
C#:
Python:
Follow the prompts, enter one of: fr, es, or hi, and speak or play your audio file.
๐ Notes
- You can retrieve all translations from result.translations.
- The Hindi output may not render properly in some consoles due to encoding.
- Azure allows both mic and file input; you can choose dynamically.