Skip to content

Azure AI Speech Translation Setup Guide

โœ… 1. Provision Azure AI Speech Resource

  1. Go to https://portal.azure.com
  2. Search "Azure AI services" > Click Create under Speech service
  3. Fill in:
  4. Subscription: Your Azure sub
  5. Resource Group: Create or choose
  6. Region: Any available
  7. Name: Unique
  8. Pricing: Select F0 (Free) or S (Standard)
  9. Agree to Responsible AI
  10. Click Review + create > Create

After deployment, open the Keys and Endpoint page. You'll use the key and region in your code.

โœ… 2. Prepare Development Environment

Option A: Already cloned?

Just open mslearn-ai-language in VS Code.

Option B: Clone the repo

  1. Open VS Code
  2. Press Ctrl+Shift+P โ†’ type Git: Clone โ†’ enter:
    https://github.com/MicrosoftLearning/mslearn-ai-language
    
  3. Open the cloned folder in VS Code.
  4. If prompted, trust the authors.
  5. Don't add required assets if prompted.

โœ… 3. Install Azure Speech SDK

Open terminal in:

Labfiles/08-speech-translation/<CSharp or Python>/translator

Run:

For C#:

dotnet add package Microsoft.CognitiveServices.Speech --version 1.30.0

For Python:

pip install azure-cognitiveservices-speech==1.30.0

โœ… 4. Configure API Key and Region

  • C#: Edit appsettings.json
  • Python: Edit .env

Add your key and region from Azure (not the endpoint!).

โœ… 5. Modify Code for Translation

Open: - C#: Program.cs - Python: translator.py

Import namespaces

C#:

using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using Microsoft.CognitiveServices.Speech.Translation;

Python:

import azure.cognitiveservices.speech as speech_sdk

Configure Translation & Speech

C#:

translationConfig = SpeechTranslationConfig.FromSubscription(aiSvcKey, aiSvcRegion);
translationConfig.SpeechRecognitionLanguage = "en-US";
translationConfig.AddTargetLanguage("fr");
translationConfig.AddTargetLanguage("es");
translationConfig.AddTargetLanguage("hi");

speechConfig = SpeechConfig.FromSubscription(aiSvcKey, aiSvcRegion);

Python:

translation_config = speech_sdk.translation.SpeechTranslationConfig(ai_key, ai_region)
translation_config.speech_recognition_language = 'en-US'
translation_config.add_target_language('fr')
translation_config.add_target_language('es')
translation_config.add_target_language('hi')

speech_config = speech_sdk.SpeechConfig(ai_key, ai_region)

โœ… 6. Translate Speech

๐ŸŽค Option A: Microphone Input

Add inside Translate function:

C#:

using AudioConfig audioConfig = AudioConfig.FromDefaultMicrophoneInput();
using TranslationRecognizer translator = new TranslationRecognizer(translationConfig, audioConfig);
Console.WriteLine("Speak now...");
TranslationRecognitionResult result = await translator.RecognizeOnceAsync();
translation = result.Translations[targetLanguage];
Console.WriteLine(translation);

Python:

audio_config = speech_sdk.AudioConfig(use_default_microphone=True)
translator = speech_sdk.translation.TranslationRecognizer(translation_config, audio_config=audio_config)
print("Speak now...")
result = translator.recognize_once_async().get()
translation = result.translations[targetLanguage]
print(translation)

๐Ÿ”Š Option B: Use Audio File

Install sound library:

C#:

dotnet add package System.Windows.Extensions --version 4.6.0

Python:

pip install playsound==1.3.0

Then, add:

C#:

using System.Media;

string audioFile = "station.wav";
SoundPlayer wavPlayer = new SoundPlayer(audioFile);
wavPlayer.Play();
using AudioConfig audioConfig = AudioConfig.FromWavFileInput(audioFile);
using TranslationRecognizer translator = new TranslationRecognizer(translationConfig, audioConfig);
TranslationRecognitionResult result = await translator.RecognizeOnceAsync();
translation = result.Translations[targetLanguage];
Console.WriteLine(translation);

Python:

from playsound import playsound

audioFile = 'station.wav'
playsound(audioFile)
audio_config = speech_sdk.AudioConfig(filename=audioFile)
translator = speech_sdk.translation.TranslationRecognizer(translation_config, audio_config=audio_config)
result = translator.recognize_once_async().get()
translation = result.translations[targetLanguage]
print(translation)

โœ… 7. Add Speech Synthesis

Add under Synthesize translation comment:

C#:

var voices = new Dictionary<string, string>
{
    ["fr"] = "fr-FR-HenriNeural",
    ["es"] = "es-ES-ElviraNeural",
    ["hi"] = "hi-IN-MadhurNeural"
};
speechConfig.SpeechSynthesisVoiceName = voices[targetLanguage];
using SpeechSynthesizer speechSynthesizer = new SpeechSynthesizer(speechConfig);
SpeechSynthesisResult speak = await speechSynthesizer.SpeakTextAsync(translation);

Python:

voices = {
    "fr": "fr-FR-HenriNeural",
    "es": "es-ES-ElviraNeural",
    "hi": "hi-IN-MadhurNeural"
}
speech_config.speech_synthesis_voice_name = voices.get(targetLanguage)
speech_synthesizer = speech_sdk.SpeechSynthesizer(speech_config)
speak = speech_synthesizer.speak_text_async(translation).get()

โœ… 8. Run the Program

From the terminal in the translator folder:

C#:

dotnet run

Python:

python translator.py

Follow the prompts, enter one of: fr, es, or hi, and speak or play your audio file.

๐Ÿ“ Notes

  • You can retrieve all translations from result.translations.
  • The Hindi output may not render properly in some consoles due to encoding.
  • Azure allows both mic and file input; you can choose dynamically.