Skip to main content

Azure AI Speech (Cognitive Services)

Azure AI Speech is Azure's Cognitive Services text-to-speech API, separate from Azure OpenAI. It provides high-quality neural voices with broader language support and advanced speech customization.

When to use this vs Azure OpenAI TTS:

  • Azure AI Speech - More languages, neural voices, SSML support, speech customization
  • Azure OpenAI TTS - OpenAI models, integrated with Azure OpenAI services

Overviewโ€‹

PropertyDetails
DescriptionAzure AI Speech is Azure's Cognitive Services text-to-speech API, separate from Azure OpenAI. It provides high-quality neural voices with broader language support and advanced speech customization.
Provider Route on LiteLLMazure/speech/

Quick Startโ€‹

LiteLLM SDK

SDK Usage
from litellm import speech
from pathlib import Path
import os

os.environ["AZURE_TTS_API_KEY"] = "your-cognitive-services-key"

speech_file_path = Path(__file__).parent / "speech.mp3"
response = speech(
model="azure/speech/azure-tts",
voice="alloy",
input="Hello, this is Azure AI Speech",
api_base="https://eastus.tts.speech.microsoft.com",
api_key=os.environ["AZURE_TTS_API_KEY"],
)
response.stream_to_file(speech_file_path)

LiteLLM Proxy

proxy_config.yaml
model_list:
- model_name: azure-speech
litellm_params:
model: azure/speech/azure-tts
api_base: https://eastus.tts.speech.microsoft.com
api_key: os.environ/AZURE_TTS_API_KEY

Setupโ€‹

  1. Create an Azure Cognitive Services resource in the Azure Portal
  2. Get your API key from the resource
  3. Note your region (e.g., eastus, westus, westeurope)
  4. Use the regional endpoint: https://{region}.tts.speech.microsoft.com

Voice Mappingโ€‹

LiteLLM automatically maps OpenAI voice names to Azure Neural voices:

OpenAI VoiceAzure Neural VoiceDescription
alloyen-US-JennyNeuralNeutral and balanced
echoen-US-GuyNeuralWarm and upbeat
fableen-GB-RyanNeuralExpressive and dramatic
onyxen-US-DavisNeuralDeep and authoritative
novaen-US-AmberNeuralFriendly and conversational
shimmeren-US-AriaNeuralBright and cheerful

Supported Parametersโ€‹

All Parameters
response = speech(
model="azure/speech/azure-tts",
voice="alloy", # Required: Voice selection
input="text to convert", # Required: Input text
speed=1.0, # Optional: 0.25 to 4.0 (default: 1.0)
response_format="mp3", # Optional: mp3, opus, wav, pcm
api_base="https://eastus.tts.speech.microsoft.com",
api_key="your-key",
)

Response Formatsโ€‹

FormatAzure Output FormatSample Rate
mp3audio-24khz-48kbitrate-mono-mp324kHz
opusogg-48khz-16bit-mono-opus48kHz
wavriff-24khz-16bit-mono-pcm24kHz
pcmraw-24khz-16bit-mono-pcm24kHz

Async Supportโ€‹

Async Usage
import asyncio
from litellm import aspeech
from pathlib import Path

async def generate_speech():
response = await aspeech(
model="azure/speech/azure-tts",
voice="alloy",
input="Hello from async",
api_base="https://eastus.tts.speech.microsoft.com",
api_key=os.environ["AZURE_TTS_API_KEY"],
)

speech_file_path = Path(__file__).parent / "speech.mp3"
response.stream_to_file(speech_file_path)

asyncio.run(generate_speech())

Regional Endpointsโ€‹

Replace {region} with your Azure resource region:

  • US East: https://eastus.tts.speech.microsoft.com
  • US West: https://westus.tts.speech.microsoft.com
  • Europe West: https://westeurope.tts.speech.microsoft.com
  • Asia Southeast: https://southeastasia.tts.speech.microsoft.com

Full list of regions

Advanced Featuresโ€‹

Custom Neural Voicesโ€‹

You can use any Azure Neural voice by passing the full voice name:

Custom Voice
response = speech(
model="azure/speech/azure-tts",
voice="en-US-AriaNeural", # Direct Azure voice name
input="Using a specific neural voice",
api_base="https://eastus.tts.speech.microsoft.com",
api_key=os.environ["AZURE_TTS_API_KEY"],
)

Browse available voices in the Azure Speech Gallery.

Error Handlingโ€‹

Error Handling
from litellm import speech
from litellm.exceptions import APIError

try:
response = speech(
model="azure/speech/azure-tts",
voice="alloy",
input="Test message",
api_base="https://eastus.tts.speech.microsoft.com",
api_key=os.environ["AZURE_TTS_API_KEY"],
)
except APIError as e:
print(f"Azure Speech error: {e}")

Referenceโ€‹