Azure Speech TTS - Burki Voice AI Docs

☁️ Azure Speech: Enterprise Scale

Microsoft’s neural TTS service with 500+ voices across 100+ languages. Seamless integration with Azure ecosystem, SSML support, and enterprise-grade reliability. Perfect for organizations already using Microsoft services.

Quick Setup

Create Azure Speech Resource

Go to Azure Portal
Create a new Speech resource
Select your subscription, resource group, and region
Note your Key and Region from the resource’s Keys and Endpoint page

Configure in Burki

Go to AI Configuration → TTS tab
Select Azure Speech as provider
Enter your API Key and Region (e.g., eastus, westus2)

Choose Voice & Model

Select your preferred neural voice from the dropdown

Free Tier: Azure offers 500,000 characters per month free. Neural voices are available on all tiers.

Available Models

🧠 Neural

High-Quality Neural VoicesNatural intonation and human-like speechQuality: Premium Best for: All production applications

📢 Standard

Standard TTS VoicesBasic text-to-speech synthesisQuality: Good Best for: Legacy compatibility

Recommendation: Always use Neural voices for the best quality. Standard voices are legacy and should only be used for specific compatibility needs.

Available Voices

English Voices

American English (en-US)

Jenny

Clear & ProfessionalPerfect for business applicationsVoice ID: en-US-JennyNeural

Aria

Warm & NaturalGreat for friendly interactionsVoice ID: en-US-AriaNeural

Guy

Natural & ConfidentStrong, authoritative voiceVoice ID: en-US-GuyNeural

Davis

Friendly & ApproachableIdeal for customer serviceVoice ID: en-US-DavisNeural

Jane

ProfessionalClear business voiceVoice ID: en-US-JaneNeural

Jason

ClearReliable male voiceVoice ID: en-US-JasonNeural

British English (en-GB)

Sonia

Clear & ProfessionalBritish female voiceVoice ID: en-GB-SoniaNeural

Ryan

WarmBritish male voiceVoice ID: en-GB-RyanNeural

Other Languages

Spanish (es-ES)

Elvira

NaturalSpanish female voiceVoice ID: es-ES-ElviraNeural

Alvaro

ClearSpanish male voiceVoice ID: es-ES-AlvaroNeural

French (fr-FR)

Denise

NaturalFrench female voiceVoice ID: fr-FR-DeniseNeural

Henri

ClearFrench male voiceVoice ID: fr-FR-HenriNeural

Arabic (ar-SA)

Salma

NaturalArabic female voiceVoice ID: ar-SA-SalmaNeural

Hamed

ClearArabic male voiceVoice ID: ar-SA-HamedNeural

All Configured Voices

Voice	Language	Gender	Voice ID	Description
Jenny	en-US	Female	`en-US-JennyNeural`	Clear and professional
Aria	en-US	Female	`en-US-AriaNeural`	Warm and natural
Guy	en-US	Male	`en-US-GuyNeural`	Natural and confident
Davis	en-US	Male	`en-US-DavisNeural`	Friendly and approachable
Jane	en-US	Female	`en-US-JaneNeural`	Professional
Jason	en-US	Male	`en-US-JasonNeural`	Clear
Sonia	en-GB	Female	`en-GB-SoniaNeural`	Clear and professional
Ryan	en-GB	Male	`en-GB-RyanNeural`	Warm
Elvira	es-ES	Female	`es-ES-ElviraNeural`	Natural
Alvaro	es-ES	Male	`es-ES-AlvaroNeural`	Clear
Denise	fr-FR	Female	`fr-FR-DeniseNeural`	Natural
Henri	fr-FR	Male	`fr-FR-HenriNeural`	Clear
Salma	ar-SA	Female	`ar-SA-SalmaNeural`	Natural
Hamed	ar-SA	Male	`ar-SA-HamedNeural`	Clear

500+ More Voices: Azure offers hundreds of additional voices. Visit the Azure Voice Gallery for the complete list.

Voice Controls

Azure Speech provides advanced voice customization through SSML:

Speaking Rate
Pitch
SSML Support

Controls speech speed (Range: 0.5 - 2.0)

0.5: Half speed (very slow)
1.0: ✅ Normal speed (Recommended)
2.0: Double speed (very fast)

<prosody rate="+20%">
  Speaking slightly faster than normal.
</prosody>

Controls voice pitch (Range: 0.5 - 2.0)

0.5: Very low pitch
1.0: ✅ Normal pitch (Recommended)
2.0: Very high pitch

<prosody pitch="+10%">
  Speaking with slightly higher pitch.
</prosody>

Full SSML markup for advanced control

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">
  <voice name="en-US-JennyNeural">
    <prosody rate="+10%" pitch="+5%">
      Hello! How can I help you today?
    </prosody>
  </voice>
</speak>

Configuration Options

Audio Format

Azure Speech automatically outputs the appropriate format for your telephony provider:

Twilio/Telnyx: PCM μ-law @ 8kHz (Raw8Khz8BitMonoMULaw)
Vonage: PCM 16-bit @ 16kHz (Raw16Khz16BitMonoPcm)

Configuration in Burki

To use Azure Speech TTS in your assistant:

Get Azure Credentials

Create a Speech resource in Azure Portal and copy your Subscription Key and Region.

Add to Burki

Go to Settings → Provider Keys → TTS and add your Azure Speech credentials.

Configure Assistant

Edit your assistant, select Azure Speech as the TTS provider, and choose a voice.

SSML Support

Azure Speech supports SSML for advanced voice control:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xml:lang="en-US">
    <voice name="en-US-JennyNeural">
        <prosody rate="+10%" pitch="+5%">
            Welcome to our service!
        </prosody>
        <break time="500ms"/>
        How can I assist you today?
    </voice>
</speak>

Use SSML tags in your assistant’s responses for fine-grained control over pronunciation, emphasis, and pacing.

Regional Selection

Latency Optimization: Choose the Azure region closest to your deployment for optimal latency.

Region	Location	Best For
`eastus`	East US	North America (East)
`westus2`	West US 2	North America (West)
`westeurope`	Netherlands	Europe
`southeastasia`	Singapore	Asia-Pacific
`australiaeast`	Australia East	Australia/Oceania

Pricing Overview

Tier	Characters/Month	Neural Voices	Price
Free	500,000	Yes	$0
Standard	Pay-as-you-go	Yes	$16 per 1M chars

Enterprise: Contact Azure for custom pricing on high-volume usage and reserved capacity.

Common Issues & Solutions

Authentication Failed

Problem: API returns 401 UnauthorizedSolutions:

Verify your Azure Speech Key is correct in Settings → Provider Keys
Ensure the key is from your Speech resource (not another Azure service)
Check that the region matches your Speech resource’s region

Voice Not Available

Problem: Selected voice doesn’t workSolutions:

Verify the voice ID format (e.g., en-US-JennyNeural)
Check that the voice is available in your region
Ensure your subscription tier supports the selected voice

High Latency

Problem: TTS response is slowSolutions:

Select an Azure region closer to your users
Burki uses streaming synthesis for optimal performance
Consider caching common phrases

Best Practices

🎯 Multilingual?

Cartesia Sonic 3 - 42 languages with voice cloning

⚡ Need Speed?

Deepgram Aura - Ultra-low ~75ms latency

🔗 Additional Resources

Azure Portal: portal.azure.comVoice Gallery: Azure Voice GalleryDocumentation: Azure Speech Service DocsPricing: Azure Speech Pricing

🚀 Ready to Use Azure Speech?

Head back to your assistant configuration and set up Azure Speech for enterprise-grade TTS!

Getting Started

Core Concepts

AI Providers

Features

Advanced

Help & Resources

☁️ Azure Speech: Enterprise Scale

​Quick Setup

​Available Models

🧠 Neural

📢 Standard

​Available Voices

​English Voices

Jenny

Aria

Guy

Davis

Jane

Jason

Sonia

Ryan

​Other Languages

Elvira

Alvaro

Denise

Henri

Salma

Hamed

​Voice Controls

​Configuration Options

​Audio Format

​Configuration in Burki

​SSML Support

​Regional Selection

​Pricing Overview

​Common Issues & Solutions

​Best Practices

​See Also

🎯 Multilingual?

⚡ Need Speed?

🔗 Additional Resources

🚀 Ready to Use Azure Speech?

Quick Setup

Available Models

Available Voices

English Voices

Other Languages

Voice Controls

Configuration Options

Audio Format

Configuration in Burki

SSML Support

Regional Selection

Pricing Overview

Common Issues & Solutions

Best Practices

See Also