Voice cloning allows you to create custom voices from audio samples, giving your AI assistants unique and personalized voices that match your brand or specific use cases.
Overview
Burki Voice AIβs voice cloning feature enables you to:- Upload Voice Samples: Upload high-quality audio recordings to create voice models
- Multi-Provider Support: Use ElevenLabs, Resemble AI, and other providers that support voice cloning
- Instant Voice Creation: Generate cloned voices ready for immediate use
- Voice Management: Organize, test, and manage your custom voices
- Usage Analytics: Track voice usage for billing and optimization
ποΈ Voice Sample Upload
Upload audio samples with validation and processing
π€ AI Voice Training
Provider-powered voice training with quality optimization
π Usage Analytics
Track synthesis usage and voice performance
π§ Easy Integration
Seamless integration with existing assistant configurations
Supported Providers
ElevenLabs
- Instant Voice Cloning: Create voices from single audio samples
- High Quality: Professional-grade voice synthesis
- Multiple Languages: Support for 29+ languages
- Quick Processing: Voices ready in seconds
Resemble AI
- Professional Training: Advanced voice training algorithms
- Custom Models: Highly personalized voice characteristics
- Unlimited Voices: Create as many voices as needed
- Enterprise Features: Advanced customization options
Future Providers
- Inworld AI: Coming soon with emotional voice cloning
- OpenAI: Voice cloning capabilities when available
Voice Sample Requirements
Audio Quality Guidelines
File Format Requirements
File Format Requirements
Supported Formats:
- MP3 (recommended)
- WAV (highest quality)
- FLAC (lossless)
- M4A/AAC
- OGG
- Sample Rate: 22kHz or higher
- Bit Rate: 128kbps minimum
- Channels: Mono preferred, stereo acceptable
- File Size: Maximum 50MB
Recording Guidelines
Recording Guidelines
Duration Requirements:
- Minimum: 10 seconds of clear speech
- Recommended: 30-60 seconds for better quality
- Maximum: 10 minutes (longer samples may not improve quality)
- Clear Speech: No background noise or music
- Natural Tone: Conversational, not monotone
- Consistent Volume: Steady audio levels throughout
- Single Speaker: Only the target voice in the recording
Quality Tips
Quality Tips
For Best Results:
- Environment: Record in a quiet room with soft furnishings
- Microphone: Use a quality microphone 6-12 inches from mouth
- Content: Read varied sentences with different emotions
- Consistency: Maintain the same speaking style throughout
- Format: Save in WAV format for highest quality
Getting Started
Step 1: Upload Voice Sample
Navigate to your assistantβs configuration and open the Voice Cloning section:- Upload Audio File: Drag and drop or click to select your audio file
- Add Metadata: Provide a name, description, and tags
- Validation: System automatically validates audio quality
- Processing: File is uploaded and prepared for cloning
Example Upload
Step 2: Create Cloned Voice
Once your sample is uploaded, create a cloned voice:- Select Provider: Choose ElevenLabs or Resemble AI
- Configure Options: Set voice name, language, and quality settings
- Initiate Cloning: Start the voice training process
- Monitor Progress: Track cloning status in real-time
Example Voice Creation
Step 3: Use Cloned Voice
Once processing is complete, assign the voice to your assistant:- Voice Selection: Choose from your cloned voices
- Testing: Preview the voice with sample text
- Assignment: Set as the assistantβs default voice
- Go Live: Start using the voice in live calls
Voice Management
Voice Library
Organizing Voices
Organizing Voices
Voice Categories:
- Brand Voices: Official company voices
- Character Voices: Specific personas or characters
- Language Variants: Same voice in different languages
- Seasonal/Campaign: Temporary or promotional voices
- Use consistent tags for easy filtering
- Include language, gender, style descriptors
- Add use case tags (customer service, sales, etc.)
Voice Analytics
Voice Analytics
Usage Tracking:
- Synthesis Count: Number of times voice was used
- Duration Metrics: Total audio generated
- Cost Tracking: Provider usage and billing
- Performance: Quality scores and user feedback
- Most/least used voices
- Cost per synthesis by provider
- Quality trends over time
- User preference patterns
Voice Testing
Test your cloned voices before deployment:- Text-to-Speech Preview: Enter sample text to hear the voice
- Quality Assessment: Evaluate clarity, naturalness, and accuracy
- Comparison Testing: Compare with original samples and other voices
- A/B Testing: Test different voices with real users
API Integration
Upload Voice Sample
Create Cloned Voice
List Cloned Voices
Best Practices
Recording Quality
Professional Recording Setup
Professional Recording Setup
Equipment Recommendations:
- Microphone: USB condenser microphone (Audio-Technica AT2020USB+)
- Environment: Quiet room with minimal echo
- Software: Audacity, GarageBand, or professional DAW
- Monitoring: Use headphones to monitor audio quality
- Consistent Distance: Maintain 6-12 inches from microphone
- Proper Levels: Keep audio peaks between -12dB and -6dB
- Room Treatment: Use blankets or acoustic foam to reduce echo
- Multiple Takes: Record several versions and choose the best
Content Selection
Content Selection
Ideal Voice Sample Content:
- Varied Sentences: Different sentence structures and lengths
- Emotional Range: Include slight variations in tone
- Natural Speech: Conversational, not reading tone
- Complete Thoughts: Full sentences with natural pauses
- Background noise or music
- Multiple speakers
- Heavy accents (unless desired)
- Monotone or robotic delivery
- Incomplete sentences or stuttering
Voice Management
- Naming Convention: Use descriptive, consistent names
- Version Control: Keep track of voice iterations and improvements
- Usage Documentation: Document which voices work best for different scenarios
- Regular Testing: Periodically test voice quality and user satisfaction
- Cost Monitoring: Track usage and costs across different providers
Security and Privacy
- Consent: Always obtain explicit consent before using someoneβs voice
- Data Protection: Store voice samples securely and follow GDPR/CCPA requirements
- Access Control: Limit who can create and manage cloned voices
- Audit Trail: Keep logs of voice creation and usage
- Retention Policy: Define how long voice samples and models are stored
Troubleshooting
Common Issues
Upload Problems
Upload Problems
File Upload Fails:
- Check file format is supported (MP3, WAV, FLAC, M4A, OGG)
- Ensure file size is under 50MB
- Verify audio duration is between 10 seconds and 10 minutes
- Check internet connection stability
- Use higher sample rate (22kHz+) and bit rate (128kbps+)
- Remove background noise using audio editing software
- Re-record in a quieter environment
- Check microphone positioning and levels
Voice Creation Issues
Voice Creation Issues
Cloning Process Fails:
- Verify provider API credentials are valid
- Check account balance with voice cloning provider
- Ensure voice sample meets provider requirements
- Contact provider support for specific error messages
- Use higher quality source audio
- Try different provider (ElevenLabs vs Resemble AI)
- Experiment with quality enhancement settings
- Consider recording new samples with better equipment
Performance Issues
Performance Issues
Slow Processing:
- Provider processing times vary (ElevenLabs: seconds, Resemble: minutes)
- Check provider status pages for service issues
- Large files take longer to process
- Peak usage times may cause delays
- Monitor usage through analytics dashboard
- Set usage limits and alerts
- Compare provider pricing for your use case
- Optimize voice selection for cost efficiency
Provider Comparison
| Feature | ElevenLabs | Resemble AI | Coming Soon |
|---|---|---|---|
| Processing Time | Seconds | Minutes | Varies |
| Quality | Excellent | Excellent | TBD |
| Languages | 29+ | English+ | TBD |
| Cost Model | Per character | Per synthesis | TBD |
| Sample Requirements | 30s+ | 60s+ | TBD |
| Instant Preview | β | β | TBD |
| Emotional Control | Basic | Advanced | TBD |
| Enterprise Features | Limited | Full | TBD |
Use Cases
Customer Service
- Consistent Brand Voice: Maintain brand identity across all interactions
- Multilingual Support: Create voices in different languages for global support
- Personality Matching: Match voice characteristics to brand personality
Sales and Marketing
- Campaign Voices: Create specific voices for marketing campaigns
- Regional Variants: Adapt voices for different geographical markets
- Seasonal Adjustments: Modify voice characteristics for holidays or events
Entertainment and Media
- Character Voices: Create unique voices for virtual characters
- Narrator Voices: Professional voices for content narration
- Interactive Experiences: Engaging voices for games and interactive media
Enterprise Applications
- Executive Voices: Clone executive voices for consistent communication
- Training Systems: Consistent voices for e-learning and training
- Brand Ambassadors: Virtual representatives with authentic brand voices
Getting Help
π Documentation
Complete TTS provider documentation
ποΈ Voice Tuning
Advanced voice configuration guide
π¬ Community Support
Get help from the community
π§ Technical Support
Contact our support team
Pro Tip: Start with ElevenLabs for quick prototyping and testing, then consider Resemble AI for production deployments requiring advanced customization and enterprise features.