Skip to main content
Cloud / AWS / Products / Amazon Polly - Text-to-Speech

Amazon Polly - Text-to-Speech

Amazon Polly converts text to natural speech. Over 60 voices in 30+ languages for applications and content.

Machine Learning
Pricing Model Pay per character
Availability All major regions
Data Sovereignty EU regions available
Reliability 99.9% availability SLA

What is Amazon Polly?

Amazon Polly is a text-to-speech service that converts text into natural-sounding speech. The service offers over 60 voices in more than 30 languages and is suitable for applications, accessibility features, and content creation.

Polly uses deep learning for Neural Text-to-Speech (NTTS) with particularly natural-sounding voices. The simple API enables integration in minutes.

Core Features

  • Neural Voices: Natural-sounding speech with NTTS technology
  • 30+ Languages: German, English, French, Spanish, and many more
  • SSML Support: Fine control over pronunciation, pauses, and emphasis
  • Speech Marks: Timing information for lip-sync and text highlighting
  • Lexicons: Custom pronunciation dictionaries

Typical Use Cases

Voice Assistants: Speech output for chatbots, IVR systems, and smart home devices. Neural voices ensure natural conversations.

Accessibility: Reading web content, documents, and apps for visually impaired users. WCAG compliance through audio alternatives.

Content Creation: Audio versions of articles, e-learning content, and podcasts. Automated production saves time and costs.

Benefits

  • Natural-sounding speech with Neural TTS
  • Pay-per-character without minimum fees
  • Simple REST API for quick integration
  • Support for German voices

Integration with innFactory

As an AWS Reseller, innFactory supports you with Amazon Polly: We help with integration into your applications, optimization of speech quality with SSML, and combination with other AWS services like Lex and Connect.

Typical Use Cases

Text-to-speech
Voice assistants
Accessibility
Content creation

Frequently Asked Questions

What is Amazon Polly?

Amazon Polly is a text-to-speech service that converts text into natural-sounding speech. It offers over 60 voices in more than 30 languages, including neural voices with high speech quality.

What are neural voices?

Neural Text-to-Speech (NTTS) uses deep learning for more natural speech synthesis. Voices sound more human-like with better intonation and emphasis than standard voices.

Which output formats are supported?

MP3, OGG Vorbis, PCM, and JSON with Speech Marks. Speech Marks provide timing information for lip-sync or text highlighting.

How can I customize pronunciation?

SSML tags enable control over pauses, emphasis, pronunciation, and speaking rate. Lexicons store custom pronunciation dictionaries.

AWS Cloud Expertise

innFactory is an AWS Reseller with certified cloud architects. We provide consulting, implementation, and managed services for AWS.

Ready to start with Amazon Polly - Text-to-Speech?

Our certified AWS experts help you with architecture, integration, and optimization.

Schedule Consultation