SDK

TTS libraries for applications and devices.

Start of content

Add speech - enhance user experience

The IVONA SDK enables integration of high-quality, advanced Text-to-Speech technology into consumer electronics, mobile applications and server solutions. The IVONA SDK includes easy-to-use IVONA TTS APIs for the development of feature-rich applications that empower intelligent voice user interfaces. The IVONA SDK is available for a variety of operating systems and hardware platforms.

Benefits for you

  • Maximum performance on each device - voices are always optimized for your platform.
  • Easy integration - you can see code examples illustrating the use of speech libraries.
  • Standards-based - utilizes most of the available TTS standards:
    • Dynamic voice and language switching
    • Support for phonetic alphabets
    • Word-highlighting (alignment of text with audio)
    • Visemes (lip-sync), SSML events
    • Prosody control (volume, speed, pitch)
    • User level pronunciation lexicon.

OEM Packages

For developers and hardware manufacturers who prefer to use OS specific TTS APIs for Android and Windows, we offer the following packages:

  • Android OEM Package: You will be able to use IVONA TTS through a standard Android TTS interface in all speech-enabled applications.
  • SAPI OEM package: You will be able to use IVONA TTS through a standard Windows Speech API interface in speech-enabled applications (up to Windows 7)

Recommended SDK Uses

Consumer Devices

Deploy cutting-edge speech applications in your products.

Connected TV

Add speech to enrich dynamic multimedia content.

Automotive

Enable drivers to keep their hands on the wheel, stay connected and drive safely.

Announcement Systems

Generate real time speech, support efficient and up-to-date announcements.

Accessibility

Remove barriers to information access & improve communications.

Education

Improve learning outcomes with Text-to-Speech.

Digital Publishing

Create professional recordings and audio content.

Text-to-Speech Cloud

If you want to embed a TTS engine into your application or you need TTS functionality, you may benefit from IVONA Speech Cloud

IVONA SDK Specifications

Technology

BrightVoice

Natural lifelike voices resulting from innovative approach to unit selection technology. Reduced unnatural discontinuities, electronic noise, and audible glitches. High accuracy through sophisticated NLP algorithms built into TTS engine. Support for natural reading of short and long texts.

Languages and voices

See voices list at http://www.ivona.com/en/voices-list/

Prosody control

Ability to adjust volume, speech rate and pitch at runtime.

Built-in domains support

IVONA TTS has built-in mechanisms to correctly pronounce texts coming from specific communicative contexts such as social text, acronyms, abbreviations and numbers.

Mixing static expressive prompts

Mechanism to mix static audio prompts with dynamically generated TTS output.

Support for phonetic alphabets

IPA, X-SAMPA, TeleAtlas®, Navteq™

Standards compliance

W3C SSML 1.0/1.1, W3C PLS 1.0 (with IVONA extensions)

Support for text highlighting

Ability to synchronize audio with text through highlighting words and sentences spoken by TTS.

Support for lip synchronization

Ability to provide applications with synchronized stream of visemes – visual representations of sound.

Requirements

Runtime memory (RAM)

5 - 13 MB

Storage memory

Server solutions > 250 MB
Desktop/mobile solutions ~ 150 MB
Embedded solutions 60-80 MB

CPU

500MHz

Chipset

x86 (32/64 bit); ARM 7,8,9,11; SH-4

OS

Linux, Windows, Android, iOS, Mac OS X, STLinux

Product features

Audio formats

PCM 16 bit mono

Sampling rate

8 kHz, 16kHz, 22.05 kHz

User level pronunciation lexicon

(with regular expression rules support)

PLS (Pronunciation Lexicon Specification) support