Accurate and human-like synthetic speech

IVONA Software continuously improves the accuracy and quality of its TTS voices to provide great listening experiences to all users including business customers and individuals who use IVONA text-to-speech technologies.

IVONA TTS voices are developed using BrightVoice technology and provide lifelike, expressive reading of words, sentences, paragraphs and entire books or speeches.

BrightVoice Technology

The purpose of a Text-to-Speech system is to convert any text into natural sounding speech.
  • First, text needs to be normalized. Normalization is the process of transforming text into a single canonical form, therefore text is parsed into single tokens.
  • Next, the text-to-speech system assigns the appropriate phonetic transcriptions to each word which reflect how text should be pronounced in any given natural language. The synthesizer then converts the symbolic linguistic representations into sound.
  • The last step is to choose the right speech units which ensure the high quality and natural sound of generated speech.


BrightVoice technology brings quality to all levels of speech synthesis and provides:
  • Intelligibility: increased degree to which the read aloud text can be understood by listeners.
  • Optimized text normalization accuracy, which allows TTS to expand acronyms, abbreviations, numbers, dates etc. into humanly understandable form. IVONA TTS uses Natural Language Processing algorithms to correctly process text input.
  • Natural sounding text-to-speech: realistic voices that are indistinguishable from human voices as voice talent characteristics are incorporated into synthetic speech.

RVD - Text-to-Speech development process

IVONA has developed and is still improving its Rapid Voice Development (RVD) process for data driven TTS development. The Rapid Voice Development process consists of well-defined steps for TTS language development empowered by the set of tools and techniques that make it possible to model linguistic characteristics including sub-vocalization, accentuation and intonation.

IVONA Software uses crowdsourcing approach to utilize native speakers for validating the quality of TTS in each stage of the RVD process.

Technology used in millions of apps, devices and services

We continuously improve the accuracy and quality of IVONA voices to provide rich interactive experiences to millions of people who use our Text-to-Speech technologies.
