IVONA Speech Server

Generate high-quality speech on your server.

IVONA Speech Server - Technical specification

PDF
IVONA SpeechServer
IVONA SpeechServer SAPI
Requirements
CPU requirements
X86 (32/64bit), Sparc* (32/64bit)
X86 (32/64bit)
RAM
Recommended min. 128MB for each voice
Recommended configurations
CPU, RAM
Amount of chars converted by ONE core per second
Xeon® X5670@3GHz, 64bit, 8GB of RAM
650 – 29001
Xeon® X5650@2.67GHz, 32bit, 8GB of RAM
620 – 27001
1 Depending on used voice/language
OS
Linux, MacOSX*, FreeBSD*, Solaris*
Windows, Windows Server
Interfaces
command line, TCP/IP, Unix socket
SAPI 5, SAPI 4*, command line
Standards compliance
W3C SSML 1.0/1.1, W3C PLS 1.0 (with IVONA extensions)
W3C SSML 1.0/1.1, W3C PLS 1.0 (with IVONA extensions), SAPI markup (with support for mixing with SSML tags)
Key features
BrightVoice: superior speech output quality
Languages and voices
US English; British English; US Spanish; Spanish; German; French; Polish; Romanian; Welsh; Welsh English; Australian English; Italian; Icelandic; Canadian French; Brazilian Portuguese; Dutch; Danish; Portuguese**; Russian**; Japanese**; Turkish**; DE-English**(Implementation of frequently used Anglicisms in German language model); Flemish**; Special Effects (SFX)/Character voices*
Sampling rate
22 kHz (up to 48 kHz*)
Audio formats
PCM 16 bit mono/stereo*
PCM 16 bit mono/stereo*
Components
speech server (deamon), tools, documentation, examples (C/C++, PHP, Pearl)
SAPI component, tools, documentation
Supported events
word-highlighting (alignment of text with audio), visemes (lip-sync), SSML events
word-highlighting (alignment of text with audio), visemes (lip-sync), SAPI bookmarks, SSML events
Scalability by multiplying speech servers
Low response (start/stop) time
Prosody control: volume, speed, pitch
User level pronunciation lexicon

(with regular expression rules support)

(with regular expression rules support)
Language detection
**
**
Phoneme mapping for mixed languages input
**
**
Text preprocessing rules for specific domains
*
*
Dynamic voice and language switching
Mixing static expressive prompts
Custom voices: voice branding or custom accent and/or style
Support for phonetic alphabets
IPA, X-SAMPA, TeleAtlas®, Navteq™
IPA, X-SAMPA, TeleAtlas®, Navteq™
Expressive TTS effects
**
**
Support for text highlighting
Contrast (feature to extract speech from background noise)
**
**
Support of requests peak handling
**
**
* Available on request
** Feature in development

Examples of implementations

The quality of the IVONA brand is not only evidenced by our numerous awards and honors. We are especially proud of the large number of business partners who have placed their trust in us. More

Professional support

ISMB ISMB - provides expert technical assistance for businesses and includes free software updates.
UPGRADE Upgrade - a special upgrade offer of business products for customers that do not have an ISMB package.

IVONA Speech Server is used by: