-
Język: Polish ›
- American English American English British English British English Polski
|
|
IVONA SDK
|
IVONA Speech Cloud
(SaaS) |
|
|---|---|---|---|
|
Requirements
| |||
|
Storage memory (ROM) per voice
|
- 100‐250 MB
- < 50 MB** - < 20 MB** |
0 MB
| |
|
Runtime memory (RAM)
|
- 5 - 13 MB
- < 5MB** |
0 MB
| |
|
CPU
|
Scalable
|
Under 50 MIPS
| |
|
Chipset
|
X86 (32/64 bit); ARM 7,8,9,11; Strong ARM; X-Scale; Sparc (32/64 bit)*; PowerPC*; MIPS*
| ||
|
OS
|
Linux, Windows, Android, Windows Mobile, Windows CE, iOS, MeeGo, Mac OS X, Solaris*, FreeBSD*
| ||
|
API
|
IVONA C/C++ API, IVONA Java API, TCP/IP, Unix socket, SAPI 4*, SAPI 5
|
Web Services (SOAP), IVONA C/C++ API**, IVONA Java API**
| |
|
Key features
| |||
|
BrightVoice™: superior speech output quality
|
|
| |
|
Languages and voices
|
US English (2 male, 3 female, child); British English (male, 2 female); US Spanish (male, female, child**); Spanish (male, female, child**); German (male, female); French (male, female); Polish (2 male, 3 female); Romanian (female); Welsh; Welsh English; Australian English; Italian; Dutch**; Canadian French**; Brazilian Portuguese**; Portuguese**; Icelandic**; Russian**; Korean**; Danish**; Swedish**; Japanese**; Special Effects (SFX)/Character voices*
| ||
|
Sampling rate
|
8 kHz, 22 kHz (up to 48 kHz*)
|
8 kHz, 22 kHz (up to 48 kHz*)
| |
|
Audio formats
|
PCM 16 bit mono/stereo*, A-law, μ-law, mp3*, vorbis (ogg) *
|
mp3, vorbis (ogg), PCM 16 bit mono*/stereo** A-law, μ-law
| |
|
Low response (start/stop) time
|
|
| |
|
SDK components
|
Libraries, documentation, examples (C/C++, PHP), support; Supported events: word-highlighting (alignment of text with audio), visemes (lip-sync), bookmarks, SSML events
| ||
|
Prosody control: volume, speed, pitch
|
|
| |
|
User level pronunciation lexicon
|
(with regular expression rules support) |
| |
|
Language detection
|
**
|
**
| |
|
Phoneme mapping for mixed languages input
|
**
|
**
| |
|
Text preprocessing rules for specific domains
|
*
|
*
| |
|
Dynamic voice and language switching
|
|
| |
|
Mixing static expressive prompts
|
|
| |
|
Custom voices: voice branding or custom accent and/or style
|
|
| |
|
Support for natural reading of long texts
|
|
| |
|
Support for phonetic alphabets
|
IPA, X-SAMPA, TeleAtlas®, Navteq™
|
IPA, X-SAMPA, TeleAtlas®, Navteq™
| |
|
Expressive TTS effects
|
**
|
**
| |
|
Standards compliance
|
W3C SSML 1.0/1.1, W3C PLS 1.0 (with IVONA extensions)
|
W3C SSML 1.0/1.1 (with IVONA extensions)
| |
|
Built-in audio effects
|
**
|
| |
|
Support for text highlighting
|
|
| |
|
REST support
|
N/A
|
| |