[TIK
logo ]


[ TIK ] [ Speech Processing Group ] [ SVOX ] [ polySVOX ]
[ Demo Sentences ] [ Mixed-lingual Text Analysis ] [ Polyglot Prosody Control ]

polySVOX - Mixed-lingual Text Analysis

Within the polyglot TTS synthesis system, mixed-lingual text analysis is responsible for the generation of the correct phone sequence of the sentence to be uttered, the correct identification of the sentence's base language and the languages of foreign inclusions, and an useful syntactic structure on which subsequent sentence accentuation and phrasing rely. More information about mixed-lingual text analysis of the polySVOX system can be found in [Rom03], [PR03], [RP04], [RP06].

The example sentences (together with the morpho-syntactic tree and the synthesized speech signal of each sentence) demonstrate these topics. They are grouped into the following sections:

Different languages in the texts and the morpho-syntactic trees are distinguished by following colors:
English French German Italian Spanish


Language Detection

Mixed-lingual sentences contain inclusions of one or more foreign languages. The size and type of such inclusions is widely varying and ranges from a part of a word up to a whole phrase.

Mixed-lingual sentences containing foreign words

Syntax Tree Audio File
Man mag das als dernier cri der Mode bezeichnen.
wav
Wir sind partout nicht dazu bereit.
wav
Im Wachstumsmarkt ist à la carte besser als das Menu.
wav
Aber à la longue langweilte mich das.
wav
Dies gilt grosso modo für fast alle politischen Bereiche.
wav
She's not really au fait with my ideas.
wav

Mixed-lingual words containing foreign morphemes

Syntax Tree Audio File
Das Musicalprogramm New York's wurde en passant upgedatet.
wav
Lifestyle outet sich oft als niveaulose Eventkultur.
wav


Identification of Syntactic Words

Mixed-lingual word forms containing contracted words

Syntax Tree Audio File
C'est l'Adagio d'Hammerklavier.
wav


Disambiguation of Homographs

Monolingual homographs

Syntax Tree Audio File
Modern gebaute Keller sollten nicht modern.
wav
They record the next record.
wav

Interlingual homographs

Syntax Tree Audio File
Die Greatest Nation hat die Grande Nation als tonangebende Nation abgelöst.
wav
Human Resources werden oft nicht sehr human gemanagt.
wav
Mit seiner Band bietet er musikalische Highlights am laufenden Band.
wav

Homographs of multi word lexemes and word groups

Syntax Tree Audio File
He's in fine conditions in fine.
wav

Homographs of contracted words

Syntax Tree Audio File
It's in St. Peter's St.
wav

Interlingual homographs of contracted words

Syntax Tree Audio File
Er hat's mit Red Hat's File System probiert.
wav


Identification of Sentence Boundaries

Ambiguous punctuation symbols

Syntax Tree Audio File
He's not in St. Peter's St. Peter's at home.
wav


SVOX Monolingual text-to-speech synthesis system for German
polySVOX Polyglot text-to-speech synthesis system

Last updated: Tue May 30 2006 by: Harald Romsdorfer