next up previous
Next: PolyTTS: Polyglot text-to-speech synthesis Up: Some of the completed Previous: Speaker Verification


ISRL: Improving speech recognition thru linguistics

The recognition of continuously spoken speech requires a language model. Generally, statistical grammars, so-called N-grams are applied, that can easily be integrated with the statistical models on the acoustical level. Statistical grammars exhibits some significant lacks in modeling highly inflectional languages like German or French, however. In such cases, rule-based language models that incorporate explicit syntactic knowledge, seem to be more adequate (see project RULAMO).

The aim of this project is to develop a speech recognition architecture that allows to apply statistical as well as explicit linguistic knowledge about the language in one system. The application of two knowledge sources is motivated by the expectation that additional information will be able to improve the speech recognition. This looks promising because the two knowledge sources are somehow complementary: The statistical language model informs about the frequency of word sequences whereas the rule-based language model tells which word sequences are correct and which ones are not.

Currently, we use a system architecture which is frequently used in natural language understanding systems: a word lattice serves as an interface between an acoustic recognizer and a natural language processing module. In our approach, a score derived from the syntactic structures found by the parser is used to rescore the word lattice such that grammatical phrases are slightly favored.

Results of this project have been presented in several workshops (see e.g. [Beu02], [Beu03], [Beu04], [BKP05a]) and conferences (papers [BP03] and [BKP05b]). More detailed information can be found in the PhD thesis [Beu07].

Supported by: Staatssekretariat für Bildung und Forschung and NCCR IM2

In collaboration with: This project was carried out in the collaborative framework of COST Action 278 and was continued in NCCR IM2.

Last updated: Thu Oct 27 14:58:12 CEST 2016 by: Beat Pfister