Thomas Ludwig Ewender
Speech Processing Group
Computer Engineering and Networks Laboratory (TIK)
Departement of Electrical Engineering
Swiss Federal Institute of Technology (ETH) Zurich
|
My research interests involve speech synthesis with a focus on signal processing and speech waveform analysis. More precisely, my recent work focused on the classification of speech segments in terms of quality and suitability to serve as a basis for synthesis and the automatic generation of speech corpora for synthesis. On one hand this work includes in-depth analysis of signal properties such as fundamental frequency, pitch marks and voicing properties. On the other hand extensive tools are required to build voices in a fully automatic way with minimal manual intervention. In the beginning of my phd time I was engaged in natural language processing for a short time. Currently I also work on a server application for speech recognition within the framework of the CTI project POSPER: Polyglot Speech Recognizer. |
|
Since May 2007 I have been working as a research assistant at the speech processing group at ETH Zürich, Switzerland. In March 2007 I graduated at the Freie Universität Berlin, shortly after I finished my diploma thesis at the system software group at the Institute of Computer Science.From February 2004 to August 2004 I studied at the Department of Computer Science, University of Auckland, New Zealand. In the academic year that followed as well as in the academic year 2006/07 I worked as a teaching assistant for the graduate lecture operating systems. |
|
Thomas Ewender, Sarah Hoffmann and Beat Pfister: Nearly Perfect Detection of Continuous F0 Contour and Frame Classification for TTS Synthesis Proceedings of Interspeech 2009, Brighton (UK), September 2009 (BibTex) A Matlab implementation of the F0 detection algorithm is available for teaching and research purposes. Tobias Kaufmann, Thomas Ewender and Beat Pfister: Improving Broadcast News Transcription with a Precision Grammar and Discriminative Reranking Proceedings of Interspeech 2009, Brighton (UK), September 2009 Thomas Ewender and Beat Pfister: Accurate Pitch Marking for Prosodic Modification of Speech Segments Proceedings of Interspeech 2010, Makuhari (Japan), September 2010 (BibTex) Thomas Ewender and Beat Pfister: Automatically Creating a Diphone Set from a Speech Database Proceedings of Interspeech 2011, Florence (Italy), August 2011 (BibTex) |
|
The speech processing group offers a two-term lecture on speech processing that encompasses speech recognition and synthesis. I'm regularily assising this the lecture. In the spring semester I give an undergraduate lab course on speech recognition (PPS Spracherkennung). Autumn semesters Vorlesung Sprachverarbeitung I Spring semesters Vorlesung Sprachverarbeitung II Spring semester 2012 PPS Spracherkennung (undergraduate lab course on speech recognition) |
|
Term project or master thesis (open) Detection of nasalized, creaky and breathy vowels Term project or master thesis (open) Pitch Determination by Nonnegative Matrix Factorization Term project (completed) Implementation and evaluation of HMM synthesis for SVOX See here for a detailed description of the project. Master thesis by Cédric Schaller (completed): Monitoring changes in speaking style over long time periods Term project by Simon Simonet (completed): Detektion und Elimination von Störgeräuschen bei Frikativlauten Term project by Daniel Kaufmann (completed): Frame classification of speech using support vector machines |