Thomas Ludwig Ewender

Speech Processing Group
Computer Engineering and Networks Laboratory (TIK) Departement of Electrical Engineering
Swiss Federal Institute of Technology (ETH) Zurich

office: ETZ D 97.7
phone: +41 44 63 27 958



Research interests

My research interests involve speech synthesis with a focus on signal processing and speech waveform analysis. More precisely, my recent work focused on the classification of speech segments in terms of quality and suitability to serve as a basis for synthesis and the automatic generation of speech corpora for synthesis. On one hand this work includes in-depth analysis of signal properties such as fundamental frequency, pitch marks and voicing properties. On the other hand extensive tools are required to build voices in a fully automatic way with minimal manual intervention.

In the beginning of my phd time I was engaged in natural language processing for a short time. Currently I also work on a server application for speech recognition within the framework of the CTI project POSPER: Polyglot Speech Recognizer.




(Short) Curriculum vitae

Since May 2007 I have been working as a research assistant at the speech processing group at ETH Zürich, Switzerland.

In March 2007 I graduated at the Freie Universität Berlin, shortly after I finished my diploma thesis at the system software group at the Institute of Computer Science.

From February 2004 to August 2004 I studied at the Department of Computer Science, University of Auckland, New Zealand. In the academic year that followed as well as in the academic year 2006/07 I worked as a teaching assistant for the graduate lecture operating systems.




Publications


Thomas Ewender, Sarah Hoffmann and Beat Pfister:
Nearly Perfect Detection of Continuous F0 Contour and Frame Classification for TTS Synthesis
Proceedings of Interspeech 2009, Brighton (UK), September 2009 (BibTex)

A Matlab implementation of the F0 detection algorithm is available for teaching and research purposes.

Tobias Kaufmann, Thomas Ewender and Beat Pfister:
Improving Broadcast News Transcription with a Precision Grammar and Discriminative Reranking
Proceedings of Interspeech 2009, Brighton (UK), September 2009

Thomas Ewender and Beat Pfister:
Accurate Pitch Marking for Prosodic Modification of Speech Segments
Proceedings of Interspeech 2010, Makuhari (Japan), September 2010 (BibTex)

Thomas Ewender and Beat Pfister:
Automatically Creating a Diphone Set from a Speech Database
Proceedings of Interspeech 2011, Florence (Italy), August 2011 (BibTex)



Teaching


The speech processing group offers a two-term lecture on speech processing that encompasses speech recognition and synthesis. I'm regularily assising this the lecture. In the spring semester I give an undergraduate lab course on speech recognition (PPS Spracherkennung).

Autumn semesters
Vorlesung Sprachverarbeitung I

Spring semesters
Vorlesung Sprachverarbeitung II

Spring semester 2012
PPS Spracherkennung (undergraduate lab course on speech recognition)



Master Theses and Term Projects


Term project or master thesis (open)
Detection of nasalized, creaky and breathy vowels

Term project or master thesis (open)
Pitch Determination by Nonnegative Matrix Factorization

Term project (completed)
Implementation and evaluation of HMM synthesis for SVOX
See here for a detailed description of the project.

Master thesis by Cédric Schaller (completed):
Monitoring changes in speaking style over long time periods

Term project by Simon Simonet (completed):
Detektion und Elimination von Störgeräuschen bei Frikativlauten

Term project by Daniel Kaufmann (completed):
Frame classification of speech using support vector machines