Next: Job opportunities
Up: Welcome to the speech
Previous: RECO: Speaker-independent word recognizer
Publications
- KP12
-
T. Kaufmann and B. Pfister.
Syntactic language modeling with formal grammars.
Speech Communication (Elsevier), 2012.
(to appear).
- NP12
-
T. Naghibi and B. Pfister.
An approach to prevent adaptive beamformers from cancelling the
desired signal.
In Proceedings of ICASSP. IEEE, 2012.
- EP11
-
T. Ewender and B. Pfister.
Automatically creating a diphone set from a speech database.
In Proceedings of Interspeech, pages 2169-2172, Florence
(Italy), August 2011.
PDF (159KB)
- Ger11
-
M. Gerber.
Speech Recognition Techniques for Languages with Limited
Linguistic Resources.
PhD thesis, No. 19507, Computer Engineering and Networks Laboratory,
ETH Zurich, 2011.
PDF (1264KB)
- GKP11
-
M. Gerber, T. Kaufmann, and B. Pfister.
Extended Viterbi algorithm for optimized word HMMs.
In Proceedings of ICASSP, pages 4932-4935, Prague (Czech
Republic), May 2011.
PDF (220KB)
- Nag11
-
T. Naghibi.
VSHMI Experimentation System.
Annual Report of the SNSF project no. 200021 130224/1. TIK, ETH
Zurich, March 2011.
PDF (267KB)
- EP10
-
T. Ewender and B. Pfister.
Accurate pitch marking for prosodic modification of speech segments.
In Proceedings of Interspeech, pages 178-181, Makuhari
(Japan), September 2010.
PDF (291KB)
- HP10
-
S. Hoffmann and B. Pfister.
Fully automatic segmentation for prosodic speech corpora.
In Proceedings of Interspeech, pages 1389-1392, Makuhari
(Japan), September 2010.
PDF (204KB)
- KP10
-
T. Kaufmann and B. Pfister.
Semi-automatic extension of morphological lexica.
In Workshop Computational Linguistics - Applications, Wisla
(Poland), 2010.
PDF (117KB)
- PN10
-
B. Pfister and T. Naghibi.
Concept of the VSHMI Experimentation System.
Report of the SNSF project no. 200021 130224/1. TIK, ETH Zurich, June
2010.
PDF
- EHP09
-
T. Ewender, S. Hoffmann, and B. Pfister.
Nearly perfect detection of continuous F0 contour and frame
classification for TTS synthesis.
In Proceedings of Interspeech, pages 100-103, Brighton,
September 2009.
PDF (771KB)
- Kau09
-
T. Kaufmann.
A Rule-based Language Model for Speech Recognition.
PhD thesis, No. 18700, Computer Engineering and Networks Laboratory,
ETH Zurich, 2009.
PDF (897KB)
- KEP09
-
T. Kaufmann, T. Ewender, and B. Pfister.
Improving broadcast news transcription with a precision grammar and
discriminative reranking.
In Proceedings of Interspeech, pages 356-359, Brighton,
September 2009.
PDF (264KB)
- Rom09a
-
H. Romsdorfer.
Polyglot speech prosody control.
In Proceedings of Interspeech, pages 488-491, Brighton (United
Kingdom), September 2009.
PDF (482KB)
- Rom09b
-
H. Romsdorfer.
Polyglot Text-to-Speech Synthesis: Text Analysis & Prosody
Control.
PhD thesis, No. 18210, ETH Zurich. Shaker Verlag Aachen (ISBN
978-3-8322-8090-1), February 2009.
PDF (1756KB)
- Rom09c
-
H. Romsdorfer.
Weighted neural network ensemble models for speech prosody control.
In Proceedings of Interspeech, pages 492-495, Brighton (United
Kingdom), September 2009.
PDF (606KB)
- GP08
-
M. Gerber and B. Pfister.
Fast search for common segments in speech signals for speaker
verification.
In Proceedings of Interspeech, pages 375-378, Brisbane
(Australia), September 2008.
PDF (204KB)
- KP08
-
T. Kaufmann and B. Pfister.
Applying a grammar-based language model to a simplified
broadcast-news transcription task.
In Proceedings of ACL, pages 106-113, Columbus (Ohio), June
2008.
PDF (464KB)
- PK08
-
B. Pfister und T. Kaufmann.
Sprachverarbeitung: Grundlagen und Methoden der Sprachsynthese
und Spracherkennung.
Springer Verlag (ISBN: 978-3-540-75909-6), 2008.
- Beu07
-
R. Beutler.
Improving Speech Recognition through Linguistic Knowledge.
PhD thesis, No. 17039, Computer Engineering and Networks Laboratory,
ETH Zurich, January 2007.
PDF (2135KB)
- GBP07
-
M. Gerber, R. Beutler, and B. Pfister.
Quasi text-independent speaker verification based on pattern
matching.
In Proceedings of Interspeech, pages 1993-1996, Antwerp,
August 2007.
PDF (658KB)
- GKP07
-
M. Gerber, T. Kaufmann, and B. Pfister.
Perceptron-based class verification.
In Proceedings of NOLISP (ISCA Workshop on non linear speech
processing), Paris, May 2007.
PDF (170KB)
- KP07
-
T. Kaufmann and B. Pfister.
Applying licenser rules to a grammar with continuous constituents.
In Stefan Müller, editor, Proceedings of the 14th
International Conference on Head-Driven Phrase Structure Grammar, pages
150-162, Stanford, 2007. CSLI Publications.
PDF (73KB)
- RP07
-
H. Romsdorfer and B. Pfister.
Text analysis and language identification for polyglot text-to-speech
synthesis.
Speech Communication (Elsevier), 49(9):697-724, September
2007.
PDF (563KB)
- RP06
-
H. Romsdorfer and B. Pfister.
Character stream parsing of mixed-lingual text.
In ISCA Tutorial and Research Workshop on Multilingual Speech
and Language Processing (MultiLing 2006), Stellenbosch (South Africa), April
2006.
PDF (122KB)
- BKP05a
-
R. Beutler, T. Kaufmann, and B. Pfister.
Integrating a non-probabilistic grammar into large vocabulary
continuous speech recognition.
In Proceedings of the IEEE ASRU 2005 Workshop, pages 104-109,
San Juan (Puerto Rico), November 2005.
PDF (155KB)
- BKP05b
-
R. Beutler, T. Kaufmann, and B. Pfister.
Using rule-based knowledge to improve LVCSR.
In Proceedings of ICASSP, pages 829-832, Philadelphia (USA),
March 2005.
PDF (209KB)
- GP05
-
M. Gerber and B. Pfister.
Quasi text-independent speaker verification with neural networks.
MLMI'05 Workshop, Edinburgh (United Kingdom), July 2005.
PDF (337KB)
- Kau05
-
T. Kaufmann.
Evaluation von Grammatikformalismen in Hinblick auf die
Anwendung in der Spracherkennung .
Zwischenbericht zum Nationalfonds-Projekt 105211-104078/1: Rule-Based
Language Model for Speech Recognition. Institut TIK, ETH Zürich, September
2005.
- RP05
-
H. Romsdorfer and B. Pfister.
Phonetic labeling and segmentation of mixed-lingual prosody
databases.
In Proceedings of Interspeech, pages 3281-3284, Lisbon
(Portugal), September 2005.
PDF (224KB)
- RPB05
-
H. Romsdorfer, B. Pfister, and R. Beutler.
A mixed-lingual phonological component which drives the statistical
prosody control of a polyglot TTS synthesis system.
In S. Bengio and H. Bourlard, editors, Machine Learning for
Multimodal Interaction, pages 263-276. Springer-Verlag Heidelberg, January
2005.
PDF (237KB)
- Beu04
-
R. Beutler.
Open vocabulary CSR by linguistic knowledge.
COST 278 workshop, Mons (Belgium), January 2004.
- NP04
-
U. Niesen and B. Pfister.
Speaker verification by means of ANNs.
In Proceedings of ESANN, Bruges (Belgium), pages 145-150,
April 2004.
PDF (63KB)
- RP04
-
H. Romsdorfer and B. Pfister.
Multi-context rules for phonological processing in polyglot TTS
synthesis.
In Proceedings of Interspeech - ICSLP, pages 737-740, Jeju
Island (Korea), October 2004.
PDF (115KB)
- Beu03
-
R. Beutler.
Improve continuous speech recognition thru linguistic knowledge.
COST 278 workshop, Barcelona, February 2003.
- BP03
-
R. Beutler and B. Pfister.
Integrating statistical and rule-based knowledge for continuous
German speech recognition.
In Proceedings of Eurospeech, pages 937-940, Geneva, September
2003.
PDF (174KB)
- Gla03
-
U. Glavitsch.
Speaker Normalization with Respect to F0: a Perceptual Approach.
IM2.SP Project Report. TIK/ETH Zurich, December 2003.
PDF (206KB)
- PB03
-
B. Pfister and R. Beutler.
Estimating the weight of evidence in forensic speaker verification.
In Proceedings of Eurospeech, pages 701-704, Geneva, September
2003.
PDF (88KB)
- PR03
-
B. Pfister and H. Romsdorfer.
Mixed-lingual text analysis for polyglot TTS synthesis.
In Proceedings of Eurospeech, pages 2037-2040, Geneva,
September 2003.
PDF (52KB)
- Beu02
-
R. Beutler.
Recognition of continuously spoken German language using linguistic
knowledge.
COST 278 workshop, Eindhoven, August 2002.
- Leh02
-
G. Lehtinen.
Sprecheradaptation und Out-of-Vocabulary-Modell.
Bericht zum Projekt: Einsatz von Spracherkennung in der SAPH.
Institut TIK, ETH Zürich, April 2002.
- PW02
-
B. Pfister, E. Wehrli et al.
Lexical and Syntactic Analysis of Mixed-Lingual Sentences for
Text-to-Speech.
Final Report of SNSF Project No 21-59396.99. Institut TIK, ETH
Zürich, November 2002.
- Pfi01
-
B. Pfister.
Personenidentifizierung anhand der Stimme.
Kriminalistik, 55. Jahrgang, Heft 4, S. 287-292
(Fachzeitschrift des Hüthig Verlags, Heidelberg), April 2001.
PDF (338KB)
- PL01
-
B. Pfister und G. Lehtinen.
Schlussbericht für das Projekt COST249: Erkennung
kontinuierlicher Sprache über das Telefon.
Institut TIK, ETH Zürich, Januar 2001.
PostScript (210KB)
- TJ01
-
C. Traber and V. Jantzen.
The SVOX TTS System.
COST258 workshop, Prague, May 2001.
- Jan00
-
V. Jantzen.
Neural network-based pitch control for various sentence types.
COST258 workshop, Stockholm, April 2000.
- JWLL00
-
F.T. Johansen, N. Warakagoda, B. Lindberg, G. Lehtinen, et
al.
The COST249 SpeechDat multilingual reference recogniser.
In Proceedings of LREC'2000 (Conference on Language, Resources
and Evaluation), Athens (Greece), June 2000.
PostScript (119KB)
- LJWL00
-
B. Lindberg, F.T. Johansen, N. Warakagoda, G. Lehtinen, et
al.
A noise robust multilingual reference recogniser based on
SpeechDat(II).
In Proceedings of ICSLP, Beijing (China), October 2000.
PostScript (60KB)
- LS00
-
G. Lehtinen, S. Safra, et al.
IDAS: Interactive Directory Assistance Services.
In Proceedings of the COST249 ISCA Workshop on Voice Operated
Telecom Services, pages 51-54, Gent (Belgium), May 2000.
PostScript (128KB)
- Tra00a
-
C. Traber.
Das Sprachsynthesesystem SVOX.
11. Konferenz Elektronische Sprachsignalverarbeitung (ESSV 2000),
Cottbus, September 2000.
- Tra00b
-
C. Traber.
Spectral smoothing of diphone boundary mismatches.
COST258 workshop, Stockholm, April 2000.
- TH99
-
C. Traber, K. Huber, et al.
From multilingual to polyglot speech synthesis.
In Proceedings of Eurospeech, pages 835-838, Budapest,
September 1999.
PDF
- HPT98
-
K. Huber, B. Pfister und Ch. Traber.
POSSY: Ein Projekt zur Realisierung einer polyglotten
Sprachsynthese.
In DAGA-Tagungsband, S. 392-393, 1998.
PostScript (33KB)
- Hub98a
-
K. Huber.
Swiss German Polyphone - Schlussbericht.
TIK-Report Nr.48. Institut TIK, ETH Zürich, Juni 1998.
- Hub98b
-
K. Huber.
Zusammenstellung der Trägerwörter für Deutsch und
Italienisch.
Bericht Nr.1 zum Projekt TTS'97. Institut TIK, ETH Zürich, Juni
1998.
- Leh98
-
G. Lehtinen.
Einsatz des konfigurierbaren Worterkenners WOROV.
Bericht Nr.2. zum Projekt: Reverse Directory Service. Institut TIK,
ETH Zürich, Januar 1998.
- LS98a
-
G. Lehtinen and S. Safra.
Generation and selection of pronunciation variants for a flexible
word recognizer.
In Proceedings of the ESCA Workshop: Modeling Pronunciation
Variation for ASR, pages 67-71, Rolduc (The Netherlands), May 1998.
PostScript (98KB)
- LS98b
-
G. Lehtinen und S. Safra.
Generierung von Aussprachevariantenregeln und Verbesserung von
Subwortmodellen für einen flexiblen Worterkenner.
In DAGA-Tagungsband, S. 400-401, March 1998.
PostScript (124KB)
- PH98
-
B. Pfister, K. Huber et al.
Das Sprachsynthesesystem SVOX und seine praktische Anwendbarkeit.
In DAGA-Tagungsband, S. 338-339, 1998.
PostScript (107KB)
- Rie98
-
M. Riedi.
Controlling Segmental Duration in Speech Synthesis Systems.
PhD thesis, No. 12487, Computer Engineering and Networks Laboratory,
ETH Zurich (TIK-Schriftenreihe Nr. 26, ISBN 3-906469-05-0), February
1998.
PostScript (3168KB)
- Saf98
-
S. Safra.
A Parsing Strategy in ARCOS-G.
Talk at the COST249 meeting in Porto, Portugal, February 12-13,
1998.
(printed in Final Report of COST249).
PDF (52KB)
- SLH98
-
S. Safra, G. Lehtinen, and K. Huber.
Modeling pronunciation variations and coarticulation with
finite-state transducers in CSR.
In Proceedings of the ESCA Workshop: Modeling Pronunciation
Variation for ASR, pages 125-130, Rolduc (The Netherlands), May 1998.
PostScript (197KB)
- LP97
-
G. Lehtinen und B. Pfister et al.
Reverse Directory Service.
Projektbericht Nr.1, Institut TIK, ETH Zürich, September 1997.
- Rie97
-
M. Riedi.
Modeling segmental duration with multivariate adaptive regression
splines.
In Proceedings of Eurospeech, pages 2627-2630, Rhodes
(Greece), September 1997.
PostScript (162KB)
- Saf97
-
S. Safra.
Das Experimentalsystem ARCOS: Konzepte, Aufbau,
Methoden.
Zwischenbericht zum Projekt ARCOS-G. Institut für Technische
Informatik und Kommunikationsnetze, ETH Zürich, Juni 1997.
- Tra97
-
C. Traber.
Improvements of the Morpho-Syntactic Analysis of the SVOX
Text-to-Speech System.
Projektbericht, Institut für Technische Informatik und
Kommunikationsnetze, ETH Zürich, Mai 1997.
- Hut96
-
H.-P. Hutter.
Comparison of Classic and Hybrid HMM Approaches to Speech
Recognition over Telephone Lines.
PhD thesis, No. 11662, Computer Engineering and Networks Laboratory,
ETH Zurich (TIK-Schriftenreihe Nr. 15, ISBN 3 7281 2424 9), October
1996.
- Pfi96a
-
B. Pfister.
High-quality prosodic modification of speech signals.
In Proceedings of ICSLP, pages 2446-2449, Philadelphia,
October 1996.
audio examples
PDF (822KB)
- Pfi96b
-
B. Pfister.
Prosodische Modifikation von Sprachsegmenten für die
konkatenative Sprachsynthese.
Diss. Nr. 11331, TIK-Schriftenreihe Nr. 11 (ISBN 3 7281 2316 1),
ETH Zürich, März 1996.
PostScript (2987KB)
- Saf96
-
S. Safra.
Chartparsing in Continuous Speech Recognition.
Talk at the COST249 meeting in Kosice, Slovakia, February 29,
1996.
(printed in Final Report of COST249).
PDF (99KB)
- Hut95
-
H.-P. Hutter.
Comparison of a new hybrid connectionist-SCHMM approach with other
hybrid approaches for speech recognition.
In Proceedings of ICASSP. IEEE, 1995.
PDF (427KB)
- LP95
-
G. Lehtinen und B. Pfister.
Portierung des ARA-Systems auf die
SparcStation-Plattform von Sun Microsystems.
Bericht Nr.3 zum Projekt Realisation einer automatischen
Rufnummernauskunft. Institut TIK, ETH Zürich, Oktober 1995.
- Pfi95
-
B. Pfister.
The SVOX Text-to-Speech System.
Laboratory TIK, ETH Zurich, September 1995.
PDF (109KB)
- Rie95
-
M. Riedi.
A neural network-based model of segmental duration for speech
synthesis.
In Proceedings of Eurospeech, pages 599-602, Madrid (Spain),
September 1995.
PDF (365KB)
- Saf95
-
S. Safra.
Handling Pronunciation Variants and Co-articulation with Finite
State Transducers.
Talk at the COST249 meeting in Nancy, France (printed in Final
Report of COST249), March 6/7, 1995.
PDF (20KB)
- Tra95
-
C. Traber.
SVOX: The Implementation of a Text-to-Speech System for German.
PhD thesis, No. 11064, Computer Engineering and Networks Laboratory,
ETH Zurich, TIK-Schriftenreihe Nr. 7 (ISBN 3 7281 2239 4), March 1995.
PDF (1073KB)
PostScript (2271KB)
- HP94
-
H.-P. Hutter und B. Pfister.
Neuartiger hybrider SKHMM/KNN-Ansatz für die Spracherkennung.
In Studientexte zur Sprachkommunikation, Heft 11, S. 90-97. TU
Berlin, Oktober 1994.
- Hut94
-
H.-P. Hutter.
Recognizer for isolated German digits over telephone lines:
RECO.
In Final Report of COST232, 1994.
- PLC94
-
B. Pfister, G. Lehtinen und D. Christnach.
ARA-V1: Systembeschreibung und Auswertung eines Testeinsatzes.
Bericht Nr.2 zum Projekt Realisation einer automatischen
Rufnummernauskunft. Institut für Elektronik, ETH Zürich, September 1994.
- PS94
-
B. Pfister und A. Schaub.
Automatische Rufnummern-Auskunft.
Technische Mitteilungen Telecom PTT, Mai 1994.
- Saf94
-
S. Safra.
Experimentalsystem zur Erkennung kontinuierlicher
Sprache.
Erster Bericht zum Projekt ARCOS-G. Institut für Technische
Informatik und Kommunikationsnetze, ETH Zürich, Februar 1994.
- SP94
-
S. Safra und B. Pfister.
ARCOS-G: Ein Experimentalsystem zur Erkennung kontinuierlicher
deutscher Sprache.
In Studientexte zur Sprachkommunikation, Heft 11, S. 174-181.
TU Berlin, Oktober 1994.
PostScript (613KB)
- Tra93
-
C. Traber.
Syntactic processing and prosody control in the SVOX TTS system
for German.
In Proceedings of Eurospeech, pages 2099-2102, September 1993.
- Hub91
-
K. Huber.
Messung und Modellierung der Segmentdauer für die Synthese
deutscher Lautsprache.
Diss. Nr. 9535, Institut für Elektronik, ETH Zürich, Juli 1991.
- Hub90
-
K. Huber.
A statistical model of duration control for speech synthesis.
In Proc. of the EUSIPCO, Barcelona, September 1990.
- Rus90
-
T. Russi.
A Framework for Syntactic and Morphological Analysis and its
Application in a Text-to-Speech System.
PhD thesis, No. 9328, Electronics Laboratory, ETH Zurich, December
1990.
Last updated: Tue Jan 24 09:54:29 CET 2012
by: Beat Pfister