skip navigation.

Research interests



My research interests are: Machine Learning in Natural Language Processing, Automated Reasoning, and Knowledge Representation. More specifically, I am mainly working on modeling of (spoken) human-human and human-machine spoken dialogue, and on grammatical inference, a branch of unsupervised machine learning:
  • Dialogue Modeling

    I am working on classification of dialogue acts (with focus on those defined in the DIT++ taxonomy), mainly in task-oriented human-human and human-machine spoken dialogue. Moreover, I am looking at dialogue management strategies and test my models in a dialogue system that I am developing in parallel.

    To support tagset annotation and evaluation with multiple annotators, I have developed a flexible web-based tool, DitAT, that distributes and collects data over the LAN or Internet and calculates statistics such as inter-annotator agreement (See: Cohen's Kappa web demo). It has been used successfully in experimental setup and class room setting.

  • Grammatical Inference

    Another topic I am working on is the induction of structure in symbolic sequential data, with focus on structure in natural language and in music. I have been working on extending, optimizing, and applying the Alignment-Based Learning (ABL) framework, a grammar induction framework that induces structure from symbolic sequences (See: ABL web demo and info). One major extension was the introduction of a new and efficient alignment learning algorithm based on generalized suffix trees (GSTs) (See: See GST web demo).

    In 2006 I have been preparing a first public release of an ABL implementation, combining prior work of Menno van Zaanen with my algorithms and extensions.

    With Menno van Zaanen I recently did some work on applying grammar induction in machine translation (MT). We have developed TABL (Translation using ABL), a system that induces syntactic and syntactic-semantic structure in two languages and learns a mapping between these structures in order to automatically learn MT systems from multi-lingual corpora.

Brief CV



I studied Computer Science (BSc, June 2001) and continued with a Master in Computational Linguistics and Artificial Intelligence (MA, November 2003).

Currently, I am a Ph.D. student at the department of Communication and Information Sciences at Tilburg University. My research is in dialogue management in human-machine interaction and is supervised by prof. dr. H. Bunt. From May to July 2006 I was associated as research fellow to the Centre for Language Technology at Macquarie University (Sydney, Australia).

A full CV is available upon request.

More information is available by subjecting my name to some 'simple' data mining techniques