Continuous Listening for Unconstrained Spoken Dialog

Tim Paek
Microsoft Research
Redmond, WA 98052

Eric Horvitz
Microsoft Research
Redmond, Washington 98052

Eric Ringger
Microsoft Research
Redmond, Washington 98052

Author email:,,


A major hindrance to rendering spoken dialog systems capable of ongoing, continuous listening without requiring a push-to-talk device is the problem of distinguishing speech which is intended for the system from that which is overheard. We present a decision-theoretic approach to this problem that exploits Bayesian models of spoken dialog at four levels of analysis within a domain-independent, multi-modal computational architecture called Quartet. We applied Quartet to the task of navigating PowerPoint slide shows during a spoken presentation in a prototype system called Presenter. We describe the runtime behavior of Presenter as well as the results of an experimental study comparing the performance of Presenter to human subjects in discriminating arbitrarily formed spoken requests for slide navigation during a recorded lecture.

Click here to access postscript or Click here to access pdf format.

Keywords: Bayesian user modeling, common ground, joint activity, conversational systems, dialog systems, computational linguistics.

In: Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, November 2000.

Related Papers

E. Horvitz and T. Paek, A Computational Architecture for Conversation, Proceedings of the Seventh International Conference on User Modeling, Banff, Canada, June 1999. New York: Springer Wien, pp. 201-210.

E. Horvitz. Uncertainty, Action, and Interaction: In Pursuit of Mixed-Initiative Computing, Intelligent Systems, Sept./ October Issue, IEEE Computer Society.

T. Paek and E. Horvitz, Conversation as Action Under Uncertainty, Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI-2000), Stanford, CA, June. 2000

T. Paek and E. Horvitz, Uncertainty, Utility, and Misunderstanding: A Decision-Theoretic Perspective on Grounding in Conversational Systems, AAAI Fall Symposium on Psychological Models of Communication in Collaborative Systems, Cape Cod, MA. November 5-7, 1999.

  • E. Horvitz and T. Paek, DeepListener: Harnessing Expected Utility to Guide Clarification Dialog in Spoken Language Systems, 6th International Conference on Spoken Language Processing (ICSLP 2000), Beijing, November 2000.