Probabilistic Combination of Text Classifiers Using Reliability Indicators: Models and Results

Paul N. Bennett, Susan T. Dumais, Eric Horvitz

Abstract:

The intuition that different text classifiers behave in qualitatively different ways has long motivated attempts to build a better metaclassifier via some combination of classifiers. We introduce a probabilistic method for combining classifiers that considers the context-sensitive reliabilities of contributing classifiers. The method harnesses reliability indicators --variables that provide a valuable signal about the performance of classifiers in different situations. We provide background, present procedures for building metaclassifiers that take into consideration both reliability indicators and classifier outputs, and review a set of comparative studies undertaken to evaluate the methodology.

Keywords: Text classification, classifier combination, metaclassifiers, reliability indicators.

In: Proceedings of 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, August 2002. ACM Press.

Author Email: pbennett+www@cs.cmu.edu,sdumais@microsoft.com, horvitz@microsoft.com