We introduce GaZIR, a gaze-based interface for browsing and searching
for images. The system computes on-line predictions of relevance of
images based on implicit feedback, and when the user zooms in, the
images predicted to be the most relevant are brought out. The key
novelty is that the relevance feedback is inferred from implicit cues
obtained in real-time from the gaze pattern, using an estimator
learned during a separate training phase. The natural zooming
interface can be connected to any content-based information retrieval
engine operating on user feedback. We show with experiments on one
engine that there is sufficient amount of information in the gaze
patterns to make the estimated relevance feedback a viable choice to
complement or even replace explicit feedback by pointing-and-clicking.
This work was supported in part by the PASCAL2 Network of Excellence
of the European Community.
Copyright ACM, 2009. This is the author's version of the work. It is
posted here by permission of ACM for your personal use. Not for redistribution.
The definite version will be published in Proceedings of the 11th Conference on Multimodal Interfaces and The Sixth Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI).