Published at Robotics Science and Systems (RSS 2014).

Presented also at AAAI 2015.

Sergio Guadarrama, Erik Rodner, Kate Saenko, Ning Zhang,
Ryan Farrell, Jeff Donahue and Trevor Darrell.

We address the problem of retrieving objects based on open-vocabulary natural language queries: Given a phrase describing a specific object, e.g., "the corn flakes box", the task is to find the best match in a set of images containing candidate objects. When naming objects, humans tend to use natural language with rich semantics, including basic-level categories, fine-grained categories, and instance-level concepts such as brand names. Existing approaches to large-scale object recognition fail in this scenario, as they expect queries that map directly to a fixed set of pre-trained visual categories, e.g. ImageNet synset tags. We address this limitation by introducing a novel object retrieval method and we also propose a method for handling open-vocabularies, i.e., words not contained in the training data. Our method can combine category- and instance-level semantics in a common representation. Our approach can accurately retrieve objects based on extremely varied open-vocabulary queries.

Presentation slides

A very brief talk about the project was given by Erik at AAAI 2015. The slides can be found here.

Pre-trained models

Pre-trained ImageNet models can be found on the Caffe webpage.

Related open-source projects

  • Caffe - for category recognition with deep convolutional networks
  • GISS - google image-by-image search (scraper)
  • Google Freebase - for query expansion
  • For the experiments in the paper, we also made use of the iq-engines API for instance-level matching, which now belongs to yahoo


If you use the software provided on this webpage, please cite the following paper:
  author = {Sergio Guadarrama and Erik Rodner and Kate Saenko and Ning Zhang and Ryan Farrell and Jeff Donahue and Trevor Darrell},
  booktitle = {Robotics Science and Systems (RSS)},
  title = {Open-vocabulary Object Retrieval},
  year = {2014}