Discovering and Learning Novel Visual Categories

We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes. This setting is similar to semi-supervised learning, but significantly harder because there are no labelled examples for the new classes. The challenge, then, is to leverage the information contained in the labelled images in order to learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data. In this work we address this problem by combining three ideas: (1) we suggest that the common approach of bootstrapping an image representation using the labeled data only introduces an unwanted bias, and that this can be avoided by using self-supervised learning to train the representation from scratch on the union of labelled and unlabelled data; (2) we use rank statistics to transfer the model’s knowledge of the labelled classes to the problem of clustering the unlabelled images; and, (3) we train the data representation by optimizing a joint objective function on the labelled and unlabelled subsets of the data, improving both the supervised classification of the labelled data, and the clustering of the unlabelled data. We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.

Publications

[1] Kai Han*, Sylvestre-Alvise Rebuffi*, Sebastien Ehrhardt*, Andrea Vedaldi, Andrew Zisserman
Automatically Discovering and Learning New Visual Categories with Ranking Statistics
International Conference on Learning Representations (ICLR), 2020. (* indicates equal contribution.) [project page] [code]

[2] Kai Han, Andrea Vedaldi, Andrew Zisserman
Learning to Discover Novel Visual Categories via Deep Transfer Clustering
International Conference on Computer Vision (ICCV), 2019. [project page] [code]

Learning Dense Visual Correspondences

In this project, we tackle the task of establishing dense visual correspondences between images containing objects of the same category. This is a challenging task due to large intra-class variations and a lack of dense pixel level annotations. We propose a convolutional neural network architecture, called adaptive neighbourhood consensus network (ANC-Net), that can be trained end-to-end with sparse key-point annotations, to handle this challenge. At the core of ANC-Net is our proposed non-isotropic 4D convolution kernel, which forms the building block for the adaptive neighbourhood consensus module for robust matching. We also introduce a simple and efficient multi-scale self-similarity module in ANC-Net to make the learned feature robust to intra-class variations. Furthermore, we propose a novel orthogonal loss that can enforce the one-to-one matching constraint. We thoroughly evaluate the effectiveness of our method on various benchmarks, where it substantially outperforms state-of-the-art methods.

Publications

[1] Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce
SCNet: Learning Semantic Correspondence
International Conference on Computer Vision (ICCV), 2017. [project page] [code]

[2] Shuda Li*, Kai Han*, Theo W. Costain, Henry Howard-Jenkins, Victor Prisacariu
Correspondence Networks with Adaptive Neighbourhood Consensus
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. (* indicates equal contribution.) [project page] [code]

[3] Xinghui Li, Kai Han, Shuda Li, Victor Prisacariu
Dual-Resolution Correspondence Networks
Conference on Neural Information Processing Systems (NeurIPS), 2020. [project page] [code]