Distributed Composite Quantization
Approximate nearest neighbor (ANN) search is a fundamental problem in computer vision, machine learning and information retrieval. Recently, quantization-based methods have drawn a lot of attention due to their superior accuracy and comparable efficiency compared with traditional hashing techniques. However, despite the prosperity of quantization techniques, they are all designed for the centralized setting, i.e., quantization is performed on the data on a single machine. This makes it difficult to scale these techniques to large-scale datasets. Built upon the Composite Quantization, we propose a novel quantization algorithm for data dis- tributed across different nodes of an arbitrary network. The proposed Distributed Composite Quantization (DCQ) decom-poses Composite Quantization into a set of decentralized sub-problems such that each node solves its own sub-problem on its local data, meanwhile is still able to attain consistent quantizers thanks to the consensus constraint. Since there is no exchange of training data across the nodes in the learning process, the communication cost of our method is low. Ex- tensive experiments on ANN search and image retrieval tasks validate that the proposed DCQ significantly improves Composite Quantization in both efficiency and scale, while still maintaining competitive accuracy.