On the Optimal Bit Complexity of Circulant Binary Embedding
Keywords:Binary Embedding, Circulant Matrix
Binary embedding refers to methods for embedding points in Rd into vertices of a Hamming cube of dimension k, such that the normalized Hamming distance well preserves the pre-defined similarity between vectors in the original space. A common approach to binary embedding is to use random projection with unstructured projection, followed by one-bit quantization to produce binary codes, which has been proven that k = O(ε-2 log n) is required to approximate the angle up to epsilon-distortion, where n is the number of data. Of particular interest in this paper is circulant binary embedding (CBE) with angle preservation, where a random circulant matrix is used for projection. It yields comparable performance while achieving the nearly linear time and space complexities, compared to embedding methods relying on unstructured projection. To support promising empirical results, several non-asymptotic analysis have been introduced to establish conditions on the number of bits to meet epsilon-distortion embedding, where one of state-of-the-art achieves the optimal sample complexity k = O(ε-3 log n) while the distortion rate ε-3 is far from the optimality, compared to k = O(ε-2 log n). In this paper, to support promising empirical results of CBE, we extend the previous theoretical framework to address the optimal condition on the number of bits, achieving that CBE with k = O(ε-2 log n) approximates the angle up to ε-distortion under mild assumptions. We also provide numerical experiments to support our theoretical results.