论文标题
最小化拖鞋以学习有效的稀疏表示
Minimizing FLOPs to Learn Efficient Sparse Representations
论文作者
论文摘要
深度表示学习已成为视觉搜索,建议和识别最广泛采用的方法之一。但是,从大型数据库中检索此类表示形式在计算上具有挑战性。基于学习紧凑的表示的近似方法,已广泛探索此问题,例如局部敏感的散列,产品量化和PCA。在这项工作中,与学习紧凑的表示相反,我们建议学习具有与密度嵌入相似的代表能力的高维和稀疏表示,同时由于稀疏的矩阵乘法操作的效率更高,这些操作可能比密集乘法更快。 Following the key insight that the number of operations decreases quadratically with the sparsity of embeddings provided the non-zero entries are distributed uniformly across dimensions, we propose a novel approach to learn such distributed sparse embeddings via the use of a carefully constructed regularization function that directly minimizes a continuous relaxation of the number of floating-point operations (FLOPs) incurred during retrieval.我们的实验表明,我们的方法与其他基线具有竞争力,并且在实用数据集上产生了类似或更好的速度VS-accuracy折衷。
Deep representation learning has become one of the most widely adopted approaches for visual search, recommendation, and identification. Retrieval of such representations from a large database is however computationally challenging. Approximate methods based on learning compact representations, have been widely explored for this problem, such as locality sensitive hashing, product quantization, and PCA. In this work, in contrast to learning compact representations, we propose to learn high dimensional and sparse representations that have similar representational capacity as dense embeddings while being more efficient due to sparse matrix multiplication operations which can be much faster than dense multiplication. Following the key insight that the number of operations decreases quadratically with the sparsity of embeddings provided the non-zero entries are distributed uniformly across dimensions, we propose a novel approach to learn such distributed sparse embeddings via the use of a carefully constructed regularization function that directly minimizes a continuous relaxation of the number of floating-point operations (FLOPs) incurred during retrieval. Our experiments show that our approach is competitive to the other baselines and yields a similar or better speed-vs-accuracy tradeoff on practical datasets.