Statistical Model-driven Similarity Hashing: Bridging Modalities for Efficient Unsupervised Retrieval

2citations

PDF Project

citations

#1456

in AAAI 2025

of 3028 papers

Top Authors

Data Points

Top Authors

Mingjin Kuai Jun Long Zhan Yang

Abstract

Unsupervised deep cross-modal hash retrieval aims to map multi-modal features into binary hash codes without labels, which is of interest due to its storage efficiency, query speed and convenient applications. However, existing approaches suffer from two main limitations: (1) Slightly insufficient consideration of text instance similarity, along with independent or redundant fusion to learn multi-modal similarity information. (2) They ignore the noisy adjacent correlations between multi-modal instances, leading to a lack of discriminative power in the generated hash codes. To address these challenges, we propose a new approach called Statistical Model-driven Similarity Hashing (SMSH). Specifically, we introduce Jaccard similarity when constructing the text similarity matrix. It reduces the similarity error between text instances while better considering the asymmetry of the elements in the text features. After that, we integrate the original similarity information between various modalities to construct a unified similarity matrix. The gaps between modalities are bridged while reducing the redundant information in them. In addition, we introduce a Statistical Model-driven Similarity Enhancement (SMSE) approach, which reduces the noise of similarity relations between multi-modal instances by using a Gaussian Mixture Model to keep instances with lower semantic similarity as far away from each other as possible. Experiments on three benchmark datasets demonstrate the excellent performance of the SMSH method.

Citation History

Jan 27, 2026

Feb 4, 2026

2+2