登录    注册    忘记密码

详细信息

半配对的多模态询问哈希方法  ( EI收录)  

Semi-paired Multi-modal Query Hashing Method

文献类型:期刊文献

中文题名:半配对的多模态询问哈希方法

英文题名:Semi-paired Multi-modal Query Hashing Method

作者:庾骏[1] 马江涛[1] 咸阳[1] 侯瑞霞[2] 孙伟[3]

第一作者:庾骏

机构:[1]郑州轻工业大学计算机与通信工程学院,郑州450000;[2]中国林业科学研究院资源信息研究所,北京100091;[3]中国农业科学院农业信息研究所,北京100081

年份:2024

卷号:46

期号:2

起止页码:481-491

中文期刊名:电子与信息学报

外文期刊名:Journal of Electronics & Information Technology

收录:CSTPCD;;EI(收录号:20241115751161);Scopus;北大核心:【北大核心2023】;CSCD:【CSCD2023_2024】;

基金:国家自然科学基金(32271880);河南省科技攻关项目基金(222102210064);河南省自然科学基金(232300420150)。

语种:中文

中文关键词:多模态信息检索;哈希;半配对数据;跨模态重建;二值化编码

外文关键词:Multimodal retrieval;Hashing;Semi-paired data;Cross-modal reconstruction;Binary codes

分类号:TN911.7;TP391

摘要:多模态哈希能够将异构的多模态数据转化为联合的二进制编码串。由于其具有低存储成本、快速的汉明距离排序的优点,已经在大规模多媒体检索中受到了广泛的关注。现有的多模态哈希方法假设所有的询问数据都具备完整的多种模态信息以生成它们的联合哈希码。然而,实际应用中很难获得全完整的多模态信息,针对存在模态信息缺失的半配对询问场景,该文提出一种新颖的半配对询问哈希(SPQH),以解决半配对的询问样本的联合编码问题。首先,提出的方法执行投影学习和跨模态重建学习以保持多模态数据间的语义一致性。然后,标签空间的语义相似结构信息和多模态数据间的互补信息被有效地捕捉以学习判别性的哈希函数。在询问编码阶段,通过学习到的跨模态重构矩阵为未配对的样本数据补全缺失的模态特征,然后再经习得的联合哈希函数生成哈希特征。相比最先进的基线方法,在Pascal Sentence,NUS-WIDE和IAPR TC-12数据集上的平均检索精度提高了2.48%。实验结果表明该算法能够有效编码半配对的多模态询问数据,取得了优越的检索性能。
Multimodal hashing can convert heterogeneous multimodal data into unified binary codes.Due to its advantages of low storage cost and fast Hamming distance sorting,it has attracted widespread attention in large-scale multimedia retrieval.Existing multimodal hashing methods assume that all query data possess complete multimodal information to generate their joint hash codes.However,in practical applications,it is difficult to obtain fully complete multimodal information.To address the problem of missing modal information in semi-paired query scenarios,a novel Semi-paired Query Hashing(SPQH)method is proposed to solve the joint encoding problem of semi-paired query samples.Firstly,the proposed method performs projection learning and cross-modal reconstruction learning to maintain semantic consistency among multimodal data.Then,the semantic similarity structure information of the label space and complementary information among multimodal data are effectively captured to learn a discriminative hash function.During the query encoding stage,the missing modal features of unpaired sample data are completed using the learned cross-modal reconstruction matrix,and then the hash features are generated using the learned joint hash function.Compared to state-ofthe-art baseline methods,the average retrieval accuracy on the Pascal Sentence,NUS-WIDE,and IAPR TC-12 datasets has improved by 2.48%.Experimental results demonstrate that the algorithm can effectively encode semi-paired multimodal query data and achieve superior retrieval performance.

参考文献:

正在载入数据...

版权所有©中国林业科学研究院 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心