昆明理工大学学报(自然科学版)

2025, 06, v.50 45-56

基于三重嵌入扩展和特征聚合的跨模态行人重识别

1.常州大学计算机与人工智能学院 2.南京理工大学江苏省社会安全图像与视频理解重点实验室

基金项目(Foundation): 国家自然科学基金项目(61976028); 江苏省社会安全图像与视频理解重点实验室课题(J2021-2)

邮箱(Email):

DOI: 10.16112/j.cnki.53-1223/n.2025.06.231

218	0	329
下载次数	被引频次	阅读次数

引用本文下载本文

PDF

引用导出

GB/T 7714-2015 MLA APA Refworks EndNote NoteExpress NoteFirst

摘要全文参考文献出版信息相关文章

摘要：

跨模态行人重识别任务存在的主要问题是可见光和红外图像之间模态差异过大，导致识别准确率低.作者提出一种基于三重嵌入扩展和特征聚合的方法，首先，对可见光图像使用通道数据增强生成第三模态图像作为输入；其次，通过三重嵌入扩展模块对可见光、红外、第三模态图像扩充以生成更多的嵌入，扩大嵌入空间，从而进一步缩小模态差异；最后，使用跨模态特征聚合模块对不同阶段的特征进行聚合，在丰富嵌入的前提下突出图像中的重要共享特征，减少无关特征对模型的影响.实验结果表明，该方法在SYSU-MM01数据集的全搜索模式下Rank-1和mAP指标分别为75.10%和71.11%;在RegDB数据集的可见光到红外模式下Rank-1和mAP指标分别为92.06%和84.44%;在低照度LLCM数据集可见光到红外模式下Rank-1和mAP分别为63.77%和66.38%,优于目前同类方法.

关键词： 行人重识别; 跨模态; 多样化嵌入; 自注意力机制; 特征聚合;

Abstract：

The main problem in cross-modal pedestrian re-identification is the excessive modal difference between visible and infrared images, which leads to low recognition accuracy.To address this issue, a method based on triple embedding extension and feature aggregation is proposed.First, the visible image is augmented with channel data to generate a third modal image as input.Second, the triple embedding extension module expands the visible, infrared, and third modal images to generate more embeddings, thereby enlarging the embedding space and further reducing modal differences.Finally, the cross-modal feature aggregation module aggregates features at different stages, highlighting important shared features in the image while reducing the influence of irrelevant features on the model.Experimental results show that the Rank-1 and mAP metrics of this method are 75.10% and 71.11% in the full search mode of the SYSU-MM01 dataset, respectively; 92.06% and 84.44% in the visible to infrared mode of the RegDB dataset; and 63.77% and 66.38% in the visible to infrared mode of the low illumination LLCM dataset, outperforming current state-of-the-art methods.

KeyWords： person re-identification; cross-modal; diverse embedding; self-attention mechanism; feature aggregation;

如需获取全文，请访问cnki.net

参考文献

[1] YE M,SHEN J B,LIN G J,et al.Deep learning for person re-identification:A survey and outlook[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,44(6):2872-2893.

[2] KHALDI K,NGUYEN V D,MANTINI P,et al.Unsupervised person re-identification in aerial imagery[C]//2024 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW),Waikoloa,HI,USA.IEEE,2024:260-269.

[3] 张臣，李云平，唐鑫，等.基于深度学习的域泛化行人重识别综述[J].昆明理工大学学报(自然科学版),2024,49(6):86-99.ZHANG C,LI Y P,TANG X,et al.A survey on domain generalization for person re-identification based on deep learning[J].Journal of Kunming University of Science and Technology (Natural Science),2024,49(6):86-99.

[4] WU A C,ZHENG W S,YU H X,et al.RGB-infrared cross-modality person re-identification[C]//2017 IEEE International Conference on Computer Vision (ICCV),Venice,Italy.IEEE,2017:5390-5399.

[5] CHOI S,LEE S M,KIM Y,et al.Hi-CMD:Hierarchical cross-modality disentanglement for visible-infrared person re-identification[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Seattle,WA,USA.IEEE,2020:10254-10263.

[6] WANG Z X,WANG Z,ZHENG Y Q,et al.Learning to reduce dual-level discrepancy for infrared-visible person re-identification[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Long Beach,CA,USA.IEEE,2019:618-626.

[7] NING E H,WANG C S,ZHANG H,et al.Occluded person re-identification with deep learning:A survey and perspectives[J].Expert Systems with Applications,2024,239:122419.

[8] 杨磊，谢明鸿，张亚飞，等.基于混合对比学习的无监督行人重识别[J].昆明理工大学学报(自然科学版),2023,48(6):39-53.YANG L,XIE M H,ZHANG Y F,et al.Hybrid contrastive learning for unsupervised person re-identification[J].Journal of Kunming University of Science and Technology (Natural Science),2023,48(6):39-53.

[9] HAO Y,WANG N N,LI J,et al.HSME:Hypersphere manifold embedding for visible thermal person re-identification[J].Proceedings of the AAAI Conference on Artificial Intelligence,2019,33(1):8385-8392.

[10] LU Y,WU Y,LIU B,et al.Cross-modality person re-identification with shared-specific feature transfer[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Seattle,WA,USA.IEEE,2020:13376-13386.

[11] YE M,RUAN W J,DU B,et al.Channel augmented joint learning for visible-infrared recognition[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV),Montreal,QC,Canada.IEEE,2021:13547-13556.

[12] NGUYEN V D,MIRZA S,ZAKERI A,et al.Tackling domain shifts in person re-identification:A survey and analysis[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW),Seattle,WA,USA.IEEE,2024:4149-4159.

[13] FU C Y,HU Y B,WU X,et al.CM-NAS:Cross-modality neural architecture search for visible-infrared person re-identification[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV),Montreal,QC,Canada.IEEE,2021:11803-11812.

[14] ZHANG Q,WANG L,PATEL V M,et al.View-decoupled transformer for person re-identification under aerial-ground camera network[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Seattle,WA,USA.IEEE,2024:22000-22009.

[15] REN M,HE L X,LIAO X Y,et al.Learning instance-level spatial-temporal patterns for person re-identification[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV),Montreal,QC,Canada.IEEE,2021:14910-14919.

[16] HE S T,LUO H,WANG P C,et al.TransReID:transformer-based object re-identification[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV),Montreal,QC,Canada.IEEE,2021:14993-15002.

[17] YANG Q Z,WU A C,ZHENG W S.Person re-identification by contour sketch under moderate clothing change[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(6):2029-2046.

[18] YANG B,CHEN J,YE M.Shallow-deep collaborative learning for unsupervised visible-infrared person re-identification[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Seattle,WA,USA.IEEE,2024:16870-16879.

[19] ZHANG Z Y,JIANG S,HUANG C,et al.RGB-IR cross-modality person ReID based on teacher-student GAN model[J].Pattern Recognition Letters,2021,150:155-161.

[20] NGUYEN V D,KHALDI K,NGUYEN D,et al.Contrastive viewpoint-aware shape learning for long-term person re-identification[C]//2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV),Waikoloa,HI,USA.IEEE,2024:1030-1038.

[21] YE M,SHEN J B,CRANDALL D J,et al.Dynamic dual-attentive aggregation learning for visible-infrared person re-identification[M]//Computer Vision-ECCV 2020.Cham:Springer International Publishing,2020:229-247.

[22] CHEN Y,WAN L,LI Z H,et al.Neural feature search for RGB-infrared person re-identification[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Nashville,TN,USA.IEEE,2021:587-597.

[23] ZHAO Z W,LIU B,CHU Q,et al.Joint color-irrelevant consistency learning and identity-aware modality adaptation for visible-infrared cross modality person re-identification[J].Proceedings of the AAAI Conference on Artificial Intelligence,2021,35(4):3520-3528.

[24] REN K J,ZHANG L.Implicit discriminative knowledge learning for visible-infrared person re-identification[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Seattle,WA,USA.IEEE,2024:393-402.

[25] ZHANG Y K,WANG H Z.Diverse embedding expansion network and low-light cross-modality benchmark for visible-infrared person re-identification[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),Vancouver,BC,Canada.IEEE,2023:2153-2162.

[26] YU Z Y,LI L S,XIE J L,et al.Pedestrian 3D shape understanding for person re-identification via multi-view learning[J].IEEE Transactions on Circuits and Systems for Video Technology,2024,34(7):5589-5602.

[27] YANG M X,HUANG Z Y,HU P,et al.Learning with twin noisy labels for visible-infrared person re-identification[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),New Orleans,LA,USA.IEEE,2022:14288-14297.

[28] ZHANG Y K,YAN Y,LI J,et al.MRCN:A novel modality restitution and compensation network for visible-infrared person re-identification[J].Proceedings of the AAAI Conference on Artificial Intelligence,2023,37(3):3498-3506.

[29] WEI Z Y,YANG X,WANG N N,et al.Dual-adversarial representation disentanglement for visible infrared person re-identification[J].IEEE Transactions on Information Forensics and Security,2023,19:2186-2200.

[30] LU Z F,LIN R H,HU H F.Tri-level modality-information disentanglement for visible-infrared person re-identification[J].IEEE Transactions on Multimedia,2023,26:2700-2714.

[31] HUANG Z P,LIU J W,LI L,et al.Modality-adaptive mixup and invariant decomposition for RGB-infrared person re-identification[J].Proceedings of the AAAI Conference on Artificial Intelligence,2022,36(1):1034-1042.

[32] PARK H,LEE S,LEE J,et al.Learning by aligning:Visible-infrared person re-identification using cross-modal correspondences[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV),Montreal,QC,Canada.IEEE,2021:12026-12035.

[33] ZHANG Q,LAI C Z,LIU J N,et al.FMCNet:Feature-level modality compensation for visible-infrared person re-identification[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),New Orleans,LA,USA.IEEE,2022:7339-7348.

[34] LIU J L,SUN Y F,ZHU F,et al.Learning memory-augmented unidirectional metrics for cross-modality person re-identification[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR),New Orleans,LA,USA.IEEE,2022:19344-19353.

[35] CHEN C Q,YE M,QI M B,et al.Structure-aware positional transformer for visible-infrared person re-identification[J].IEEE Transactions on Image Processing,2022,31:2352-2364.

[36] JIANG K Z,ZHANG T Z,LIU X,et al.Cross-modality transformer for visible-infrared person re-identification[M]//Computer Vision-ECCV 2022.Cham:Springer Nature Switzerland,2022:480-496.

[37] QIAN Z H,LIN Y T,DU B.Visible-infrared person re-identification via patch-mixed cross-modality learning[J].Pattern Recognition,2025,157:110873.

基本信息:

DOI：10.16112/j.cnki.53-1223/n.2025.06.231

中图分类号:TP391.41

引用信息:

[1]刘锁兰,夏洋洋.基于三重嵌入扩展和特征聚合的跨模态行人重识别[J].昆明理工大学学报(自然科学版),2025,50(06):45-56.DOI:10.16112/j.cnki.53-1223/n.2025.06.231.

基金信息:

国家自然科学基金项目(61976028); 江苏省社会安全图像与视频理解重点实验室课题(J2021-2)

请选择需要下载的pdf数据

昆明理工大学学报(自然科学版)

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈

请选择需要下载的pdf数据

昆明理工大学学报(自然科学版)

使用微信“扫一扫”功能。将此内容分享给您的微信好友或者朋友圈

使用微信“扫一扫”功能。
将此内容分享给您的微信好友或者朋友圈