登录    注册    忘记密码

详细信息

融合Swin Transformer的虫害图像实例分割优化方法研究     被引量:2

Research on the optimized pest image instance segmentation method based on the Swin Transformer model

文献类型:期刊文献

中文题名:融合Swin Transformer的虫害图像实例分割优化方法研究

英文题名:Research on the optimized pest image instance segmentation method based on the Swin Transformer model

作者:高家军[1] 张旭[1] 郭颖[1] 刘昱坤[1] 郭安琪[1] 石蒙蒙[1] 王鹏[1] 袁莹[1]

第一作者:高家军

机构:[1]中国林业科学研究院资源信息研究所,北京100091

年份:2023

卷号:47

期号:3

起止页码:1-10

中文期刊名:南京林业大学学报:自然科学版

外文期刊名:Journal of Nanjing Forestry University:Natural Sciences Edition

收录:CSTPCD;;Scopus;北大核心:【北大核心2020】;CSCD:【CSCD2023_2024】;

基金:中央级公益性科研院所基本科研业务费专项资金项目(CAFYBB2021ZB002)。

语种:中文

中文关键词:虫害识别;Swin Transformer;Mask R-CNN;实例分割;土沉香;黄野螟

外文关键词:pest recognition;Swin Transformer;Mask R?CNN;instance segmentation;Aguilaria sinensis;Heortia vitessoides

分类号:TP39;S433;S763

摘要:【目的】为了实现对虫害的精准监测,提出了一种融合Swin Transformer的图像实例分割优化方法,以期有效解决复杂真实场景下多幼虫个体图像识别分割困难的问题。【方法】选用Swin Transformer模型,改进Mask R-CNN实例分割模型的主干网部分,对黄野螟幼虫虫害图像进行识别分割。针对不同结构参数的Swin Transformer模型与ResNet模型,调整各层的输入输出维度,将其分别设置为Mask R-CNN的主干网进行对比实验,从定量与定性两个角度分析不同主干网的Mask R-CNN模型对黄野螟幼虫的识别分割精度与效果,确定最佳模型结构。【结果】(1)该方法在虫害识别框选方面的测度(F1)分数可达89.7%,平均精度(A_(P))可达88.0%;在虫害识别分割方面的F1分数可达84.3%,A_(P)可达82.2%。相较于Mask R-CNN,在目标框选与目标分割方面分别提升8.75%与8.40%。(2)对于小目标虫害识别分割任务,该方法在虫害识别框选方面的F1分数可达88.4%,A_(P)可达86.3%;在虫害识别分割方面的F1分数可达84.0%,A_(P)可达81.7%。相较于Mask R-CNN,在目标框选与目标分割方面分别提升9.30%与9.45%。【结论】对于复杂真实场景下的图像实例分割任务,其识别分割效果极大地依赖于模型对图像特征的提取能力,而融合了Swin Transformer的Mask R-CNN实例分割模型,在主干网的特征提取能力更强,模型整体的识别分割效果更好,可为虫害的识别监测提供技术支撑,同时为保护农、林、牧等产业资源提供解决方案。
【Objective】To achieve accurate pest monitoring,the author proposes an optimized instance segmentation method based on the Swin Transformer to effectively solve the difficulty in image recognition and segmentation of multi?larval individuals under complex real scenarios.【Method】The Swin Transformer model was selected to improve the backbone network of the Mask R?CNN instance segmentation model and to identify and segment Heortia vitessoides larvae which harmed Aquilaria sinensis.The input and output dimensions of all layers of the Swin Transformer and ResNet models with different structural parameters were adjusted.Both models were set as the backbone networks of Mask R?CNN for comparative experiments.H.vitessoides moore larvae identification and segmentation performances for different backbone networks were quantitatively and qualitatively analyzed using Mask R?CNN models to determine the best model structure.【Result】(1)Using this method,the F1 score and A_(P) were 89.7%and 88.0%,respectively,in terms of pest identification framing,and 84.3%and 82.2%,respectively,in terms of pest identification and segmentation,increasing by 8.75%and 8.40%,respectively,compared to that of the Mask R?CNN model in terms of target framing and segmentation.(2)For small target pest identification and segmentation tasks,the F1 score and A_(P) were 88????4%and 86????3%,respectively,in terms of pest identification framing,84.0%and 81.7%,respectively,in terms of pest identification and segmentation,and increased by 9.30%and 9.45%,respectively,compared to that of the Mask R?CNN model in terms of target framing and segmentation.【Conclusion】In segmentation tasks under complex real scenarios,the recognition and segmentation effects depend to a large extent on the model's ability to extract image features.By integrating the Swin Transformer,the mask R?CNN instance segmentation model has a stronger ability to extract features in the backbone network,with a better overall recognition and segmentation effect.It could provide technical support for the identification and monitoring of pests and solutions for the protection of agriculture,forestry,animal husbandry,and other industrial resources.

参考文献:

正在载入数据...

版权所有©中国林业科学研究院 重庆维普资讯有限公司 渝B2-20050021-8 
渝公网安备 50019002500408号 违法和不良信息举报中心