详细信息
Clusterformer for Pine Tree Disease Identification Based on UAV Remote Sensing Image Segmentation ( SCI-EXPANDED收录 EI收录) 被引量:9
文献类型:期刊文献
英文题名:Clusterformer for Pine Tree Disease Identification Based on UAV Remote Sensing Image Segmentation
作者:Liu, Huan[1] Li, Wei[1] Jia, Wen[2,3] Sun, Hong[4] Zhang, Mengmeng[1] Song, Lujie[1] Gui, Yuanyuan[1]
第一作者:Liu, Huan
通信作者:Li, W[1];Jia, W[2]
机构:[1]Beijing Inst Technol, Sch Informat & Elect, Beijing 100081, Peoples R China;[2]Chinese Acad Forestry, Inst Forest Resource Informat Tech, Beijing 100091, Peoples R China;[3]Natl Forestry & Grassland Adm, Key Lab Forestry Remote Sensing & Informat Syst, Beijing 100091, Peoples R China;[4]Natl Forestry & Grassland Adm, Ctr Biol Disaster Prevent & Control, Shenyang 110034, Peoples R China
年份:2024
卷号:62
起止页码:1-15
外文期刊名:IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING
收录:;EI(收录号:20240715545644);Scopus(收录号:2-s2.0-85184819866);WOS:【SCI-EXPANDED(收录号:WOS:001164472500007)】;
基金:No Statement Available
语种:英文
外文关键词:Cluster transformer; pine wilt identification; semantic segmentation; unmanned aerial vehicle (UAV) remote sensing
摘要:Pine wilt disease (PWD) is one of the most prevalent pine tree diseases, resulting in both ecological and economic havoc. Unmanned aerial vehicle (UAV) remote sensing segmentation plays a crucial role in early identifying and preventing PWD. However, deep learning segmentation models customized for PWD identification in scenarios with complex backgrounds have not received extensive exploration. In this article, we propose a novel UAV remote sensing segmentation model called Clusterformer with a conventional encoder-decoder structure. The encoder is comprised of the specially designed cluster transformer, which includes a cluster token mixer and a spatial-channel feed-forward network (SC-FFN). The cluster token mixer utilizes constructed clusters from the feature maps to represent pixels, thereby reducing redundant and interfering information. The SC-FFN extracts multiscale spatial information through depth-wise convolutions and channel information through a multilayer perceptron (MLP) in sequence. The decoder primarily consists of the specially designed D-cluster transformer. The token mixer of the D-cluster transformer employs constructed clusters from high-level decoded tokens to represent low-level encoded tokens without relying on traditional upsampling methods such as interpolation, transpose convolution, or patch expansion. Consequently, more robust and less redundant features from high-level decoded feature maps are transferred to low-level encoded feature maps. Experimental results on two PWD datasets demonstrate that Clusterformer outperforms existing state-of-the-art segmentation models. This confirms the effectiveness and efficiency of Clusterformer in PWD identification. The code is available at https://github.com/huanliu233/Clusterformer.
参考文献:
正在载入数据...