详细信息
Developing Interpretable Deep Learning Model for Subtropical Forest Type Classification Using Beijing-2, Sentinel-1, and Time-Series NDVI Data of Sentinel-2 ( SCI-EXPANDED收录 EI收录)
文献类型:期刊文献
英文题名:Developing Interpretable Deep Learning Model for Subtropical Forest Type Classification Using Beijing-2, Sentinel-1, and Time-Series NDVI Data of Sentinel-2
作者:Chen, Shudan[1,2] Wang, Xuefeng[1,2] Shi, Mengmeng[1,2] Tao, Guofeng[3] Qiao, Shijiao[3] Chen, Zhulin[1,2]
第一作者:Chen, Shudan
通信作者:Chen, ZL[1];Chen, ZL[2]
机构:[1]Chinese Acad Forestry, Inst Forest Resource Informat Tech, Beijing 100091, Peoples R China;[2]State Forestry & Grassland Adm, Key Lab Forest Management & Growth Modelling, Beijing 100091, Peoples R China;[3]Beijing Normal Univ, Adv Interdisciplinary Inst Satellite Applicat, Beijing 100875, Peoples R China
年份:2025
卷号:16
期号:11
外文期刊名:FORESTS
收录:;EI(收录号:20254819610260);Scopus(收录号:2-s2.0-105023054399);WOS:【SCI-EXPANDED(收录号:WOS:001625905200001)】;
基金:This research was funded by the National Key Research and Development Program of China under grant number 2023YFB3907702, and the National Natural Science Foundation of China under Grant 32401581.
语种:英文
外文关键词:forest classification; multi-modal remote sensing; deep learning; feature fusion; interpretability
摘要:Accurate forest type classification in subtropical regions is essential for ecological monitoring and sustainable management. Multimodal remote sensing data provide rich information support, yet the synergy between network architectures and fusion strategies in deep learning models remains insufficiently explored. This study established a multimodal deep learning framework with integrated interpretability analysis by combining high-resolution Beijing-2 RGB imagery, Sentinel-1 data, and time-series Sentinel-2 NDVI data. Two representative architectures (U-Net and Swin-UNet) were systematically combined with three fusion strategies, including feature concatenation (Concat), gated multimodal fusion (GMU), and Squeeze-and-Excitation (SE). To quantify feature contributions and decision patterns, three complementary interpretability methods were also employed: Shapley Additive Explanations (SHAP), Grad-CAM++, and occlusion sensitivity. Results show that Swin-UNet consistently outperformed U-Net. The SwinUNet-SE model achieved the highest overall accuracy (OA) of 82.76%, exceeding the best U-Net model by 3.34%, with the largest improvement of 5.8% for mixed forest classification. The effectiveness of fusion strategies depended strongly on architecture. In U-Net, SE and Concat improved OA by 0.91% and 0.23% compared with the RGB baseline, while GMU slightly declined. In Swin-UNet, all strategies achieved higher gains between 1.03% and 2.17%, and SE effectively reduced NDVI sensitivity. SHAP analysis showed that RGB features contributed most (values > 0.0015), NDVI features from winter and spring ranked among the top 50%, and Sentinel-1 features contributed less. These findings reveal how architecture and fusion design interact to enhance multimodal forest classification.
参考文献:
正在载入数据...
