中国自然资源航空物探遥感中心主办
地质出版社出版

融合CNN与Transformer的遥感影像道路信息提取

曲海成, 王莹, 刘腊梅, 郝明. 2025. 融合CNN与Transformer的遥感影像道路信息提取. 自然资源遥感, 37(1): 38-45. doi: 10.6046/zrzyyg.2023237
引用本文: 曲海成, 王莹, 刘腊梅, 郝明. 2025. 融合CNN与Transformer的遥感影像道路信息提取. 自然资源遥感, 37(1): 38-45. doi: 10.6046/zrzyyg.2023237
QU Haicheng, WANG Ying, LIU Lamei, HAO Ming. 2025. Information extraction of roads from remote sensing images using CNN combined with Transformer. Remote Sensing for Natural Resources, 37(1): 38-45. doi: 10.6046/zrzyyg.2023237
Citation: QU Haicheng, WANG Ying, LIU Lamei, HAO Ming. 2025. Information extraction of roads from remote sensing images using CNN combined with Transformer. Remote Sensing for Natural Resources, 37(1): 38-45. doi: 10.6046/zrzyyg.2023237

融合CNN与Transformer的遥感影像道路信息提取

  • 基金项目:

    国家自然科学基金面上项目“面向数据特性保持的高光谱影像高效压缩方法研究”(编号: 42271409)和辽宁省高等学校基本科研项目“基于全脉冲混合神经网络的高效能目标检测”(编号: LIKMZ20220699)共同资助

详细信息
    作者简介: 曲海成(1981-), 男, 博士, 副教授, 主要研究方向为遥感图像高性能计算、智能大数据处理等。Email: quhaicheng@lntu.edu.cn
    通讯作者: 王莹(1998-), 女, 硕士研究生, 主要研究方向为数字图像处理与模式识别。Email: lntuwangying@163.com
  • 中图分类号: U495; |TP751; |TP183

Information extraction of roads from remote sensing images using CNN combined with Transformer

More Information
    Corresponding author: WANG Ying
  • 利用高分辨率遥感影像进行道路信息提取时, 深度神经网络很难同时学习影像全局上下文信息和边缘细节信息, 为此, 该文提出了一种同时学习全局语义信息和局部空间细节的级联神经网络。首先将输入的特征图分别送入到双分支编码器卷积神经网络(convolutional neural networks, CNN)和Transformer中, 然后, 采用了双分支融合模块(shuffle attention dual branch fusion block, SA-DBF)来有效地结合这2个分支学习到的特征, 从而实现全局信息与局部信息的融合。其中, 双分支融合模块通过细粒度交互对这2个分支的特征进行建模, 同时利用多重注意力机制充分提取特征图的通道和空间信息, 并抑制掉无效的噪声信息。在公共数据集Massachusetts道路数据集上对模型进行测试, 准确率(overall accuracy, OA)、交并比(intersection over union, IoU)和F1等评价指标分别达到98.04%, 88.03%和65.13%; 与主流方法U-Net和TransRoadNet等进行比较, IoU分别提升了2.01个百分点和1.42个百分点, 实验结果表明所提出的方法优于其他的比较方法, 能够有效提高道路分割的精确度。
  • 加载中
  • [1]

    He D, Zhong Y, Wang X, et al.Deep convolutional neural network framework for subpixel mapping[J].IEEE Transactions on Geoscience and Remote Sensing, 2021, 59(11):9518-9539.

    [2]

    Huang B, Zhao B, Song Y.Urban land-use mapping using a deep convolutional neural network with high spatial resolution multispectral remote sensing imagery[J].Remote Sensing of Environment, 2018, 214:73-86.

    [3]

    Xu Y, Chen H, Du C, et al.MSACon Mining spatial attention-based contextual information for road extraction[J].IEEE Transactions on Geoscience and Remote Sensing, 1809, 60:5604317.

    [4]

    Yuan Q, Shen H, Li T, et al.Deep learning in environmental remote sensing achievements and challenges[J].Remote Sensing of Environment an Interdisciplinary Journal, 2020, 241:111716.

    [5]

    Zhu Q, Zhang Y, Wang L, et al.A global context-aware and batch-independent network for road extraction from VHR satellite imagery[J].ISPRS Journal of Photogrammetry and Remote Sensing, 2021, 175:353-365.

    [6]

    Yang K, Yi J, Chen A, et al.ConDinet++:Full-scale fusion network based on conditional dilated convolution to extract roads from remote sensing images[J].IEEE Geoscience and Remote Sensing Letters, 2021, 19:8015105.

    [7]

    He D, Shi Q, Liu X, et al.Generating 2m fine-scale urban tree cover product over 34 metropolises in China based on deep context-aware sub-pixel mapping network[J].International Journal of Applied Earth Observation and Geoinformation, 2022, 106:102667.

    [8]

    Shelhamer E, Long J, Darrell T.Fully convolutional networks for semantic segmentation[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.IEEE, 2017:640-651.

    [9]

    Ronneberger O, Fischer P, Brox T.U-net convolutional networks for biomedical image segmentation[C]// IEEE Springer International 2015:234-241.

    [10]

    Badrinarayanan V, Kendall A, Cipolla R.SegNet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12):2481-2495.

    [11]

    Chen L C, Zhu Y, Papandreou G, et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[M]//Computer Vision-ECCV 2018.Cham Springer International Publishing, 2018:833-851.

    [12]

    Gao L, Song W, Dai J, et al.Road extraction from high-resolution remote sensing imagery using refined deep residual convolutional neural network[J].Remote Sensing, 2019, 11(5):552.

    [13]

    王勇, 曾祥强.集成注意力机制和扩张卷积的道路提取模型[J].中国图象图形学报, 2022, 27(10):3102-3115.

    Wang Y, Zeng X Q.Road extraction model derived from integrated attention mechanism and dilated convolution[J].Journal of Image and Graphics, 2022, 27(10):3102-3115.

    [14]

    吴强强, 王帅, 王彪, 等.空间信息感知语义分割模型的高分辨率遥感影像道路提取[J].遥感学报, 2022, 26(9):1872-1885.

    Wu Q Q, Wang S, Wang B, et al.Road extraction method of high-resolution remote sensing image on the basis of the spatial information perception semantic segmentation model[J].National RemoteSensing Bulletin, 2022, 26(9):1872-1885.

    [15]

    Vaswani A, Shazeer N, Parmar N, et al.Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017, Long Beach.ACM, 2017:6000-6010.

    [16]

    Sanchis-Agudo M, Wang Y, Duraisamy K, et al.Easy attention:A simple self-attention mechanism for Transformers[J/OL].2023:arXiv:2308.12874.http //arxiv.org/abs/2308.12874.

    [17]

    Dosovitskiy A, Beyer L, Kolesnikov A, et al.An image is worth 16×16 words:Transformers for image recognition at scale[J/OL].2020:arXiv:2010.11929.http //arxiv.org/abs/2010.11929.

    [18]

    Yang Z, Zhou D, Yang Y, et al.TransRoadNet:A novel road extraction method for remote sensing images via combining high-level semantic feature and context[J].IEEE Geoscience and Remote Sensing Letters, 1973, 19:6509505.

    [19]

    Dai Z, Liu H, Le Q V, et al.CoAtNet:Marrying convolution and attention for all data sizes[J/OL].2021:arXiv:2106.04803.http //arxiv.org/abs/2106.04803.

    [20]

    Cao Y, Xu J, Lin S, et al.GCNet:Non-local networks meet squeeze-excitation networks and beyond[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).Seoul, Korea (South).IEEE, 2019:1971-1980.

    [21]

    Woo S, Park J, Lee J Y, et al.CBAM:Convolutional block attention module[M]//Computer Vision-ECCV 2018.Cham:Springer International Publishing, 2018:3-19.

    [22]

    Su R, Huang W, Ma H, et al.SGE NET:Video object detection with squeezed GRU and information entropy map[C]//2021 IEEE International Conference on Image Processing (ICIP).Anchorage, AK, USA.IEEE, 2021:689-693.

    [23]

    Wang Q, Wu B, Zhu P, et al.ECA-net:Efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Seattle.IEEE, 2020:11531-11539.

    [24]

    Zhou L, Zhang C, Wu M.D-LinkNet:LinkNet with pretrained encoder and dilated convolution for high resolution satellite imagery road extraction[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).Salt Lake City.IEEE, 2018:192-1924.

  • 加载中
计量
  • 文章访问数:  41
  • PDF下载数:  5
  • 施引文献:  0
出版历程
收稿日期:  2023-08-02
修回日期:  2024-05-09

目录