Skip to main content
Log in

LAANet: lightweight attention-guided asymmetric network for real-time semantic segmentation

  • Review
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

With the increasing demand for real-world scenarios such as robot navigation and autonomous driving, how to achieve a good trade-off between segmentation accuracy, inference speed and model size has become a core issue for real-time semantic segmentation applications. In this paper, we propose a lightweight attention-guided asymmetric network (LAANet), which adopts an asymmetric encoder–decoder architecture. In the encoder, we propose an efficient asymmetric bottleneck (EAB) module to jointly extract local and context information. In the decoder, we propose an attention-guided dilated pyramid pooling (ADPP) module and an attention-guided feature fusion upsampling (AFFU) module, which are used to aggregate multi-scale context information and fuse features from different layers, respectively. LAANet has only 0.67M parameters, while achieving the accuracy of 73.6% and 67.9\(\%\) mean Intersection over Union (mIoU) at 95.8 and 112.5 Frames Per Second (FPS) on the Cityscapes and CamVid datasets, respectively. The experimental results show that LAANet achieves an optimal trade-off between segmentation accuracy, inference speed, and model size.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Bovcon B, Mandeljc R, Perš J et al (2018) Stereo obstacle detection for unmanned surface vehicles by IMU-assisted semantic segmentation. Robot Auton Syst 104:1–13

    Article  Google Scholar 

  2. Zhang X, Chen Z, Wu QMJ et al (2019) Fast semantic segmentation for scene perception. IEEE Trans Ind Inf 15(2):1183–1192

    Article  Google Scholar 

  3. Minaee S, Boykov Y, Porikli F et al (2021) Image segmentation using deep learning: a survey. IEEE Trans Pattern Anal Mach Intell 99:1–1

    Article  Google Scholar 

  4. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3431–3440

  5. Paszke A, Chaurasia A, Kim S, Culurciello E (2016) ENet: A deep neural network architecture for real-time semantic segmentation. arXiv preprint arXiv:1606.02147

  6. Romera E, Alvarez JM, Bergasa LM, Arroyo R (2018) ERFNet: efficient residual factorized convnet for real-time semantic segmentation. IEEE Trans Intell Transp Syst 19(1):263–272

    Article  Google Scholar 

  7. Zhang X, Zhou X, Lin M, Sun J (2018) ShuffleNet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6848–6856

  8. Li G, Y un I, Kim J, Kim J (2019) DABNet: Depth-wise asymmetric bottleneck for real-time semantic segmentation. arXiv preprint arXiv:1907.11357

  9. Woo S, Park J, Lee JY, Kweon I.S (2018) CBAM: Convolutional Block Attention Module. In: Proceedings of the European conference on computer vision (ECCV), pp 3–19

  10. Zhao H, Shi J, Qi X et al (2017) Pyramid scene parsing network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 6230–6239

  11. Chen LC, Papandreou G, Kokkinos I et al (2017) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848

    Article  Google Scholar 

  12. Emara T, Abd El Munim HE, Abbas HM (2019) LiteSeg: a novel lightweight ConvNet for semantic segmentation. Dig Image Comput Tech Appl (DICTA), pp 1–7

  13. Wang Y, Zhou Q, Liu J et al (2019) LEDNet: A lightweight encoder-decoder network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on image processing (ICIP), pp 1860–1864

  14. Li H, Xiong P, Fan H, Sun J (2019) DFANet: Deep feature aggregation for real-time semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9514–9523

  15. Liu J, Zhou Q, Qiang Y et al (2020) FDDWNet: A lightweight convolutional neural network for real-time semantic segmentation. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2373–2377

  16. Li Y, Li X, Xiao C et al (2021) EACNet: enhanced asymmetric convolution for real-time semantic segmentation. IEEE Signal Proces Lett 28:234–238

    Article  Google Scholar 

  17. Cordts M, Omran M, Ramos S et al (2016) The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3213–3223

  18. Brostow GJ, Shotton J, Fauqueur J, Cipolla R (2008) Segmentation and recognition using structure from motion point clouds. In: Proceedings of the European conference on computer vision (ECCV), pp 44–57

  19. Lou A, Loew M (2021) CFPNet: channel-wise feature pyramid for real-time semantic segmentation. arXiv preprint arXiv:2103.12212

  20. Dong G, Yan Y, Shen C, Wang H (2021) Real-time high performance semantic image segmentation of urban street scenes. IEEE Trans Intell Transp Syst 22(6):3258–3274

    Article  Google Scholar 

  21. Zhang XL, Du BC, Luo ZC et al (2021) Lightweight and efficient asymmetric network design for real-time semantic segmentation. Appl Intell. https://doi.org/10.1007/s10489-021-02437-9

    Article  Google Scholar 

  22. Lo SY , Hang HM , Chan SW et al (2018) Efficient dense modules of asymmetric convolution for real-time semantic segmentation. arXiv preprint arXiv:1809.06323

  23. Wang Y, Zhou Q, Wu X (2019) ESNet: An efficient symmetric network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 41–52

  24. Mehta S, Rastegari M, Caspi A et al (2018) ESPNet: Efficient spatial pyramid of dilated convolutions for semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 561–580

  25. Yang Z, Yu H, Feng M et al (2020) Small object augmentation of urban scenes for real-time semantic segmentation. IEEE Trans Image Process 29:5175–5190

    Article  Google Scholar 

  26. Sun B, Li J, Shao M et al (2019) LPRNet: lightweight deep network by low-rank pointwise residual convolution. arXiv preprint arXiv:1910.11853

  27. Mehta S, Rastegari M, Shapiro L, Hajishirzi H (2019) ESPNetv2: a light-weight, power efficient, and general purpose convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 9182-9192

  28. Jiang W, Xie Z, Li Y et al (2020) LRNNet: A light-weighted network with efficient reduced non-local operation for real-time semantic segmentation. arXiv preprint arXiv:2006.02706

  29. Yu C, Wang J, Gao C et al (2020) Context prior for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 12413–12422

  30. Wang J, Xiong H, Wang H, Nian X (2020) ADSCNEt: asymmetric depthwise separable convolution for semantic segmentation in real-time. Appl Intell 50(4):1045–1056

    Article  Google Scholar 

  31. Gao G, Xu G, Yu Y et al (2021) MSCFNet: a lightweight network with multi-scale context fusion for real-time semantic segmentation. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2021.3098355

    Article  Google Scholar 

  32. Yang Q, Chen T, Fan J et al (2021) EADNet: efficient asymmetric dilated network for semantic segmentation. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing (ICASSP), pp 2315–2319

  33. Fu J, Liu J, Tian H et al (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3141–3149

  34. Han HY, Chen YC, Hsiao PY, Fu LC (2021) Using channel-wise attention for deep CNN based real-time semantic segmentation with class-aware edge information. IEEE Trans Intell Transp Syst 22(2):1041–1051

    Article  Google Scholar 

  35. Zhang Y, Sun X, Dong J et al (2021) GPNet: gated pyramid network for semantic segmentation. Pattern Recogn. https://doi.org/10.1016/j.patcog.2021.107940

    Article  Google Scholar 

  36. Peng C, Tian T, Chen C et al (2021) Bilateral attention decoder: a lightweight decoder for real-time semantic segmentation. Neural Networks 137:188–199

    Article  Google Scholar 

  37. Zhao T, Wu X (2019) Pyramid feature attention network for saliency detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp 3080–3089

  38. Yin L, Hu H (2020) Enhanced global attention upsample decoder based on enhanced spatial attention and feature aggregation module for semantic segmentation. Electron Lett 56(13):659–661

    Article  Google Scholar 

  39. Wu T, Tang S, Zhang R et al (2021) CGNet: a light-weight context guided network for semantic segmentation. IEEE Trans Image Process 30:1169–1179

    Article  Google Scholar 

  40. Wang L, Xu Q, Xiong Z et al (2019) A multi-level feature fusion network for real-time semantic segmentation. In: Proceedings of the International Conference on Wireless Communications and Signal Processing (WCSP), pp 1–6

  41. Liu M, Yin H (2019) Feature pyramid encoding network for real-time semantic segmentation. arXiv preprint arXiv:1909.08599

  42. Liu C, Gao H, Chen A (2020) A real-time semantic segmentation algorithm based on improved lightweight network. In: Proceedings of the International Symposium on Autonomous Systems (ISAS), pp 249–253

  43. Hu X, Jing L, Sehar U (2021) Joint pyramid attention network for real-time semantic segmentation of urban scenes. Appl Intell. https://doi.org/10.1007/s10489-021-02446-8

    Article  Google Scholar 

  44. Lv Q, Sun X, Chen C et al (2021) Parallel complement network for real-time semantic segmentation of road scenes. IEEE Trans Intell Transp Syst. https://doi.org/10.1109/TITS.2020.3044672

    Article  Google Scholar 

  45. Yu C, Wang J, Peng C et al (2018) BiSeNet: Bilateral segmentation network for real-time semantic segmentation. In: Proceedings of the European conference on computer vision (ECCV), pp 334–349

  46. Yu C, Gao C, Wang J, et al (2020) BiSeNet V2: bilateral network with guided aggregation for real-time semantic segmentation. arXiv preprint arXiv:2004.02147

Download references

Acknowledgements

This work was supported by the Hebei Provincial Department of education in 2021 provincial postgraduate demonstration course project construction under Grant KCJSX2021024.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bingce Du.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, X., Du, B., Wu, Z. et al. LAANet: lightweight attention-guided asymmetric network for real-time semantic segmentation. Neural Comput & Applic 34, 3573–3587 (2022). https://doi.org/10.1007/s00521-022-06932-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-06932-z

Keywords

Navigation