360-degree visual saliency detection based on fast-mapped convolution and adaptive equator-bias perception

Zhang, Ripei; Chen, Chunyi; Zhang, Jiacheng; Peng, Jun; Alzbier, Ahmed Mustafa Taha

doi:10.1007/s00371-021-02395-w

360-degree visual saliency detection based on fast-mapped convolution and adaptive equator-bias perception

Original article
Published: 10 February 2022

Volume 39, pages 1163–1180, (2023)
Cite this article

The Visual Computer Aims and scope Submit manuscript

Ripei Zhang¹,
Chunyi Chen¹,
Jiacheng Zhang¹,
Jun Peng¹ &
…
Ahmed Mustafa Taha Alzbier¹

310 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

The geometric distortion of the panoramic image makes the saliency detection method based on traditional 2D convolution invalid. “Mapped Convolution” can effectively solve this problem, which accepts a task- or domain-specific mapping function in the form of an adjacency list that dictates where the convolutional filters sample the input. However, when applied to panorama saliency detection, the method results in additional computational overhead due to repeatedly sampling overlapping regions of adjacent convolution positions along the longitude. In order to solve this problem, we improved the calculation process of “Mapped Convolution”. Rather than accessing adjacency list during the convolution, we first sample the panorama based on the adjacency list for only once and obtain a sampled map. This sampling process is called the decoupled sampling of “Mapped Convolution”. And then the map is convoluted in traditional 2D way, thus avoiding repeatedly sampling. In this paper, an interpolation method based on the Softmax function is also proposed and applied to the interpolation calculation of decoupled sampling. Compared with common interpolation methods such as linear interpolation, this interpolation method makes our network more efficient during training. We additionally introduce a new adaptive equator bias algorithm allowing for different attention distributions at different longitudes, which is more consistent with viewer's visual behavior. Combining the U-Autoencoder network containing the decoupled sampling with the adaptive equator bias algorithm, we construct a 360-degree visual saliency detection model. We map the original panorama into a cube, and then use the the cube isometric mapping method to remap it into a panorama and input it into the network for training. Then, the crude saliency map output by the decoder is combined with the equator bias map to obtain the final saliency map. The results show that the model proposed is superior to recent state-of-the-art models in terms of computational speed and saliency-map prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Saliency Detection in 360 $$^\circ $$ Videos

Learning to Zoom: A Saliency-Based Sampling Layer for Neural Networks

PanoFormer: Panorama Transformer for Indoor 360 $$^{\circ }$$ Depth Estimation

References

Rai, Y., Gutiérrez, J., Le Callet, P.: A dataset of head and eye movements for 360 degree images. Proceedings of the 8th ACM on Multimedia Systems Conference. 205–210 (2017)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Jia, L., Wen, G.: Object-Based Visual Saliency Computation. Springer, New York (2014)
Google Scholar
Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans Image Process 13(10), 1304–1318 (2004)
Article Google Scholar
Setlur, V., Takagi, S., Raskar, R., et al.: Automatic image retargeting. Mum05: International Conference on Mobile & Ubiquitous Multimedia. ACM (2004)
Chang, M., Ong, S. K., Nee, A.: Automatic information positioning scheme in AR-assisted maintenance based on visual saliency//International conference on augmented reality. Springer (2016)
Schroers, C., Bazin, J.C., Sorkine-Hornung, A.: An omnistereoscopic video pipeline for capture and display of real-world VR. ACM Transactions Graph. 37(3), 1–13 (2018)
Article Google Scholar
Ding, Y., Liu, Y., Liu, J., et al.: Panoramic Image Saliency Detection by Fusing Visual Frequency Feature and Viewing Behavior Pattern. Springer, Cham (2018)
Book Google Scholar
Startsev, M., Dorr, M.: 360-aware saliency estimation with conventional image saliency predictors. Signal Process. Image Commun. S0923596518302595 (2018)
Harel, J.: Graph-based visual saliency Proc. Conference on Neural Information processing Systems (NIPS). 19: 545–552 (2007)
Bylinskii, Z., Recasens, A., Borji A, et al. Where should saliency models look next? European Conference on Computer Vision (2016)
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: Computer Vision, 2009 IEEE 12th International Conference on. pp. 2106– 2113. IEEE (2009)
Kruthiventi, S.S., Ayush, K., Babu, R.V.: Deepfix: a fully convolutional neural network for predicting human eye fixations. IEEE Trans Image Process. 26(9), 4446–4456 (2017)
Article MathSciNet MATH Google Scholar
Pan, J.T., Sayrol, E., Giro-i-Nieto, X., McGuinness, K. and O'Connor, N.E.: Shallow and deep convolutional networks for saliency prediction In: Proc of the IEEE Conf on Computer Vision and Pattern Recognition 598–606 (2016)
Wang, W.G., Shen, J.B.: Deep visual attention prediction. IEEE Trans. Image Process. 27(5), 2368–2378 (2018)
Article MathSciNet Google Scholar
Ling, J., Zhang, K., Zhang, Y., et al.: A saliency prediction model on 360 degree images using color dictionary based sparse representation. Signal Process. Image Commun. 69, 60–68 (2018)
Article Google Scholar
Lebreton, P., Raake, A.: GBVS360, BMS360, ProSal: extending existing saliency prediction models from 2D to omni- directional images. Signal Process. Image Commun. 69, 69–78 (2018)
Article Google Scholar
Rafael, M., Sebastian, L., Tejo, C., et al.: Salnet360: saliency maps for omni directional images with CNN. Signal Process. Image Commun. 69(2018), 26–34 (2018)
Google Scholar
Le, H., Xuelong, L., Yongsheng, D.: Salnet: edge constraint based to end model for salient object detection. In Chinese Conference on pattern recognition and computer vision (PRCV). Springer, 186–198 (2018)
Fangyi, C., Lu, Z., Hamidouche, W., and Olivier defects Sal-gan360: visual saliency prediction on 360 degree images with general ad-universal networks. In 2018 IEEE International Conference on Multimedia & expoworkshops (icmew). IEEE, 01–04 (2018)
Zhang, Z., Xu, Y., Yu, J., et al.: Saliency detection in 360°videos. Proceedings of the European Conference on Computer Vision.Germany: Springer, 488–503 (2018)
Bogdanova, I., Bur, A., Hugli, H.: Visual attention on the sphere. Comput. Vis. Image Underst. 114(1), 100–110 (2010)
Article MATH Google Scholar
Coors, B., Condurache, AP., Geiger, A., SphereNet: Learning spherical representations for detection and classification in omnidirectional images// European Conference on Computer Vision. Springer, Cham (2018)
Eder, M.et al.: Mapped convolutions. arXiv preprint arXiv:1906.11096, cs.CV (2019)
Keinert, B., Innmann, M., Saenger, M., et al.: Spherical fibonacci mapping. ACM Transactions Graph. 34(6), 1–7 (2015)
Article Google Scholar
Blu, T., Thévenaz, P., Unser, M.: Linear interpolation revitalized. IEEE Transactions Image Process. 13(5), 710–719 (2004)
Article MathSciNet Google Scholar
Sa, Y.: Improved bilinear interpolation method for image fast processing. IEEE (2015)
Lehmann, T.M., Gonner, C., Spitzer, K.: Addendum: B-spline interpolation in medical image processing. IEEE Trans. Med. Imaging 20(7), 660–665 (2001)
Article Google Scholar
Romera, E., Alvarez, J.M., Bergasa, L.M., et al.: ERFNet: efficient residual factorized convnet for real-time semantic segmentation. In: IEEE Transactions on Intelligent Transportation Systems, vol. 19, no. 1, pp. 263–272 (2017). https://doi.org/10.1109/TITS.2017.2750080
Sitzmann, V., Serrano, A., Pavel, A., et al.Saliency in VR: how do people explore virtual environments?. Proceedings of Transactions on Visualization and Computer Graphics. USA: IEEE, 1633 – 1642 (2018)
Monroyr, R., Lutz, S., Chalasani, T., et al.: SalNet360: saliency maps for omnidirectional images with CNN. Signal Process. Image Commun. 69, 26–34 (2017)
Article Google Scholar
Keys, R.G.: Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust Speech Signal Process. 29, 1153 (1981)
Article MathSciNet MATH Google Scholar
Chang, K.-T.: Computation for bilinear interpolation. Introduction to geographic information system-s. 5^th ed. New York, NY: McGrawHill, Print (2009)
Martin, D., Serrano, A., and Masia, B.: Panoramic convolutions for 360 single-image saliency prediction //Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (2020)
Chen, X., Fei, Q., Shi, G.: Bottom-up visual saliency estimation with deep autoencoder-based spars-e reconstruction. IEEE Transactions Neural Netw. Learn. Syst. 27(6), 1227–1240 (2016)
Article MathSciNet Google Scholar
Xun, H., Chengyao, S., Xavier, B., et al.: Silicon: reducing the semantic gap in saliency prediction by adapting deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision 262–270 (2015)
Kruthiventi, S.S., Ayush, K., Babu, R.V.: Deepfix: a fully convolutional neural network for predicting human eye fixations. IEEE Transactions Image Process. 26(9), 4446–4456 (2017)
Article MathSciNet MATH Google Scholar
Junting, P., Cristian, C. F., Kevin, McG, et al.: Salgan: Visual saliency prediction with generative adversarial networks. arXiv preprint arXiv:1701.01081(2017) (2017)
Junting, P., Elisa, S., Xavier, G. N., et al.: Shallow and deep convolutional networks for saliency pr-ediction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.598–606 (2016)
University of Nantes, Technicolor, Salient360!: Visual attention modeling for 360 images grand challe-nge. In: IEEE International Conference on Multimedia and Expo, ICME (2017)
Pan, J., Ferrer, C.C., McGuinness, K., O'Connor, N.E., Torres, J., Sayrol, E. and Giro-i-Nieto, X.: Salgan: visual saliencyreduction with general advertising networks. ArXiv preprint arXiv: 1701.01081 (2017)
Ziheng, Z., Yanyu, X., Jingyi, Y, et al.: Saliency detection in 360 videos. In procedures of the European Conference on computer vision (ECCV). 488–503 (2018)
Ronneberger, O., Fischer, P., and Brox, T.: U-net: Convolutionalnetworks for biomedical image segmentation. In International Conference onMedical image computing and computer-assisted intervention Springer 234–241 (2015)
Haoran, L., et al.: SalGCN: saliency prediction for 360-degree images based on spherical graph convolutional networks. Proceedings of the 28th ACM International Conference on Multimedia (2020)
Yuchuan, S., Grauman, K.: Kernel transformer networks forcompact spherical convolution. In: Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition 9442–9451 (2019)
Martin, D., Serrano, A., & Masia, B. Panoramic convolutions for 360◦ single-image saliency prediction. In: CVPR workshop on computer vision for augmented and virtual reality (2020)
Zhu, Y., Zhai, G., Min, X.: The prediction of head and eye movement for 360 degree images. Signal Process. Image Commun. 69, 15–25 (2018)
Article Google Scholar
Cohen, T. S., Geiger, M., Köhler, J., & Welling, M.: Spherical cnns. arXiv preprint arXiv:1801.10130 (2018)
Azaza, A., Douik, A.: Deep CNN features for visual saliency estimation 2018 15th International Multi-Conference on Systems Signals & Devices (SSD) (2018)
Li, X., et al.: A combined loss-based multiscale fully convolutional network for high-resolution remote sensing image change detection” IEEE Geoscience and Remote Sensing Letters (2021)
Cheng, H. T., et al.: Cube padding for weakly-supervised saliency prediction in 360{\\deg Videos.” 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition IEEE (2018)

Download references

Acknowledgements

This work was supported partially by the National Natural Science Foundation of China under Grant U19A2063, partially by the Jilin Provincial Science & Technology Development Program of China under Grant 20190302113GX.

Author information

Authors and Affiliations

School of Computer Science and Technology, Changchun University of Science and Technology, Changchun, 130022, China
Ripei Zhang, Chunyi Chen, Jiacheng Zhang, Jun Peng & Ahmed Mustafa Taha Alzbier

Authors

Ripei Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Chunyi Chen
View author publications
You can also search for this author in PubMed Google Scholar
Jiacheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jun Peng
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Mustafa Taha Alzbier
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chunyi Chen.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, R., Chen, C., Zhang, J. et al. 360-degree visual saliency detection based on fast-mapped convolution and adaptive equator-bias perception. Vis Comput 39, 1163–1180 (2023). https://doi.org/10.1007/s00371-021-02395-w

Download citation

Accepted: 25 December 2021
Published: 10 February 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s00371-021-02395-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

360-degree visual saliency detection based on fast-mapped convolution and adaptive equator-bias perception

Abstract

Access this article

Similar content being viewed by others

Saliency Detection in 360 $$^\circ $$ Videos

Learning to Zoom: A Saliency-Based Sampling Layer for Neural Networks

PanoFormer: Panorama Transformer for Indoor 360 $$^{\circ }$$ Depth Estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

360-degree visual saliency detection based on fast-mapped convolution and adaptive equator-bias perception

Abstract

Access this article

Similar content being viewed by others

Saliency Detection in 360 $$^\circ $$ Videos

Learning to Zoom: A Saliency-Based Sampling Layer for Neural Networks

PanoFormer: Panorama Transformer for Indoor 360 $$^{\circ }$$ Depth Estimation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation