MusicFactory: Application of a Convolutional Neural Network for the Generation of Soundscapes from Images

Navarro-Cáceres, Juan José; Mendes, André Sales; Blas, Hector Sánchez San; González, Gabriel Villarrubia; Navarro-Cáceres, María

doi:10.1007/978-3-031-14859-0_14

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1430))

Included in the following conference series:

International Conference on Disruptive Technologies, Tech Ethics and Artificial Intelligence

354 Accesses

Abstract

A soundscape is a sound description of a concrete environment. Therefore, the soundscapes are always connected to a visual component, as it might capture sounds from an urban city, a countryside, or a domestic place. In this work, we present a system that generate soundscapes from images. Firstly, we recognize some objects in the image. In a second step the system searches the most adequate sounds according to the entities identified in the picture. Finally, a soundscape is synthesized by combining the short sound files found. The results obtained according to the subjective evaluation are promising and encouraging to deepen our research in the soundscape generation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://usalinvestigacion.eu.qualtrics.com/jfe/form/SV_6LuVypXB6UbLM4S.

References

Thorogood, M., Pasquier, P., Eigenfeldt, A.: Audio metaphor: audio information retrieval for soundscape composition. Proc. Sound Music Comput. Cong.(SMC), 277–283 (2012)
Google Scholar
Marinos, K., Valle, A., et al.: Soundscapegenerator: soundscape modelling and simulation. In: XX Colloquio di Informatica Musicale 20th Colloquium on Music Informatics, pp. 65–70. Università IUAV di Venezia (2014)
Google Scholar
Polo, A., Sevillano, X.: Musical vision: an interactive bio-inspired sonification tool to convert images into music. J. Multimodal User Interfaces 13(3), 231–243 (2019). https://doi.org/10.1007/s12193-018-0280-4
Article Google Scholar
Thorogood, M., Fan, J., Pasquier, P.: A framework for computer-assisted sound design systems supported by modelling affective and perceptual properties of soundscape. J. New Music Res. 48(3), 264–280 (2019)
Article Google Scholar
Harmon, S.: Narrative-inspired generation of ambient music. In: ICCC, pp. 136–142 (2017)
Google Scholar
OK Toffa and M Mignotte. Dataset and semantic based-approach for image sonification. Multimed. Tools Appl. 1–14 (2022). https://doi.org/10.1007/s11042-022-12914-z
Pak, M., Kim, S.: A review of deep learning in image recognition. In: 2017 4th International Conference on Computer Applications and Information Processing Technology (CAIPT), pp. 1–3. IEEE (2017)
Google Scholar
Ngugi, L.C., Abelwahab, M., Abo-Zahhad, M.: Recent advances in image processing techniques for automated leaf pest and disease recognition–a review. Inf. Process. Agric. 8(1), 27–51 (2021)
Google Scholar
Fan, X., Feng, X., Dong, Y., Hou, H.: Covid-19 CT image recognition algorithm based on transformer and CNN. Displays, 102150 (2022)
Google Scholar
Yang, M., Kumar, P., Bhola, J., Shabaz, M.: Development of image recognition software based on artificial intelligence algorithm for the efficient sorting of apple fruit. Int. J. Syst. Assur. Eng. Manag. 1–9 (2021). https://doi.org/10.1007/s13198-021-01415-1
Jiang, P., Ergu, D., Liu, F., Cai, Y., Ma, B.: A review of yolo algorithm developments. Procedia Comput. Sci. 199, 1066–1073 (2022)
Article Google Scholar
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv:2004.10934 (2020)
Valle, A., Armao, P., Casu, M., Koutsomichalis, M.: SoDa: a sound design accelerator for the automatic generation of soundscapes from an ontologically annotated sound library. In: ICMC (2014)
Google Scholar
Salamon, J., MacConnell, D., Cartwright, M., Li, P., Bello, J.P.: Scaper: a library for soundscape synthesis and augmentation. In: 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 344–348. IEEE (2017)
Google Scholar
Pearce, M.T., Wiggins, G.A.: Evaluating cognitive models of musical composition. In: Proceedings of the 4th International Joint Workshop on Computational Creativity, pp. 73–80. Goldsmiths, University of London (2007)
Google Scholar

Download references

Acknowledgments

The research of André Filipe Sales Mendes has been co-financed by the European Social Fund and Junta de Castilla y León (Operational Programme 2014–2020 for Castilla y León, EDU/556/2019 BOCYL) and partially supported by the project “FolkAI: Disseminate Folk European Music through Artificial Intelligence”(EIN2020-112348) under the program Research Europe 2020 financed by the Economy Ministry (Spanish Government).

Author information

Authors and Affiliations

Expert Systems and Applications Lab - ESALAB, Faculty of Science, University of Salamanca, Plaza de los Caídos s/n, 37008, Salamanca, Spain
Juan José Navarro-Cáceres, André Sales Mendes, Hector Sánchez San Blas, Gabriel Villarrubia González & María Navarro-Cáceres

Authors

Juan José Navarro-Cáceres
View author publications
You can also search for this author in PubMed Google Scholar
André Sales Mendes
View author publications
You can also search for this author in PubMed Google Scholar
Hector Sánchez San Blas
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel Villarrubia González
View author publications
You can also search for this author in PubMed Google Scholar
María Navarro-Cáceres
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to María Navarro-Cáceres .

Editor information

Editors and Affiliations

Facultad de Informática, Universidad Pontificia de Salamanca, Salamanca, Spain
Daniel H. de la Iglesia
Faculty of Science, University of Salamanca, Salamanca, Spain
Juan F. de Paz Santana
Facultad de Informática, Universidad Pontificia de Salamanca, Salamanca, Spain
Alfonso J. López Rivero

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Navarro-Cáceres, J.J., Mendes, A.S., Blas, H.S.S., González, G.V., Navarro-Cáceres, M. (2023). MusicFactory: Application of a Convolutional Neural Network for the Generation of Soundscapes from Images. In: de la Iglesia, D.H., de Paz Santana, J.F., López Rivero, A.J. (eds) New Trends in Disruptive Technologies, Tech Ethics and Artificial Intelligence. DiTTEt 2022. Advances in Intelligent Systems and Computing, vol 1430. Springer, Cham. https://doi.org/10.1007/978-3-031-14859-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-031-14859-0_14
Published: 28 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14858-3
Online ISBN: 978-3-031-14859-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

MusicFactory: Application of a Convolutional Neural Network for the Generation of Soundscapes from Images