Hostname: page-component-8448b6f56d-42gr6 Total loading time: 0 Render date: 2024-04-18T01:40:37.728Z Has data issue: false hasContentIssue false

OntoScene, A Logic-Based Scene Interpreter: Implementation and Application in the Rock Art Domain

Published online by Cambridge University Press:  15 January 2020

DANIELA BRIOLA
Affiliation:
Department of Computer Sciences, Systems and Communications University of Milano Bicocca, Italy (e-mail: daniela.briola@unimib.it)
VIVIANA MASCARDI
Affiliation:
Department of Informatics, Bioengineering, Robotics, and Systems Engineering University of Genova, Italy (e-mails: viviana.mascardi@unige.it, gmaxsun89@gmail.com)
MASSIMILIANO GIOSEFFI
Affiliation:
Department of Informatics, Bioengineering, Robotics, and Systems Engineering University of Genova, Italy (e-mails: viviana.mascardi@unige.it, gmaxsun89@gmail.com)
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We present OntoScene, a framework aimed at understanding the semantics of visual scenes starting from the semantics of their elements and the spatial relations holding between them. OntoScene exploits ontologies for representing knowledge and Prolog for specifying the interpretation rules that domain experts may adopt, and for implementing the SceneInterpreter engine. Ontologies allow the designer to formalize the domain in a reusable way and make the system modular and interoperable with existing multiagent systems, while Prolog provides a solid basis to define complex rules of interpretation in a way that can be affordable even for people with no background in Computational Logics. The domain selected for experimenting OntoScene is that of prehistoric rock art, which provides us with a fascinating and challenging testbed.

Type
Original Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright
© The Author(s), 2020. Published by Cambridge University Press

Footnotes

*

We thank Prof. Henry de Lumley and Annie Echassoux for granting us the permission to reproduce some figures from their book (de Lumley and Echassoux 2011), and Martine Bertéa, Rights Director of CNRS éditions, for helping us in obtaining their permission. We are grateful to Dr. Nicoletta Bianchi for her precious support in the IndianaMAS project and in the activities we faced after its conclusion. Finally, we thank the anonymous reviewers for their thorough reading and for their constructive comments.

References

Agustí, J., Puigsegur, J. and Robertson, D. 1998. A visual syntax for logic and logic programming. Journal of Visual Languages & Computing 9, 4, 399428.CrossRefGoogle Scholar
Antanas, L., van Otterlo, M., Mogrovejo, O., Antonio, J., Tuytelaars, T. and De Raedt, L. 2012. A relational distance-based framework for hierarchical image understanding. In Proceedings of the 1st International Conference on Pattern Recognition Applications and Methods, vol. 2, 206218.Google Scholar
Baader, F., Bürkert, H.-J., Heinsohn, J. and Hollunder, B. 1991. Terminological knowledge representation: A proposal for a terminological logic. In International Workshop on Terminological Logics. KIT-Report 89, TU Berlin, Fachbereich Informatik.Google Scholar
Baldassano, C. 2015. Visual Scene Perception in the Human Brain: Connections to Memory, Categorization, and Social Cognition. Ph.D. thesis, Stanford University.Google Scholar
Bannour, H. and Hudelot, C. 2011. Towards ontologies for image interpretation and annotation. In 2011 9th International Workshop on Content-Based Multimedia Indexing (CBMI). IEEE, 211–216.Google Scholar
Bellifemine, F. L., Caire, G. and Greenwood, D. 2007. Developing Multi-Agent Systems with JADE. John Wiley & Sons.CrossRefGoogle Scholar
Bianchi, N. 2011. Mount Bego prehistoric rock carvings. Adoranten, 7080.Google Scholar
Bicknell, C. 1913. A Guide to the Prehistoric Rock Engravings in the Italian Maritime Alps. Tip. G. Bessone.Google Scholar
Briola, D. 2016. Agents and Ontologies for a Smart Management of Heterogeneous Data: The Indianamas System. Studies in Computational Intelligence, vol. 616. Springer, 25–36.Google Scholar
Briola, D., Deufemia, V., Mascardi, V. and Paolino, L. 2017. Agent-oriented and ontology-driven digital libraries: The Indianamas experience. Software – Practice and Experience 47, 11, 17731799.CrossRefGoogle Scholar
Briola, D., Deufemia, V., Mascardi, V., Paolino, L. and Bianchi, N. 2014. Ontology-driven processing and management of digital rock art objects in Indianamas. In Proceedings of 5th International Conference Digital Heritage (EuroMed 2014). Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 8740, 217227.Google Scholar
Briola, D., Mascardi, V. and Gioseffi, M. 2018. OntologyBeanGenerator 5.0: Extending ontology concepts with methods and exceptions. In Proceedings of the 19th Workshop From Objects to Agents, Palermo, Italy, Cossentino, M., Sabatucci, L. and Seidita, V., Eds. CEUR Workshop Proceedings, vol. 2215. CEUR-WS.org, 116–123.Google Scholar
Costagliola, G., Deufemia, V. and Risi, M. 2005. Sketch grammars: a formalism for describing and recognizing diagrammatic sketch languages. In Eighth International Conference on Document Analysis and Recognition (ICDAR’05), vol. 2, 12261230.Google Scholar
Crimi, C., Guercio, A., Nota, G., Pacini, G., Tortora, G. and Tucci, M. 1991. Relation grammars and their application to multi-dimensional languages. Journal of Visual Languages & Computing 2, 4, 333346.CrossRefGoogle Scholar
de Lumley, H. and Echassoux, A. 2009. The rock carvings of the chalcolithic and ancient bronze age from the Mont Bego area. The cosmogonic myths of the early metallurgic settlers in the southern alps. L’Anthropologie 113, 5, 9691004.Google Scholar
de Lumley, H. and Echassoux, A. 2011. La montagne sacrée du Bego. CNRS Editions.Google Scholar
Di Martino, B. and Esposito, A. 2016. A rule-based procedure for automatic recognition of design patterns in UML diagrams. Software: Practice and Experience 46, 7, 9831007.Google Scholar
Donahue, J., Jia, Y., Vinyals, O., Hoffman, J., Zhang, N., Tzeng, E. and Darrell, T. 2014. Decaf: A deep convolutional activation feature for generic visual recognition. In International Conference on Machine Learning, 647–655.Google Scholar
Dovier, A., Formisano, A. and Pontelli, E. 2005. A comparison of CLP(FD) and ASP solutions to np-complete problems. In Logic Programming, 21st International Conference, ICLP 2005, Sitges, Spain, October 2–5, 2005, Proceedings, Gabbrielli, M. and Gupta, G., Eds. Lecture Notes in Computer Science, vol. 3668. Springer, 67–82.Google Scholar
Forestier, G., Derivaux, S., Wemmert, C. and Gançarski, P. 2008. An evolutionary approach for ontology driven image interpretation. In Workshops on Applications of Evolutionary Computation. Springer, 295304.Google Scholar
Gerber, C., Siekmann, J. H., and Vierke, G. 1999. Holonic multi-agent systems. Tech. Rep. DFKI-RR-99-03, Deutsches Forschungszentrum für Künztliche Inteligenz – GmbH, Postfach 20 80, 67608 Kaiserslautern, FRG.Google Scholar
Guarino, N., Oberle, D. and Staab, S. 2009. What is an ontology? In Handbook on Ontologies. Springer, 117.Google Scholar
Guérin, C., Rigaud, C., Bertet, K. and Revel, A. 2017. An ontology-based framework for the automated analysis and interpretation of comic books’ images. Information Sciences 378, C (Feb.), 109–130.Google Scholar
Haarslev, V. 1999. A logic-based formalism for reasoning about visual representations. Journal of Visual Languages and & Computing 10, 4, 421445.CrossRefGoogle Scholar
Haarslev, V., Möller, R. and Schröder, C. 1994. Combining spatial and terminological reasoning. In Annual Conference on Artificial Intelligence. Springer, 142153.Google Scholar
Haarslev, V., Möller, R. and Wessel, M. 2002. Visual spatial query languages: A semantics using description logic. In Diagrammatic Representation and Reasoning. Springer, 387403.CrossRefGoogle Scholar
Hammond, T. and Davis, R. 2007. LADDER, a sketching language for user interface developers. In ACM SIGGRAPH 2007 Courses, SIGGRAPH ’07. ACM, New York, NY, USA.CrossRefGoogle Scholar
He, K., Zhang, X., Ren, S. and Sun, J. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770–778.Google Scholar
Helm, R. and Marriott, K. 1991. A declarative specification and semantics for visual languages. Journal of Visual Languages & Computing 2, 4, 311331.CrossRefGoogle Scholar
Henderson, J. M. and Hollingworth, A. 1999. High-level scene perception. Annual Review of Psychology 50, 1, 243271.CrossRefGoogle Scholar
Hill, E. F. 2003. Jess in Action: Java Rule-Based Systems. Manning Publications Co.Google Scholar
Karp, R. M. 1972. Reducibility among combinatorial problems. In Proceedings of a Symposium on the Complexity of Computer Computations, held March 20-22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, Miller, R. E. and Thatcher, J. W., Eds. The IBM Research Symposia Series. Plenum Press, New York, 85–103.Google Scholar
Karttunen, L. 1986. D-PATR: A development environment for unification-based grammars. In Proceedings of the 11th Conference on Computational Linguistics. Association for Computational Linguistics, 7480.Google Scholar
Knuth, D. 2000. Dancing links. Millennial Perspectives in Computer Science 1.Google Scholar
Kondo, H. M., van Loon, A. M., Kawahara, J.-I. and Moore, B. C. J. 2017. Auditory and visual scene analysis: an overview. Philosophical Transactions 372, 1714 (February).Google ScholarPubMed
Kveraga, K. and Bar, M. Eds. 2014. Scene Vision: Making Sense of What We See. MIT Press.CrossRefGoogle Scholar
Ladret, D. and Rueher, M. 1991. Vlp: A visual logic programming language. Journal of Visual Languages & Computing 2, 2, 163188.CrossRefGoogle Scholar
Li, S. and Ying, M. 2003. Region connection calculus: Its models and composition table. Artif. Intell. 145, 1–2, 121146.CrossRefGoogle Scholar
Lifschitz, V. 1999. Answer set planning. In International Conference on Logic Programming and Nonmonotonic Reasoning. Springer, 373374.Google Scholar
Marr, D. 1982. Vision. W.H. Freeman, San Francisco, CA.Google Scholar
Marriott, K. and Stuckey, P. 1998. Programming with Constraints – An Introduction. MIT Press.CrossRefGoogle Scholar
Mascardi, V., Briola, D., Locoro, A., Grignani, D., Deufemia, V., Paolino, L., Bianchi, N., de Lumley, H., Malafronte, D. and Ricciarelli, A. 2014. A holonic multi-agent system for sketch, image and text interpretation in the rock art domain. Int. J. of Innovative Computing, Information and Control 10, 1 (Feb.), 81100.Google Scholar
McGuinness, D. L., Van Harmelen, F., et al. 2004. Owl web ontology language overview. W3C Recommendation 10, 10, 2004.Google Scholar
Meyer, B. 1992. Pictures depicting pictures on the specification of visual languages by visual grammars. In 1992 IEEE Workshop on Visual Languages, 1992. Proceedings. IEEE, 41–47.Google Scholar
Randell, D. A. and Cohn, A. G. 1989. Modelling topological and metrical properties in physical processes. In Proceedings of the 1st International Conference on Principles of Knowledge Representation and Reasoning (KR ’89). Toronto, Canada, May 15–18 1989., Brachman, R. J., Levesque, H. J., and Reiter, R., Eds. Morgan Kaufmann, 357368.Google Scholar
Santosh, K. C., Lamiroy, B., and Ropers, J. 2009. Inductive logic programming for symbol recognition. In 2009 10th International Conference on Document Analysis and Recognition, 13301334.Google Scholar
Sikos, L. F. 2017. Description Logics in Multimedia Reasoning. Springer.Google Scholar
Simonyan, K. and Zisserman, A. 2014. Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556.Google Scholar
Ullman, S. 1996. High-Level Vision: Object Recognition and Visual Cognition. MIT Press, Cambridge, MA.CrossRefGoogle Scholar
Wan, J., Wang, D., Hoi, S. C. H., Wu, P., Zhu, J., Zhang, Y. and Li, J. 2014. Deep learning for content-based image retrieval: A comprehensive study. In Proceedings of the 22nd ACM International Conference on Multimedia. ACM, 157166.Google Scholar
Wittenburg, K., Weitzman, L. and Talley, J. 1991. Unification-based grammars and tabular parsing for graphical languages. Journal of Visual Languages & Computing 2, 4, 347370.Google Scholar
Wooldridge, M. and Jennings, N. R. 1995. Intelligent agents: theory and practice. Knowledge Eng. Review 10, 2, 115152.CrossRefGoogle Scholar