Keywords

1 Introduction

Since the advent of information technology, the number of multimedia documents has grown continuously. In fact, more than 80% of company and organization data is in the form of documents. Multimedia documents are characterized by a rich content and complex structures, which complicates access to specific granules in such documents and therefore makes the document retrieval a tedious task. Graph is the most effective representation model, which allow the representation of complex and connected data, such as multimedia document. Comparing two documents structurally, means comparing graphs that represent them. The graph theory could be of great interest in the evaluation of the structural similarity.

In general, comparing two graphs leads to find the better matching between them. The general approaches proposed by graph theory concerning matching are: exact matching and approximate matching. In many fields of application, the goal is not to show that two graphs are structurally identical, but it is more interesting to know how similar these graphs are. In such applications, graph similarity based on exact matching is not appropriate. For this purpose, approximate error-tolerant graph matching based on finding the maximum common subgraph or on the calculation of the graph editing distance has been proposed in [8, 9, 17]. In the context of image, finding visual information granularity in an image, requires the use of special techniques in order to respond to user’s need. Indeed, Image Retrieval can be based either on the text or on the visual content. However, a key limitation of traditional image retrieval system is the ignorance of semantic aspect. Several works has used graphs to represent images [2, 6, 7, 32]. In [1], graphs are rich data structures with the ability to represent complex and structured objects.

In this paper, we present the basic concepts of Image Retrieval and image retrieval techniques. Also, a comparative study of image retrieval techniques has been carried out to present their advantages, and drawbacks. The problematic of this work revolves around: the integration of semantic aspect in graph-based Images Retrieval approaches.

The remainder of this paper is organized as follows. Section 2, describes the concept of Image Retrieval and the techniques of Image Retrieval, and presents advantages and limits of using each type of techniques. Section 3 presents a comparative study on graph-based Image Retrieval and outlines the integration of semantic aspect in graph-based approaches. Finally, Sect. 4 discusses our issue focused on graph-based semantic Image Retrieval.

2 Image Retrieval

Information Retrieval (IR) is concerned with the acquisition, structuring, storage, retrieval, and ranking of information [13]. It based on the information needs of user. This task can be applied for all types of data: text, image, video, music track. An Image is the visual representation of an object by different mediums or support.

2.1 Image Retrieval Techniques

Image Retrieval (ImR) is the area studying the way to find images in image collection. Moreover, the issue is to rank the similar images to the user’s request. ImR attracts the attention of many researchers in the field of: digital libraries, remote sensing, astronomy, etc. It has been a very active field since the 1970s. ImR systems can be classified into three techniques [20]:

  • Text-based Image Retrieval (TBIR), is the system for retrieving images by text queries. The extraction of an image similar to a user query is based on indexation process, which proposes to attach to an image, a set of descriptors. This technique use the textual indexation of an image, its metadata or the textual elements attached to the image. A lot of research has been done on TBIR, but are very ancient due to the great importance given to the other types of ImR. TBIR can be based on annotation [15]. The major drawback of this approach is that for a descriptive annotation, it must be manual, hence, the complexity of the task. One of the first examples of TBIR is [36], it presents a framework that performs annotations to images using text and then uses text-based databases.

  • Content-based Retrieval (CBIR): over the last decades, a big interest in images collection has grown with the development of image acquisition devices, storage capacities and the availability of high-quality digitization techniques. Indeed, CBIR is a system based on colors, textures, shapes, and other characteristics (depending on the user’s needs). In other words, it consists of extracting visual descriptors and retrieving by visual similarity. This technique responds to many needs in the field of ImR and overcomes the limitations of TBIR. An image can be described by a weighting function that reflects the importance of the features and varies widely according to the system and the objectives. Work [34] presents an effective content-based visual ImR system, by extracting color histogram and spatial information. Work [14] presents CBIR approach using a computational visual attention model, based on saliency regions and energy features of the gray-level co-occurrence and saliency structure histogram. Work [16] presents an approach using local visual attention feature, based on fast and performant salient point detector, and the salient point expansion.

  • Semantic-based Image Retrieval (SBIR), is the technique that defines image using semantic terms to determine the significance conveyed in the image. SBIR can be obtained by extracting visual descriptors from the image in order to identify significant and interesting regions of the image, followed by a process for extracting knowledge in order to obtain a semantic description of the image. This technique is performed by several factors. The following section focus on that. Table 1 summarizes the advantages and limits of each techniques.

Table 1. Advantages and Limits of ImR techniques

According to Table 1, we conclude that each technique has its advantages and limitations, therefore the use of TBIR, CBIR and SBIR depends on the objective of the task carried out by the user and the context studied. Several works aim to increase the relevance of the result, to that end, they combined more than one category. Work [25] presents a decisive content based ImR approach for feature fusion in visual and textual images. In [30] a system is proposed to combine textual and visual statistics in a single index vector for content-based retrieval. Work [29] presents a system based on content-based and develops its own ontology module, it contributes to significantly increase the relevance of retrieval results, by enhancing the ranking of images.

2.2 Image Representation Models

For several years great effort has been devoted to the study of image representation models, most commonly used are: vectors, strings, trees, and graphs. Vectors are often used in ImR, works [16, 29, 34] model the image as a vector.

String is an ordered set of elements, used in ImR when it becomes important to order elements. The distance between two string are often defined by the editing distance of Levenshtein’s chain [33], as in [23]. However, studies on vector-based and string-based approaches are still lacking due to modeling poverty and may not be conventional in all situations to model complex objects. Likewise, tree-based model allows the representation of hierarchical relationships, and not practicable to model complex relations. To solve this issue, many researchers have proposed graph theory in ImR.

Graph-based model permits to represent all possible relations between components, the semantics associated with an arc is not limited to a typing or membership relation. Graphs offer a very rich modeling of the document and their structures. Graph-based Image is represented as a set of components and a set of binary relations between these components. They are widely used in many applications due to their very high expressiveness in terms of structure and semantics. Note that strings and trees are particular graphs.

The application of graph theory to IR is studied in several works due to its advantages in terms of improving the efficiency of the IR engine. Indeed, graph-based measures provide the use of the graph as a semantic representation model for queries and documents and also its exploitation in a semantic document search model [10, 24, 28].

3 Graph-Based Semantic Image Retrieval

3.1 Graph-Based Image Retrieval

In graph data, nodes represent entities and edges represent relationships between nodes. Graph structure has the ability of representing meaning of entities and relationships between entities. This excellent ability makes graph more and more popular in the field of computer. In general, the mathematical theory of graphs could be of great interest in measuring the similarity of objects. In [12], sub-graph isomorphism can be used to show the inclusion or the equivalence of two graphs. Below, we give the mathematical definition of a graph:

Definition 1

A graph G can be defined by a pair (V, E), where V is the set of nodes of G and \(E \in V \times V\) represents the set of edges of G (relations between nodes)

In the following a number of works regarding graph-based ImR.

3.2 Factors Adding Semantic Aspect

Taking into account semantic aspect in ImR, means to retrieve the relevant result with considering the overall signification of the image. SBIR aims to give the adequate interpretation of the image. The purpose of the following table is to determine factors adding semantic aspect in graph-based ImR.

Table 2. Approaches using graphs

Table 2 shows image-based approaches, classified according to the graph model and the factors reflecting semantic aspect in graph-based ImR approaches. Those factors contributed to enhance semantic aspect in the ImR system.

In the following, a set of factors is grouped together to clearly identify the factors that implicitly and explicitly influence the semantics of ImR approaches.

To evaluate the retrieval performance, work [5] designs an automatic scheme to simulate the relevance feedback. The simulation system automatically classifies a database image as relevant if the image belongs to the same semantic category with the initial query. The work affirms that experimental results show that the relevance feedback technique improves retrieval performance for semantic categories with clear region correspondence.

In [35], relevance feedback was defined as a powerful interactive technique used to improve the performance of ImR systems. With user provided relevant/irrelevant information on the retrieved images, the system can capture the semantic concept of the query more correctly and gradually improve the retrieval precision.

Work [18] presents an intelligent annotation-based ImR system, that introduces concepts and instances, where annotations are stored as RDF triples and can be queried to find images. Annotations at concept level, are enable to create semantic links between concepts and then addresses many challenges.

Relevance feedback and query annotation are techniques that allow the expansion of the query to enhance the query expressing of the user’s need. Query annotation is a technique that influences the graph before the retrieval process and the relevance feedback allows the extension of the graph to optimize the retrieval engine.

According to the underlying structure, most traditional methods focus on the data features, but, they ignore the underlying structure information, which plays a major role for semantic discovery, especially when the label information is unknown. Many databases have underlying cluster or manifold structure [3].

The context of a node in graph model has an influence on the semantics, but in a lower degree, it implies to take into account the ascendant and descendant nodes, in order to make a general interpretation of the image, such is in [26].

Specific graphs such as: the semantic graph and conceptual graph, used for the representation of knowledge and reasoning. Work [22] presents CKSGIS, it retrieves automatically an interactive semantic graph of convigned terms that allow users to easily find related images, not limited to a specific search term. In [4, 11, 27], conceptual graphs have been used in semantic representations for ImR. They are very used in graph-based ImR. As well as scene graph, it logically structures the spatial representation of a scene graph, such as in [21, 31].

4 Discussion and Coclusion

In the last few years, several privileges have been acquired while working with graph-based, due to the tree architecture of graph, its ability to model complex objects (in our case: images), and complex relationships between these objects (e.g.: representation of multiple relationships between the same nodes).

In ImR domain, there are several factors adding semantic aspect, to increase the performance of the retrieval system and to improve the relevance of the result. Based on the approaches presented in Table 2, the main problematic of this work is to know in what extent, graphs integrate semantic aspect in ImR, using not only approaches dealing with semantics in an explicitly way, but also those expressing semantics in an implicitly way. In this work, we were interested in the semantic aspect of graph-based ImR approaches, depending on the context of the study and the objectives pursued.

In this paper, we presented a state of the art of works related to graph-based ImR. In general, the paper reviews the ImR technique (TBIR, CBIR, and SBIR) and image representation models (strings, lists, trees, and graphs). From this we deduce that TBIR and CBIR techniques are not enough to deal with relevant and effective ImR and according to image representation models literature, we deduce that, graph model has a great importance in ImR.

Graph-based ImR permits to represent all the possible relations between the components, the semantics associated with an arc is not limited to a typing or membership relation. An image is represented as a set of components and a set of binary relations between these components.

The main purpose of the paper is to draw attention to graph-based approaches, and its vital role to increase significance of the image and optimally serve the user’s interest. We have made an overview of existing approaches on ImR. We have concluded that semantic aspect can be carried from several angles, in accordance with study’s context and the objectives pursued. Based on this state of the art, we can conclude that graph-based approaches to ImR can open other leads to improve image-related information retrieval systems.

ImR involves into a promising field, but existing methods of semantic ImR must be adapted. There remain many challenges to overcome in this domain. Our future work will focus on approaches using semantic in IR.