Semantic-based Web Image Retrieval

Hai Zhuge
China Knowledge Grid Research Group, Key Lab of Intelligent Information Processing
Institute of Computing Technology, Chinese Academy of Sciences, P.O. Box 2704, Beijing, 100080, China


The main obstacle to realize real semantic-based image retrieval is that semantic description of image is difficult.  The basic idea of this paper is that the semantics of the destination image can be reflected by the semantics of relevant images and the semantic relationship between them.  Set forth from this idea, semantic links are proposed to reflect semantic relationships between images, and reasoning rules are derived from the semantic links to assist intelligent image retrieval.  The proposed image retrieval approach has two distinguished advantages: first, the retrieval result is the semantic clustering of relevant images rather than a list of isolated images as the output of the current web search engines, second, users are able to browse images by wandering along semantic paths with underline semantic rule reasoning.  


Information Grid, Knowledge Grid, Semantic link, Semantic Web, Semantic Grid, Web image retrieval.


The text-based approaches apply text-based retrieval algorithms to the annotated images including keywords, caption of image, text surrounding image, entire text of the containing page, and filenames.  Text-based retrieval systems support some natural language or topic-descriptive queries.  Content-based image retrieval approaches apply image analysis techniques to extract visual features from images.  The features are extracted at the preprocessing stage and stored in the retrieval system’s database.  The extracted features are usually of high dimensionality, and need to make some dimension reduction to allow scalability of these systems. Hyperlink-based approaches make use of the link structure to retrieve relevant images [1].  Their basic premise is that a page p displays or links to an image when the author of p considers the image to be of value to the viewers of the page.  These previous image retrieval approaches are almost irrelevant to the semantics of image itself. 


The proposed retrieval approach is based on the orthogonal semantic space and semantic link space as shown in Figure 1.  The orthogonal semantic space organizes resources according to the orthogonal classification.  Each point in the space determines a set of resources belonging to the same category.   The determined resources correspond to the nodes of a semantic-linked network in the semantic link space.  Normal forms have been proposed to normalize the orthogonal semantic space [3].

Figure 1. Semantic-space-based retrieval approach.

A semantic link space consists of a set of resources and the semantic relationship defined on the resource set.   The semantic relationship is represented as links associated with semantic factors.    The representation and understanding basis of the orthogonal semantic space, semantic link space, and the feature space keep updating with the semantic web.

An image reflects its semantics in: features, content, semantic category (reflected by semantic-coordinates in the semantic space), and inter-semantic-relationships.

A semantic link reflects such semantic relationship between two images.   It can be represented as a pointer with a type directed from one resources or fragment (predecessor) to another (successor).  Semantic-linked image networks can be formed by specifying the semantic relationships between relevant images [4].  Semantic links includes two categories: meaning-semantic links and position-semantic links.  The meaning-semantic links reflect the meaning relationship between two resources, denoted as X-a->Y. The position-semantic links can be abstracted as X is-a-of Y, where a belongs to {up, down, left, right, north, south, east, west, in-front, behind} is called semantic factor.  The following pairs: “up” and “down”, “left” and “right”, “south” and “north”, “east” and “west”, “in-front” and “behind” are called mutual-symmetric semantic factors.   

There are two ways to generate the semantic links: 1)  made by human with the help of assistant tools; and, 2) automatically created by analyzing the semantic relationship between relevant descriptive terms with the help of domain ontology and the other information processing technology like the co-occurrence analysis.

The search algorithm is proposed based on the minimization of semantic-linked image networks. 


We have carried out experiment to compare the recall and precision for retrieving a given set of semantic-linked image networks under the same set of query conditions.  Each network contains thirty image nodes.  Figure 2 compares the change of precision and recall with the change of the number of semantic links.  Figure 3 shows the recall and precision change with the number of types of semantic links.  We can see that the retrieval efficacy depends not only on the number of semantic links but also the types of the semantic links included in a semantic-linked network.   We are carrying out experiments with larger scale and random samples  to verify this phenomenon.

Figure 2. Recall and precision change with the number of semantic links.

Figure 3. Recall and precision change with the number of types of the semantic links.  

The underline premise of the proposed approach is that the image retrieval efficacy depends on the providers’ semantic description on the provided images.  If no semantic links are established, the proposed approach becomes traditional text-based or content-based approaches.  The hyperlink-based approach also depends on the pre-established hyperlinks.


This paper proposes a semantic-link-based image retrieval approach.  Multiple types of semantic links like meaning semantic links and position semantic links are proposed to form a single semantic image of versatile semantic images [5].   Relevant reasoning rules are derived from the semantic links to navigate search and conduct reasoning during search and browsing.    We have implemented a prototype of the proposed approach on the Information and Knowledge Grid platform whose orthogonal semantic space further promotes the efficiency and intelligence of image retrieval  [2, 3] (     

We are completing and evaluating a set of primitive semantic links, investigating the normal forms for organizing semantic link networks, realizing local semantic interconnectivity under the soft-device model [5], and carrying out domain applications.


The research work was supported by the National Science Foundation of China (NSFC).


  1. R.Lempel and A.Soffer. PicASHOW: Pictorial Authority Search by Hyperlinks on the Web. WWW10, May 1-5, Hong Kong, 2001.
  2. H.Zhuge. A Knowledge Grid Model and Platform for Global Knowledge Sharing. Expert Systems with Applications, vol.22, no.4, pp.313-320, 2002.
  3. H.Zhuge. VEGA-KG: A Way to the Knowledge Web. In Proc. of 11th International World Wide Web Conference (WWW2002), May, Honolulu, Hawaii, USA, 2002.  
  4. H.Zhuge. Active Document Framework: Concept and Method.  In Proceedings of the 5th Asia Pacific Web Conference (APWeb2003), April, Xian, China, Springer LNCS 2642, 2003.
  5. H.Zhuge, Clustering Soft-devices in Semantic Grid, IEEE Computing in Science and Engineering, vol.4, no.6, 60-62, 2002.