Personal tools
You are here: Home People Alumni Julia Vogel publ diss_JuliaVogel.html
 

Semantic Scene Modeling and Retrieval
Julia Vogel
PhD Thesis, October 2004. [pdf]
Hartung-Gorre Verlag, Konstanz.
ISBN: 3-89649-967-2

Abstract:  Semantics-based image retrieval has gained increasing interest in recent years. As an area in linguistics, semantics deals with the
sense and the meaning of language. In the context of content-based image retrieval, the research goal is to access the meaning of images by naming or describing the most important image regions and their relationships.

The topic of this dissertation is the semantic description, understanding, and modeling of natural scenes. The primary objective is to develop a computational image representation that reduces the semantic gap between the image understanding of humans and the computer. For humans, the most intuitive means of communications about images is image description. Image semantics and image description are thus closely interconnected.

We propose a semantic modeling of natural scenes that is based on the classification of local semantic concepts. Image regions are extracted on a regular 10x10 grid. The resulting patches are classified into nine concept classes that subsume the main semantic content of the database images. Images are represented through the frequency of occurrence of the semantic concepts. This semantic modeling constitutes a compact, semantic image representation that allows to describe or search for specific image content, or, on a higher level, to model the semantic content of natural scene categories.

The semantic modeling has been intensively studied for categorization and retrieval of natural scenes. Depending on the classification method and on the quality of the concept detectors, good to very good categorization and retrieval performance has been obtained. In particular, it is shown that the semantic modeling leads to considerably better categorization and retrieval performance compared to directly employing low-level features. Nevertheless, the analysis of the mis-categorized scenes reveals that the regular semantic ambiguity of the database images demands rather for a typicality ranking of images than for hard-decision categorization.

This hypothesis is supported in two psychophysical experiments. Humans are able to consistently categorize images, but the employed database consists to a large degree of images that can be assigned to several scene categories. However, the human participants were very consistent in ranking the database images according to their semantic typicality.

It is shown visually and quantitatively, that the proposed semantic modeling is also well-suited for semantic ranking of images. In particular, the typicality transition between two scene categories can be modeled. In addition, we propose a perceptually plausible distance measure that represents the most discriminant semantic concepts of each scene category. The typicality ranking obtained with this distance measure correlates highly with the human rankings.

Finally, this thesis discusses the problem of performance evaluation in content-based image retrieval systems. When searching for specific local semantic content, the retrieval results can be modeled statistically. We develop closed-form expressions for the prediction of precision and recall in our vocabulary-supported retrieval system. In addition, these expressions allow to optimize precision and recall by up to 60%.
 


BibTex Record

@book{JuliaVogel_SemanticSceneModelingandRetrieval,
   author    = {Julia Vogel},
   title     = {Semantic Scene Modeling and Retrieval},
   publisher = {Hartung-Gorre Verlag Konstanz},
   year      = {2004},
   series    = {Selected Readings in Vision and Graphics},
   number    = {33},
   editor    = {Luc Van Gool and Gabor Szekely and Markus Gross and Bernt Schiele},
}


 
Last Update: Apr 27, 2005 by Julia Vogel
by webmfritz last modified 2005-12-20 01:02