Some Ideas for Modelling Image-Text Combinations

Type de publicationReport
Année de publication2005
AuteursSalway, A., & Martinec R.
InstitutionSchool of Electronics and Physical Sciences Department of Computing
Mots-clésrelation texte/image

The combination of different media types is a defining characteristic of multimedia yet much research has concentrated on understanding the semantics of media types individually. Recently, systems have been developed to process correlated image and text data for tasks including multimedia retrieval, fusion, summarization, adaptation and generation. We argue that the further development and the more general application of such systems require a better computational understanding of image-text combinations. In particular we need to know  more about the correspondence between the semantic content of images and the semantic content of texts when they are used together. This paper outlines a new area of multimedia research focused on modeling image-text combinations. Our aim is to develop a general theory to be applied in the development of multimedia systems that process correlated image and text data. Here, we propose a theoretical framework to describe how visual and textual information combine in terms of semantic relations between images and texts. Our classification of image-text relations is grounded in aesthetic and semiotic theory and has been developed with a view to automatic classification.

