ISO/IEC JTC 1/SC 34/WG 3

Date: 1999-04-20

 

 

 

Title:  Korean Contribution to The Information Association Working Group 3 - Hypertext & Multimedia(Comments on the Multimedia Retrieval System Architecture and  Image Retrieval Technique)
        


 


Source: Ki-Joung Kang,  Multimedia Technical Lab, Korea Telecom


Project:        

        
Status: To be discussed at  the Second meeting of  ISO/IEC JTC1/SC34     19-23 April 1999, Granada, Spain

 

 

 


1. Abstraction

Standardization activities for SGML,HTML, XML and etc have been actively performed by many including international standardization organization recently.
Text-based document processing techniques which are revised by organization members are well used as information query languages on internet. But, most information has been changed into multimedia type which is composed of many data types such as text, image, graphic, audio, video as information age proceeds.  Therefore  information retrieval engine for multimedia data is  has been required.
 MPEG-7 would be International Standard in Nov. 2000.
A architecture for multimedia Information Retrieval System(MIR:Multimedia Information Retrieval)is supposed in this contribution paper as Fig.1 .
In addition to MIR, Image Retrieval System(IIR) which is a key component of MIR architecture is also supposed in Fig.2.

2. Multimedia Information Retrieval System

As internet and world-wide web are popular and widely used, the number of hosts and users are rapidly increased. In addition to it, the amount of information is enormously increased.
Many information are generated and changed continuously on internet.
There are many Text-based IRS such as Korea Telecom's InfoCop, Shimmani, Navor in Korea and Yahoo, Altavista, Excite, Lycos, Goo(NTT's) in other nations. These are only text-based IRS based on TIR(Text Information Retrieval) processing system.
A lots of study has been performed but detail standardization procedures are not yet on IRS for large-scale multimedia information (image, voice, video, etc). It will be supposed a standard architecture of multimedia information retrieval system in this contribution and it is depicted in Fig.1.


The supposed system model is a technique for multimedia data retrieval and management on internet by which multimedia data are analyzed and then special features are extracted and saved. Also, multimedia data are retrieved and managed according to the extracted features.
Because TIR(Text Information Retrieval ) is a kind of existent text-based IRS, development is not always necessary. IIR(Image Information Retrieval) is used for retrieving image data and its architecture is depicted in Fig.2
VoIR(Voice Information Retrieval) is for retrieving voice data and ViIR(Video Information Retrieval) is for video data.
Multimedia information Database is the Database which contains many kinds of data analyzed by each IRS(Information Retrieval System) and its architecture will be supposed later.

The service scenario is as follows.
Various queries inserted by users are passed to each IRS and processed to search in MIDB(multimedia Information database). And then the searched results are delivered to users.
In this case, it is possible that user's requests are entered into any IRS and it doesn't matter. For example, consider the following situation. Suppose that user enter the keyword like "Benhur"  through TiR.  As a result, users can receive various types of information such as text, image, voice and video which was searched in MIDB in accordance with the given keyword.

3. The architecture of IRS(Image Retrieval System)

Most are text-based Information Retrieval Systems which search the exact or similar match of given keyword. As multimedia information is composed of text, voice, video and something like that, a different type of IRS has been needed for more efficient and effective search recently.  For example, it is said image-based IRS is required.
The primary version of image-based IRS was  oriented  method using the retrieval keyword  with based text keyword. But, this technique includes many limitations and problems because it requires too much time to insert keywords and it is difficult to add or change them. To solve these problems, a new method, content-based IRS, has been developed recently and information can be efficiently searched in terms of specific features of images such as color, texture, shape and pattern.
With this new retrieval technique, image data can be searched faster and more exactly by using various features of them.
The application area o

f this technique is very wide including digital library, advertisement, movies, pictures, trademarks and video.
The basic architecture of IIR system(Fig.2) is composed of FEM(Feature Extraction Module), SEM(Simultaneous Estimation Module), RM(Retrieval Module) and etc. FEM analyzes automatically various features like color, texture, shape for the given image data and represents a small amount of feature data.

SEM(Simultaneous Estimation Module) calculates similarity of image data using extracted feature values. Dissimilarity instead of similarity can be calculated according to the definition of features.
RM(Retrieval Module) processes user queries using interactions between users and system.
 There are many different query types such as query by example, query by user's sketch and query by color combination(proportion). Search results are displayed in order according to similarity calculated by SEM.
 
a) Query by example : it searches and displays the results which is similar to the selected image among the given examples.

b) Query by user's sketch : This method solves the problems introduced by "query by examples" method. Generally speaking, it is difficult that randomly selected samples cover the exact image users require.
The more images, the more serious this problem is. But, this problem can be solved if user directly sketch the sample image.

c) Query by color combination : It displays the similar images if user disignates the name and proportion of the colors instead of direct sketches.

d) Repetitive query : It is difficult to search the expected result at one try. Many text-based internet search engines also have same problem. This results from the fact computers can't perfectly understand user query. In this case, repetitive query method can be effective because it use user's feedback. This method users the specific technique which extract the features from image data and combined them dynamically. Supported with user management function which saves tastes and characteristics of each users, feedback procedures can be eliminated. Besides, weight value can be used to enhance the exactness of search.

e) Query by keywords : This method uses keywords which describe features of image and are saved in a field in Image database.

4. Conclusion

 The government of Korea plans to endow each public servant with one IP address within year 1999. This example implies that demand of internet is increased enormously and rapidly.
Information retrieval technology for multimedia data including text, image, graphic, video and audio will be needed for wide range of internet application and business. Supposed multimedia information retrieval technique in this contribution paper combines image processing and representation skill with text-based document processing technique such as SGML, HTML, XML.
Also, multimedia information retrieval technique will be used in domestic and foreign broadcasting stations and related business such as web video and motion picture, image and economical effect may be enormous.