Title page for ETD etd-02192010-184412


Type of Document Dissertation
Author Korah, John
URN etd-02192010-184412
Title Issues of Real Time Information Retrieval in Large, Dynamic and Heterogeneous Search Spaces
Degree PhD
Department Computer Science
Advisory Committee
Advisor Name Title
Santos, Eunice E. Committee Chair
Arthur, James D. Committee Member
Borggaard, Jeffrey T. Committee Member
Ribbens, Calvin J. Committee Member
Santos, Eugene Jr. Committee Member
Keywords
  • Real time Information Retrieval
  • Large and Dynamic Search
  • Anytime Image Retrieval
  • Multiagent Resource Allocation Framework
  • Performance Analysis
Date of Defense 2009-11-03
Availability unrestricted
Abstract
Increasing size and prevalence of real time information have become important characteristics of databases found on the internet. Due to changing information, the relevancy ranking of the search results also changes. Current methods in information retrieval, which are based on offline indexing, are not efficient in such dynamic search spaces and cannot quickly provide the most current results. Due to the explosive growth of the internet, stove-piped approaches for dealing with dynamism by simply employing large computational resources are ultimately not scalable. A new processing methodology that incorporates intelligent resource allocation strategies is required. Also, modeling the dynamism in the search space in real time is essential for effective resource allocation.

In order to support multi-grained dynamic resource allocation, we propose to use a partial processing approach that uses anytime algorithms to process the documents in multiple steps. At each successive step, a more accurate approximation of the final similarity values of the documents is produced. Resource allocation algorithm use these partial results to select documents for processing, decide on the number of processing steps and the computation time allocated for each step. We validate the processing paradigm by demonstrating its viability with image documents. We design an anytime image algorithm that uses a combination of wavelet transforms and machine learning techniques to map low level visual features to higher level concepts. Experimental validation is done by implementing the image algorithm within an established multiagent information retrieval framework called I-FGM. We also formulate a multiagent resource allocation framework for design and performance analysis of resource allocation with partial processing. A key aspect of the framework is modeling changes in the search space as external and internal dynamism using a grid-based search space model. The search space model divides the documents or candidates into groups based on its partial-value and portion processed. Hence the changes in the search space can be effectively represented in the search space model as flow of agents and candidates between the grids. Using comparative experimental studies and detailed statistical analysis we validate the search space model and demonstrate the effectiveness of the resource allocation framework.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  KORAH_J_D_2009.pdf 2.05 Mb 00:09:30 00:04:53 00:04:16 00:02:08 00:00:10

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.