Title page for ETD etd-08282002-151909


Type of Document Master's Thesis
Author Pande, Ashwini K
Author's Email Address aspande@vt.edu
URN etd-08282002-151909
Title Table Understanding for Information Retrieval
Degree Master of Science
Department Computer Science
Advisory Committee
Advisor Name Title
Ehrich, Roger W. Committee Chair
Fox, Edward Alan Committee Member
North, Christopher L. Committee Member
Keywords
  • Information retrieval
  • Statistical crosscorrelation
  • Odessa digital library
  • detection heuristics
  • Table detection
Date of Defense 2002-08-19
Availability unrestricted
Abstract
This thesis proposes a novel approach for finding tables in text files containing a mixture of unstructured and structured text. Tables may be arbitrarily complex because the data in the tables may themselves be tables and because the grouping of data elements displayed in a table may be very complex. Although investigators have proposed competence models to explain the structure of tables, there are no computationally feasible performance models for detecting and parsing general structures in real data. Our emphasis is placed on the investigation of a new statistical procedure for detecting basic tables in plain text documents. The main task here is defining and testing this theory in the context of the Odessa Digital Library.
Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  AshwiniPandeTableIR.pdf 890.86 Kb 00:04:07 00:02:07 00:01:51 00:00:55 00:00:04

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.