Title page for ETD etd-01052009-103100


Type of Document Master's Thesis
Author Tilley, Jason W
URN etd-01052009-103100
Title A Comparison of Statistical Filtering Methods for Automatic Term Extraction for Domain Analysis
Degree Master of Engineering
Department Computer Science
Advisory Committee
Advisor Name Title
Frakes, William B. Committee Chair
Belli, Gabriella M. Committee Member
Kulczycki, Gregory W. Committee Member
Keywords
  • domain analysis
  • term extraction
Date of Defense 2008-12-22
Availability unrestricted
Abstract
Fourteen word frequency metrics were tested to evaluate their effectiveness in identifying vocabulary in a domain. Fifteen domain engineering projects were examined to measure how closely the vocabularies selected by the fourteen word frequency metrics were to the vocabularies produced by domain engineers. Six filtering mechanisms were also evaluated to measure their impact on selecting proper vocabulary terms. The results of the experiment show that stemming and stop word removal do improve overlap scores and that term frequency is a valuable contributor to overlap. Variations on term frequency are not always significant improvers of overlap.
Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  JasonThesis_5_10_09.pdf 1.25 Mb 00:05:45 00:02:57 00:02:35 00:01:17 00:00:06

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.