Scholarly
    Communications Project


Document Type:Master's Thesis
Name:Ritesh D. Sojitra
Email address:sojitra@vt.edu
URN:1998/01050
Title:PHRASAL DOCUMENT ANALYSIS FOR MODELING
Degree:Master of Science
Department:Electrical and Computer Engineering
Committee Chair: Dr. Walling R. Cyre
Chair's email:cyre@vt.edu
Committee Members:Dr. James R. Armstrong
Dr. F. Gail Gray
Keywords:Chunk, Information Extraction, Modeling, ModelMaker, Noun Phrase, Parser
Date of defense:September 11, 1998
Availability:Release the entire work immediately worldwide.

Abstract:

Specifications of digital hardware systems are typically written in a natural language. The objective of this research is automatic information extraction from specifications to aid model generation for system level design automation. This is done by automatic extraction of the noun phrases and the verbs from the natural language specification statements. First, the natural language sentences are parsed using a chart parser. Then, a noun phrase and verb extractor scans these charts to obtain the noun phrases with their frequencies of occurrence. The noun phrases are then classified by semantic types. Also the verbs are automatically assigned their respective roots and classified. Finally, each sentence is summarized as a sequence of "chunks": noun phrases, verbs and prepositions. Vectors are generated from these chunks and imported into MS Excel for plotting occurrence graphs of noun phrases and verbs with respect to the sentences in which they occur. Finally, inter-term dependencies between noun phrases, and between noun phrases and verbs were studied. The frequencies of occurrence, the classification of chunks, the occurrence graphs and the inter-term dependencies together give useful information about the subject, the hardware components and the behavior of a system described by a natural language specification document.

List of Attached Files

Thesis.pdf


The author grants to Virginia Tech or its agents the right to archive and display their thesis or dissertation in whole or in part in the University Libraries in all forms of media, now or hereafter known. The author retains all proprietary rights, such as patent rights. The author also retains the right to use in future works (such as articles or books) all or part of this thesis or dissertation.