

Type of Document Dissertation Author Feng, Yuanjian URN etd-06072010-034446 Title Detection and Characterization of Multilevel Genomic Patterns Degree PhD Department Electrical and Computer Engineering Advisory Committee
Advisor Name Title Wang, Yue J. Committee Chair DaSilva, Luiz A. Committee Member Lu, Chang-Tien Committee Member Wyatt, Christopher L. Committee Member Xuan, Jianhua Committee Member Keywords
- Gene Expressions
- DNA Copy Number Changes
- Stability Analysis
- Regression Analysis
- Tree of Phenotypes
Date of Defense 2010-05-26 Availability restricted Abstract DNA microarray has become a powerful tool in genetics, molecular biology, and biomedical research. DNA microarray can be used for measuring the genotypes, structural changes, and gene expressions of human genomes. Detection and characterization of multilevel, high-throughput microarray genomic data pose new challenges to statistical pattern recognition and machine learning research. In this dissertation, we propose novel computational methods for analyzing DNA copy number changes and learning the trees of phenotypes using DNA microarray data.
DNA copy number change is an important form of structural variations in human genomes. The copy number signals measured by high-density DNA microarrays usually have low signal-to-noise ratios and complex patterns due to inhomogeneous composition of tissue samples. We propose a robust detection method for extracting copy number changes in a single signal profile and consensus copy number changes in the signal profiles of a population. We adapt a solution-path algorithm to efficiently solve the optimization problems associated with the proposed method. We tested the proposed method on both simulation and real CGH and SNP microarray datasets, and observed competitively improved performance as compared to several widely-adopted copy number change detection methods. We also propose a chromosome instability measure to summarize the extracted copy number changes for assessing chromosomal instabilities of tumor genomes. The proposed measure demonstrates distinct patterns between different subtypes of ovarian serous carcinomas and normal samples.
Among active research on complex human diseases using genomic data, little effort and progress have been made in discovering the relational structural information embedded in the molecular data. We propose two stability analysis based methods to learn stable and highly resolved trees of phenotypes using microarray gene expression data of heterogeneous diseases. In the first method, we use a hierarchical, divisive visualization approach to explore the tree of phenotypes and a leave-one-out cross validation to select stable tree structures. In the second method, we propose a node bandwidth constraint to construct stable trees that can balance the descriptive power and reproducibility of tree structures. Using a top-down merging procedure, we modify the binary tree structures learned by hierarchical group clustering methods to achieve a given node bandwidth. We use a bootstrap based stability analysis to select stable tree structures under different node bandwidth constraints. The experimental results on two microarray gene expression datasets of human diseases show that the proposed methods can discover stable trees of phenotypes that reveal the relationships between multiple diseases with biological plausibility.
Files
Filename Size Approximate Download Time (Hours:Minutes:Seconds)
28.8 Modem 56K Modem ISDN (64 Kb) ISDN (128 Kb) Higher-speed Access Feng_Y_D_2010.pdf 3.54 Mb 00:16:23 00:08:25 00:07:22 00:03:41 00:00:18 indicates that a file or directory is accessible from the Virginia Tech campus network only.
If you have questions or technical problems, please Contact DLA.