Title page for ETD etd-05092005-105158


Type of Document Dissertation
Author Slotta, Douglas J.
URN etd-05092005-105158
Title Evalutating Biological Data Using Rank Correlation Methods
Degree PhD
Department Computer Science
Advisory Committee
Advisor Name Title
Heath, Lenwood S. Committee Chair
Helm, Richard Frederick Committee Member
Murali, T. M. Committee Member
Potts, Malcolm Committee Member
Ramakrishnan, Naren Committee Member
Vergara, John Paul Committee Member
Keywords
  • MPSS
  • bioinformatics
  • microarrays
  • spoiler count
  • rank-order
  • feature selection
Date of Defense 2005-05-05
Availability unrestricted
Abstract
Analyses based upon rank correlation methods, such as Spearman's Rho and Kendall's Tau, can provide quick insights into large biological data sets. Comparing expression levels between different technologies and models is problematic due to the different units of measure. Here again, rank correlation provides an effective means of comparison between the two techniques. Massively Parallel Signature Sequencing (MPSS) transcript abundance levels to microarray signal intensities for Arabidopsis thaliana are compared. Rank correlations can be applied to subsets as well as the entire set. Results of subset comparisons can be used to improve the capabilities of predictive models, such as Predicted Highly Expressed (PHX). This is done for Escherichia coli. Methods are given to combine predictive models based upon feedback from experimental data. The problem of feature selection in supervised learning situations is also considered, where all features are drawn from a common domain and are best interpreted via ordinal comparisons with other features, rather than as numerical values. This is done for synthetic data as well as for microarray experiments examining the life cycle of Drosophila melanogaster and human leukemia cells. Two novel methods are presented based upon Rho and Tau, and their efficacy is tested with synthetic and real world data. The method based upon Spearman's Rho is shown to be more effective.
Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
  dis.pdf 1.54 Mb 00:07:08 00:03:40 00:03:13 00:01:36 00:00:08

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.