Title page for ETD etd-08022005-120858


Type of Document Master's Thesis
Author Pati, Amrita
Author's Email Address apati@vt.edu
URN etd-08022005-120858
Title Modeling and Analysis of Regulatory Elements in Arabidopsis thaliana from Annotated Genomes and Gene Expression Data
Degree Master of Science
Department Computer Science
Advisory Committee
Advisor Name Title
Heath, Lenwood S. Committee Chair
Grene, Ruth Committee Member
Murali, T. M. Committee Member
Keywords
  • Motifs
  • Regulatory elements
  • Gene Expression
  • Motif Combinations
  • Itemsets
  • Genes
Date of Defense 2005-07-13
Availability restricted
Abstract
Modeling of cis-elements in the upstream regions of genes is a challenging computational problem. A set of regulatory motifs present in the promoters of a set of genes can be modeled by a biclique. Combinations of cis-elements play a vital role in ascertaining that the correct co-action of transcription factors binding to the gene promoter, results in appropriate gene expression in response to various stimuli. Geometrical and spatial constraints in transcription factor binding also impose restrictions on order and separation of cis-elements. Not all regulatory elements that coexist are biologically significant. If the set of genes in which a set of regulatory elements co-occur, are tightly correlated with respect to gene expression data over a set of treatments, the regulatory element combination can be biologically directed.

       The system developed in this work, XcisClique, consists of a comprehensive infrastructure for annotated genome and gene expression data for Arabidopsis thaliana. XcisClique models cis-regulatory elements as regular expressions and detects maximal bicliques of genes and motifs, called itemsets. An itemset consists of a set of genes (called a geneset) and a set of motifs (called a motifset) such that every motif in the motifset occurs in the promoter of every gene in the geneset. XcisClique differs from existing tools of the same kind in that, it offers a common platform for the integration of sequence and gene expression data. Itemsets identified by XcisClique are not only evaluated for statistical over-representation in sequence data, but are also examined with respect to the expression patterns of the corresponding geneset. Thus, the results produced are biologically directed. XcisClique is also the only tool of its kind for Arabidopsis thaliana, and can also be used for other organisms in the presence of appropriate sequence, expression, and regulatory element data. The web-interface to a subset of functionalities, source code and supplemental material are available online at http://bioinformatics.cs.vt.edu/xcisclique.

Files
  Filename       Size       Approximate Download Time (Hours:Minutes:Seconds) 
 
 28.8 Modem   56K Modem   ISDN (64 Kb)   ISDN (128 Kb)   Higher-speed Access 
[VT] Thesis.pdf 3.47 Mb 00:16:03 00:08:15 00:07:13 00:03:36 00:00:18
[VT] XCISCLIQUE.tar.gz 164.31 Mb 12:40:42 06:31:13 05:42:19 02:51:09 00:14:36
[VT] indicates that a file or directory is accessible from the Virginia Tech campus network only.

Browse All Available ETDs by ( Author | Department )

dla home
etds imagebase journals news ereserve special collections
virgnia tech home contact dla university libraries

If you have questions or technical problems, please Contact DLA.