Computer Science Seminar - University of Houston
Skip to main content

Computer Science Seminar

Faculty Candidate 2014

Stochastic Models and Algorithms for Large-scale Comparative Genomics under Complex Evolutionary Scenarios

Seminar Slides: Download (PDF)

When: Friday, March 21, 2014
Where: PGH 232
Time: 11:00 AM

Speaker: Dr. Kevin J. Liu, Rice University

Host: Dr. Zhigang Deng

The growth of genome-scale sequence data in public databases is outpacing Moore's law. Fueled by the explosion of genomic data, comparative genomics seeks to annotate and understand the genomes of different organisms, leading to new biological and biomedical discoveries. Current comparative genomic studies typically utilize a computational pipeline consisting of several stages, including:

  1. aligning orthologs, which are genomic sequences related by evolutionary descent from a common ancestral sequence,
  2. examining evolutionary relationships among orthologs and other genomic features, and
  3. using the resulting insights to reason about the biological function and significance of the genomic features under study.

Modern comparative genomic studies face three primary challenges: biological -- the co-occurrence of multiple complex evolutionary events; mathematical -- the need to devise realistic, yet tractable, models of genome evolution; and, computational -- the need to develop accurate and scalable algorithms and tools for conducting large-scale analyses.

My research addresses all three challenges. In this talk, I will begin by describing my Ph.D. work on large-scale sequence alignment and phylogenetic estimation (stages (1) and (2), respectively). Some of the primary contributions of my work are new divide-and-conquer techniques, which are essential to accurate inference while enabling scalability that improved upon previous methods by an order of magnitude (in terms of number of sequences) or more. Next, I will discuss my postgraduate work on machine learning techniques for comparative genomics in the presence of complex evolutionary events, especially those that result in non-tree-like evolutionary histories (stages (2) and (3)). Highlights include PhyloNet-HMM, a model-based inference method that combines phylogenetic networks, which capture non-tree-like evolutionary relationships among genomes, with hidden Markov models (HMMs), which capture dependencies within genomes, in a novel manner. The performance of PhyloNet-HMM is demonstrated through an empirical analysis of mouse genomes, resulting in a new clinical insight and a new biological discovery. Other important applications include the study of horizontal gene transfer in bacteria and its role in the spread of antibiotic resistance. I will conclude with general observations and directions for my future research.

Bio:

Kevin J. Liu is an NIH postdoctoral fellow working with Luay Nakhleh in the Department of Computer Science and Michael H. Kohn in the Department of Ecology and Evolutionary Biology at Rice University.

His research develops efficient and accurate computational algorithms and tools for large-scale comparative genomics, and then applies the insights enabled by his new approaches to create new biological and biomedical discoveries. In 2011, he received his Ph.D. in computer science from the University of Texas at Austin, where he was supervised by Tandy Warnow in the Department of Computer Science and C. Randal Linder in the Department of Integrative Biology. His graduate work contributed new computational methods for scalable and accurate inference of large-scale multiple sequence alignments and phylogenies.