Identifying novel constrained elements by exploiting biased substitution patterns
2009 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1460-2059, Vol. 25, no 12, I54-I62 p.Article in journal (Refereed) Published
Motivation: Comparing the genomes from closely related species provides a powerful tool to identify functional elements in a reference genome. Many methods have been developed to identify conserved sequences across species; however, existing methods only model conservation as a decrease in the rate of mutation and have ignored selection acting on the pattern of mutations. Results: We present a new approach that takes advantage of deeply sequenced clades to identify evolutionary selection by uncovering not only signatures of rate-based conservation but also substitution patterns characteristic of sequence undergoing natural selection. We describe a new statistical method for modeling biased nucleotide substitutions, a learning algorithm for inferring site-specific substitution biases directly from sequence alignments and a hidden Markov model for detecting constrained elements characterized by biased substitutions. We show that the new approach can identify significantly more degenerate constrained sequences than rate-based methods. Applying it to the ENCODE regions, we identify as much as 10.2% of these regions are under selection.
Place, publisher, year, edition, pages
2009. Vol. 25, no 12, I54-I62 p.
human genome, functional elements, vertebrate, genomes, sequence-analysis, identification, discovery, 1-percent, browser, mammals, family
Medical and Health Sciences
IdentifiersURN: urn:nbn:se:uu:diva-148838DOI: 10.1093/bioinformatics/btp190ISI: 000266498300008OAI: oai:DiVA.org:uu-148838DiVA: diva2:403338