The majority of the vertebrate genome sequence is not coding for proteins. In recent years, the evolution of this noncoding fraction of the genome has gained interest. These studies have been greatly facilitated by the availability of full genome sequences. The aim of this thesis is to study evolution of the noncoding vertebrate genome through bioinformatic analysis of large-scale genomic datasets.
In a first analysis we addressed the use of conservation of sequence between highly diverged genomes to infer function. We provided evidence for a turnover of the patterns of negative selection. Hence, measures of constraint based on comparisons of diverged genomes might underestimate the functional proportion of the genome.
In the following analyses we focused on length variation as found in small-scale insertion and deletion (indel) polymorphisms and microsatellites. For indels in chicken, replication slippage is a likely mutation mechanism, as a large proportion of the indels are parts of tandem-duplicates. Using a set of microsatellite polymorphisms in chicken, where we avoid ascertainment bias, we showed that polymorphism is positively correlated with microsatellite length and AT-content. Furthermore, interruptions in the microsatellite sequence decrease the levels of polymorphism.
We also analysed the association between microsatellite polymorphism and recombination in the human genome. Here we found increased levels of microsatellite polymorphism in human recombination hotspots and also similar increases in the frequencies of single nucleotide polymorphisms (SNPs) and indels. This points towards natural selection shaping the levels of variation. Alternatively, recombination is mutagenic for all three kinds of polymorphisms.
Finally, I present the program ILAPlot. It is a tool for visualisation, exploration and data extraction based on BLAST.
Our combined results highlight the intricate connections between evolutionary phenomena. It also emphasises the importance of length variability in genome evolution, as well as the gradual difference between indels and microsatellites.