Exploring the diversity of unmapped reads from human deep sequencing
Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
currently DNA and RNA sequencing are performed as standard parts of many scientific experiments. While the majority of the reads produced in these experiments do map to the genome of the organism of interest there are a significant fraction that do not. These reads have often been viewed as uninteresting and thus discarded, sometimes explained as errors created in the sequencing process. However, there may be a real possibility that these reads actually contain genomic sequences belonging to, but not currently in the genome ofthe organism investigated, as well as information about other organisms which live and thrivein the sample material. Considering this, it is of great interest to investigate these reads to see if they contain any usable information. In this project the unmapped reads from SOLiD sequencing of blood and saliva from a twin pair were assembled. The assembled parts were thencompared to different blast databases to investigate if similar genomic regions are reported inother species. We can conclude that indeed a large fraction of the contigs found in this assemblyhave homology to bacterial genes while other contigs share similarity to genomic regions foundin apes and other species closely related to us. All in all the results show that there is more to the unmapped reads than just sequencing errors.
Place, publisher, year, edition, pages
2012. , 32 p.
genetics, medical, next generation sequencing
IdentifiersURN: urn:nbn:se:uu:diva-194782OAI: oai:DiVA.org:uu-194782DiVA: diva2:606422
Master Programme in Bioinformatics
2012-06-28, C2:301, BMC, uppsala, 11:15 (English)
Feuk, Lars, Associate Professor
Josefsson, Lars-Göran, Associate professor