uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Novel Bioinformatics Applications for Protein Allergology, Genome-Wide Association and Retrovirology Studies
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics. (Bongcam-Rudloff)
2010 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Recently, the pace of growth in the amount of data sources within Life Sciences has increased exponentially until pose a difficult problem to efficiently manage their integration. The data avalanche we are experiencing may be significant for a turning point in science, with a change of orientation from proprietary to publicly available data and a concomitant acceptance of studies based on the latter. To investigate these issues, a Network of Excellence (EMBRACE) was launched with the aim to integrate the major databases and the most popular bioinformatics software tools. The focus of this thesis is therefore to approach the problem of seamlessly integrating varied data sources and/or distributed research tools.

In paper I, we have developed a web service to facilitate allergenicity risk assessment, based on allergen descriptors, in order to characterize proteins with the potential for sensitization and cross-reactivity.

In paper II, a web service was developed which uses a lightweight protocol to integrate human endogenous retrovirus (ERV) data within a public genome browser. This new data catalogue and many other publicly available sources were integrated and tested in a bioinformatics-rich client application.

In paper III, GeneFinder, a distributed tool for genome-wide association studies, was developed and tested. Useful information based on a particular genomic region can be easily retrieved and assessed.

Finally, in paper IV, we developed a prototype pipeline to mine the dog genome for endogenous retroviruses and displaying the transcriptional landscape of these retroviral integrations. Moreover, we further characterized a group that until this point was believed to be primate-specific. Our results also revealed that the dog has been very effective in protecting itself from such integrations.

This work integrates different applications in the fields of protein allergology, biotechnology, genome association studies and endogenous retroviruses.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis , 2010. , p. 103
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 703
Keywords [en]
data integration, web services, protein allergology, risk assessment, cross reactivity, endogenous retroviruses, ERV, dog, canine, GWAS, genome-wide association studies
Identifiers
URN: urn:nbn:se:uu:diva-111932ISBN: 978-91-554-7694-6 (print)OAI: oai:DiVA.org:uu-111932DiVA, id: diva2:283857
Public defence
2010-01-29, C8:301, BMC, Husargatan 3, Uppsala, 09:30 (English)
Opponent
Supervisors
Projects
EMBRACE NoE EU FP6Available from: 2010-01-08 Created: 2009-12-31 Last updated: 2010-01-11Bibliographically approved
List of papers
1. EVALLER: a web server for in silico assessment of potential protein allergenicity
Open this publication in new window or tab >>EVALLER: a web server for in silico assessment of potential protein allergenicity
Show others...
2007 (English)In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 35, p. W694-W700Article in journal (Refereed) Published
Abstract [en]

Bioinformatics testing approaches for protein allergenicity, involving amino acid sequence comparisons, have evolved appreciably over the last several years to increased sophistication and performance. EVALLER, the web server presented in this article is based on our recently published 'Detection based on Filtered Length-adjusted Allergen Peptides' (DFLAP) algorithm, which affords in silico determination of potential protein allergenicity of high sensitivity and excellent specificity. To strengthen bioinformatics risk assessment in allergology EVALLER provides a comprehensive outline of its judgment on a query protein's potential allergenicity. Each such textual output incorporates a scoring figure, a confidence numeral of the assignment and information on high- or low-scoring matches to identified allergen-related motifs, including their respective location in accordingly derived allergens. The interface, built on a modified Perl Open Source package, enables dynamic and color-coded graphic representation of key parts of the output. Moreover, pertinent details can be examined in great detail through zoomed views. The server can be accessed at http://bioinformatics.bmc.uu.se/evaller.html.

National Category
Medical and Health Sciences Signal Processing
Research subject
Electrical Engineering with specialization in Signal Processing
Identifiers
urn:nbn:se:uu:diva-99645 (URN)10.1093/nar/gkm370 (DOI)000255311500128 ()17537818 (PubMedID)
Available from: 2009-03-18 Created: 2009-03-18 Last updated: 2017-12-13Bibliographically approved
2. Annotation and visualization of endogenous retroviral sequences using the Distributed Annotation System (DAS) and eBioX
Open this publication in new window or tab >>Annotation and visualization of endogenous retroviral sequences using the Distributed Annotation System (DAS) and eBioX
Show others...
2009 (English)In: BMC Bioinformatics, ISSN 1471-2105, E-ISSN 1471-2105, Vol. 10 Suppl. 6, p. S18-Article in journal (Refereed) Published
Abstract [en]

BACKGROUND: The Distributed Annotation System (DAS) is a widely used network protocol for sharing biological information. The distributed aspects of the protocol enable the use of various reference and annotation servers for connecting biological sequence data to pertinent annotations in order to depict an integrated view of the data for the final user. RESULTS: An annotation server has been devised to provide information about the endogenous retroviruses detected and annotated by a specialized in silico tool called RetroTector. We describe the procedure to implement the DAS 1.5 protocol commands necessary for constructing the DAS annotation server. We use our server to exemplify those steps. Data distribution is kept separated from visualization which is carried out by eBioX, an easy to use open source program incorporating multiple bioinformatics utilities. Some well characterized endogenous retroviruses are shown in two different DAS clients. A rapid analysis of areas free from retroviral insertions could be facilitated by our annotations. CONCLUSION: The DAS protocol has shown to be advantageous in the distribution of endogenous retrovirus data. The distributed nature of the protocol is also found to aid in combining annotation and visualization along a genome in order to enhance the understanding of ERV contribution to its evolution. Reference and annotation servers are conjointly used by eBioX to provide visualization of ERV annotations as well as other data sources. Our DAS data source can be found in the central public DAS service repository, http://www.dasregistry.org, or at http://loka.bmc.uu.se/das/sources.

Place, publisher, year, edition, pages
BioMed Central, 2009
National Category
Medical and Health Sciences
Identifiers
urn:nbn:se:uu:diva-106783 (URN)10.1186/1471-2105-10-S6-S18 (DOI)000267522200018 ()19534743 (PubMedID)
Available from: 2009-07-03 Created: 2009-07-02 Last updated: 2017-12-13Bibliographically approved
3. GeneFinder: "in silico" positional cloning of trait genes
Open this publication in new window or tab >>GeneFinder: "in silico" positional cloning of trait genes
(English)Manuscript (preprint) (Other (popular science, discussion, etc.))
Abstract [en]

Motivation: Positional cloning of trait genes is extremely laborious and the amount of information available on gene function in different organisms is increasing so rapidly that it is hard for a research group to collect all the relevant information from a number of data sources without performing a large number of manual and time consuming searches.

Results: A web service application named GeneFinder was designed and implemented. It collects selected available information related to trait loci within a given chromosomal region that control a specific phenotype. The information contains details on gene function, disease conditions, tissue expression as well as predicted gene homologies in several other species. The information gathered is further ordered by a special-purpose ranking algorithm. A web interface to the GeneFinder web service was also developed where the results are presented in a ranked list easing its interpretation. We explain the design of the architecture, show how our web interface works, and finally test a candidate region.

Availability: GeneFinder is publicly available and free to use. The web interface is available at http://www.genefinder.org/.

Keywords
data integration, web services, GWAS, genome-wide association studies
National Category
Medical Genetics Bioinformatics and Systems Biology
Research subject
Medical Genetics
Identifiers
urn:nbn:se:uu:diva-112124 (URN)
Projects
EMBRACE
Available from: 2010-01-09 Created: 2010-01-09 Last updated: 2018-01-12
4. Data mining of the dog genome reveals novel Canine Endogenous Retroviruses(CfERVs)
Open this publication in new window or tab >>Data mining of the dog genome reveals novel Canine Endogenous Retroviruses(CfERVs)
Show others...
(English)Manuscript (preprint) (Other (popular science, discussion, etc.))
Abstract [en]

Mining the dog genome for canine endogenous retroviruses (CfERV) using the program RetroTector© identified 407 CfERVs (0.15% of the total genome size). Phylogenetic analysis showed that the majority of these CfERVs belong to the gammaretroviridae (n=313) genus. In this group, we found 33 integrated CfERVs with similarity to the human HERV-Fc1. Eighteen of them had conserved open reading frames open and seven of the 18 were recent integrations (≤ 5% LTR divergence). Some of these CfERVs may have potential for active retrotransposition and could actively contribute to the plasticity of canine genomes. Similar to other vertebrates, betaretroviruses (n=28) was the second most common group. In addition, four spuma-like and four gypsy-like CfERVs were identified, the latter group being rare in vertebrate genomes. Moreover, we identified 55 CfERVs that could not be classified unambiguously to any known retroviral genera. The integration landscape shows that all dog chromosomes have CfERV integrations with non-uniform distribution both along and across chromosomes. Some regions were essentially devoid of CfERVs whereas other regions had large numbers. Notably, in a comparison between dog and human genomes, CfERV were approximately one fifth of the amount of HERVs found. Species-specific mechanisms for purging and protection against retroviral infections are suggested to act in the dog genome. The CfERV integration pattern showed that a substantial fraction of annotated genes were found within 100 kb distance from annotated proviruses. The majority of such integrations were placed in antisense orientation relative to the transcriptional direction of the neighboring chromosomal genes. In conclusion, our results from Canis familiaris genome analysis support the notion that different mammals may interact distinctively with endogenous retroviruses.

Keywords
endogenous retroviruses, ERV, dog, canine
National Category
Bioinformatics and Systems Biology Microbiology in the medical area
Research subject
Medical Virology
Identifiers
urn:nbn:se:uu:diva-112125 (URN)
Projects
EMBRACE
Available from: 2010-01-09 Created: 2010-01-09 Last updated: 2018-01-12

Open Access in DiVA

fulltext(1598 kB)1493 downloads
File information
File name FULLTEXT01.pdfFile size 1598 kBChecksum SHA-512
be8ddcc5c014cdb2edb99981f29b1fc468aae0e2f7ae2c906936238b0b33e633d1c7458f9719e1df810cdcc4d762bc459f3ebd0f5d8359910530bb2db22a42bc
Type fulltextMimetype application/pdf
Buy this publication >>

Authority records BETA

Martínez Barrio, Álvaro

Search in DiVA

By author/editor
Martínez Barrio, Álvaro
By organisation
The Linnaeus Centre for Bioinformatics

Search outside of DiVA

GoogleGoogle Scholar
Total: 1493 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1047 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf