uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Computational detection of allergenic proteins attains a new level of accuracy with in silico variable-length peptide extraction and machine learning
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences, Clinical Pharmacology. (Cancer pharmacology and informatics (Rolf Larsson))
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Genetics and Pathology. Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences, Signal Processing. (Cancer pharmacology and informatics (Rolf Larsson))
2006 (English)In: Nucleic Acids Research, ISSN 0305-1048, E-ISSN 1362-4962, Vol. 34, no 13, 3779-3793 p.Article in journal (Refereed) Published
Abstract [en]

The placing of novel or new-in-the-context proteins on the market, appearing in genetically modified foods, certain bio-pharmaceuticals and some household products leads to human exposure to proteins that may elicit allergic responses. Accurate methods to detect allergens are therefore necessary to ensure consumer/patient safety. We demonstrate that it is possible to reach a new level of accuracy in computational detection of allergenic proteins by presenting a novel detector, Detection based on Filtered Length-adjusted Allergen Peptides (DFLAP). The DFLAP algorithm extracts variable length allergen sequence fragments and employs modern machine learning techniques in the form of a support vector machine. In particular, this new detector shows hitherto unmatched specificity when challenged to the Swiss-Prot repository without appreciable loss of sensitivity. DFLAP is also the first reported detector that successfully discriminates between allergens and non-allergens occurring in protein families known to hold both categories. Allergenicity assessment for specific protein sequences of interest using DFLAP is possible via ulfh@slv.se.

Place, publisher, year, edition, pages
2006. Vol. 34, no 13, 3779-3793 p.
National Category
Medical and Health Sciences Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-97617DOI: 10.1093/nar/gkl467ISI: 000240583100028PubMedID: 16977698OAI: oai:DiVA.org:uu-97617DiVA: diva2:172632
Available from: 2008-10-17 Created: 2008-10-17 Last updated: 2017-12-14Bibliographically approved
In thesis
1. Novel Computational Analyses of Allergens for Improved Allergenicity Risk Assessment and Characterization of IgE Reactivity Relationships
Open this publication in new window or tab >>Novel Computational Analyses of Allergens for Improved Allergenicity Risk Assessment and Characterization of IgE Reactivity Relationships
2008 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Immunoglobulin E (IgE) mediated allergy is a major and seemingly increasing health problem in the Western countries. The combined usage of databases of molecular and clinical information on allergens (allergenic proteins) as well as new experimental platforms capable of generating huge amounts of allergy-related data from a single blood test holds great potential to enhance our knowledge of this complex disease. To maximally benefit from this development, however, both novel and improved methods for computational analysis are urgently required. This thesis concerns two types of important and practical computational analyses of allergens: allergenicity/IgE-cross-reactivity risk assessment and characterization of IgE-reactivity patterns. Both directions rely on development and implementation of bioinformatics and statistical learning algorithms, which are applied to either amino acid sequence information of allergenic proteins or on quantified human blood serum levels of specific IgE-antibodies to allergen preparations (purified extracts of allergenic sources, such as e.g. peanut or birch).

The main application for computational risk assessment of allergenicity is to prevent unintentional introduction of allergen-encoding transgenes in genetically modified (GM) food crops. Two separate classification procedures for potential protein allergenicity are introduced. Both protocols rely on multivariate classification algorithms that are educated to discriminate allergens from presumable non-allergens based on their amino acid sequence. Both classification procedures are thoroughly evaluated and the second protocol shows state-of-the-art performance in comparison to current top-ranked methods. Moreover, several pitfalls in performance estimation of classifiers are demonstrated and procedures to circumvent these are suggested.

Visualization and characterization of IgE-reactivity patterns among allergen preparations are enabled by application of bioinformatics and statistical learning methods to a multivariate dataset holding recorded blood serum IgE-levels of over 1000 sensitized individuals, each measured to 89 allergen preparations. Moreover, a novel framework for divisive hierarchical clustering including graphical representation of the resulting output is introduced, which greatly simplifies analysis of the abovementioned dataset. Important IgE-reactivity relationships within several groups of allergen preparations are identified including well-known groups of clinically relevant cross-reactivities.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2008. 65 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine, ISSN 1651-6206 ; 385
Keyword
allergens, bioinformatics, statistical learning, performance estimation, risk assessment
National Category
Biomedical Laboratory Science/Technology
Identifiers
urn:nbn:se:uu:diva-9313 (URN)978-91-554-7308-2 (ISBN)
Public defence
2008-11-07, Lärosal IV, Universitetshuset, Uppsala, 13:00 (English)
Opponent
Supervisors
Available from: 2008-10-17 Created: 2008-10-17 Last updated: 2009-05-12Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textPubMed
By organisation
Clinical PharmacologyDepartment of Genetics and PathologySignal Processing
In the same journal
Nucleic Acids Research
Medical and Health SciencesEngineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 506 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf