uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
New data base-independent, sequence tag-based scoring of peptide MS/MS data validates Mowse scores, recovers below threshold data, singles out modified peptides, and assesses the quality of MS/MS techniques
Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences.
Uppsala University, Disciplinary Domain of Science and Technology, Technology, Department of Engineering Sciences.
2005 (English)In: Molecular & Cellular Proteomics, ISSN 1535-9476, E-ISSN 1535-9484, Vol. 4, no 8, 1180-1188 p.Article in journal (Refereed) Published
Abstract [en]

The Mascot score (M-score) is one of the conventional validity measures in data base identification of peptides and proteins by MS/MS data. Although tremendously useful, M-score has a number of limitations. For the same MS/MS data, M-score may change if the protein data base is expanded. A low M-value may not necessarily mean poor match but rather poor MS/MS quality. In addition M-score does not fully utilize the advantage of combined use of complementary fragmentation techniques collisionally activated dissociation (CAD) and electron capture dissociation (ECD). To address these issues, a new data base-independent scoring method (S-score) was designed that is based on the maximum length of the peptide sequence tag provided by the combined CAD and ECD data. The quality of MS/MS spectra assessed by S-score allows poor data (39% of all MS/MS spectra) to be filtered out before the data base search, speeding up the data analysis and eliminating a major source of false positive identifications. Spectra with below threshold M-scores (poor matches) but high S-scores are validated. Spectra with zero M-score (no data base match) but high S-score are classified as belonging to modified sequences. As an extension of S-score, an extremely reliable sequence tag was developed based on complementary fragments simultaneously appearing in CAD and ECD spectra. Comparison of this tag with the data base-derived sequence gives the most reliable peptide identification validation to date. The combined use of M- and S-scoring provides positive sequence identification from >25% of all MS/MS data, a 40% improvement over traditional M-scoring performed on the same Fourier transform MS instrumentation. The number of proteins reliably identified from Escherichia coli cell lysate hereby increased by 29% compared with the traditional M-score approach. Finally S-scoring provides a quantitative measure of the quality of fragmentation techniques such as the minimum abundance of the precursor ion, the MS/MS of which gives the threshold S-score value of 2.

Place, publisher, year, edition, pages
2005. Vol. 4, no 8, 1180-1188 p.
National Category
Engineering and Technology
Identifiers
URN: urn:nbn:se:uu:diva-95264DOI: 10.1074/mcp.T500009-MCP200PubMedID: 15911534OAI: oai:DiVA.org:uu-95264DiVA: diva2:169413
Available from: 2006-12-20 Created: 2006-12-20 Last updated: 2017-12-14Bibliographically approved
In thesis
1. Characterization of Polypeptides by Tandem Mass Spectrometry Using Complementary Fragmentation Techniques
Open this publication in new window or tab >>Characterization of Polypeptides by Tandem Mass Spectrometry Using Complementary Fragmentation Techniques
2006 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

In the growing field of proteomics identification of proteins by tandem mass spectrometry (MS/MS) is performed by matching experimental mass spectra against calculated spectra of all possible peptides in a protein database. One problem with this approach is the false-positive identifications. MS-based proteomics experiments are further affected by a rather poor efficiency typical in the range of 10-15%, implicating that only a low percentage of acquired mass spectrometric data is significantly identified and assigned a peptide sequence.

In this thesis improvement in spectrum specificity is accomplished by using a combination of high-accuracy mass spectrometry and techniques that will yield complementary sequence information. Performing collision-activated dissociation (CAD) and electron capture dissociation (ECD) upon the same peptide ion will yield such complementary sequence information. Implementing this into a proteomics approach and showing the advantages of using complementary fragmentation techniques for improving peptide identification is shown. Furthermore, a novel database-independent score is introduced (S-score) based upon the maximum length of the peptide sequence tag derived from complementary use of CAD and ECD. The S-score can be used to separate poor quality spectra from good quality spectra. An-other aspect of the S-score is the development of the ‘reliable sequence tag’ which can be used to recover below threshold identifications and for a reliable backbone for de novo sequencing of peptides.

A novel proteomics-grade de novo sequencing algorithm has also been developed based upon the RST, which can retrieve peptide identification with the highest reliability (>95%). Furthermore, a novel software tool for unbiased identifications of any post-translational modifications present in a peptide sample is introduced (ModifiComb). Combining all the tools described in this thesis increases the identification specificity (>30 times), recovers false-negative identifications and increases the overall efficiency of proteomics experiements to above 40%. Currently one of the highest achieved in large-scale proteomics.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2006. 65 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 252
Keyword
Analytical chemistry, Mass Spectrometry, Electron capture dissociation (ECD), Collision-activated dissociation (CAD), Proteomics, Post-translational modifications, De Novo sequencing, Bioinformatics, Analytisk kemi
National Category
Natural Sciences
Identifiers
urn:nbn:se:uu:diva-7409 (URN)91-554-6755-5 (ISBN)
Public defence
2007-01-11, B21, BMC, Husargatan 3, Uppsala, 14:15
Opponent
Supervisors
Available from: 2006-12-20 Created: 2006-12-20 Last updated: 2013-09-04Bibliographically approved
2. New Proteomics Methods and Fundamental Aspects of Peptide Fragmentation
Open this publication in new window or tab >>New Proteomics Methods and Fundamental Aspects of Peptide Fragmentation
2007 (English)Doctoral thesis, comprehensive summary (Other academic)
Alternative title[sv]
Nya Proteomik Metoder och Fundamentala Aspekter av Peptid Fragmentering
Abstract [en]

The combination of collision-activated dissociation, (CAD) and electron capture dissociation, (ECD) yielded a 125% increase in protein identification. The S-score was developed for measuring the information content in MS/MS spectra. This measure made it possible to single out good quality spectra that were not identified by a search engine. Poor quality MS/MS data was filtered out, streamlining the identification process.

A proteomics grade de novo sequencing approach was developed enabling to almost completely sequence 19% of all MS/MS data with 95% reliability in a typical proteomics experiment.

A new tool, Modificomb, for identifying all types of modifications in a fast, reliable way was developed. New types of modifications have been discovered and the extent of modifications in gel based proteomics turned out to be greater than expected.

PhosTShunter was developed for sensitive identification of all phosphorylated peptides in an MS/MS dataset.

Application of these programs to human milk samples led to identification of a previously unreported and potentially biologically important phosphorylation site.

Peptide fragmentation has been studied. It was shown emphatically on a dataset of 15.000 MS/MS spectra that CAD and ECD have different cleavage preferences with respect to the amino acid context.

Hydrogen rearrangement involving z• species has been investigated. Clear trends have been unveiled. This information elucidated the mechanism of hydrogen transfer.

Partial side-chain losses in ECD have been studied. The potential of these ions for reliably distinguishing Leu/Iso residues was shown. Partial sidechain losses occurring far away from the cleavage site have been detected.

A strong correlation was found between the propensities of amino acids towards peptide bond cleavage employing CAD and the propensity of amino acids to accept in solution backbone-backbone H-bonds and form stable motifs. This indicated that the same parameter governs formation of secondary structures in solution and directs fragmentation in peptide ions by CAD.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2007. 56 p.
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 264
Keyword
Bioinformatics, Proteomics, Peptide fragmentation, Bioinformatik
Identifiers
urn:nbn:se:uu:diva-7438 (URN)978-91-554-6775-X (ISBN)
Public defence
2007-02-08, B21, BMC, Husargatan 3, Uppsala, 14:15
Opponent
Supervisors
Available from: 2007-01-17 Created: 2007-01-17 Last updated: 2013-09-04Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textPubMed
By organisation
Department of Engineering Sciences
In the same journal
Molecular & Cellular Proteomics
Engineering and Technology

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 504 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf