uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Local descriptors of protein structure: A systematic analysis of the sequence-structure relationship in proteins using short- and long-range interactions
Uppsala University, Disciplinary Domain of Science and Technology, Biology, Department of Cell and Molecular Biology, The Linnaeus Centre for Bioinformatics.
2009 (English)In: Proteins: Structure, Function, and Genetics, ISSN 0887-3585, E-ISSN 1097-0134, Vol. 75, no 4, 870-884 p.Article in journal (Refereed) Published
Abstract [en]

Local protein structure representations that incorporate long-range contacts between residues are often considered in protein structure comparison but have found relatively little use in structure prediction where assembly from single backbone fragments dominates. Here, we introduce the concept of local descriptors of protein structure to characterize local neighborhoods of amino acids including short- and long-range interactions. We build a library of recurring local descriptors and show that this library is general enough to allow assembly of unseen protein structures. The library could on average re-assemble 83% of 119 unseen structures, and showed little or no performance decrease between homologous targets and targets with folds not represented among domains used to build it. We then systematically evaluate the descriptor library to establish the level of the sequence signal in sets of protein fragments of similar geometrical conformation. In particular, we test whether that signal is strong enough to facilitate correct assignment and alignment of these local geometries to new sequences. We use the signal to assign descriptors to a test set of 479 sequences with less than 40% sequence identity to any domain used to build the library, and show that on average more than 50% of the backbone fragments constituting descriptors can be correctly aligned. We also use the assigned descriptors to infer SCOP folds, and show that correct predictions can be made in many of the 151 cases where PSI-BLAST was unable to detect significant sequence similarity to proteins in the library. Although the combinatorial problem of simultaneously aligning several fragments to sequence is a major bottleneck compared with single is that correct alignments imply correct long range distance constraints. The lack of these constraints is most likely the major reason why structure prediction methods fail to consistently produce adequate models when good templates are unavailable or undetectable. Thus, we believe that the current study offers new and valuable insight into the prediction of sequence-structure relationships in proteins.

Place, publisher, year, edition, pages
2009. Vol. 75, no 4, 870-884 p.
Keyword [en]
protein structure prediction, protein structure comparison, fragment-based methods, local protein substructures, sequence patterns, long-range interactions
National Category
Biological Sciences
URN: urn:nbn:se:uu:diva-129049DOI: 10.1002/prot.22296ISI: 000266133600008OAI: oai:DiVA.org:uu-129049DiVA: diva2:337503
Available from: 2010-08-06 Created: 2010-08-05 Last updated: 2010-08-06Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text
By organisation
The Linnaeus Centre for Bioinformatics
In the same journal
Proteins: Structure, Function, and Genetics
Biological Sciences

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Altmetric score

Total: 166 hits
ReferencesLink to record
Permanent link

Direct link