uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Quantitative Chemogenomics: Machine-Learning Models of Protein-Ligand Interaction
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences. (Cancer Pharmacology and Computational Medicine)
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences. (Cancer Pharmacology and Computational Medicine)
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Medical Sciences. (Cancer Pharmacology and Computational Medicine)
2011 (English)In: Current Topics in Medicinal Chemistry, ISSN 1568-0266, E-ISSN 1873-4294, Vol. 11, no 15, 1978-1993 p.Article, review/survey (Refereed) Published
Description
Abstract [en]

Chemogenomics is an emerging interdisciplinary field that lies in the interface of biology, chemistry, and informatics. Most of the currently used drugs are small molecules that interact with proteins. Understanding protein-ligand interaction is therefore central to drug discovery and design. In the subfield of chemogenomics known as proteochemometrics, protein-ligand-interaction models are induced from data matrices that consist of both protein and ligand information along with some experimentally measured variable. The two general aims of this quantitative multi-structure-property-relationship modeling (QMSPR) approach are to exploit sparse/incomplete information sources and to obtain more general models covering larger parts of the protein-ligand space, than traditional approaches that focuses mainly on specific targets or ligands. The data matrices, usually obtained from multiple sparse/incomplete sources, typically contain series of proteins and ligands together with quantitative information about their interactions. A useful model should ideally be easy to interpret and generalize well to new unseen protein-ligand combinations. Resolving this requires sophisticated machine-learning methods for model induction, combined with adequate validation. This review is intended to provide a guide to methods and data sources suitable for this kind of protein-ligand-interaction modeling. An overview of the modeling process is presented including data collection, protein and ligand descriptor computation, data preprocessing, machine-learning-model induction and validation. Concerns and issues specific for each step in this kind of data-driven modeling will be discussed.

Place, publisher, year, edition, pages
2011. Vol. 11, no 15, 1978-1993 p.
Keyword [en]
Chemogenomics, proteochemometrics, QSAR, QMSPR, machine learning
National Category
Medical and Health Sciences
Identifiers
URN: urn:nbn:se:uu:diva-158280DOI: 10.2174/156802611796391249ISI: 000293855300009OAI: oai:DiVA.org:uu-158280DiVA: diva2:438941
Available from: 2011-09-06 Created: 2011-09-06 Last updated: 2017-12-08Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full text

Authority records BETA

Gustafsson, Mats G.

Search in DiVA

By author/editor
Andersson, Claes R.Gustafsson, Mats G.
By organisation
Department of Medical Sciences
In the same journal
Current Topics in Medicinal Chemistry
Medical and Health Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 398 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf