uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty
Univ Cambridge, Ctr Mol Informat, Dept Chem, Lensfield Rd, Cambridge CB2 1EW, England; IOTA Pharmaceut, St Johns Innovat Ctr, Cowley Rd, Cambridge CB4 0WS, England.
Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K..
Swetox, Unit of Toxicology Sciences, Karolinska Institutet, Forskargatan 20, SE-151 36 Södertälje, Sweden; Department of Computer and Systems Sciences , Stockholm University, Box 7003, SE-164 07 Kista, Sweden.
Centre for Molecular Informatics, Department of Chemistry , University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K..
Show others and affiliations
2018 (English)In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 58, no 5, p. 1132-1140Article in journal (Refereed) Published
Abstract [en]

Making predictions with an associated confidence is highly desirable as it facilitates decision making and resource prioritization. Conformal regression is a machine learning framework that allows the user to define the required confidence and delivers predictions that are guaranteed to be correct to the selected extent. In this study, we apply conformal regression to model molecular properties and bioactivity values and investigate different ways to scale the outputted prediction intervals to create as efficient (i.e. narrow) regressors as possible. Different algorithms to estimate the prediction uncertainty were used to normalize the prediction ranges and the different approaches were evaluated on 29 publicly available datasets. Our results show that the most efficient conformal regressors are obtained when using the natural exponential of the ensemble standard deviation from the underlying random forest to scale the prediction intervals. This approach afforded an average prediction range of 1.65 pIC50 units at the 80 % confidence level when applied to bioactivity modeling. The choice of nonconformity function has a pronounced impact on the average prediction range with a difference of close to one log unit in bioactivity between the tightest and widest prediction range. Overall, conformal regression is a robust approach to generate bioactivity predictions with associated confidence.

Place, publisher, year, edition, pages
2018. Vol. 58, no 5, p. 1132-1140
National Category
Bioinformatics and Systems Biology
Identifiers
URN: urn:nbn:se:uu:diva-350011DOI: 10.1021/acs.jcim.8b00054ISI: 000433634900021PubMedID: 29701973OAI: oai:DiVA.org:uu-350011DiVA, id: diva2:1203242
Funder
Swedish Research Council FormasSwedish Foundation for Strategic Research Available from: 2018-05-02 Created: 2018-05-02 Last updated: 2018-08-20Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textPubMed

Authority records BETA

Spjuth, Ola

Search in DiVA

By author/editor
Spjuth, Ola
By organisation
Department of Pharmaceutical BiosciencesScience for Life Laboratory, SciLifeLab
In the same journal
Journal of Chemical Information and Modeling
Bioinformatics and Systems Biology

Search outside of DiVA

GoogleGoogle Scholar

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 57 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf