Improved estimation of intrinsic solubility of drug-like molecules through multi-task graph transformerShow others and affiliations
2025 (English)In: Journal of Cheminformatics, E-ISSN 1758-2946, Vol. 17, no 1, article id 153
Article in journal (Refereed) Published
Abstract [en]
Aqueous solubility of a compound plays a crucial role throughout various stages of drug discovery and development. Despite numerous efforts using various machine learning models, accurately estimating aqueous solubility remains a challenge. One primary limitation is the absence of a single source, large dataset of druglike compounds for model training. Additionally, studies have highlighted the need for improvements in prediction algorithms and molecular representations. To address these challenges, the Johnson and Johnson (J&J) in-house solubility data was leveraged. Theoretical pH-solubility equations and in-house pKa prediction tools were utilized to calculate intrinsic solubility from J&J data. A multi-task graph transformer model was developed and trained on the calculated intrinsic solubility data of 13,306 compounds along with seven relevant physicochemical properties including solubility at pH 2/7, logP, and logD at three different pHs. When evaluated making use of high-quality test data, the developed model achieved a root mean square error (RMSE) of 0.61 and coefficient of determination (R2) of 0.60, demonstrating state-of-the-art performance in estimating intrinsic solubility for drug-like compounds.
Place, publisher, year, edition, pages
BioMed Central (BMC), 2025. Vol. 17, no 1, article id 153
Keywords [en]
Graph transformer, Muti-task learning, Quantitative structure-property relationship (QSPR), Molecular property prediction, Drug-like compounds
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:uu:diva-570504DOI: 10.1186/s13321-025-01106-0ISI: 001592018500001PubMedID: 41084070Scopus ID: 2-s2.0-105018704970OAI: oai:DiVA.org:uu-570504DiVA, id: diva2:2009776
2025-10-282025-10-282025-10-28Bibliographically approved