uu.seUppsala University Publications
Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 34) Show all publications
Lapins, M., Arvidsson, S., Lampa, S., Berg, A., Schaal, W., Alvarsson, J. & Spjuth, O. (2018). A confidence predictor for logD using conformal regression and a support-vector machine. Journal of Cheminformatics, 10(1), Article ID 17.
Open this publication in new window or tab >>A confidence predictor for logD using conformal regression and a support-vector machine
Show others...
2018 (English)In: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 10, no 1, article id 17Article in journal (Refereed) Published
Abstract [en]

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

Keywords
Conformal prediction, LogD, Machine learning, QSAR, RDF, Support-vector machine
National Category
Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-347779 (URN)10.1186/s13321-018-0271-1 (DOI)000429065900001 ()29616425 (PubMedID)
Funder
EU, Horizon 2020, 731075
Available from: 2018-04-06 Created: 2018-04-06 Last updated: 2018-08-28Bibliographically approved
Ahmed, L., Georgiev, V., Capuccini, M., Toor, S., Schaal, W., Laure, E. & Spjuth, O. (2018). Efficient iterative virtual screening with Apache Spark and conformal prediction.. Journal of Cheminformatics, 10, Article ID 8.
Open this publication in new window or tab >>Efficient iterative virtual screening with Apache Spark and conformal prediction.
Show others...
2018 (English)In: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 10, article id 8Article in journal (Refereed) Published
Abstract [en]

BACKGROUND: Docking and scoring large libraries of ligands against target proteins forms the basis of structure-based virtual screening. The problem is trivially parallelizable, and calculations are generally carried out on computer clusters or on large workstations in a brute force manner, by docking and scoring all available ligands.

CONTRIBUTION: In this study we propose a strategy that is based on iteratively docking a set of ligands to form a training set, training a ligand-based model on this set, and predicting the remainder of the ligands to exclude those predicted as 'low-scoring' ligands. Then, another set of ligands are docked, the model is retrained and the process is repeated until a certain model efficiency level is reached. Thereafter, the remaining ligands are docked or excluded based on this model. We use SVM and conformal prediction to deliver valid prediction intervals for ranking the predicted ligands, and Apache Spark to parallelize both the docking and the modeling.

RESULTS: We show on 4 different targets that conformal prediction based virtual screening (CPVS) is able to reduce the number of docked molecules by 62.61% while retaining an accuracy for the top 30 hits of 94% on average and a speedup of 3.7. The implementation is available as open source via GitHub ( https://github.com/laeeq80/spark-cpvs ) and can be run on high-performance computers as well as on cloud resources.

Keywords
Apache Spark, Cloud computing, Conformal prediction, Docking, Virtual screening
National Category
Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-343980 (URN)10.1186/s13321-018-0265-z (DOI)000426699400001 ()29492726 (PubMedID)
Funder
eSSENCE - An eScience CollaborationSwedish e‐Science Research CenterSwedish National Infrastructure for Computing (SNIC), b2015245Swedish National Infrastructure for Computing (SNIC), SNIC 2017/13-6
Available from: 2018-03-03 Created: 2018-03-03 Last updated: 2018-05-14Bibliographically approved
Dahlö, M., Scofield, D., Schaal, W. & Spjuth, O. (2018). Tracking the NGS revolution: managing life science research on shared high-performance computing clusters. GigaScience, 7(5), Article ID giy028.
Open this publication in new window or tab >>Tracking the NGS revolution: managing life science research on shared high-performance computing clusters
2018 (English)In: GigaScience, ISSN 2047-217X, E-ISSN 2047-217X, Vol. 7, no 5, article id giy028Article in journal (Refereed) Published
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-350009 (URN)10.1093/gigascience/giy028 (DOI)000438566200001 ()29659792 (PubMedID)
Funder
Science for Life Laboratory - a national resource center for high-throughput molecular bioscienceSwedish National Infrastructure for Computing (SNIC)
Available from: 2018-04-05 Created: 2018-05-02 Last updated: 2018-09-24Bibliographically approved
Alogheli, H., Olanders, G., Schaal, W., Brandt, P. & Anders, K. (2017). Docking of Macrocycles: Comparing Rigid and Flexible Docking in Glide. Journal of Chemical Information and Modeling, 57(2), 190-202
Open this publication in new window or tab >>Docking of Macrocycles: Comparing Rigid and Flexible Docking in Glide
Show others...
2017 (English)In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 57, no 2, p. 190-202Article in journal (Refereed) Published
Abstract [en]

In recent years, there has been an increased interest in using macrocyclic compounds for drug discovery and development. For docking of these commonly large and flexible compounds to be addressed, a screening and a validation set were assembled from the PDB consisting of 16 and 31 macrocycle-containing protein complexes, respectively. The macrocycles were docked in Glide by rigid docking of pregenerated conformational ensembles produced by the macrocycle conformational sampling method (MCS) in Schrödinger Release 2015-3 or by direct Glide flexible docking after performing ring-templating. The two protocols were compared to rigid docking of pregenerated conformational ensembles produced by an exhaustive Monte Carlo multiple minimum (MCMM) conformational search and a shorter MCMM conformational search (MCMM-short). The docking accuracy was evaluated and expressed as the RMSD between the heavy atoms of the ligand as found in the X-ray structure after refinement and the poses obtained by the docking protocols. The median RMSD values for top-scored poses of the screening set were 0.83, 0.80, 0.88, and 0.58 Å for MCMM, MCMM-short, MCS, and Glide flexible docking, respectively. There was no statistically significant difference in the performance between rigid docking of pregenerated conformations produced by the MCS and direct docking using Glide flexible docking. However, the flexible docking protocol was 2-times faster in docking the screening set compared to that of the MCS protocol. In a final study, the new Prime-MCS method was evaluated in Schrödinger Release 2016-3. This method is faster compared that of to MCS; however, the conformations generated were found to be suboptimal for rigid docking. Therefore, on the basis of timing, accuracy, and ease of set up, standard Glide flexible docking with prior ring-templating is recommended over current gold standard protocols using rigid docking of pregenerated conformational ensembles.

National Category
Medicinal Chemistry
Identifiers
urn:nbn:se:uu:diva-318050 (URN)10.1021/acs.jcim.6b00443 (DOI)000395226100010 ()28079375 (PubMedID)
Funder
Swedish Research Council, 521-2014-6711
Available from: 2017-03-23 Created: 2017-03-23 Last updated: 2018-03-05Bibliographically approved
Sütterlin, S., Dahlö, M., Tellgren-Roth, C., Schaal, W. & Melhus, Å. (2017). High frequency of silver resistance genes in invasive isolates of Enterobacter and Klebsiella species. Journal of Hospital Infection, 96(3), 256-261
Open this publication in new window or tab >>High frequency of silver resistance genes in invasive isolates of Enterobacter and Klebsiella species
Show others...
2017 (English)In: Journal of Hospital Infection, ISSN 0195-6701, E-ISSN 1532-2939, Vol. 96, no 3, p. 256-261Article in journal (Refereed) Published
Abstract [en]

Background: Silver-based products have been marketed as an alternative to antibiotics, and their consumption has increased. Bacteria may, however, develop resistance to silver.

Aim: To study the presence of genes encoding silver resistance (silE, silP, silS) over time in three clinically important Enterobacteriaceae genera.

Methods: Using polymerase chain reaction (PCR), 752 bloodstream isolates from the years 1990–2010 were investigated. Age, gender, and ward of patients were registered, and the susceptibility to antibiotics and silver nitrate was tested. Clonality and single nucleotide polymorphism were assessed with repetitive element sequence-based PCR, multi-locus sequence typing, and whole-genome sequencing.

Findings: Genes encoding silver resistance were detected most frequently in Enterobacter spp. (48%), followed by Klebsiella spp. (41%) and Escherichia coli 4%. Phenotypical resistance to silver nitrate was found in Enterobacter (13%) and Klebsiella (3%) isolates. The lowest carriage rate of sil genes was observed in blood isolates from the neonatology ward (24%), and the highest in blood isolates from the oncology/haematology wards (66%). Presence of sil genes was observed in international high-risk clones. Sequences of the sil and pco clusters indicated that a single mutational event in the silS gene could have caused the phenotypic resistance.

Conclusion: Despite a restricted consumption of silver-based products in Swedish health care, silver resistance genes are widely represented in clinical isolates of Enterobacter and Klebsiella species. To avoid further selection and spread of silver-resistant bacteria with a high potential for healthcare-associated infections, the use of silver-based products needs to be controlled and the silver resistance monitored.

National Category
Public Health, Global Health, Social Medicine and Epidemiology
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-326299 (URN)10.1016/j.jhin.2017.04.017 (DOI)000403468000010 ()28506673 (PubMedID)
Funder
Swedish Research Council Formas, 2011-1692Science for Life Laboratory - a national resource center for high-throughput molecular bioscience
Available from: 2017-07-05 Created: 2017-07-05 Last updated: 2017-09-14Bibliographically approved
Capuccini, M., Ahmed, L., Schaal, W., Laure, E. & Spjuth, O. (2017). Large-scale virtual screening on public cloud resources with Apache Spark. Journal of Cheminformatics, 9, Article ID 15.
Open this publication in new window or tab >>Large-scale virtual screening on public cloud resources with Apache Spark
Show others...
2017 (English)In: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 9, article id 15Article in journal (Refereed) Published
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-318693 (URN)10.1186/s13321-017-0204-4 (DOI)000396830300001 ()28316653 (PubMedID)
Projects
eSSENCE
Available from: 2017-03-06 Created: 2017-03-27 Last updated: 2018-03-14Bibliographically approved
Rönn, R., Lindh, C., Ringberg, E., Andersson, H., Nilsson, P., Schaal, W., . . . Tyrchan, C. (2016). Cyclopropane carboxylic acid derivatives and pharmaceutical uses thereof . WO WO-2016177845-A1.
Open this publication in new window or tab >>Cyclopropane carboxylic acid derivatives and pharmaceutical uses thereof 
Show others...
2016 (English)Patent (Other (popular science, discussion, etc.))
National Category
Pharmaceutical Chemistry
Identifiers
urn:nbn:se:uu:diva-343695 (URN)
Patent
WO WO-2016177845-A1
Available from: 2018-02-28 Created: 2018-02-28 Last updated: 2018-02-28
Alvarsson, J., Lampa, S., Schaal, W., Andersson, C., Wikberg, J. E. S. & Spjuth, O. (2016). Large-scale ligand-based predictive modelling using support vector machines. Journal of Cheminformatics, 8, Article ID 39.
Open this publication in new window or tab >>Large-scale ligand-based predictive modelling using support vector machines
Show others...
2016 (English)In: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 8, article id 39Article in journal (Refereed) Published
Abstract [en]

The increasing size of datasets in drug discovery makes it challenging to build robust and accurate predictive models within a reasonable amount of time. In order to investigate the effect of dataset sizes on predictive performance and modelling time, ligand-based regression models were trained on open datasets of varying sizes of up to 1.2 million chemical structures. For modelling, two implementations of support vector machines (SVM) were used. Chemical structures were described by the signatures molecular descriptor. Results showed that for the larger datasets, the LIBLINEAR SVM implementation performed on par with the well-established libsvm with a radial basis function kernel, but with dramatically less time for model building even on modest computer resources. Using a non-linear kernel proved to be infeasible for large data sizes, even with substantial computational resources on a computer cluster. To deploy the resulting models, we extended the Bioclipse decision support framework to support models from LIBLINEAR and made our models of logD and solubility available from within Bioclipse.

Keywords
Predictive modelling; Support vector machine; Bioclipse; Molecular signatures; QSAR
National Category
Pharmaceutical Sciences Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-248959 (URN)10.1186/s13321-016-0151-5 (DOI)000381186100001 ()27516811 (PubMedID)
Funder
Swedish National Infrastructure for Computing (SNIC), b2013262 b2015001Science for Life Laboratory - a national resource center for high-throughput molecular bioscienceeSSENCE - An eScience Collaboration
Available from: 2015-04-09 Created: 2015-04-09 Last updated: 2018-08-28Bibliographically approved
Pelcman, B., Sanin, A., Nilsson, P., No, K., Schaal, W., Öhrman, S., . . . Claesson, H.-E. (2015). 3-Substituted pyrazoles and 4-substituted triazoles as inhibitors of human 15-lipoxygenase-1. Bioorganic & medicinal chemistry letters, 25(15), 3024-3029
Open this publication in new window or tab >>3-Substituted pyrazoles and 4-substituted triazoles as inhibitors of human 15-lipoxygenase-1
Show others...
2015 (English)In: Bioorganic & medicinal chemistry letters, ISSN 1464-3405, Vol. 25, no 15, p. 3024-3029Article in journal (Refereed) Published
Abstract [en]

Investigation of 1N-substituted pyrazole-3-carboxanilides as 15-lipoxygenase-1 (15-LOX-1) inhibitors demonstrated that the 1N-substituent was not essential for activity or selectivity. Additional halogen substituents on the pyrazole ring, however, increased activity. Further development led to triazole-4-carboxanilides and 2-(3-pyrazolyl) benzoxazoles, which are potent and selective 15-LOX-1 inhibitors.

National Category
Medicinal Chemistry
Identifiers
urn:nbn:se:uu:diva-256043 (URN)10.1016/j.bmcl.2015.05.004 (DOI)000356101700029 ()26037322 (PubMedID)
Available from: 2015-06-22 Created: 2015-06-22 Last updated: 2018-03-05Bibliographically approved
Pelcman, B., Sanin, A., Nilsson, P., Schaal, W., Olofsson, K., Krog-Jensen, C., . . . Claesson, H.-E. (2015). N-Substituted pyrazole-3-carboxamides as inhibitors of human 15-lipoxygenase. Bioorganic & medicinal chemistry letters, 25(15), 3017-3023
Open this publication in new window or tab >>N-Substituted pyrazole-3-carboxamides as inhibitors of human 15-lipoxygenase
Show others...
2015 (English)In: Bioorganic & medicinal chemistry letters, ISSN 1464-3405, Vol. 25, no 15, p. 3017-3023Article in journal (Refereed) Published
Abstract [en]

High-throughput screening was used to find selective inhibitors of human 15-lipoxygenase-1 (15-LOX-1). One hit, a 1-benzoyl substituted pyrazole-3-carboxanilide (1a), was used as a starting point in a program to develop potent and selective 15-LOX-1 inhibitors.

National Category
Medicinal Chemistry
Identifiers
urn:nbn:se:uu:diva-256045 (URN)10.1016/j.bmcl.2015.05.007 (DOI)000356101700028 ()26037319 (PubMedID)
Available from: 2015-06-22 Created: 2015-06-22 Last updated: 2018-03-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6770-0878

Search in DiVA

Show all publications