uu.seUppsala University Publications
Change search
Link to record
Permanent link

Direct link
BETA
Biography [eng]

PhD in Bioinformatics from Uppsala University, 2009. Postdoctoral fellowships at Karolinska Institutet, Stockholm and Finnish Institute of Molecular Medicine (FIMM), Helsinki. Was co-director at the UPPMAX high performance computing center at Uppsala University 2011-2017, and Head of the Bioinformatics Compute and Storage facility at Science for Life Laboratory in Sweden 2011-2017. Currently employed as Senior Lecturer at Department of Pharmaceutical Biosciences in the fields of data-intensive and translational bioinformatics with a particular focus on how modern e-infrastructures enables the studying of complex phenomena, and predictive modeling in pharmacology, toxicology, and metabolism.

Biography [swe]

PhD in Bioinformatics from Uppsala University, 2009. Postdoctoral fellowships at Karolinska Institutet, Stockholm and Finnish Institute of Molecular Medicine (FIMM), Helsinki. Was co-director at the UPPMAX high performance computing center at Uppsala University 2011-2017, and Head of the Bioinformatics Compute and Storage facility at Science for Life Laboratory in Sweden 2011-2017. Currently employed as Senior Lecturer at Department of Pharmaceutical Biosciences in the fields of data-intensive and translational bioinformatics with a particular focus on how modern e-infrastructures enables the studying of complex phenomena, and predictive modeling in pharmacology, toxicology, and metabolism.

Publications (10 of 86) Show all publications
Herman, S., Niemelä, V., Emami Khoonsari, P., Sundblom, J., Burman, J., Landtblom, A.-M., . . . Kultima, K. (2019). Alterations in the tyrosine and phenylalanine pathways revealed by biochemical profiling in cerebrospinal fluid of Huntington's disease subjects. Scientific Reports, 9, Article ID 4129.
Open this publication in new window or tab >>Alterations in the tyrosine and phenylalanine pathways revealed by biochemical profiling in cerebrospinal fluid of Huntington's disease subjects
Show others...
2019 (English)In: Scientific Reports, ISSN 2045-2322, E-ISSN 2045-2322, Vol. 9, article id 4129Article in journal (Refereed) Published
Abstract [en]

Huntington's disease (HD) is a severe neurological disease leading to psychiatric symptoms, motor impairment and cognitive decline. The disease is caused by a CAG expansion in the huntingtin (HTT) gene, but how this translates into the clinical phenotype of HD remains elusive. Using liquid chromatography mass spectrometry, we analyzed the metabolome of cerebrospinal fluid (CSF) from premanifest and manifest HD subjects as well as control subjects. Inter-group differences revealed that the tyrosine metabolism, including tyrosine, thyroxine, L-DOPA and dopamine, was significantly altered in manifest compared with premanifest HD. These metabolites demonstrated moderate to strong associations to measures of disease severity and symptoms. Thyroxine and dopamine also correlated with the five year risk of onset in premanifest HD subjects. The phenylalanine and the purine metabolisms were also significantly altered, but associated less to disease severity. Decreased levels of lumichrome were commonly found in mutated HTT carriers and the levels correlated with the five year risk of disease onset in premanifest carriers. These biochemical findings demonstrates that the CSF metabolome can be used to characterize molecular pathogenesis occurring in HD, which may be essential for future development of novel HD therapies.

Place, publisher, year, edition, pages
NATURE PUBLISHING GROUP, 2019
National Category
Neurology
Identifiers
urn:nbn:se:uu:diva-379886 (URN)10.1038/s41598-019-40186-5 (DOI)000460754600020 ()30858393 (PubMedID)
Funder
Åke Wiberg FoundationEU, Horizon 2020, 654241
Available from: 2019-03-25 Created: 2019-03-25 Last updated: 2019-03-25Bibliographically approved
Herman, S., Åkerfeldt, T., Spjuth, O., Burman, J. & Kultima, K. (2019). Biochemical Differences in Cerebrospinal Fluid between Secondary Progressive and Relapsing-Remitting Multiple Sclerosis. Cells, 8(2), Article ID 84.
Open this publication in new window or tab >>Biochemical Differences in Cerebrospinal Fluid between Secondary Progressive and Relapsing-Remitting Multiple Sclerosis
Show others...
2019 (English)In: Cells, ISSN 2073-4409, Vol. 8, no 2, article id 84Article in journal (Refereed) Published
Abstract [en]

To better understand the pathophysiological differences between secondary progressive multiple sclerosis (SPMS) and relapsing-remitting multiple sclerosis (RRMS), and to identify potential biomarkers of disease progression, we applied high-resolution mass spectrometry (HRMS) to investigate the metabolome of cerebrospinal fluid (CSF). The biochemical differences were determined using partial least squares discriminant analysis (PLS-DA) and connected to biochemical pathways as well as associated to clinical and radiological measures. Tryptophan metabolism was significantly altered, with perturbed levels of kynurenate, 5-hydroxytryptophan, 5-hydroxyindoleacetate, and N-acetylserotonin in SPMS patients compared with RRMS and controls. SPMS patients had altered kynurenine compared with RRMS patients, and altered indole-3-acetate compared with controls. Regarding the pyrimidine metabolism, SPMS patients had altered levels of uridine and deoxyuridine compared with RRMS and controls, and altered thymine and glutamine compared with RRMS patients. Metabolites from the pyrimidine metabolism were significantly associated with disability, disease activity and brain atrophy, making them of particular interest for understanding the disease mechanisms and as markers of disease progression. Overall, these findings are of importance for the characterization of the molecular pathogenesis of SPMS and support the hypothesis that the CSF metabolome may be used to explore changes that occur in the transition between the RRMS and SPMS pathologies.

Keywords
cerebrospinal fluid, mass spectrometry, metabolomics, multiple sclerosis, pyrimidine, tryptophan
National Category
Clinical Medicine
Identifiers
urn:nbn:se:uu:diva-375564 (URN)10.3390/cells8020084 (DOI)000460896000006 ()30678351 (PubMedID)
Funder
Åke Wiberg FoundationEU, Horizon 2020, 654241
Available from: 2019-01-31 Created: 2019-01-31 Last updated: 2019-04-11Bibliographically approved
Novella, J. A., Emami Khoonsari, P., Herman, S., Whitenack, D., Capuccini, M., Burman, J., . . . Spjuth, O. (2019). Container-based bioinformatics with Pachyderm. Bioinformatics, 35, 839-846
Open this publication in new window or tab >>Container-based bioinformatics with Pachyderm
Show others...
2019 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 35, p. 839-846Article in journal (Refereed) Published
National Category
Bioinformatics and Systems Biology
Identifiers
urn:nbn:se:uu:diva-371628 (URN)10.1093/bioinformatics/bty699 (DOI)30101309 (PubMedID)
Available from: 2018-08-08 Created: 2018-12-21 Last updated: 2019-03-29Bibliographically approved
Peters, K., Bradbury, J., Bergmann, S., Capuccini, M., Cascante, M., de Atauri, P., . . . Steinbeck, C. (2019). PhenoMeNal: Processing and analysis of Metabolomics data in the Cloud. GigaScience, 8
Open this publication in new window or tab >>PhenoMeNal: Processing and analysis of Metabolomics data in the Cloud
Show others...
2019 (English)In: GigaScience, ISSN 2047-217X, E-ISSN 2047-217X, Vol. 8Article in journal (Refereed) Epub ahead of print
National Category
Bioinformatics and Systems Biology
Identifiers
urn:nbn:se:uu:diva-371635 (URN)10.1093/gigascience/giy149 (DOI)30535405 (PubMedID)
Available from: 2018-12-07 Created: 2018-12-21 Last updated: 2019-01-03Bibliographically approved
Kensert, A., Harrison, P. J. & Spjuth, O. (2019). Transfer Learning with Deep Convolutional Neural Networks for Classifying Cellular Morphological Changes.. SLAS discovery : advancing life sciences R & D, Article ID 2472555218818756.
Open this publication in new window or tab >>Transfer Learning with Deep Convolutional Neural Networks for Classifying Cellular Morphological Changes.
2019 (English)In: SLAS discovery : advancing life sciences R & D, ISSN 2472-5560, article id 2472555218818756Article in journal (Refereed) Epub ahead of print
Abstract [en]

The quantification and identification of cellular phenotypes from high-content microscopy images has proven to be very useful for understanding biological activity in response to different drug treatments. The traditional approach has been to use classical image analysis to quantify changes in cell morphology, which requires several nontrivial and independent analysis steps. Recently, convolutional neural networks have emerged as a compelling alternative, offering good predictive performance and the possibility to replace traditional workflows with a single network architecture. In this study, we applied the pretrained deep convolutional neural networks ResNet50, InceptionV3, and InceptionResnetV2 to predict cell mechanisms of action in response to chemical perturbations for two cell profiling datasets from the Broad Bioimage Benchmark Collection. These networks were pretrained on ImageNet, enabling much quicker model training. We obtain higher predictive accuracy than previously reported, between 95% and 97%. The ability to quickly and accurately distinguish between different cell morphologies from a scarce amount of labeled data illustrates the combined benefit of transfer learning and deep convolutional neural networks for interrogating cell-based images.

Keywords
cell phenotypes, deep learning, high-content imaging, machine learning, transfer learning
National Category
Bioinformatics and Systems Biology
Identifiers
urn:nbn:se:uu:diva-375566 (URN)10.1177/2472555218818756 (DOI)30641024 (PubMedID)
Funder
Swedish Foundation for Strategic Research Swedish National Infrastructure for Computing (SNIC)
Available from: 2019-01-31 Created: 2019-01-31 Last updated: 2019-01-31
Lapins, M., Arvidsson, S., Lampa, S., Berg, A., Schaal, W., Alvarsson, J. & Spjuth, O. (2018). A confidence predictor for logD using conformal regression and a support-vector machine. Journal of Cheminformatics, 10(1), Article ID 17.
Open this publication in new window or tab >>A confidence predictor for logD using conformal regression and a support-vector machine
Show others...
2018 (English)In: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 10, no 1, article id 17Article in journal (Refereed) Published
Abstract [en]

Lipophilicity is a major determinant of ADMET properties and overall suitability of drug candidates. We have developed large-scale models to predict water-octanol distribution coefficient (logD) for chemical compounds, aiding drug discovery projects. Using ACD/logD data for 1.6 million compounds from the ChEMBL database, models are created and evaluated by a support-vector machine with a linear kernel using conformal prediction methodology, outputting prediction intervals at a specified confidence level. The resulting model shows a predictive ability of [Formula: see text] and with the best performing nonconformity measure having median prediction interval of [Formula: see text] log units at 80% confidence and [Formula: see text] log units at 90% confidence. The model is available as an online service via an OpenAPI interface, a web page with a molecular editor, and we also publish predictive values at 90% confidence level for 91 M PubChem structures in RDF format for download and as an URI resolver service.

Keywords
Conformal prediction, LogD, Machine learning, QSAR, RDF, Support-vector machine
National Category
Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-347779 (URN)10.1186/s13321-018-0271-1 (DOI)000429065900001 ()29616425 (PubMedID)
Funder
EU, Horizon 2020, 731075
Available from: 2018-04-06 Created: 2018-04-06 Last updated: 2018-08-28Bibliographically approved
Svensson, F., Aniceto, N., Norinder, U., Cortes-Ciriano, I., Spjuth, O., Carlsson, L. & Bender, A. (2018). Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty. Journal of Chemical Information and Modeling, 58(5), 1132-1140
Open this publication in new window or tab >>Conformal Regression for Quantitative Structure-Activity Relationship Modeling-Quantifying Prediction Uncertainty
Show others...
2018 (English)In: Journal of Chemical Information and Modeling, ISSN 1549-9596, E-ISSN 1549-960X, Vol. 58, no 5, p. 1132-1140Article in journal (Refereed) Published
Abstract [en]

Making predictions with an associated confidence is highly desirable as it facilitates decision making and resource prioritization. Conformal regression is a machine learning framework that allows the user to define the required confidence and delivers predictions that are guaranteed to be correct to the selected extent. In this study, we apply conformal regression to model molecular properties and bioactivity values and investigate different ways to scale the outputted prediction intervals to create as efficient (i.e. narrow) regressors as possible. Different algorithms to estimate the prediction uncertainty were used to normalize the prediction ranges and the different approaches were evaluated on 29 publicly available datasets. Our results show that the most efficient conformal regressors are obtained when using the natural exponential of the ensemble standard deviation from the underlying random forest to scale the prediction intervals. This approach afforded an average prediction range of 1.65 pIC50 units at the 80 % confidence level when applied to bioactivity modeling. The choice of nonconformity function has a pronounced impact on the average prediction range with a difference of close to one log unit in bioactivity between the tightest and widest prediction range. Overall, conformal regression is a robust approach to generate bioactivity predictions with associated confidence.

National Category
Bioinformatics and Systems Biology
Identifiers
urn:nbn:se:uu:diva-350011 (URN)10.1021/acs.jcim.8b00054 (DOI)000433634900021 ()29701973 (PubMedID)
Funder
Swedish Research Council FormasSwedish Foundation for Strategic Research
Available from: 2018-05-02 Created: 2018-05-02 Last updated: 2018-08-20Bibliographically approved
Gupta, A., Harrison, P. J., Wieslander, H., Pielawski, N., Kartasalo, K., Partel, G., . . . Wählby, C. (2018). Deep Learning in Image Cytometry: A Review.. Cytometry Part A
Open this publication in new window or tab >>Deep Learning in Image Cytometry: A Review.
Show others...
2018 (English)In: Cytometry Part A, ISSN 1552-4922, E-ISSN 1552-4930Article in journal (Refereed) Epub ahead of print
Abstract [en]

Artificial intelligence, deep convolutional neural networks, and deep learning are all niche terms that are increasingly appearing in scientific presentations as well as in the general media. In this review, we focus on deep learning and how it is applied to microscopy image data of cells and tissue samples. Starting with an analogy to neuroscience, we aim to give the reader an overview of the key concepts of neural networks, and an understanding of how deep learning differs from more classical approaches for extracting information from image data. We aim to increase the understanding of these methods, while highlighting considerations regarding input data requirements, computational resources, challenges, and limitations. We do not provide a full manual for applying these methods to your own data, but rather review previously published articles on deep learning in image cytometry, and guide the readers toward further reading on specific networks and methods, including new methods not yet applied to cytometry data. © 2018 The Authors. Cytometry Part A published by Wiley Periodicals, Inc. on behalf of International Society for Advancement of Cytometry.

Keywords
biomedical image analysis, cell analysis, convolutional neural networks, deep learning, image cytometry, machine learning, microscopy
National Category
Medical Image Processing
Identifiers
urn:nbn:se:uu:diva-371631 (URN)10.1002/cyto.a.23701 (DOI)30565841 (PubMedID)
Funder
Swedish Foundation for Strategic Research
Available from: 2018-12-21 Created: 2018-12-21 Last updated: 2019-03-28
Ahmed, L., Georgiev, V., Capuccini, M., Toor, S., Schaal, W., Laure, E. & Spjuth, O. (2018). Efficient iterative virtual screening with Apache Spark and conformal prediction.. Journal of Cheminformatics, 10, Article ID 8.
Open this publication in new window or tab >>Efficient iterative virtual screening with Apache Spark and conformal prediction.
Show others...
2018 (English)In: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 10, article id 8Article in journal (Refereed) Published
Abstract [en]

BACKGROUND: Docking and scoring large libraries of ligands against target proteins forms the basis of structure-based virtual screening. The problem is trivially parallelizable, and calculations are generally carried out on computer clusters or on large workstations in a brute force manner, by docking and scoring all available ligands.

CONTRIBUTION: In this study we propose a strategy that is based on iteratively docking a set of ligands to form a training set, training a ligand-based model on this set, and predicting the remainder of the ligands to exclude those predicted as 'low-scoring' ligands. Then, another set of ligands are docked, the model is retrained and the process is repeated until a certain model efficiency level is reached. Thereafter, the remaining ligands are docked or excluded based on this model. We use SVM and conformal prediction to deliver valid prediction intervals for ranking the predicted ligands, and Apache Spark to parallelize both the docking and the modeling.

RESULTS: We show on 4 different targets that conformal prediction based virtual screening (CPVS) is able to reduce the number of docked molecules by 62.61% while retaining an accuracy for the top 30 hits of 94% on average and a speedup of 3.7. The implementation is available as open source via GitHub ( https://github.com/laeeq80/spark-cpvs ) and can be run on high-performance computers as well as on cloud resources.

Keywords
Apache Spark, Cloud computing, Conformal prediction, Docking, Virtual screening
National Category
Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-343980 (URN)10.1186/s13321-018-0265-z (DOI)000426699400001 ()29492726 (PubMedID)
Funder
eSSENCE - An eScience CollaborationSwedish e‐Science Research CenterSwedish National Infrastructure for Computing (SNIC), b2015245Swedish National Infrastructure for Computing (SNIC), SNIC 2017/13-6
Available from: 2018-03-03 Created: 2018-03-03 Last updated: 2018-05-14Bibliographically approved
Kensert, A., Alvarsson, J., Norinder, U. & Spjuth, O. (2018). Evaluating parameters for ligand-based modeling with random forest on sparse data sets. Journal of Cheminformatics, 10, Article ID 49.
Open this publication in new window or tab >>Evaluating parameters for ligand-based modeling with random forest on sparse data sets
2018 (English)In: Journal of Cheminformatics, ISSN 1758-2946, E-ISSN 1758-2946, Vol. 10, article id 49Article in journal (Refereed) Published
Abstract [en]

Ligand-based predictive modeling is widely used to generate predictive models aiding decision making in e.g. drug discovery projects. With growing data sets and requirements on low modeling time comes the necessity to analyze data sets efficiently to support rapid and robust modeling. In this study we analyzed four data sets and studied the efficiency of machine learning methods on sparse data structures, utilizing Morgan fingerprints of different radii and hash sizes, and compared with molecular signatures descriptor of different height. We specifically evaluated the effect these parameters had on modeling time, predictive performance, and memory requirements using two implementations of random forest; Scikit-learn as well as FEST. We also compared with a support vector machine implementation. Our results showed that unhashed fingerprints yield significantly better accuracy than hashed fingerprints (p <= 0.05), with no pronounced deterioration in modeling time and memory usage. Furthermore, the fast execution and low memory usage of the FEST algorithm suggest that it is a good alternative for large, high dimensional sparse data. Both support vector machines and random forest performed equally well but results indicate that the support vector machine was better at using the extra information from larger values of the Morgan fingerprint's radius.

Place, publisher, year, edition, pages
BMC, 2018
Keywords
Random forest, Support vector machines, Sparse representation, Fingerprint, Machine learning
National Category
Computer Sciences
Identifiers
urn:nbn:se:uu:diva-368445 (URN)10.1186/s13321-018-0304-9 (DOI)000447254000001 ()30306349 (PubMedID)
Funder
Knut and Alice Wallenberg FoundationSwedish Research Council FormasSwedish National Infrastructure for Computing (SNIC), SNIC 2017/7-241
Available from: 2018-12-07 Created: 2018-12-07 Last updated: 2018-12-07Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-8083-2864

Search in DiVA

Show all publications

Profile pages

Research group website