Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Large-scale simulation-based experiments with stochastic models using machine learning-assisted approaches: Applications in systems biology using Markov jump processes
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Scientific Computing. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computational Science.ORCID iD: 0000-0002-9417-6618
2021 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Discrete and stochastic models in systems biology, such as biochemical reaction networks, can be modeled as Markov jump processes. The chemical master equation describes how the probability distribution of a biochemical system's states evolves. Unfortunately, solutions to the chemical master equation only exist for trivial problems. However, the stochastic simulation algorithm (SSA) can generate exact sample paths. Large-scale simulation-based experiments involving variations to the model's parameters are computationally intensive and hinder modelers from exploring and inferring their models due to high-dimensional models.

This thesis proposes methodologies and tools for model exploration and approximate parameter inference of high-dimensional stochastic models simulated via the SSA.  We propose a smart computational workflow using machine learning-assisted approaches to enable model exploration of gene regulatory networks where the objective is to assess different qualitative behaviors present in the model. 

An artificial neural network is proposed for learning summary statistics used in approximate parameter inference.  The neural network can find distinct local features from multivariate time series, enabling more complex models involving several biological species. By introducing epistemic uncertainty, we further explore Bayesian neural networks for approximate parameter inference. A classification approach is introduced, which learns the proposal posterior by an adaptive sampling scheme, ultimately reducing the number of simulations required for the inference task. 

We have also developed the software package Sciope to support modelers with machine learning-assisted techniques for model exploration and parameter inference. Sciope also comes with various features, such as experimental designs, traditional ABC algorithms, and a parallel backend to scale large simulation-based experiments from laptops to the cloud.

Finally, to reduce the gap between modelers and biologists, StochSS Live! has been developed. StochSS Live! is a user-friendly web-based platform that enables any practitioners to build biochemical reaction models and perform simulation by ensemble analysis, model exploration, and approximate parameter inference. 

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2021. , p. 68
Series
Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science and Technology, ISSN 1651-6214 ; 2035
Keywords [en]
bioinformatics, systems biology, stochastic simulation, model exploration, approximate parameter inference, machine learning, distributed computing
National Category
Bioinformatics (Computational Biology) Computational Mathematics
Research subject
Scientific Computing
Identifiers
URN: urn:nbn:se:uu:diva-439782ISBN: 978-91-513-1194-4 (print)OAI: oai:DiVA.org:uu-439782DiVA, id: diva2:1543699
Public defence
2021-06-04, 2446 ITC, Lägerhyddsvägen 2, Uppsala, 10:15 (English)
Opponent
Supervisors
Projects
eSSENCE
Funder
NIH (National Institute of Health)Göran Gustafsson Foundation for promotion of scientific research at Uppala University and Royal Institute of TechnologyAvailable from: 2021-05-11 Created: 2021-04-12 Last updated: 2022-10-31
List of papers
1. Smart computational exploration of stochastic gene regulatory network models using human-in-the-loop semi-supervised learning
Open this publication in new window or tab >>Smart computational exploration of stochastic gene regulatory network models using human-in-the-loop semi-supervised learning
2019 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 35, p. 5199-5206Article in journal (Refereed) Published
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-392179 (URN)10.1093/bioinformatics/btz420 (DOI)000509361200016 ()31141124 (PubMedID)
Projects
eSSENCE
Available from: 2019-05-29 Created: 2019-08-30 Last updated: 2021-04-12Bibliographically approved
2. Convolutional Neural Networks as Summary Statistics for Approximate Bayesian Computation
Open this publication in new window or tab >>Convolutional Neural Networks as Summary Statistics for Approximate Bayesian Computation
2022 (English)In: IEEE/ACM Transactions on Computational Biology & Bioinformatics, ISSN 1545-5963, E-ISSN 1557-9964, Vol. 19, no 6, p. 3353-3365Article in journal (Refereed) Published
Abstract [en]

Approximate Bayesian Computation is widely used in systems biology for inferring parameters in stochastic gene regulatory network models. Its performance hinges critically on the ability to summarize high-dimensional system responses such as time series into a few informative, low-dimensional summary statistics. The quality of those statistics acutely impacts the accuracy of the inference task. Existing methods to select the best subset out of a pool of candidate statistics do not scale well with large pools of several tens to hundreds of candidate statistics. Since high quality statistics are imperative for good performance, this becomes a serious bottleneck when performing inference on complex and high-dimensional problems.This paper proposes a convolutional neural network architecture for automatically learning informative summary statistics of temporal responses. We show that the proposed network can effectively circumvent the statistics selection problem of the preprocessing step for ABC inference. The proposed approach is demonstrated on two benchmark problem and one challenging inference problem learning parameters in a high-dimensional stochastic genetic oscillator. We also study the impact of experimental design on network performance by comparing different data richness and data acquisition strategies.

Place, publisher, year, edition, pages
IEEE, 2022
Keywords
Approximate inference, ABC, Summary statistics, Neural Network, Regression
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-439778 (URN)10.1109/TCBB.2021.3108695 (DOI)000966719600030 ()34460381 (PubMedID)
Projects
eSSENCE - An eScience Collaboration
Funder
NIH (National Institutes of Health), 2R01EB014877-04AeSSENCE - An eScience CollaborationGöran Gustafsson Foundation for promotion of scientific research at Uppala University and Royal Institute of Technology
Available from: 2021-04-10 Created: 2021-04-10 Last updated: 2023-08-23Bibliographically approved
3. Robust and integrative Bayesian neural networks for likelihood-free parameter inference
Open this publication in new window or tab >>Robust and integrative Bayesian neural networks for likelihood-free parameter inference
Show others...
2022 (English)In: 2022 International Joint Conference on Neural Networks (IJCNN), Institute of Electrical and Electronics Engineers (IEEE), 2022, p. 1-10Conference paper, Published paper (Refereed)
Abstract [en]

State-of-the-art neural network-based methods for learning summary statistics have delivered promising results for simulation-based likelihood-free parameter inference. Existing approaches for learning summarizing networks are mainly based on deterministic neural networks, and do not take network prediction uncertainty into account. This work proposes a robust integrated approach that learns summary statistics using Bayesian neural networks, and produces a proposal posterior density using categorical distributions. An adaptive sampling scheme selects simulation locations to efficiently and iteratively refine the predictive proposal posterior of the network conditioned on observations. This allows for more efficient and robust convergence on comparatively large prior spaces. The approximated proposal posterior can then either be processed through a correction mechanism, or be used in conjunction with a density estimator to arrive at the true posterior. We demonstrate our approach on benchmark examples.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2022
Series
IEEE International Joint Conference on Neural Networks (IJCNN), ISSN 2161-4393, E-ISSN 2161-4407
Keywords
Approximate Bayesian inference, Bayesian neural network, Summary statistics, Adaptive sampling, Classification
National Category
Computational Mathematics
Research subject
Scientific Computing
Identifiers
urn:nbn:se:uu:diva-439780 (URN)10.1109/IJCNN55064.2022.9892800 (DOI)000867070907037 ()978-1-6654-9526-4 (ISBN)978-1-7281-8671-9 (ISBN)
Conference
2022 International Joint Conference on Neural Networks (IJCNN), 18-23 July 2022, Padua, ITALY
Projects
eSSENCE
Funder
eSSENCE - An eScience CollaborationScience for Life Laboratory, SciLifeLab
Available from: 2021-04-10 Created: 2021-04-10 Last updated: 2023-01-12Bibliographically approved
4. Scalable machine learning-assisted model exploration and inference using Sciope
Open this publication in new window or tab >>Scalable machine learning-assisted model exploration and inference using Sciope
2021 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 37, no 2, p. 279-281Article in journal (Refereed) Published
Abstract [en]

Discrete stochastic models of gene regulatory networks are fundamental tools for in silico study of stochastic gene regulatory networks. Likelihood-free inference and model exploration are critical applications to study a system using such models. However, the massive computational cost of complex, high-dimensional and stochastic modelling currently limits systematic investigation to relatively simple systems. Recently, machine-learning-assisted methods have shown great promise to handle larger, more complex models. To support both ease-of-use of this new class of methods, as well as their further development, we have developed the scalable inference, optimization and parameter exploration (Sciope) toolbox. Sciope is designed to support new algorithms for machine-learning-assisted model exploration and likelihood-free inference. Moreover, it is built ground up to easily leverage distributed and heterogeneous computational resources for convenient parallelism across platforms from workstations to clouds.The Sciope Python3 toolbox is freely available on https://github.com/Sciope/Sciope, and has been tested on Linux, Windows and macOS platforms.Supplementary information is available at Bioinformatics online.

Place, publisher, year, edition, pages
Oxford University Press, 2021
National Category
Bioinformatics (Computational Biology)
Identifiers
urn:nbn:se:uu:diva-428779 (URN)10.1093/bioinformatics/btaa673 (DOI)000649439900023 ()32706854 (PubMedID)
Projects
eSSENCE
Note

btaa673

Available from: 2020-12-16 Created: 2020-12-16 Last updated: 2024-01-15Bibliographically approved
5. Epidemiological modeling in StochSS Live!
Open this publication in new window or tab >>Epidemiological modeling in StochSS Live!
Show others...
2021 (English)In: Bioinformatics, ISSN 1367-4803, E-ISSN 1367-4811, Vol. 37, no 17, p. 2787-2788Article in journal (Refereed) Published
Abstract [en]

We present StochSS Live!, a web-based service for modeling, simulation and analysis of a wide range of mathematical, biological and biochemical systems. Using an epidemiological model of COVID-19, we demonstrate the power of StochSS Live! to enable researchers to quickly develop a deterministic or a discrete stochastic model, infer its parameters and analyze the results.StochSS Live! is freely available at https://live.stochss.org/Supplementary data are available at Bioinformatics online.

Place, publisher, year, edition, pages
Oxford University PressOxford University Press (OUP), 2021
National Category
Bioinformatics (Computational Biology)
Research subject
Bioinformatics
Identifiers
urn:nbn:se:uu:diva-439781 (URN)10.1093/bioinformatics/btab061 (DOI)000697377500047 ()33512399 (PubMedID)
Projects
eSSENCE
Funder
eSSENCE - An eScience Collaboration
Note

btab061

Available from: 2021-04-10 Created: 2021-04-10 Last updated: 2024-01-15Bibliographically approved

Open Access in DiVA

UUthesis_Wrede,F_2021(1263 kB)824 downloads
File information
File name FULLTEXT01.pdfFile size 1263 kBChecksum SHA-512
ead2bddb073b0a1977c12c8365adeaa13b4689babbc96c5a37941b581538c88cb30b5819d63840c42ec330bf4b91e48564a382c576d1b0d5ea2a60b7d3b2a47c
Type fulltextMimetype application/pdf

Other links

Online defence

Authority records

Wrede, Fredrik

Search in DiVA

By author/editor
Wrede, Fredrik
By organisation
Division of Scientific ComputingComputational Science
Bioinformatics (Computational Biology)Computational Mathematics

Search outside of DiVA

GoogleGoogle Scholar
Total: 824 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 1455 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf