Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Predicting tumour growth-driving interactions from transcriptomic data using machine learning
Uppsala University, Disciplinary Domain of Medicine and Pharmacy, Faculty of Medicine, Department of Immunology, Genetics and Pathology, Neurooncology and neurodegeneration.
2023 (English)Independent thesis Advanced level (professional degree), 20 credits / 30 HE creditsStudent thesis
Abstract [en]

The mortality rate is high for cancer patients and treatments are only efficient in a fraction of patients. To be able to cure more patients, new treatments need to be invented. Immunotherapy activates the immune system to fight against cancer and one treatment targets immune checkpoints. If more targets are found, more patients can be treated successfully. In this project, interactions between immune and cancer cells that drive tumour growth were investigated in an attempt to find new potential targets. This was achieved by creating a machine learning model that finds genes expressed in cells involved in tumour-driving interactions.

Single-cell RNA sequencing and spatial transcriptomic data from breast cancer patients were utilised as well as single-cell RNA sequencing data from healthy patients. The tumour rate was based on the cumulative expression of G2/M genes. The G2/M related genes were excluded from the analysis since these were assumed to be cell cycle genes. The machine learning model was based on a supervised variational autoencoder architecture. By using this kind of architecture, it was possible to compress the input into a low dimensional space of genes, called a latent space, which was able to explain the tumour rate. Optuna hyperparameter optimizer framework was utilised to find the best combination of hyperparameters for the model. The model had a R2 score of 0.93, which indicated that the latent space was able to explain the growth rate 93% accurately.

The latent space consisted of 20 variables. To find out which genes that were in this latent space, the correlation between each latent variable and each gene was calculated. The genes that were positively correlated or negatively correlated were assumed to be in the latent space and therefore involved in explaining tumour growth. Furthermore, the correlation between each latent variable and the growth rate was calculated. The up- and downregulated genes in each latent variable were kept and used for finding out the pathways for the different latent variables. Five of these latent variables were involved in immune responses and therefore these were further investigated. The genes in these five latent variables were mapped to cell types. One of these latent variables had upregulated immune response for positively correlated growth, indicating that immune cells were involved in promoting cancer progression. Another latent variable had downregulated immune response for negatively correlated growth. This indicated that if these genes would be upregulated instead, the tumour would be thriving. The genes found in these latent variables were analysed further. CD80, CSF1, CSF1R, IL26, IL7, IL34 and the protein NF-kappa-B were interesting finds and are known immune-modulators. These could possibly be used as markers for pro-tumour immunity. Furthermore, CSF1, CSF1R, IL26, IL34 and the protein NF-kappa-B could potentially be targeted in immunotherapy.

Place, publisher, year, edition, pages
2023. , p. 71
Series
UPTEC X ; 23017
Keywords [en]
cancer, immunology, cell-cell interactions, deep learning, variational autoencoder, supervised variational autoencoder, tumour microenvironment, single-cell RNA sequencing, spatial transcriptomics, breast cancer, machine learning
National Category
Bioinformatics (Computational Biology)
Identifiers
URN: urn:nbn:se:uu:diva-505563OAI: oai:DiVA.org:uu-505563DiVA, id: diva2:1771426
External cooperation
SciLifeLab, Karolinska Institutet
Educational program
Molecular Biotechnology Engineering Programme
Supervisors
Examiners
Available from: 2023-06-21 Created: 2023-06-20 Last updated: 2023-06-21Bibliographically approved

Open Access in DiVA

fulltext(6240 kB)227 downloads
File information
File name FULLTEXT01.pdfFile size 6240 kBChecksum SHA-512
93f277f8a3ce16e3b46ecd840b71ccd7b99064c98582ea5c344a42117f57cb5cf857737d9104ebf5dd9f8d70588a1d05cd1dcd07eaa561150c093ebebaa338c0
Type fulltextMimetype application/pdf

By organisation
Neurooncology and neurodegeneration
Bioinformatics (Computational Biology)

Search outside of DiVA

GoogleGoogle Scholar
Total: 227 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 549 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf