Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Link to record
Permanent link

Direct link
Publications (10 of 144) Show all publications
Acerbis, M., Sladoje, N. & Lindblad, J. (2025). A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology. In: Jens Petersen; Vedrana Andersen Dahl (Ed.), Image Analysis: 23rd Scandinavian Conference, SCIA 2025, Reykjavik, Iceland, June 23–25, 2025, Proceedings, Part II. Paper presented at 23rd Scandinavian Conference on Image Analysis - SCIA, June 23-25, 2025, Reykjavik, Iceland (pp. 264-277). Cham: Springer
Open this publication in new window or tab >>A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology
2025 (English)In: Image Analysis: 23rd Scandinavian Conference, SCIA 2025, Reykjavik, Iceland, June 23–25, 2025, Proceedings, Part II / [ed] Jens Petersen; Vedrana Andersen Dahl, Cham: Springer, 2025, p. 264-277Conference paper, Published paper (Refereed)
Abstract [en]

Accurate and efficient cell detection is crucial in many biomedical image analysis tasks. We evaluate the performance of several Deep Learning (DL) methods for cell detection in Papanicolaoustained cytological Whole Slide Images (WSIs), focusing on accuracy of predictions and computational efficiency. We examine recent off-the-shelf algorithms as well as custom-designed detectors, applying them to two datasets: the CNSeg Dataset and the Oral Cancer (OC) Dataset. Our comparison includes well-established segmentation methods such as StarDist, Cellpose, and the Segment Anything Model 2 (SAM2), alongside centroid-based Fully Convolutional Regression Network (FCRN) approaches. We introduce a suitable evaluation metric to assess the accuracy of predictions based on the distance from ground truth positions. We also explore the impact of dataset size and data augmentation techniques on model performance. Results show that centroid-based methods, particularly the Improved Fully Convolutional Regression Network (IFCRN) method, outperform segmentation-based methods in terms of both detection accuracy and computational efficiency. This study highlights the potential of centroid-based detectors as a preferred option for cell detection in resource-limited environments, offering faster processing times and lower GPU memory usage without compromising accuracy.

Place, publisher, year, edition, pages
Cham: Springer, 2025
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 15726
Keywords
Cell Detection, Digital Cytology, Deep Learning, Whole Slide Imaging
National Category
Medical Imaging Computer graphics and computer vision
Identifiers
urn:nbn:se:uu:diva-567960 (URN)10.1007/978-3-031-95918-9_19 (DOI)001553877800019 ()2-s2.0-105009771016 (Scopus ID)978-3-031-95917-2 (ISBN)978-3-031-95918-9 (ISBN)
Conference
23rd Scandinavian Conference on Image Analysis - SCIA, June 23-25, 2025, Reykjavik, Iceland
Funder
Vinnova, 2021-01420Swedish Research Council, 2022-03580Swedish Cancer Society, 22 2353 PjSwedish Cancer Society, 22 2357 Pj
Available from: 2025-10-01 Created: 2025-10-01 Last updated: 2025-10-01Bibliographically approved
Lian, W., Lindblad, J., Runow Stark, C., Hirsch, J. & Sladoje, N. (2025). Let it shine: Autofluorescence of Papanicolaou-stain improves AI-based cytological oral cancer detection. Computers in Biology and Medicine, 185, 1-14, Article ID 109498.
Open this publication in new window or tab >>Let it shine: Autofluorescence of Papanicolaou-stain improves AI-based cytological oral cancer detection
Show others...
2025 (English)In: Computers in Biology and Medicine, ISSN 0010-4825, E-ISSN 1879-0534, Vol. 185, p. 1-14, article id 109498Article in journal (Refereed) Published
Abstract [en]

Background and objectives:

Oral cancer is a global health challenge. The disease can be successfully treated if detected early, but the survival rate drops significantly for late stage cases. There is a growing interest in a shift from the current standard of invasive and time-consuming tissue sampling and histological examination, towards non-invasive brush biopsies and cytological examination, facilitating continued risk group monitoring. For cost effective and accurate cytological analysis there is a great need for reliable computer-assisted data-driven approaches. However, infeasibility of accurate cell-level annotation hinders model performance, and limits evaluation and interpretation of the results. This study aims to improve AI-based oral cancer detection by introducing additional information through multimodal imaging and deep multimodal information fusion.

Methods:

We combine brightfield and fluorescence whole slide microscopy imaging to analyze Papanicolaou-stained liquid-based cytology slides of brush biopsies collected from both healthy and cancer patients. Given the challenge of detailed cytological annotations, we utilize a weakly supervised deep learning approach only relying on patient-level labels. We evaluate various multimodal information fusion strategies, including early, late, and three recent intermediate fusion methods.

Results:

Our experiments demonstrate that: (i) there is substantial diagnostic information to gain from fluorescence imaging of Papanicolaou-stained cytological samples, (ii) multimodal information fusion improves classification performance and cancer detection accuracy, compared to single-modality approaches. Intermediate fusion emerges as the leading method among the studied approaches. Specifically, the Co-Attention Fusion Network (CAFNet) model achieves impressive results, with an F1 score of 83.34% and an accuracy of 91.79% at cell level, surpassing human performance on the task. Additional tests highlight the importance of accurate image registration to maximize the benefits of the multimodal analysis.

Conclusion:

This study advances the field of cytopathology by integrating deep learning methods, multimodal imaging and information fusion to enhance non-invasive early detection of oral cancer. Our approach not only improves diagnostic accuracy, but also allows an efficient, yet uncomplicated, clinical workflow. The developed pipeline has potential applications in other cytological analysis settings. We provide a validated open-source analysis framework and share a unique multimodal oral cancer dataset to support further research and innovation.

Place, publisher, year, edition, pages
Elsevier, 2025
Keywords
Biomedical imaging, Multimodal microscopy, Deep learning, Multimodal information fusion, Artificial intelligence, Cytopathology
National Category
Medical Imaging Computer Vision and learning System Cancer and Oncology
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-547008 (URN)10.1016/j.compbiomed.2024.109498 (DOI)2-s2.0-85211213571 (Scopus ID)
Funder
Swedish Cancer SocietyStockholm County CouncilVinnova, 2021-01420Vinnova, 2020-03611Vinnova, 2017-02447Swedish Research Council, 2022-03580Swedish Research Council, 2017-04385Swedish Research Council, 2022-06725Swedish Research Council, 2018-05973
Available from: 2025-01-13 Created: 2025-01-13 Last updated: 2025-10-29Bibliographically approved
Fraile, M., Calvo-Barajas, N., Apeiron, A. S., Varni, G., Lindblad, J., Sladoje, N. & Castellano, G. (2025). UpStory: the uppsala storytelling dataset. Frontiers in Robotics and AI, 12, Article ID 1547578.
Open this publication in new window or tab >>UpStory: the uppsala storytelling dataset
Show others...
2025 (English)In: Frontiers in Robotics and AI, E-ISSN 2296-9144, Vol. 12, article id 1547578Article in journal (Refereed) Published
Abstract [en]

Friendship and rapport play an important role in the formation of constructive social interactions, and have been widely studied in education due to their impact on learning outcomes. Given the growing interest in automating the analysis of such phenomena through Machine Learning, access to annotated interaction datasets is highly valuable. However, no dataset on child-child interactions explicitly capturing rapport currently exists. Moreover, despite advances in the automatic analysis of human behavior, no previous work has addressed the prediction of rapport in child-child interactions in educational settings. We present UpStory — the Uppsala Storytelling dataset: a novel dataset of naturalistic dyadic interactions between primary school aged children, with an experimental manipulation of rapport. Pairs of children aged 8–10 participate in a task-oriented activity: designing a story together, while being allowed free movement within the play area. We promote balanced collection of different levels of rapport by using a within-subjects design: self-reported friendships are used to pair each child twice, either minimizing or maximizing pair separation in the friendship network. The dataset contains data for 35 pairs, totaling 3 h 40 m of audiovisual recordings. It includes two video sources, and separate voice recordings per child. An anonymized version of the dataset is made publicly available, containing per-frame head pose, body pose, and face features. Finally, we confirm the informative power of the UpStory dataset by establishing baselines for the prediction of rapport. A simple approach achieves 68% test accuracy using data from one child, and 70% test accuracy aggregating data from a pair.

Place, publisher, year, edition, pages
Frontiers Media S.A., 2025
Keywords
child-child interaction, multimodal dataset, machine learning, rapport, social signals
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:uu:diva-539568 (URN)10.3389/frobt.2025.1547578 (DOI)001543085400001 ()40761769 (PubMedID)2-s2.0-105012487451 (Scopus ID)
Funder
Swedish Research Council, 2020-03167
Available from: 2024-10-01 Created: 2024-10-01 Last updated: 2025-08-20Bibliographically approved
Fraile, M., Varni, G., Lindblad, J., Sladoje, N. & Castellano, G. (2024). Are We Friends?: End-to-End Prediction of Child Rapport in Guided Play. In: Alessio Del Bue; Cristian Canton; Jordi Pont-Tuset; Tatiana Tommasi (Ed.), Computer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings, Part XV. Paper presented at The 18th European Conference on Computer Vision ECCV 2024, 29 September-4 October, 2024, Milano, Italy (pp. 380-392). Cham: Springer
Open this publication in new window or tab >>Are We Friends?: End-to-End Prediction of Child Rapport in Guided Play
Show others...
2024 (English)In: Computer Vision – ECCV 2024 Workshops: Milan, Italy, September 29–October 4, 2024, Proceedings, Part XV / [ed] Alessio Del Bue; Cristian Canton; Jordi Pont-Tuset; Tatiana Tommasi, Cham: Springer, 2024, p. 380-392Conference paper, Published paper (Refereed)
Abstract [en]

Close, fulfilling interactions with other classmates are an important part of a child’s learning experience, and have been shown to improve educational outcomes. This is especially apparent in guided play activities in which children need to coordinate to succeed. However, until the recent publication of the UpStory dataset, no child-child interaction dataset with explicit control for rapport was available, leading to a lack of methods for automatic rapport prediction in pair play. In this study, we perform the first-of-its-kind evaluation of end-to-end Computer Vision techniques for child-child rapport prediction, and compare our Deep Learning-based results to the feature-based Machine Learning approaches reported in the UpStory paper. The results show that, under a thorough training and evaluation procedure, end-to-end learning under-performs when compared to feature-based methods.

Place, publisher, year, edition, pages
Cham: Springer, 2024
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 15637
Keywords
Child-child interaction, End-to-end learning, Video analysis, Social signals, Multimodal cues
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:uu:diva-539580 (URN)10.1007/978-3-031-91581-9_27 (DOI)001544987000027 ()2-s2.0-105014494236 (Scopus ID)978-3-031-91580-2 (ISBN)978-3-031-91581-9 (ISBN)
Conference
The 18th European Conference on Computer Vision ECCV 2024, 29 September-4 October, 2024, Milano, Italy
Available from: 2024-10-01 Created: 2024-10-01 Last updated: 2025-10-07Bibliographically approved
Rudraiah, P. S., Camacho, R., Fernandez-Rodriguez, J., Fixler, D., Grimm, J., Gruber, F., . . . Zoratto, S. (2024). Correlated multimodal imaging in life sciences: lessons learnt. Frontiers in Biomaterials Science, 3
Open this publication in new window or tab >>Correlated multimodal imaging in life sciences: lessons learnt
Show others...
2024 (English)In: Frontiers in Biomaterials Science, E-ISSN 2813-3749, Vol. 3Article, review/survey (Refereed) Published
Abstract [en]

Correlated Multimodal Imaging (CMI) gathers information about the same specimen with two or more modalities that–combined–create a composite and complementary view of the sample (including insights into structure, function, dynamics and molecular composition). CMI allows one to reach beyond what is possible with a single modality and describe biomedical processes within their overall spatio-temporal context and gain a mechanistic understanding of cells, tissues, and organisms in health and disease by untangling their molecular mechanisms within their native environment. The field of CMI has grown substantially over the last decade and previously unanswerable biological questions have been solved by applying novel CMI workflows. To disseminate these workflows and comprehensively share the scattered knowledge present within the CMI community, an initiative was started to bring together imaging, image analysis, and biomedical scientists and work towards an open community that promotes and disseminates the field of CMI. This community project was funded for the last 4 years by an EU COST Action called COMULIS (COrrelated MUltimodal imaging in the LIfe Sciences). In this review we share some of the showcases and lessons learnt from the action. We also briefly look ahead at how we anticipate building on this initial initiative.

Place, publisher, year, edition, pages
Frontiers Media S.A., 2024
Keywords
correlated multimodal imaging, preclinical hybrid imaging, correlated light and electron microscopy, image registration, correlation software, image databases and repositories
National Category
Medical Imaging Computer Vision and learning System
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-547014 (URN)10.3389/fbiom.2024.1338115 (DOI)
Available from: 2025-01-13 Created: 2025-01-13 Last updated: 2025-06-17Bibliographically approved
Breznik, E., Wetzer, E., Lindblad, J. & Sladoje, N. (2024). Cross-modality sub-image retrieval using contrastive multimodal image representations. Scientific Reports, 14(1), Article ID 18798.
Open this publication in new window or tab >>Cross-modality sub-image retrieval using contrastive multimodal image representations
2024 (English)In: Scientific Reports, E-ISSN 2045-2322, Vol. 14, no 1, article id 18798Article in journal (Refereed) Published
Abstract [en]

In tissue characterization and cancer diagnostics, multimodal imaging has emerged as a powerful technique. Thanks to computational advances, large datasets can be exploited to discover patterns in pathologies and improve diagnosis. However, this requires efficient and scalable image retrieval methods. Cross-modality image retrieval is particularly challenging, since images of similar (or even the same) content captured by different modalities might share few common structures. We propose a new application-independent content-based image retrieval (CBIR) system for reverse (sub-)image search across modalities, which combines deep learning to generate representations (embedding the different modalities in a common space) with robust feature extraction and bag-of-words models for efficient and reliable retrieval. We illustrate its advantages through a replacement study, exploring a number of feature extractors and learned representations, as well as through comparison to recent (cross-modality) CBIR methods. For the task of (sub-)image retrieval on a (publicly available) dataset of brightfield and second harmonic generation microscopy images, the results show that our approach is superior to all tested alternatives. We discuss the shortcomings of the compared methods and observe the importance of equivariance and invariance properties of the learned representations and feature extractors in the CBIR pipeline. Code is available at: https://github.com/MIDA-group/CrossModal_ImgRetrieval.

Place, publisher, year, edition, pages
Springer Nature, 2024
National Category
Medical Imaging
Identifiers
urn:nbn:se:uu:diva-470293 (URN)10.1038/s41598-024-68800-1 (DOI)001318393400020 ()39138271 (PubMedID)
Note

These authors contributed equally: Eva Breznik and Elisabeth Wetzer

Available from: 2022-03-22 Created: 2022-03-22 Last updated: 2025-02-09Bibliographically approved
Koriakina, N., Sladoje, N., Basic, V. & Lindblad, J. (2024). Deep multiple instance learning versus conventional deep single instance learning for interpretable oral cancer detection. PLOS ONE, 19(4), Article ID e0302169.
Open this publication in new window or tab >>Deep multiple instance learning versus conventional deep single instance learning for interpretable oral cancer detection
2024 (English)In: PLOS ONE, E-ISSN 1932-6203, Vol. 19, no 4, article id e0302169Article in journal (Refereed) Published
Abstract [en]

The current medical standard for setting an oral cancer (OC) diagnosis is histological examination of a tissue sample taken from the oral cavity. This process is time-consuming and more invasive than an alternative approach of acquiring a brush sample followed by cytological analysis. Using a microscope, skilled cytotechnologists are able to detect changes due to malignancy; however, introducing this approach into clinical routine is associated with challenges such as a lack of resources and experts. To design a trustworthy OC detection system that can assist cytotechnologists, we are interested in deep learning based methods that can reliably detect cancer, given only per-patient labels (thereby minimizing annotation bias), and also provide information regarding which cells are most relevant for the diagnosis (thereby enabling supervision and understanding). In this study, we perform a comparison of two approaches suitable for OC detection and interpretation: (i) conventional single instance learning (SIL) approach and (ii) a modern multiple instance learning (MIL) method. To facilitate systematic evaluation of the considered approaches, we, in addition to a real OC dataset with patient-level ground truth annotations, also introduce a synthetic dataset—PAP-QMNIST. This dataset shares several properties of OC data, such as image size and large and varied number of instances per bag, and may therefore act as a proxy model of a real OC dataset, while, in contrast to OC data, it offers reliable per-instance ground truth, as defined by design. PAP-QMNIST has the additional advantage of being visually interpretable for non-experts, which simplifies analysis of the behavior of methods. For both OC and PAP-QMNIST data, we evaluate performance of the methods utilizing three different neural network architectures. Our study indicates, somewhat surprisingly, that on both synthetic and real data, the performance of the SIL approach is better or equal to the performance of the MIL approach. Visual examination by cytotechnologist indicates that the methods manage to identify cells which deviate from normality, including malignant cells as well as those suspicious for dysplasia. We share the code as open source.

Place, publisher, year, edition, pages
Public Library of Science (PLoS), 2024
National Category
Other Computer and Information Science Medical Imaging Computer graphics and computer vision
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-527711 (URN)10.1371/journal.pone.0302169 (DOI)001214105000035 ()38687694 (PubMedID)2-s2.0-85191914130 (Scopus ID)
Funder
Vinnova, 2017-02447Vinnova, 2020-03611Swedish Research Council, 2017-04385Vinnova, 2021-01420Swedish Cancer Society, 22 2353 PjSwedish Cancer Society, 22 2357 Pj
Available from: 2024-05-06 Created: 2024-05-06 Last updated: 2025-02-10Bibliographically approved
Chatterjee, S., Göksel, O., Sladoje, N. & Lindblad, J. (2024). Detection of Extremely Sparse Key Instances in Whole Slide Cytology Images via Self-supervised One-class Representation Learning. In: Apostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal (Ed.), Pattern Recognition: 27th International Conference, ICPR 2024, Kolkata, India, December 1–5, 2024, Proceedings, Part XXVII. Paper presented at 27th International Conference, ICPR 2024, Kolkata, India, December 1–5, 2024 (pp. 408-421). Springer Nature
Open this publication in new window or tab >>Detection of Extremely Sparse Key Instances in Whole Slide Cytology Images via Self-supervised One-class Representation Learning
2024 (English)In: Pattern Recognition: 27th International Conference, ICPR 2024, Kolkata, India, December 1–5, 2024, Proceedings, Part XXVII / [ed] Apostolos Antonacopoulos, Subhasis Chaudhuri, Rama Chellappa, Cheng-Lin Liu, Saumik Bhattacharya, Umapada Pal, Springer Nature, 2024, p. 408-421Conference paper, Published paper (Refereed)
Abstract [en]

Whole slide pathological image classification using slide-level labels often relies on multiple instance learning. Multiple instance learning based approaches are particularly challenging with whole slide cytology images, where the vast number of instances can make it difficult to identify key instances, especially when they are scarce. In this work we evaluate whether using representations learnt from patches from only normal slides is effective for instance-level decision making. We aim for interpretable slide-level decision making for whole slide cytology images. We focus on the effectiveness of a self-supervised contrastive learning framework within a one-class classifier setting, assessing its ability to learn the appearances of normal cells from a limited number of normal slides and subsequently identify abnormal cells (key instances) on test slides. We evaluate our approach on a publicly available cytology dataset, achieving a Recall@400 score of 0.1938, considerably improving over the 0.1109 score obtained using a weakly supervised approach.

Place, publisher, year, edition, pages
Springer Nature, 2024
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 15327
National Category
Computer Vision and learning System Cancer and Oncology
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-547013 (URN)10.1007/978-3-031-78398-2_27 (DOI)001565106100027 ()2-s2.0-85211813543 (Scopus ID)978-3-031-78397-5 (ISBN)978-3-031-78398-2 (ISBN)
Conference
27th International Conference, ICPR 2024, Kolkata, India, December 1–5, 2024
Funder
Swedish Cancer Society, 22 2353Swedish Cancer Society, 22 2357Vinnova, 2020-03611
Available from: 2025-01-13 Created: 2025-01-13 Last updated: 2025-11-25Bibliographically approved
Hirsch, J., Sandy, R., Hasséus, B. & Lindblad, J. (2023). A paradigm shift in the prevention and diagnosis of oral squamous cell carcinoma.. Journal of Oral Pathology & Medicine, 52(9), 826-833
Open this publication in new window or tab >>A paradigm shift in the prevention and diagnosis of oral squamous cell carcinoma.
2023 (English)In: Journal of Oral Pathology & Medicine, ISSN 0904-2512, E-ISSN 1600-0714, Vol. 52, no 9, p. 826-833Article in journal (Refereed) Published
Abstract [en]

BACKGROUND: Oral squamous cell carcinoma (OSCC) is a widespread disease with only 50%-60% 5-year survival. Individuals with potentially malignant precursor lesions are at high risk.

METHODS: Survival could be increased by effective, affordable, and simple screening methods, along with a shift from incisional tissue biopsies to non-invasive brush biopsies for cytology diagnosis, which are easy to perform in primary care. Along with the explainable, fast, and objective artificial intelligence characterisation of cells through deep learning, an easy-to-use, rapid, and cost-effective methodology for finding high-risk lesions is achievable. The collection of cytology samples offers the further opportunity of explorative genomic analysis.

RESULTS: Our prospective multicentre study of patients with leukoplakia yields a vast number of oral keratinocytes. In addition to cytopathological analysis, whole-slide imaging and the training of deep neural networks, samples are analysed according to a single-cell RNA sequencing protocol, enabling mapping of the entire keratinocyte transcriptome. Mapping the changes in the genetic profile, based on mRNA expression, facilitates the identification of biomarkers that predict cancer transformation.

CONCLUSION: This position paper highlights non-invasive methods for identifying patients with oral mucosal lesions at risk of malignant transformation. Reliable non-invasive methods for screening at-risk individuals bring the early diagnosis of OSCC within reach. The use of biomarkers to decide on a targeted therapy is most likely to improve the outcome. With the large-scale collection of samples following patients over time, combined with genomic analysis and modern machine-learning-based approaches for finding patterns in data, this path holds great promise.

Place, publisher, year, edition, pages
John Wiley & Sons, 2023
Keywords
brush biopsies, explainable AI, oral cancer, oral keratinocytes, precision medicine
National Category
Cancer and Oncology
Identifiers
urn:nbn:se:uu:diva-515133 (URN)10.1111/jop.13484 (DOI)001067002000001 ()37710407 (PubMedID)
Available from: 2023-10-27 Created: 2023-10-27 Last updated: 2024-01-24Bibliographically approved
Wetzer, E., Lindblad, J. & Sladoje, N. (2023). Can Representation Learning for Multimodal Image Registration be Improved by Supervision of Intermediate Layers?. In: IbPRIA 2023: Pattern Recognition and Image Analysis: . Paper presented at Iberian Conference on Pattern Recognition and Image Analysis (pp. 261-275). Springer, 14062
Open this publication in new window or tab >>Can Representation Learning for Multimodal Image Registration be Improved by Supervision of Intermediate Layers?
2023 (English)In: IbPRIA 2023: Pattern Recognition and Image Analysis, Springer, 2023, Vol. 14062, p. 261-275Conference paper, Poster (with or without abstract) (Refereed)
Abstract [en]

Multimodal imaging and correlative analysis typically require image alignment. Contrastive learning can generate representations of multimodal images, reducing the challenging task of multimodal image registration to a monomodal one. Previously, additional supervision on intermediate layers in contrastive learning has improved biomedical image classification. We evaluate if a similar approach improves representations learned for registration to boost registration performance. We explore three approaches to add contrastive supervision to the latent features of the bottleneck layer in the U-Nets encoding the multimodal images and evaluate three different critic functions. Our results show that representations learned without additional supervision on latent features perform best in the downstream task of registration on two public biomedical datasets. We investigate the performance drop by exploiting recent insights in contrastive learning in classification and self-supervised learning. We visualize the spatial relations of the learned representations by means of multidimensional scaling, and show that additional supervision on the bottleneck layer can lead to partial dimensional collapse of the intermediate embedding space.

Place, publisher, year, edition, pages
Springer, 2023
Series
Lecture Notes in Computer Science
Keywords
Contrastive learning; Multimodal image registration; Digital pathology
National Category
Medical Imaging Computer graphics and computer vision
Research subject
Computerized Image Processing; Machine learning
Identifiers
urn:nbn:se:uu:diva-510271 (URN)10.1007/978-3-031-36616-1_21 (DOI)978-3-031-36615-4 (ISBN)978-3-031-36616-1 (ISBN)
Conference
Iberian Conference on Pattern Recognition and Image Analysis
Funder
Vinnova, 2017-02447Vinnova, 2020-03611Vinnova, 2021-01420
Available from: 2023-08-25 Created: 2023-08-25 Last updated: 2025-02-09
Projects
Image analysis for reliable and cost effective cancer detection [2015-05878_VR]; Uppsala University; Publications
Bajic, B., Lindblad, J. & Sladoje, N. (2019). Sparsity promoting super-resolution coverage segmentation by linear unmixing in presence of blur and noise. Journal of Electronic Imaging (JEI), 28(1), Article ID 013046.
Sparse modelling and deep learning for improved Fourier ptychographic microscopy with biomedical applications [2017-04385_VR]; Uppsala University; Publications
Koriakina, N., Sladoje, N., Basic, V. & Lindblad, J. (2024). Deep multiple instance learning versus conventional deep single instance learning for interpretable oral cancer detection. PLOS ONE, 19(4), Article ID e0302169. Bajic, B., Lindblad, J. & Sladoje, N. (2019). Sparsity promoting super-resolution coverage segmentation by linear unmixing in presence of blur and noise. Journal of Electronic Imaging (JEI), 28(1), Article ID 013046.
Interpretable AI-based multispectral 3D-analysis of cell interrelations in cancer microenvironment for patient-specific therapy [2022-03580_VR]; Uppsala University; Publications
Acerbis, M., Sladoje, N. & Lindblad, J. (2025). A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology. In: Jens Petersen; Vedrana Andersen Dahl (Ed.), Image Analysis: 23rd Scandinavian Conference, SCIA 2025, Reykjavik, Iceland, June 23–25, 2025, Proceedings, Part II. Paper presented at 23rd Scandinavian Conference on Image Analysis - SCIA, June 23-25, 2025, Reykjavik, Iceland (pp. 264-277). Cham: Springer
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-7312-8222

Search in DiVA

Show all publications

Profile pages

Personal page