Open this publication in new window or tab >>Show others...
2025 (English)In: Computers in Biology and Medicine, ISSN 0010-4825, E-ISSN 1879-0534, Vol. 185, p. 1-14, article id 109498Article in journal (Refereed) Published
Abstract [en]
Background and objectives:
Oral cancer is a global health challenge. The disease can be successfully treated if detected early, but the survival rate drops significantly for late stage cases. There is a growing interest in a shift from the current standard of invasive and time-consuming tissue sampling and histological examination, towards non-invasive brush biopsies and cytological examination, facilitating continued risk group monitoring. For cost effective and accurate cytological analysis there is a great need for reliable computer-assisted data-driven approaches. However, infeasibility of accurate cell-level annotation hinders model performance, and limits evaluation and interpretation of the results. This study aims to improve AI-based oral cancer detection by introducing additional information through multimodal imaging and deep multimodal information fusion.
Methods:
We combine brightfield and fluorescence whole slide microscopy imaging to analyze Papanicolaou-stained liquid-based cytology slides of brush biopsies collected from both healthy and cancer patients. Given the challenge of detailed cytological annotations, we utilize a weakly supervised deep learning approach only relying on patient-level labels. We evaluate various multimodal information fusion strategies, including early, late, and three recent intermediate fusion methods.
Results:
Our experiments demonstrate that: (i) there is substantial diagnostic information to gain from fluorescence imaging of Papanicolaou-stained cytological samples, (ii) multimodal information fusion improves classification performance and cancer detection accuracy, compared to single-modality approaches. Intermediate fusion emerges as the leading method among the studied approaches. Specifically, the Co-Attention Fusion Network (CAFNet) model achieves impressive results, with an F1 score of 83.34% and an accuracy of 91.79% at cell level, surpassing human performance on the task. Additional tests highlight the importance of accurate image registration to maximize the benefits of the multimodal analysis.
Conclusion:
This study advances the field of cytopathology by integrating deep learning methods, multimodal imaging and information fusion to enhance non-invasive early detection of oral cancer. Our approach not only improves diagnostic accuracy, but also allows an efficient, yet uncomplicated, clinical workflow. The developed pipeline has potential applications in other cytological analysis settings. We provide a validated open-source analysis framework and share a unique multimodal oral cancer dataset to support further research and innovation.
Place, publisher, year, edition, pages
Elsevier, 2025
Keywords
Biomedical imaging, Multimodal microscopy, Deep learning, Multimodal information fusion, Artificial intelligence, Cytopathology
National Category
Medical Imaging Computer Vision and learning System Cancer and Oncology
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-547008 (URN)10.1016/j.compbiomed.2024.109498 (DOI)2-s2.0-85211213571 (Scopus ID)
Funder
Swedish Cancer SocietyStockholm County CouncilVinnova, 2021-01420Vinnova, 2020-03611Vinnova, 2017-02447Swedish Research Council, 2022-03580Swedish Research Council, 2017-04385Swedish Research Council, 2022-06725Swedish Research Council, 2018-05973
2025-01-132025-01-132025-10-29Bibliographically approved