Open this publication in new window or tab >>2021 (English)In: Document Analysis and Recognition -- ICDAR 2021 / [ed] Lladós J., Lopresti D., Uchida S., Springer, 2021, Vol. 12824, p. 572-586Conference paper, Published paper (Refereed)
Abstract [en]
Obtaining the original, clean forms of struck-through handwritten words can be of interest to literary scholars, focusing on tasks such as genetic criticism. In addition to this, replacing struck-through words can also have a positive impact on text recognition tasks. This work presents a novel unsupervised approach for strikethrough removal from handwritten words, employing cycle-consistent generative adversarial networks (CycleGANs). The removal performance is improved upon by extending the network with an attribute-guided approach. Furthermore, two new datasets, a synthetic multi-writer set, based on the IAM database, and a genuine single-writer dataset, are introduced for the training and evaluation of the models. The experimental results demonstrate the efficacy of the proposed method, where the examined attribute-guided models achieve F1 scores above 0.8 on the synthetic test set, improving upon the performance of the regular CycleGAN. Despite being trained exclusively on the synthetic dataset, the examined models even produce convincing cleaned images for genuine struck-through words.
Place, publisher, year, edition, pages
Springer, 2021
Keywords
Strikethrough removal, CycleGAN, Handwritten words, Document image processing
National Category
Computer Vision and Robotics (Autonomous Systems)
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-455889 (URN)10.1007/978-3-030-86337-1_38 (DOI)000711880100038 ()
Conference
International Conference on Document Analysis and Recognition (ICDAR)
Funder
Riksbankens Jubileumsfond, P19-0103:1Swedish Research Council, 2018-05973
2021-10-122021-10-122023-09-05Bibliographically approved