Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
Link to record
Permanent link

Direct link
Publications (10 of 18) Show all publications
Cheng, L., Frankemölle, J., Axelsson, A. & Vats, E. (2024). Uncovering the Handwritten Text in the Margins: End-to-end Handwritten Text Detection and Recognition. In: In Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024): . Paper presented at 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024) (pp. 111-120).
Open this publication in new window or tab >>Uncovering the Handwritten Text in the Margins: End-to-end Handwritten Text Detection and Recognition
2024 (English)In: In Proceedings of the 8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024), 2024, p. 111-120Conference paper, Published paper (Refereed)
National Category
Natural Language Processing
Identifiers
urn:nbn:se:uu:diva-528458 (URN)
Conference
8th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2024)
Funder
Kjell and Marta Beijer Foundation
Available from: 2024-05-22 Created: 2024-05-22 Last updated: 2025-02-07Bibliographically approved
La Mela, M. & Vats, E. (2023). Automatic classification of historical texts using a BERT model: News about wild berries, 1860-1910. In: Book of Abstracts, DH Benelux 2023, May 31-June 2, Brussels, Belgium: . Paper presented at DH Benelux 2023, May 31-June 2, Brussels, Belgium (pp. 1-4).
Open this publication in new window or tab >>Automatic classification of historical texts using a BERT model: News about wild berries, 1860-1910
2023 (English)In: Book of Abstracts, DH Benelux 2023, May 31-June 2, Brussels, Belgium, 2023, p. 1-4Conference paper, Oral presentation with published abstract (Refereed)
Keywords
newspapers, classification, machine learning, BERT
National Category
History Natural Language Processing
Research subject
History; Computer Science
Identifiers
urn:nbn:se:uu:diva-514487 (URN)10.5281/zenodo.7990441 (DOI)
Conference
DH Benelux 2023, May 31-June 2, Brussels, Belgium
Available from: 2023-10-17 Created: 2023-10-17 Last updated: 2025-04-10
Kass, D. & Vats, E. (2022). AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks. In: Uchida, S Barney, E Eglin, V (Ed.), DOCUMENT ANALYSIS SYSTEMS, DAS 2022: . Paper presented at 15th IAPR International Workshop on Document Analysis Systems (DAS), MAY 22-25, 2022, La Rochelle Univ, La Rochelle, FRANCE (pp. 507-522). Springer Nature, 13237
Open this publication in new window or tab >>AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
2022 (English)In: DOCUMENT ANALYSIS SYSTEMS, DAS 2022 / [ed] Uchida, S Barney, E Eglin, V, Springer Nature, 2022, Vol. 13237, p. 507-522Conference paper, Published paper (Refereed)
Abstract [en]

This work proposes an attention-based sequence-to-sequence model for handwritten word recognition and explores transfer learning for data-efficient training of HTR systems. To overcome training data scarcity, this work leverages models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models. ResNet feature extraction and bidirectional LSTM-based sequence modeling stages together form an encoder. The prediction stage consists of a decoder and a content-based attention mechanism. The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset Imgur5K and the IAM dataset. The experimental results evaluate the performance of the HTR framework, further supported by an in-depth analysis of the error cases. Source code and pre-trained models are available at GitHub (https://github.com/dmitrijsk/AttentionHTR).

Place, publisher, year, edition, pages
Springer Nature, 2022
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords
Handwritten text recognition, Attention encoder-decoder networks, Sequence-to-sequence model, Transfer learning, Multi-writer
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
urn:nbn:se:uu:diva-488224 (URN)10.1007/978-3-031-06555-2_34 (DOI)000870314500034 ()978-3-031-06555-2 (ISBN)978-3-031-06554-5 (ISBN)
Conference
15th IAPR International Workshop on Document Analysis Systems (DAS), MAY 22-25, 2022, La Rochelle Univ, La Rochelle, FRANCE
Funder
Swedish National Infrastructure for Computing (SNIC), SNIC 2021/7-47
Available from: 2022-11-14 Created: 2022-11-14 Last updated: 2025-02-01Bibliographically approved
Heil, R., Vats, E. & Hast, A. (2022). Paired Image to Image Translation for Strikethrough Removal from Handwritten Words. In: Uchida, S Barney, E Eglin, V (Ed.), DOCUMENT ANALYSIS SYSTEMS, DAS 2022: . Paper presented at 15th IAPR International Workshop on Document Analysis Systems (DAS), MAY 22-25, 2022, La Rochelle Univ, La Rochelle, FRANCE (pp. 309-322). Springer Nature, 13237
Open this publication in new window or tab >>Paired Image to Image Translation for Strikethrough Removal from Handwritten Words
2022 (English)In: DOCUMENT ANALYSIS SYSTEMS, DAS 2022 / [ed] Uchida, S Barney, E Eglin, V, Springer Nature, 2022, Vol. 13237, p. 309-322Conference paper, Published paper (Refereed)
Abstract [en]

Transcribing struck-through, handwritten words, for example for the purpose of genetic criticism, can pose a challenge to both humans and machines, due to the obstructive properties of the superimposed strokes. This paper investigates the use of paired image to image translation approaches to remove strikethrough strokes from handwritten words. Four different neural network architectures are examined, ranging from a few simple convolutional layers to deeper ones, employing Dense blocks. Experimental results, obtained from one synthetic and one genuine paired strikethrough dataset, confirm that the proposed paired models outperform the CycleGAN-based state of the art, while using less than a sixth of the trainable parameters.

Place, publisher, year, edition, pages
Springer Nature, 2022
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords
Strikethrough removal, Paired image to image translation, Handwritten words, Document image processing
National Category
Computer graphics and computer vision
Identifiers
urn:nbn:se:uu:diva-488232 (URN)10.1007/978-3-031-06555-2_21 (DOI)000870314500021 ()978-3-031-06555-2 (ISBN)978-3-031-06554-5 (ISBN)
Conference
15th IAPR International Workshop on Document Analysis Systems (DAS), MAY 22-25, 2022, La Rochelle Univ, La Rochelle, FRANCE
Funder
Swedish Research Council, 2018-05973Riksbankens Jubileumsfond, P19-0103:1
Available from: 2022-11-14 Created: 2022-11-14 Last updated: 2025-02-07Bibliographically approved
Heil, R., Vats, E. & Hast, A. (2021). Strikethrough Removal from Handwritten Words Using CycleGANs. In: Lladós J., Lopresti D., Uchida S. (Ed.), Document Analysis and Recognition -- ICDAR 2021: . Paper presented at International Conference on Document Analysis and Recognition (ICDAR) (pp. 572-586). Springer, 12824
Open this publication in new window or tab >>Strikethrough Removal from Handwritten Words Using CycleGANs
2021 (English)In: Document Analysis and Recognition -- ICDAR 2021 / [ed] Lladós J., Lopresti D., Uchida S., Springer, 2021, Vol. 12824, p. 572-586Conference paper, Published paper (Refereed)
Abstract [en]

Obtaining the original, clean forms of struck-through handwritten words can be of interest to literary scholars, focusing on tasks such as genetic criticism. In addition to this, replacing struck-through words can also have a positive impact on text recognition tasks. This work presents a novel unsupervised approach for strikethrough removal from handwritten words, employing cycle-consistent generative adversarial networks (CycleGANs). The removal performance is improved upon by extending the network with an attribute-guided approach. Furthermore, two new datasets, a synthetic multi-writer set, based on the IAM database, and a genuine single-writer dataset, are introduced for the training and evaluation of the models. The experimental results demonstrate the efficacy of the proposed method, where the examined attribute-guided models achieve F1 scores above 0.8 on the synthetic test set, improving upon the performance of the regular CycleGAN. Despite being trained exclusively on the synthetic dataset, the examined models even produce convincing cleaned images for genuine struck-through words. 

Place, publisher, year, edition, pages
Springer, 2021
Keywords
Strikethrough removal, CycleGAN, Handwritten words, Document image processing
National Category
Computer graphics and computer vision
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-455889 (URN)10.1007/978-3-030-86337-1_38 (DOI)000711880100038 ()
Conference
International Conference on Document Analysis and Recognition (ICDAR)
Funder
Riksbankens Jubileumsfond, P19-0103:1Swedish Research Council, 2018-05973
Available from: 2021-10-12 Created: 2021-10-12 Last updated: 2025-02-07Bibliographically approved
Mårtensson, L., Vats, E. & Hast, A. (2021). The Significance of Script Proportions in the Medieval Swedish Script. Arkiv för nordisk filologi
Open this publication in new window or tab >>The Significance of Script Proportions in the Medieval Swedish Script
2021 (English)In: Arkiv för nordisk filologi, ISSN 0066-7668Article in journal (Refereed) Published
National Category
Natural Language Processing
Identifiers
urn:nbn:se:uu:diva-525784 (URN)
Available from: 2024-03-29 Created: 2024-03-29 Last updated: 2025-02-07Bibliographically approved
Hast, A. & Vats, E. (2021). Word Recognition using Embedded Prototype Subspace Classifiers on a new Imbalanced Dataset. Journal of WSCG, 29(1-2), 39-47
Open this publication in new window or tab >>Word Recognition using Embedded Prototype Subspace Classifiers on a new Imbalanced Dataset
2021 (English)In: Journal of WSCG, ISSN 1213-6972, E-ISSN 1213-6964, Vol. 29, no 1-2, p. 39-47Article in journal (Refereed) Published
Abstract [en]

This paper presents an approach towards word recognition based on embedded prototype subspace classification. The purpose of this paper is three-fold. Firstly, a new dataset for word recognition is presented, which is extracted from the Esposalles database consisting of the Barcelona cathedral marriage records. Secondly, different clustering techniques are evaluated for Embedded Prototype Subspace Classifiers. The dataset, containing 30 different classes of words is heavily imbalanced, and some word classes are very similar, which renders the classification task rather challenging. For ease of use, no stratified sampling is done in advance, and the impact of different data splits is evaluated for different clustering techniques. It will be demonstrated that the original clustering technique based on scaling the bandwidth has to be adjusted for this new dataset. Thirdly, an algorithm is therefore proposed that finds k clusters, striving to obtain a certain amount of feature points in each cluster, rather than finding some clusters based on scaling the Silverman’s rule of thumb. Furthermore, Self Organising Maps are also evaluated as both a clustering and embedding technique.

Place, publisher, year, edition, pages
University of West Bohemia, 2021
Keywords
Subspaces, Embedded Prototypes, Clustering, Deep Learning, Self Organising Maps, t-SNE, Data splits
National Category
Computer graphics and computer vision
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-453762 (URN)10.24132/JWSCG.2021.29.5 (DOI)
Funder
Riksbankens Jubileumsfond, NHS14-2068:1Swedish National Infrastructure for Computing (SNIC), 2020/15-177
Available from: 2021-09-22 Created: 2021-09-22 Last updated: 2025-02-07Bibliographically approved
Hast, A., Mårtensson, L., Vats, E. & Heil, R. (2019). Creating an Atlas over Handwritten Script Signs. In: Digital Humanities in the Nordic Countries: . Paper presented at DHN 2019, March 6–8, Copenhagen, Denmark.
Open this publication in new window or tab >>Creating an Atlas over Handwritten Script Signs
2019 (English)In: Digital Humanities in the Nordic Countries, 2019Conference paper, Poster (with or without abstract) (Refereed)
National Category
Computer Sciences
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-373517 (URN)
Conference
DHN 2019, March 6–8, Copenhagen, Denmark
Available from: 2019-01-15 Created: 2019-01-15 Last updated: 2019-01-17
Hast, A., Lind, M. & Vats, E. (2019). Embedded Prototype Subspace Classification: A subspace learning framework. In: Computer Analysis of Images and Patterns, CAIP 2019, PT II: . Paper presented at The 18th International Conference on Computer Analysis of Images and Patterns, CAIP 2019, September 2–6,2019, Salerno, Italy (pp. 581-592). Springer
Open this publication in new window or tab >>Embedded Prototype Subspace Classification: A subspace learning framework
2019 (English)In: Computer Analysis of Images and Patterns, CAIP 2019, PT II, Springer, 2019, p. 581-592Conference paper, Published paper (Refereed)
Abstract [en]

Handwritten text recognition is a daunting task, due to complex characteristics of handwritten letters. Deep learning based methods have achieved significant advances in recognizing challenging handwritten texts because of its ability to learn and accurately classify intricate patterns. However, there are some limitations of deep learning, such as lack of well-defined mathematical model, black-box learning mechanism, etc., which pose challenges. This paper aims at going beyond the blackbox learning and proposes a novel learning framework called as Embedded Prototype Subspace Classification, that is based on the well-known subspace method, to recognise handwritten letters in a fast and efficient manner. The effectiveness of the proposed framework is empirically evaluated on popular datasets using standard evaluation measures.

Place, publisher, year, edition, pages
Springer, 2019
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349 ; 11679
Keywords
Handwritten text, Subspaces, Deep learning, t-SNE
National Category
Medical Imaging Human Computer Interaction
Identifiers
urn:nbn:se:uu:diva-393257 (URN)10.1007/978-3-030-29891-3_51 (DOI)000558110900051 ()978-3-030-29891-3 (ISBN)978-3-030-29890-6 (ISBN)
Conference
The 18th International Conference on Computer Analysis of Images and Patterns, CAIP 2019, September 2–6,2019, Salerno, Italy
Available from: 2019-09-18 Created: 2019-09-18 Last updated: 2025-02-09Bibliographically approved
Mårtensson, L., Vats, E., Hast, A. & Fornés, A. (2019). In search of the scribe: Letter spotting as a tool for identifying scribes in large handwritten text corpora. Human IT, 14(2), 95-120
Open this publication in new window or tab >>In search of the scribe: Letter spotting as a tool for identifying scribes in large handwritten text corpora
2019 (English)In: Human IT, ISSN 1402-1501, E-ISSN 1402-151X, Vol. 14, no 2, p. 95-120Article in journal (Refereed) Published
National Category
Computer Sciences
Research subject
Computerized Image Processing
Identifiers
urn:nbn:se:uu:diva-373929 (URN)
Available from: 2019-01-17 Created: 2019-01-17 Last updated: 2019-01-17Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0003-4480-3158

Search in DiVA

Show all publications

Profile pages