Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
AttentionHTR: Handwritten Text Recognition Based on Attention Encoder-Decoder Networks
Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Arts, Department of ALM. Ctr Digital Humanities Uppsala.ORCID iD: 0000-0003-4480-3158
2022 (English)In: DOCUMENT ANALYSIS SYSTEMS, DAS 2022 / [ed] Uchida, S Barney, E Eglin, V, Springer Nature, 2022, Vol. 13237, p. 507-522Conference paper, Published paper (Refereed)
Abstract [en]

This work proposes an attention-based sequence-to-sequence model for handwritten word recognition and explores transfer learning for data-efficient training of HTR systems. To overcome training data scarcity, this work leverages models pre-trained on scene text images as a starting point towards tailoring the handwriting recognition models. ResNet feature extraction and bidirectional LSTM-based sequence modeling stages together form an encoder. The prediction stage consists of a decoder and a content-based attention mechanism. The effectiveness of the proposed end-to-end HTR system has been empirically evaluated on a novel multi-writer dataset Imgur5K and the IAM dataset. The experimental results evaluate the performance of the HTR framework, further supported by an in-depth analysis of the error cases. Source code and pre-trained models are available at GitHub (https://github.com/dmitrijsk/AttentionHTR).

Place, publisher, year, edition, pages
Springer Nature, 2022. Vol. 13237, p. 507-522
Series
Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349
Keywords [en]
Handwritten text recognition, Attention encoder-decoder networks, Sequence-to-sequence model, Transfer learning, Multi-writer
National Category
Computer graphics and computer vision Computer Sciences
Identifiers
URN: urn:nbn:se:uu:diva-488224DOI: 10.1007/978-3-031-06555-2_34ISI: 000870314500034ISBN: 978-3-031-06555-2 (electronic)ISBN: 978-3-031-06554-5 (print)OAI: oai:DiVA.org:uu-488224DiVA, id: diva2:1710758
Conference
15th IAPR International Workshop on Document Analysis Systems (DAS), MAY 22-25, 2022, La Rochelle Univ, La Rochelle, FRANCE
Funder
Swedish National Infrastructure for Computing (SNIC), SNIC 2021/7-47Available from: 2022-11-14 Created: 2022-11-14 Last updated: 2025-02-01Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full text

Authority records

Kass, DmitrijsVats, Ekta

Search in DiVA

By author/editor
Kass, DmitrijsVats, Ekta
By organisation
Department of Information TechnologyDepartment of ALM
Computer graphics and computer visionComputer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 81 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf