uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A Multilingual Evaluation of Three Spelling Normalization Methods for Historical Text
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. (datorlingvistik)
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. (datorlingvistik)
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. (datorlingvistik)
2014 (English)In: Proceedings of the 8th Workshop on Language Technologyfor Cultural Heritage, Social Sciences, and Humanities(LaTeCH), 2014, 32-41 p.Conference paper, Published paper (Refereed)
Abstract [en]

We present a multilingual evaluation of approaches for spelling normalisation of historical text based on data from five languages: English, German, Hungarian, Icelandic, and Swedish. Three different normalisation methods are evaluated: a simplistic filtering model, a Levenshteinbased approach, and a character-based statistical machine translation approach. The evaluation shows that the machine translation approach often gives the best results, but also that all approaches improve over the baseline and that no single method works best for all languages.

Place, publisher, year, edition, pages
2014. 32-41 p.
Keyword [en]
spelling normalization, historical texts
National Category
Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:uu:diva-239449ISBN: 978-1-937284-85-5 (print)OAI: oai:DiVA.org:uu-239449DiVA: diva2:774587
Conference
14th Conference of the European Association for Computational Linguistics, EACL 2014, 26–30 April, Gothenburg, Sweden
Funder
Swedish Research Council
Available from: 2014-12-26 Created: 2014-12-26 Last updated: 2017-01-25Bibliographically approved

Open Access in DiVA

No full text

Other links

http://aclanthology.info/volumes/proceedings-of-the-8th-workshop-on-language-technology-for-cultural-heritage-social-sciences-and-humanities-latech

Authority records BETA

Pettersson, EvaMegyesi, BeátaNivre, Joakim

Search in DiVA

By author/editor
Pettersson, EvaMegyesi, BeátaNivre, Joakim
By organisation
Department of Linguistics and Philology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 343 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf