Character-based PSMT for Closely Related Languages
2009 (English)In: Proceedings of 13th Annual Conference of the European Association for Machine Translation (EAMT’09), 2009, 12-19 p.Conference paper (Refereed)
Translating unknown words between related languages using a character-based statistical machine translation model can be beneficial. In this paper, we describe a simple method to combine character-based models with standard word-based models to increase the coverage of a phrase-based SMT system. Using this approach, we can show a modest improvement when translating between Norwegian and Swedish. The potentials of applying character-based models to closely related languages is also illustrated by applying the character model on its own. The performance of such an approach is similar to the word-level baseline and closer to the reference in terms of string similarity.
Place, publisher, year, edition, pages
2009. 12-19 p.
Language Technology (Computational Linguistics)
Research subject Computational Linguistics
IdentifiersURN: urn:nbn:se:uu:diva-165940OAI: oai:DiVA.org:uu-165940DiVA: diva2:474916
European Association for Machine Translation (EAMT)