Inducing Baseform Models from a Swedish Vocabulary Pool
2007 (English)In: Proceedings of the 16th Nordic Conference of Computational Linguistics NODALIDA-2007, 2007, 51-58 p.Conference paper (Refereed)
In many language technology applications, we need to map wordforms to a citation form or baseform, or the other way around, e.g. for lexicon lookup or for representational purposes.
In this paper, we used a suffix trie mapper with suffix-change probabilities, and computed wordform-baseform and baseform-wordform models from eight subsets of a ranked Swedish vocabulary. All models were evaluated for both directions on a testset, and four of the models were also evaluated for wordform-baseform mapping on five unseen texts.
For wordform-baseform mapping, the best models performed on par with state-of-the-art systems. Most models were useful for some situation—given mapping direction, and time and space restrictions—but no model was best for all situations.
Place, publisher, year, edition, pages
2007. 51-58 p.
morfologi, statistisk modell, utvärdering
Language Technology (Computational Linguistics) Specific Languages
IdentifiersURN: urn:nbn:se:uu:diva-11108ISBN: 978-9985-4-0514-7OAI: oai:DiVA.org:uu-11108DiVA: diva2:38876