Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Improving Brill's PoS Tagger for an Agglutinative Language
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. datorlingvistik.ORCID iD: 0000-0002-4838-6518
1999 (English)In: Proceedings of the Joint Sigdat Conference on Empirical Methods in Natural Language Processing and Very Large Corpora: EMNLP/VLC '99, 1999, p. 275-284Conference paper, Published paper (Refereed)
Abstract [en]

In this paper Brill's rule-based PoS tagger is tested and adapted for Hungarian. It is shown that the present system does not obtain as high accuracy for Hungarian as it does for English (and other Germanic languages) because of the structural difference between these languages. Hungarian, unlike English, has rich morphology, is agglutinative with some inflectional characteristics and has fairly free word order. The tagger has the greatest difficulties with parts-of-speech belonging to open classes because of their complicated morphological structure. It is shown that the accuracy of tagging can be increased from approximately 83% to 97% by simply changing the rule generating mechanisms, namely the lexical templates in the lexical training module.

Place, publisher, year, edition, pages
1999. p. 275-284
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:uu:diva-19669OAI: oai:DiVA.org:uu-19669DiVA, id: diva2:47441
Conference
Joint Sigdat Conference on Empirical Methods in Natural Language Processing and Very Large Corpora
Available from: 2006-11-30 Created: 2006-11-30 Last updated: 2025-02-07

Open Access in DiVA

No full text in DiVA

Authority records

Megyesi, Beata

Search in DiVA

By author/editor
Megyesi, Beata
By organisation
Department of Linguistics and Philology
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 387 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf