Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Shallow Parsing with PoS Taggers and Linguistic Features.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. (computational linguistics)ORCID iD: 0000-0002-4838-6518
2002 (English)In: Journal of Machine Learning Research: Special Issue on Shallow Parsing, Vol. 2, p. 639-668Article in journal (Refereed) Published
Abstract [en]

Three data-driven publicly available part-of-speech taggers are applied to shallow parsing of Swedish texts. The phrase structure is represented by nine types of phrases in a hierarchical structure containing labels for every constituent type the token belongs to in the parse tree. The encoding is based on the concatenation of the phrase tags on the path from lowest to higher nodes. Various linguistic features are used in learning; the taggers are trained on the basis of lexical information only, part-of-speech only, and a combination of both, to predict the phrase structure of the tokens with or without part-of-speech. Special attention is directed to the taggers' sensitivity to different types of linguistic information included in learning, as well as the taggers' sensitivity to the size and the various types of training data sets. The method can be easily transferred to other languages.

Place, publisher, year, edition, pages
2002. Vol. 2, p. 639-668
Keywords [en]
Chunking, Shallow parsing, Part-of-speech taggers, Hidden Markov models, Maximum entropy learning, Transformation-based learning
National Category
Natural Language Processing
Identifiers
URN: urn:nbn:se:uu:diva-19635OAI: oai:DiVA.org:uu-19635DiVA, id: diva2:47407
Available from: 2006-11-30 Created: 2006-11-30 Last updated: 2025-02-07

Open Access in DiVA

No full text in DiVA

Authority records

Megyesi, Beata

Search in DiVA

By author/editor
Megyesi, Beata
By organisation
Department of Linguistics and Philology
Natural Language Processing

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 444 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf