uu.seUppsala University Publications
Change search
ReferencesLink to record
Permanent link

Direct link
Shallow Parsing with PoS Taggers and Linguistic Features.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. (computational linguistics)ORCID iD: 0000-0002-4838-6518
2002 (English)In: Journal of Machine Learning Research: Special Issue on Shallow Parsing, Vol. 2, 639-668 p.Article in journal (Refereed) Published
Abstract [en]

Three data-driven publicly available part-of-speech taggers are applied to shallow parsing of Swedish texts. The phrase structure is represented by nine types of phrases in a hierarchical structure containing labels for every constituent type the token belongs to in the parse tree. The encoding is based on the concatenation of the phrase tags on the path from lowest to higher nodes. Various linguistic features are used in learning; the taggers are trained on the basis of lexical information only, part-of-speech only, and a combination of both, to predict the phrase structure of the tokens with or without part-of-speech. Special attention is directed to the taggers' sensitivity to different types of linguistic information included in learning, as well as the taggers' sensitivity to the size and the various types of training data sets. The method can be easily transferred to other languages.

Place, publisher, year, edition, pages
2002. Vol. 2, 639-668 p.
Keyword [en]
Chunking, Shallow parsing, Part-of-speech taggers, Hidden Markov models, Maximum entropy learning, Transformation-based learning
National Category
Language Technology (Computational Linguistics)
URN: urn:nbn:se:uu:diva-19635OAI: oai:DiVA.org:uu-19635DiVA: diva2:47407
Available from: 2006-11-30 Created: 2006-11-30 Last updated: 2016-03-08

Open Access in DiVA

No full text

Search in DiVA

By author/editor
Megyesi, Beata
By organisation
Department of Linguistics and Philology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 187 hits
ReferencesLink to record
Permanent link

Direct link