uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Parser Training with Heterogeneous Treebanks
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.ORCID iD: 0000-0001-8844-2126
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.ORCID iD: 0000-0002-2837-3648
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
2018 (English)In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, 2018, p. 619-625Conference paper, Published paper (Refereed)
Abstract [en]

How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previouslysuggested, but little evaluated, strategiesfor exploiting multiple treebanks based onconcatenating training sets, with or without fine-tuning. We go on to propose anew method based on treebank embeddings. We perform experiments for severallanguages and show that in many casesfine-tuning and treebank embeddings leadto substantial improvements over singletreebanks or concatenation, with averagegains of 2.0–3.5 LAS points. We arguethat treebank embeddings should be preferred due to their conceptual simplicity,flexibility and extensibility.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2018. p. 619-625
National Category
Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:uu:diva-362215DOI: 10.18653/v1/P18-2098ISI: 000493913100098ISBN: 978-1-948087-34-6 (print)OAI: oai:DiVA.org:uu-362215DiVA, id: diva2:1252663
Conference
The 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, July 15 - 20, 2018.
Funder
Swedish Research Council, P2016-01817Available from: 2018-10-02 Created: 2018-10-02 Last updated: 2019-12-06Bibliographically approved

Open Access in DiVA

fulltext(217 kB)127 downloads
File information
File name FULLTEXT01.pdfFile size 217 kBChecksum SHA-512
dc8ae120ca8ca57c5a99e239c65a7f8225cd22513581d02d6ef5a64d72be955fad5314f38e3e64ccc7f02be252c9d5f1f1d7a02cd8d3c6f184e4b8a908b69014
Type fulltextMimetype application/pdf

Other links

Publisher's full text

Authority records BETA

Stymne, Sarade Lhoneux, MiryamSmith, AaronNivre, Joakim

Search in DiVA

By author/editor
Stymne, Sarade Lhoneux, MiryamSmith, AaronNivre, Joakim
By organisation
Department of Linguistics and Philology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 127 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 61 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf