Logo: to the web site of Uppsala University

uu.sePublikasjoner fra Uppsala universitet
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited
Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Språkvetenskapliga fakulteten, Institutionen för lingvistik och filologi.
Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Språkvetenskapliga fakulteten, Institutionen för lingvistik och filologi.ORCID-id: 0000-0001-8844-2126
Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Språkvetenskapliga fakulteten, Institutionen för lingvistik och filologi.
Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Språkvetenskapliga fakulteten, Institutionen för lingvistik och filologi.
Vise andre og tillknytning
2019 (engelsk)Inngår i: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, s. 2755-2768Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope. In this paper, we show that, even though some details of the picture have changed after the switch to neural networks and continuous representations, the basic trade-off between rich features and global optimization remains essentially the same. Moreover, we show that deep contextualized word embeddings, which allow parsers to pack information about global sentence structure into local feature representations, benefit transition-based parsers more than graph-based parsers, making the two approaches virtually equivalent in terms of both accuracy and error profile. We argue that the reason is that these representations help prevent search errors and thereby allow transitionbased parsers to better exploit their inherent strength of making accurate local decisions. We support this explanation by an error analysis of parsing experiments on 13 languages.

sted, utgiver, år, opplag, sider
2019. s. 2755-2768
HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
URN: urn:nbn:se:uu:diva-406697ISI: 000854193302085OAI: oai:DiVA.org:uu-406697DiVA, id: diva2:1413749
Konferanse
2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), November 3-7, Hong Kong, China
Forskningsfinansiär
Swedish Research Council, 2016-01817Tilgjengelig fra: 2020-03-11 Laget: 2020-03-11 Sist oppdatert: 2025-02-07bibliografisk kontrollert
Inngår i avhandling
1. The Search for Syntax: Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
Åpne denne publikasjonen i ny fane eller vindu >>The Search for Syntax: Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
2023 (engelsk)Doktoravhandling, med artikler (Annet vitenskapelig)
Abstract [en]

Syntax — the study of the hierarchical structure of language — has long featured as a prominent research topic in the field of natural language processing (NLP). Traditionally, its role in NLP was confined towards developing parsers: supervised algorithms tasked with predicting the structure of utterances (often for use in downstream applications). More recently, however, syntax (and syntactic theory) has factored much less into the development of NLP models, and much more into their analysis. This has been particularly true with the nascent relevance of language models: semi-supervised algorithms trained to predict (or infill) strings given a provided context. In this dissertation, I describe four separate studies that seek to explore the interplay between syntactic parsers and language models upon the backdrop of dependency syntax. In the first study, I investigate the error profiles of neural transition-based and graph-based dependency parsers, showing that they are effectively homogenized when leveraging representations from pre-trained language models. Following this, I report the results of two additional studies which show that dependency tree structure can be partially decoded from the internal components of neural language models — specifically, hidden state representations and self-attention distributions. I then expand on these findings by exploring a set of additional results, which serve to highlight the influence of experimental factors, such as the choice of annotation framework or learning objective, in decoding syntactic structure from model components. In the final study, I describe efforts to quantify the overall learnability of a large set of multilingual dependency treebanks — the data upon which the previous experiments were based — and how it may be affected by factors such as annotation quality or tokenization decisions. Finally, I conclude the thesis with a conceptual analysis that relates the aforementioned studies to a broader body of work concerning the syntactic knowledge of language models.

sted, utgiver, år, opplag, sider
Uppsala: Acta Universitatis Upsaliensis, 2023. s. 101
Serie
Studia Linguistica Upsaliensia, ISSN 1652-1366 ; 30
Emneord
syntax, language models, dependency parsing, universal dependencies
HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-508379 (URN)978-91-513-1850-9 (ISBN)
Disputas
2023-09-22, Humanistiska Teatern, Engelska parken, Thunbergsvägen 3C, Uppsala, 14:00 (engelsk)
Opponent
Veileder
Tilgjengelig fra: 2023-08-24 Laget: 2023-07-30 Sist oppdatert: 2025-02-07

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

https://www.aclweb.org/anthology/D19-1277https://www.emnlp-ijcnlp2019.org/

Person

Kulmizev, Arturde Lhoneux, MiryamNivre, Joakim

Søk i DiVA

Av forfatter/redaktør
Kulmizev, Arturde Lhoneux, MiryamNivre, Joakim
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric

urn-nbn
Totalt: 129 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf