Logotyp: till Uppsala universitets webbplats

uu.sePublikationer från Uppsala universitet
Driftinformation
Ett driftavbrott i samband med versionsuppdatering är planerat till 10/12-2024, kl 12.00-13.00. Under den tidsperioden kommer DiVA inte att vara tillgängligt
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The Search for Syntax: Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
Uppsala universitet, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Språkvetenskapliga fakulteten, Institutionen för lingvistik och filologi. (Computational Linguistics)
2023 (Engelska)Doktorsavhandling, sammanläggning (Övrigt vetenskapligt)
Fritextbeskrivning
Abstract [en]

Syntax — the study of the hierarchical structure of language — has long featured as a prominent research topic in the field of natural language processing (NLP). Traditionally, its role in NLP was confined towards developing parsers: supervised algorithms tasked with predicting the structure of utterances (often for use in downstream applications). More recently, however, syntax (and syntactic theory) has factored much less into the development of NLP models, and much more into their analysis. This has been particularly true with the nascent relevance of language models: semi-supervised algorithms trained to predict (or infill) strings given a provided context. In this dissertation, I describe four separate studies that seek to explore the interplay between syntactic parsers and language models upon the backdrop of dependency syntax. In the first study, I investigate the error profiles of neural transition-based and graph-based dependency parsers, showing that they are effectively homogenized when leveraging representations from pre-trained language models. Following this, I report the results of two additional studies which show that dependency tree structure can be partially decoded from the internal components of neural language models — specifically, hidden state representations and self-attention distributions. I then expand on these findings by exploring a set of additional results, which serve to highlight the influence of experimental factors, such as the choice of annotation framework or learning objective, in decoding syntactic structure from model components. In the final study, I describe efforts to quantify the overall learnability of a large set of multilingual dependency treebanks — the data upon which the previous experiments were based — and how it may be affected by factors such as annotation quality or tokenization decisions. Finally, I conclude the thesis with a conceptual analysis that relates the aforementioned studies to a broader body of work concerning the syntactic knowledge of language models.

Ort, förlag, år, upplaga, sidor
Uppsala: Acta Universitatis Upsaliensis, 2023. , s. 101
Serie
Studia Linguistica Upsaliensia, ISSN 1652-1366 ; 30
Nyckelord [en]
syntax, language models, dependency parsing, universal dependencies
Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Forskningsämne
Datorlingvistik
Identifikatorer
URN: urn:nbn:se:uu:diva-508379ISBN: 978-91-513-1850-9 (tryckt)OAI: oai:DiVA.org:uu-508379DiVA, id: diva2:1784732
Disputation
2023-09-22, Humanistiska Teatern, Engelska parken, Thunbergsvägen 3C, Uppsala, 14:00 (Engelska)
Opponent
Handledare
Tillgänglig från: 2023-08-24 Skapad: 2023-07-30 Senast uppdaterad: 2023-08-24
Delarbeten
1. Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited
Öppna denna publikation i ny flik eller fönster >>Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing – A Tale of Two Parsers Revisited
Visa övriga...
2019 (Engelska)Ingår i: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, s. 2755-2768Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope. In this paper, we show that, even though some details of the picture have changed after the switch to neural networks and continuous representations, the basic trade-off between rich features and global optimization remains essentially the same. Moreover, we show that deep contextualized word embeddings, which allow parsers to pack information about global sentence structure into local feature representations, benefit transition-based parsers more than graph-based parsers, making the two approaches virtually equivalent in terms of both accuracy and error profile. We argue that the reason is that these representations help prevent search errors and thereby allow transitionbased parsers to better exploit their inherent strength of making accurate local decisions. We support this explanation by an error analysis of parsing experiments on 13 languages.

Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Forskningsämne
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-406697 (URN)000854193302085 ()
Konferens
2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), November 3-7, Hong Kong, China
Forskningsfinansiär
Vetenskapsrådet, 2016-01817
Tillgänglig från: 2020-03-11 Skapad: 2020-03-11 Senast uppdaterad: 2023-07-30Bibliografiskt granskad
2. Do Neural Language Models Show Preferences for Syntactic Formalisms?
Öppna denna publikation i ny flik eller fönster >>Do Neural Language Models Show Preferences for Syntactic Formalisms?
2020 (Engelska)Ingår i: 58Th Annual Meeting Of The Association For Computational Linguistics (Acl 2020), ASSOC COMPUTATIONAL LINGUISTICS-ACL , 2020, s. 4077-4091Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Recent work on the interpretability of deep neural language models has concluded that many properties of natural language syntax are encoded in their representational spaces. However, such studies often suffer from limited scope by focusing on a single language and a single linguistic formalism. In this study, we aim to investigate the extent to which the semblance of syntactic structure captured by language models adheres to a surface-syntactic or deep syntactic style of analysis, and whether the patterns are consistent across different languages. We apply a probe for extracting directed dependency trees to BERT and ELMo models trained on 13 different languages, probing for two different syntactic annotation styles: Universal Dependencies (UD), prioritizing deep syntactic relations, and Surface-Syntactic Universal Dependencies (SUD), focusing on surface structure. We find that both models exhibit a preference for UD over SUD - with interesting variations across languages and layers - and that the strength of this preference is correlated with differences in tree shape.

Ort, förlag, år, upplaga, sidor
ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2020
Nationell ämneskategori
Jämförande språkvetenskap och allmän lingvistik Datavetenskap (datalogi)
Identifikatorer
urn:nbn:se:uu:diva-423307 (URN)000570978204034 ()978-1-952148-25-5 (ISBN)
Konferens
58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), JULY 5-10, 2020
Tillgänglig från: 2020-10-23 Skapad: 2020-10-23 Senast uppdaterad: 2023-07-30Bibliografiskt granskad
3. Attention Can Reflect Syntactic Structure (If You Let It)
Öppna denna publikation i ny flik eller fönster >>Attention Can Reflect Syntactic Structure (If You Let It)
Visa övriga...
2021 (Engelska)Ingår i: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Association for Computational Linguistics, 2021, s. 3031-3045Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English - a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. We show that full trees can be decoded above baseline accuracy from single attention heads, and that individual relations are often tracked by the same heads across languages. Furthermore, in an attempt to address recent debates about the status of attention as an explanatory mechanism, we experiment with fine-tuning mBERT on a supervised parsing objective while freezing different series of parameters. Interestingly, in steering the objective to learn explicit linguistic structure, we find much of the same structure represented in the resulting attention patterns, with interesting differences with respect to which parameters are frozen.

Ort, förlag, år, upplaga, sidor
Association for Computational Linguistics, 2021
Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Forskningsämne
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-462768 (URN)10.18653/v1/2021.eacl-main.264 (DOI)000863557003010 ()978-1-954085-02-2 (ISBN)
Konferens
16th Conference of the European Chapter of the Association for Computational Linguistics,19-23 April, 2021, on line
Forskningsfinansiär
Google
Tillgänglig från: 2022-01-02 Skapad: 2022-01-02 Senast uppdaterad: 2023-07-30Bibliografiskt granskad
4. Schrödinger's tree: On syntax and neural language models
Öppna denna publikation i ny flik eller fönster >>Schrödinger's tree: On syntax and neural language models
2022 (Engelska)Ingår i: Frontiers in Artificial Intelligence, E-ISSN 2624-8212, Vol. 5, artikel-id 796788Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

In the last half-decade, the field of natural language processing (NLP) hasundergone two major transitions: the switch to neural networks as the primarymodeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emergedas NLP’s workhorse, displaying increasingly fluent generation capabilities andproving to be an indispensable means of knowledge transfer downstream.Due to the otherwise opaque, black-box nature of such models, researchershave employed aspects of linguistic theory in order to characterize theirbehavior. Questions central to syntax—the study of the hierarchical structureof language—have factored heavily into such work, shedding invaluableinsights about models’ inherent biases and their ability to make human-likegeneralizations. In this paper, we attempt to take stock of this growing body ofliterature. In doing so, we observe a lack of clarity across numerous dimensions,which influences the hypotheses that researchers form, as well as theconclusions they draw from their findings. To remedy this, we urge researchersto make careful considerations when investigating coding properties, selectingrepresentations, and evaluating via downstream tasks. Furthermore, we outlinethe implications of the different types of research questions exhibited in studieson syntax, as well as the inherent pitfalls of aggregate metrics. Ultimately, wehope that our discussion adds nuance to the prospect of studying languagemodels and paves the way for a less monolithic perspective on syntax in thiscontext.

Ort, förlag, år, upplaga, sidor
Frontiers Media S.A., 2022
Nyckelord
neural networks, language models, syntax, coding properties, representations, natural language understanding
Nationell ämneskategori
Språkteknologi (språkvetenskaplig databehandling)
Forskningsämne
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-492066 (URN)10.3389/frai.2022.796788 (DOI)000915268600001 ()36325030 (PubMedID)
Forskningsfinansiär
Uppsala universitet
Tillgänglig från: 2023-01-01 Skapad: 2023-01-01 Senast uppdaterad: 2023-07-30Bibliografiskt granskad
5. Investigating UD Treebanks via Dataset Difficulty Measures
Öppna denna publikation i ny flik eller fönster >>Investigating UD Treebanks via Dataset Difficulty Measures
2023 (Engelska)Ingår i: Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, Dubrovnik, Croatia: Association for Computational Linguistics, 2023, s. 1076-1089Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Treebanks annotated with Universal Dependencies (UD) are currently available for over 100 languages and are widely utilized by the community. However, their inherent characteristics are hard to measure and are only partially reflected in parser evaluations via accuracy metrics like LAS. In this study, we analyze a large subset of the UD treebanks using three recently proposed accuracy-free dataset analysis methods: dataset cartography, 𝒱-information, and minimum description length. Each method provides insights about UD treebanks that would remain undetected if only LAS was considered. Specifically, we identify a number of treebanks that, despite yielding high LAS, contain very little information that is usable by a parser to surpass what can be achieved by simple heuristics. Furthermore, we make note of several treebanks that score consistently low across numerous metrics, indicating a high degree of noise or annotation inconsistency present therein.

Ort, förlag, år, upplaga, sidor
Dubrovnik, Croatia: Association for Computational Linguistics, 2023
Nyckelord
computational linguistics, syntax, universal dependencies, parsing, natural language processing
Nationell ämneskategori
Jämförande språkvetenskap och allmän lingvistik Datavetenskap (datalogi)
Forskningsämne
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-508035 (URN)
Konferens
The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023), Dubrovnik, Croatia, May 2-6, 2023
Tillgänglig från: 2023-07-18 Skapad: 2023-07-18 Senast uppdaterad: 2023-08-11Bibliografiskt granskad

Open Access i DiVA

fulltext(986 kB)984 nedladdningar
Filinformation
Filnamn FULLTEXT01.pdfFilstorlek 986 kBChecksumma SHA-512
56e70b9d675551d85b1a87035bd1ac7fd22210abca43ca756b1120afaade24552275532b1e52fe93d0e9ca9d42e4a3067efb1745bbca56c5dd955b04f43da370
Typ fulltextMimetyp application/pdf

Person

Kulmizev, Artur

Sök vidare i DiVA

Av författaren/redaktören
Kulmizev, Artur
Av organisationen
Institutionen för lingvistik och filologi
Språkteknologi (språkvetenskaplig databehandling)

Sök vidare utanför DiVA

GoogleGoogle Scholar
Totalt: 991 nedladdningar
Antalet nedladdningar är summan av nedladdningar för alla fulltexter. Det kan inkludera t.ex tidigare versioner som nu inte längre är tillgängliga.

isbn
urn-nbn

Altmetricpoäng

isbn
urn-nbn
Totalt: 765 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf