Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Attention Can Reflect Syntactic Structure (If You Let It)
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
Show others and affiliations
2021 (English)In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Association for Computational Linguistics, 2021, p. 3031-3045Conference paper, Published paper (Refereed)
Abstract [en]

Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English - a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual BERT across 18 languages in order to test the generalizability of the claim that dependency syntax is reflected in attention patterns. We show that full trees can be decoded above baseline accuracy from single attention heads, and that individual relations are often tracked by the same heads across languages. Furthermore, in an attempt to address recent debates about the status of attention as an explanatory mechanism, we experiment with fine-tuning mBERT on a supervised parsing objective while freezing different series of parameters. Interestingly, in steering the objective to learn explicit linguistic structure, we find much of the same structure represented in the resulting attention patterns, with interesting differences with respect to which parameters are frozen.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2021. p. 3031-3045
National Category
Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:uu:diva-462768DOI: 10.18653/v1/2021.eacl-main.264ISI: 000863557003010ISBN: 978-1-954085-02-2 (print)OAI: oai:DiVA.org:uu-462768DiVA, id: diva2:1623982
Conference
16th Conference of the European Chapter of the Association for Computational Linguistics,19-23 April, 2021, on line
Funder
GoogleAvailable from: 2022-01-02 Created: 2022-01-02 Last updated: 2023-07-30Bibliographically approved
In thesis
1. The Search for Syntax: Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
Open this publication in new window or tab >>The Search for Syntax: Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Syntax — the study of the hierarchical structure of language — has long featured as a prominent research topic in the field of natural language processing (NLP). Traditionally, its role in NLP was confined towards developing parsers: supervised algorithms tasked with predicting the structure of utterances (often for use in downstream applications). More recently, however, syntax (and syntactic theory) has factored much less into the development of NLP models, and much more into their analysis. This has been particularly true with the nascent relevance of language models: semi-supervised algorithms trained to predict (or infill) strings given a provided context. In this dissertation, I describe four separate studies that seek to explore the interplay between syntactic parsers and language models upon the backdrop of dependency syntax. In the first study, I investigate the error profiles of neural transition-based and graph-based dependency parsers, showing that they are effectively homogenized when leveraging representations from pre-trained language models. Following this, I report the results of two additional studies which show that dependency tree structure can be partially decoded from the internal components of neural language models — specifically, hidden state representations and self-attention distributions. I then expand on these findings by exploring a set of additional results, which serve to highlight the influence of experimental factors, such as the choice of annotation framework or learning objective, in decoding syntactic structure from model components. In the final study, I describe efforts to quantify the overall learnability of a large set of multilingual dependency treebanks — the data upon which the previous experiments were based — and how it may be affected by factors such as annotation quality or tokenization decisions. Finally, I conclude the thesis with a conceptual analysis that relates the aforementioned studies to a broader body of work concerning the syntactic knowledge of language models.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2023. p. 101
Series
Studia Linguistica Upsaliensia, ISSN 1652-1366 ; 30
Keywords
syntax, language models, dependency parsing, universal dependencies
National Category
Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
urn:nbn:se:uu:diva-508379 (URN)978-91-513-1850-9 (ISBN)
Public defence
2023-09-22, Humanistiska Teatern, Engelska parken, Thunbergsvägen 3C, Uppsala, 14:00 (English)
Opponent
Supervisors
Available from: 2023-08-24 Created: 2023-07-30 Last updated: 2023-08-24

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full texthttps://aclanthology.org/2021.eacl-main.264

Authority records

Kulmizev, ArturNivre, Joakim

Search in DiVA

By author/editor
Kulmizev, ArturNivre, Joakim
By organisation
Department of Linguistics and Philology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 53 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf