Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
Planned maintenance
A system upgrade is planned for 10/12-2024, at 12:00-13:00. During this time DiVA will be unavailable.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Schrödinger's tree: On syntax and neural language models
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. RISE Research Institutes of Sweden, Kista, Sweden.ORCID iD: 0000-0002-7873-3971
2022 (English)In: Frontiers in Artificial Intelligence, E-ISSN 2624-8212, Vol. 5, article id 796788Article in journal (Refereed) Published
Abstract [en]

In the last half-decade, the field of natural language processing (NLP) hasundergone two major transitions: the switch to neural networks as the primarymodeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emergedas NLP’s workhorse, displaying increasingly fluent generation capabilities andproving to be an indispensable means of knowledge transfer downstream.Due to the otherwise opaque, black-box nature of such models, researchershave employed aspects of linguistic theory in order to characterize theirbehavior. Questions central to syntax—the study of the hierarchical structureof language—have factored heavily into such work, shedding invaluableinsights about models’ inherent biases and their ability to make human-likegeneralizations. In this paper, we attempt to take stock of this growing body ofliterature. In doing so, we observe a lack of clarity across numerous dimensions,which influences the hypotheses that researchers form, as well as theconclusions they draw from their findings. To remedy this, we urge researchersto make careful considerations when investigating coding properties, selectingrepresentations, and evaluating via downstream tasks. Furthermore, we outlinethe implications of the different types of research questions exhibited in studieson syntax, as well as the inherent pitfalls of aggregate metrics. Ultimately, wehope that our discussion adds nuance to the prospect of studying languagemodels and paves the way for a less monolithic perspective on syntax in thiscontext.

Place, publisher, year, edition, pages
Frontiers Media S.A., 2022. Vol. 5, article id 796788
Keywords [en]
neural networks, language models, syntax, coding properties, representations, natural language understanding
National Category
Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
URN: urn:nbn:se:uu:diva-492066DOI: 10.3389/frai.2022.796788ISI: 000915268600001PubMedID: 36325030OAI: oai:DiVA.org:uu-492066DiVA, id: diva2:1722927
Funder
Uppsala UniversityAvailable from: 2023-01-01 Created: 2023-01-01 Last updated: 2023-07-30Bibliographically approved
In thesis
1. The Search for Syntax: Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
Open this publication in new window or tab >>The Search for Syntax: Investigating the Syntactic Knowledge of Neural Language Models Through the Lens of Dependency Parsing
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Syntax — the study of the hierarchical structure of language — has long featured as a prominent research topic in the field of natural language processing (NLP). Traditionally, its role in NLP was confined towards developing parsers: supervised algorithms tasked with predicting the structure of utterances (often for use in downstream applications). More recently, however, syntax (and syntactic theory) has factored much less into the development of NLP models, and much more into their analysis. This has been particularly true with the nascent relevance of language models: semi-supervised algorithms trained to predict (or infill) strings given a provided context. In this dissertation, I describe four separate studies that seek to explore the interplay between syntactic parsers and language models upon the backdrop of dependency syntax. In the first study, I investigate the error profiles of neural transition-based and graph-based dependency parsers, showing that they are effectively homogenized when leveraging representations from pre-trained language models. Following this, I report the results of two additional studies which show that dependency tree structure can be partially decoded from the internal components of neural language models — specifically, hidden state representations and self-attention distributions. I then expand on these findings by exploring a set of additional results, which serve to highlight the influence of experimental factors, such as the choice of annotation framework or learning objective, in decoding syntactic structure from model components. In the final study, I describe efforts to quantify the overall learnability of a large set of multilingual dependency treebanks — the data upon which the previous experiments were based — and how it may be affected by factors such as annotation quality or tokenization decisions. Finally, I conclude the thesis with a conceptual analysis that relates the aforementioned studies to a broader body of work concerning the syntactic knowledge of language models.

Place, publisher, year, edition, pages
Uppsala: Acta Universitatis Upsaliensis, 2023. p. 101
Series
Studia Linguistica Upsaliensia, ISSN 1652-1366 ; 30
Keywords
syntax, language models, dependency parsing, universal dependencies
National Category
Language Technology (Computational Linguistics)
Research subject
Computational Linguistics
Identifiers
urn:nbn:se:uu:diva-508379 (URN)978-91-513-1850-9 (ISBN)
Public defence
2023-09-22, Humanistiska Teatern, Engelska parken, Thunbergsvägen 3C, Uppsala, 14:00 (English)
Opponent
Supervisors
Available from: 2023-08-24 Created: 2023-07-30 Last updated: 2023-08-24

Open Access in DiVA

fulltext(349 kB)238 downloads
File information
File name FULLTEXT01.pdfFile size 349 kBChecksum SHA-512
3e8a986e069e7970858fade143e210433ea74b192b0cf741ee840c14f133b122ddd7b9b6d5ffe9827bf4f9c44575f90e9ed670d21736c87c7c4e7a22815ba417
Type fulltextMimetype application/pdf

Other links

Publisher's full textPubMed

Authority records

Kulmizev, ArturNivre, Joakim

Search in DiVA

By author/editor
Kulmizev, ArturNivre, Joakim
By organisation
Department of Linguistics and Philology
In the same journal
Frontiers in Artificial Intelligence
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 238 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
pubmed
urn-nbn

Altmetric score

doi
pubmed
urn-nbn
Total: 170 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf