Logo: to the web site of Uppsala University

uu.sePublications from Uppsala University
System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Deep active learning for data mining from conflict text corpora
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Social Sciences, Department of Peace and Conflict Research. Peace Research Institute, Oslo, Norway.ORCID iD: 0000-0002-5372-7129
(English)Manuscript (preprint) (Other academic)
Abstract [en]

High-resolution event data on armed conflict and related processes have revolutionized the study of political contention. However, most datasets of this type only collect spatio-temporal and conflict intensity data at that level of detail. Information on dynamics, such as targets, tactics, and purposes, is rarely collected due to the substantial effort of collecting data. This study proposes an inexpensive, high-performance approach to increase the feature richness of such datasets by leveraging active learning -- an iterative process of improving a machine learning model based on guided human input at each step of the learning process. Active learning is employed to then fine-tune (train in steps) a large, encoder-only language model fitted to the rich corpus of textual data underlying such datasets. This allows for the extraction of features related to conflict dynamics, such as electoral violence and attacks on religious targets. The approach achieves a performance comparable to the human (gold-standard) coding, while reducing the necessary human annotation by as much as 99 percent.

National Category
Other Social Sciences not elsewhere specified Peace and Conflict Studies Other Social Sciences not elsewhere specified Political Science (excluding Public Administration Studies and Globalisation Studies) Computer Sciences
Research subject
Peace and Conflict Research; Machine learning
Identifiers
URN: urn:nbn:se:uu:diva-544706OAI: oai:DiVA.org:uu-544706DiVA, id: diva2:1919199
Part of project
Societies at risk: The impact of armed conflict on human development, Riksbankens JubileumsfondAvailable from: 2024-12-07 Created: 2024-12-07 Last updated: 2025-02-20
In thesis
1. Forecasting battles: New machine learning methods for predicting armed conflict
Open this publication in new window or tab >>Forecasting battles: New machine learning methods for predicting armed conflict
2025 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Over the past decade, the field of conflict forecasting has undergone a remarkable metamorphosis, transforming from a series of isolated efforts with low predictive power into large, globe-spanning projects with impressive performance. However, despite this evolution, many challenges still remain. First, while we are good at predicting absolute risks, we are poor at predicting conflict dynamics (onsets, escalations, de-escalations and terminations). Second, we are over-reliant on spatio-temporal features and mechanistic models due to the nature of the event-data we use, thus excluding actor agency. Third, we do not handle either data or model uncertainty. Fourth, we are lagging behind the state-of-the-art in machine-learning. This dissertation attempts to resolve some of these salient difficulties, by contributing to six core elements of current-generation forecasting systems. First, time, by looking at the substantive effects and uncertainties of the temporal distance between data and forecast horizons. Second, space, by looking at the inherent uncertainties of high-resolution geospatial data and proposing a statistical method to address this. Third, feature space, by tackling the extreme feature sparsity in event-data and proposing a novel, deep active learning approach to mine features from existing large conflict-related text corpora. Fourth, substantive knowledge, by combining findings from the previous papers to take a fresh look at the microdynamics of conflict escalation. Fifth, the forecasting process itself, by building models that directly forecast from text, eliminating the intermediate step of manual data curation. Finally, the frontier of event-data, by looking at whether the news-media heavy way we collect violent fatal events can be extended to the collection of non-violent events. Methodologically, the dissertation introduces state-of-the art methods to the field, including the use of large language models, Gaussian processes, active learning and deep time series modelling. The six papers in the dissertation exhibit significant performance improvement, especially in forecasting dynamics.

Place, publisher, year, edition, pages
Uppsala: Uppsala University, 2025. p. 62
Series
Report / Department of Peace and Conflict Research, ISSN 0566-8808 ; 132
Keywords
conflict forecasting, predictive methodology, event data, battle events, spatial forecasting, machine learning, large language models, computational linguistics, civil war, armed conflict
National Category
Political Science (excluding Public Administration Studies and Globalisation Studies) Other Social Sciences not elsewhere specified Peace and Conflict Studies Other Social Sciences not elsewhere specified Computer Sciences Social and Economic Geography
Research subject
Peace and Conflict Research; Computational Linguistics; Political Science; Social and Economic Geography; Machine learning
Identifiers
urn:nbn:se:uu:diva-545176 (URN)978-91-506-3086-2 (ISBN)
Public defence
2025-03-21, Brusewitzsalen, Gamla Torget 6, Uppsala, 13:15 (English)
Opponent
Supervisors
Available from: 2025-01-27 Created: 2024-12-12 Last updated: 2025-02-20

Open Access in DiVA

No full text in DiVA

Other links

Preprint at arXiv

Authority records

Croicu, Mihai

Search in DiVA

By author/editor
Croicu, Mihai
By organisation
Department of Peace and Conflict Research
Other Social Sciences not elsewhere specifiedPeace and Conflict StudiesOther Social Sciences not elsewhere specifiedPolitical Science (excluding Public Administration Studies and Globalisation Studies)Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 132 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf