uu.seUppsala University Publications
Change search
Refine search result
1 - 22 of 22
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Berglund, Karl
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Arts, Department of Literature, Sociology of Literature.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Määttä, Jerry
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Arts, Department of Literature.
    Apples and Oranges? Large-Scale Thematic Comparisons of Contemporary Swedish Popular and Literary Fiction2019In: Samlaren: tidskrift för svensk litteraturvetenskaplig forskning, ISSN 0348-6133, E-ISSN 2002-3871, Vol. 140, p. 228-260Article in journal (Refereed)
    Abstract [en]

    Karl Berglund, Department of Literature, Uppsala University

    Mats Dahllöf, Department of Linguistics and Philology, Uppsala University

    Jerry Määttä, Department of Literature, Uppsala University

    Apples and Oranges? Large-Scale Thematic Comparisons of Contemporary Swedish Popular and Literary Fiction

    The aim of this article is to compare thematic trends in contemporary Swedish bestselling and literary fiction with the help of a computational method—topic modelling—which extracts content themes based on statistical patterns of word usage. This procedure allows us to identify trends and patterns that are not easily discovered through manual reading. We track topics in two subsets of Swedish fiction from the period 2004–2017: 1) prose fiction on the Swedish bestseller charts, and 2) prose fiction shortlisted for the August Prize (arguably the most prestigious Swedish literary prize). The results confirm several assumptions about contemporary popular and literary fiction, such as more plot-focused themes in popular fiction and themes more connected to settings in literary fiction. But the outcomes also provide new, and more surprising knowledge, such as food and economy being the most biased themes among the non-crime fiction bestsellers, whereas themes concerning nature are most biased in the literary realm. Moreover, themes relating to sex, intimacy, and violence are biased towards literary fiction rather than popular fiction. In the light of our findings, we argue that both popular fiction and literary fiction seem to be characterised by certain thematic attributes that make it relevant to discuss them as genres also on a textual-thematic level.

  • 2.
    Berglund, Karl
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Arts, Department of Literature, Sociology of Literature.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Määttä, Jerry
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Arts, Department of Literature.
    Supplementary material for “Apples and Oranges? Large-Scale Thematic Comparisons of Contemporary Swedish Popular and Literary Fiction” (Samlaren, 2019)2019Report (Other academic)
    Abstract [en]

    The report provides raw listings of the results of topic modeling experiments, intended for readers interested in taking a closer look at these. Explanations and discussion are found in the main article: “Apples and Oranges? Large-Scale Thematic Comparisons of Contemporary Swedish Popular and Literary Fiction” published in the journal Samlaren, 2019.

  • 3.
    Borin, Lars
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Dahllöf, Mats
    A corpus-based grammar tutor for education in language and speech technology1999In: Proceedings of the EACL '99 Post-Conference Workshop on Computer and Internet Supported Education in Language and Speech Technology, The Association for Computational Linguistics, Bergen , 1999, p. 36-43Conference paper (Refereed)
    Abstract [en]

    We describe work in progress on a corpus-based tutoring system for education in traditional and formal grammar. It is mainly intended for language and speech technology students and gives them the opportunity to learn grammar and grammatical analysis

  • 4.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    An Annotational Approach to Compositional Semantics2003In: Proceedings of the Second Workshop on Treebanks and Linguistic Theories (TLT 2003), Växjö University Press , 2003, p. 33-44Conference paper (Refereed)
  • 5.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    An Implementation of Token Dependency Semantics for a Fragment of English2003Report (Other academic)
  • 6.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Automatic prediction of gender, political affiliation, and age in Swedish politicians from the wording of their speeches: A comparative study of classifiability2012In: Literary & Linguistic Computing, ISSN 0268-1145, E-ISSN 1477-4615, Vol. 27, no 2, p. 139-153Article in journal (Refereed)
    Abstract [en]

    The present study explores automatic classification of Swedish politicians and their speeches into classes based on personal traits-gender, age, and political affiliation-as a means for measuring and analyzing how these traits influence language use. Support Vector Machines classified 200-word passages, represented by binary bag-of-word-forms vectors. Different feature selections were tried. The performance of the classifiers was assessed using test data from authors unseen in the training data. Author-level predictions derived from twenty-one text-level predictions reached an accuracy rate of 81.2% for gender, 89.4% for political affiliation, and 78.9% for age. Classification concerning each basic distinction was applied to general populations of politicians and to cohorts defined by the other classes. The outcomes suggest that the extent to which these personal traits are expressed in language use varies considerably among the different cohorts and that different traits affect different layers of the vocabulary. The accuracy rates for gender classification were higher for the right wing and older cohorts than for the opposite ones. Age prediction gave higher accuracy for the right wing cohort. Political classification gave the highest accuracy rates when all forms were included in the feature sets, whereas feature sets restricted to verbs or function words gave the highest scores for gender prediction, and the lowest ones for political classification.

  • 7.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Automatic Scribe Attribution for Medieval Manuscripts2018In: Digital Medievalist, ISSN 1715-0736, E-ISSN 1715-0736, Vol. 11, no 1, p. 1-26, article id 6Article in journal (Refereed)
    Abstract [en]

    We propose an automatic method for attributing manuscript pages to scribes. The system uses digital images as published by libraries. The attribution process involves extracting from each query page approximately letter-size components. This is done by means of binarization (ink-background separation), connected component labelling, and further segmentation, guided by the estimated typical stroke width. Components are extracted in the same way from the pages of known scribal origin. This allows us to assign a scribe to each query component by means of nearest-neighbour classification. Distance (dissimilarity) between components is modelled by simple features capturing the distribution of ink in the bounding box defined by the component, together with Euclidean distance. The set of component-level scribe attributions, which typically includes hundreds of components for a page, is then used to predict the page scribe by means of a voting procedure. The scribe who receives the largest number of votes from the 120 strongest component attributions is proposed as its scribe. The scribe attribution process allows the argument behind an attribution to be visualized for a human reader. The writing components of the query page are exhibited along with the matching components of the known pages. This report is thus open to inspection and analysis using the methods and intuitions of traditional palaeography. The present system was evaluated on a data set covering 46 medieval scribes, writing in Carolingian minuscule, Bastarda, and a few other scripts. The system achieved a mean top-1 accuracy of 98.3% as regards the first scribe proposed for each page, when the labelled data comprised one randomly selected page from each scribe and nine unseen pages for each scribe were to be attributed in the validation procedure. The experiment was repeated 50 times to even out random variation effects.

  • 8.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Clustering Writing Components from Medieval Manuscripts2018In: COMHUM 2018: Book of Abstracts for the Workshop on Computational Methods in the Humanities 2018 / [ed] Piotrowski, Michael, Lausanne, 2018, p. 11-13Conference paper (Refereed)
  • 9.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Clustering writing components from medieval manuscripts2019In: Proceedings of the Workshop on Computational Methods in the Humanities 2018 / [ed] Michael Piotrowski, 2019, p. 23-32Conference paper (Refereed)
    Abstract [en]

    This article explores a minimally supervised method for extracting components, mostly letters, from historical manuscripts, and clustering them into classes capturing linguistic equivalence. The clustering uses the DBSCAN algorithm and an additional classification step. This pipeline gives us cheap, but partial, manuscript transcription in combination with human annotation. Experiments with different parameter settings suggest that a system like this should be tuned separately for different categories, rather than rely on one-pass application of algorithms partitioning the same components into non-overlapping clusters. The method could also be used to extract features for manuscript classification, e.g. dating and scribe attribution, as well as to extract data for further palaeographic analysis.

  • 10.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Code and Data for “Classification of Medieval Documents: Determining the Issuer, Place of Issue, and Decade for Old Swedish Charters”2020Data set
    Abstract [en]

    Code and data for the article Classification of Medieval Documents: Determining the Issuer, Place of Issue, and Decade for Old Swedish Charters (to appear in DHN2020 Digital Humanities in the Nordic Countries}, Riga, 17--20 March 2020).

    The study based on this code and dataset is a comparative exploration of different classification tasks for Swedish medieval charters (transcriptions from the SDHK collection) and different classifier setups. In particular, we explore the identification of the issuer, place of issue, and decade of production. The experiments used features based on lowercased words and character 3- and 4-grams. We evaluated the performance of two learning algorithms: linear discriminant analysis and decision trees. For evaluation, five-fold cross-validation was performed. We report accuracy and macro-averaged F1 score. The validation made use of six labeled subsets of SDHK combining the three tasks with Old Swedish and Latin. Issuer identification for the Latin dataset (595 charters from 12 issuers) reached the highest scores, above 0.9, for the decision tree classifier using word features. The best corresponding accuracy for Old Swedish was 0.81. Place and decade identification produced lower performance scores for both languages. Which classifier design is the best one seems to depend on peculiarities of the dataset and the classification task. The present study does however support the idea that text classification is useful also for medieval documents characterized by extreme spelling variation.

  • 11.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Här kommer vi!2012In: Språktidningen, ISSN 1654-5028, no 7, p. 44-49Article in journal (Other (popular science, discussion, etc.))
  • 12.
    Dahllöf, Mats
    Göteborgs universitet.
    On the Semantics of Propositional Attitude Reports1995Doctoral thesis, monograph (Other academic)
  • 13.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Predicting the Scribe Behind a Page of Medieval Handwriting2014Conference paper (Refereed)
  • 14.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Prolog-Embedding Typed Feature Structure Grammar (PETFSG-II.2) and Grammar Tool2003Report (Other academic)
  • 15.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Scribe attribution for early medieval handwriting by means of letter extraction and classification and a voting procedure for larger pieces2014In: 22nd International Conference on Pattern Recognition (ICPR), 2014, p. 1910-1915Conference paper (Refereed)
    Abstract [en]

    The present study investigates a method for the attribution of scribal hands, inspired by traditional palaeography in being based on comparison of letter shapes. The system was developed for and evaluated on early medieval Caroline minuscule manuscripts. The generation of a prediction for a page image involves writing identification, letter segmentation, and letter classification. The system then uses the letter proposals to predict the scribal hand behind a page. Letters and sequences of connected letters are identified by means of connected component labeling and split into letter-size pieces. The hand (and character) prediction makes use of a dataset containing instances of the letters b, d, p, and q, cut out from manuscript pages whose scribal origin is known. Letters are represented by features capturing the distribution of foreground. Cosine similarity is used for nearest neighbor classification. The hand behind a page is finally predicted by means of a voting procedure taking the highest scoring letter-level hits as its input. This hand prediction method was evaluated on pages from five different hands and reached an accuracy above 99% for four of them and 87% for a fifth significantly more difficult one. The hand behind single toplisted letters was correctly predicted in 83% of the cases.

  • 16.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Språklig betydelse: En introduktion till semantik och pragmatik1999Book (Refereed)
  • 17.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Three papers on computational syntax and semantics1999Report (Other academic)
  • 18.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Token Dependency Semantics and the Paratactic Analysis of Intensional Constructions2002In: Journal of Semantics, ISSN 0167-5133, E-ISSN 1477-4593, Vol. 19, no 4, p. 333-368Article in journal (Refereed)
    Abstract [en]

    This article introduces Token Dependency Semantics (TDS), a surface‐oriented and token‐based framework for compositional truth‐conditional semantics. It is motivated by Davidson's ‘paratactic’ analysis of semantic intensionality (‘On Saying That’, 1968, Synthèse 19: 130–146), which has been much discussed in philosophy. This is the first fully‐fledged formal implementation of Davidson's proposal. Operator‐argument structure and scope are captured by means of relations among tokens. Intensional constituent tokens represent ‘propositional’ contents directly. They serve as arguments to the words introducing intensional contexts, rather than being ‘ordinary’ constituents. The treatment of de re readings involves the use of functions (‘anchors’) assigning entities to argument positions of lexical tokens. Quantifiers are thereby allowed to bind argument places on content tokens. This gives us a simple underspecification‐based account of scope ambiguity. The TDS framework is applied to indirect speech reports, mental attitude sentences, control verbs, and modal and agent‐relative sentence adverbs in English. This semantics is compatible with a traditional view of syntax. Here, it is integrated into a Head‐driven Phrase Structure Grammar (HPSG). The result is a straightforward and ontologically parsimonious analysis of truth‐conditional meaning and semantic intensionality.

  • 19.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Two Reports on Computational Syntax and Semantics2003Report (Other academic)
  • 20.
    Dahllöf, Mats
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Berglund, Karl
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Arts, Department of Literature, Sociology of Literature.
    Faces, Fights, and Families: Topic Modeling and Gendered Themes in Two Corpora of Swedish Prose Fiction2019In: DHN 2019 Copenhagen, Proceedings of 4th Conference of The Association Digital Humanities in the Nordic Countries Copenhagen, March 6-8 2019 / [ed] Constanza Navaretta et al., 2019, p. 92-111Conference paper (Refereed)
    Abstract [en]

    This paper explores topic modeling (TM) as a tool for “dis- tant reading” of two Swedish literary corpora. We investigate what kinds of insight and knowledge a TM-based approach can provide to Swedish literary history, and which methodological difficulties are associated with this endeavour. The TM is based on 12- and 24-term chunks of selected verb and common noun lemmas. We generate models with 20, 40, and 100 topics. We also propose a method for a quantitative and qualita- tive gendered thematic analysis by combining TM with a study of how the topics relate to gender in characters and authors. The two corpora contain, respectively, Swedish classics (1821–1941) and recent bestsellers (2004–2017). We find that most of the topics proposed by the TM are easy to interpret as conceptual themes, and that the “same” themes ap- pear for the two corpora and for different TM settings. The study allows us to make interesting observations concerning different aspects of gender and topic distribution.

  • 21.
    Wahlberg, Fredrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Centre for Image Analysis. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Mårtensson, Lasse
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Scandinavian Languages.
    Brun, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Centre for Image Analysis. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    Data Mining Medieval Documents by Word Spotting2011In: Proceedings of the 2011 Workshop on Historical Document Imaging and Processing, New York: ACM , 2011, p. 75-82Conference paper (Refereed)
    Abstract [en]

    This paper presents novel results for word spotting based on dynamic time warping applied to medieval manuscripts in Latin and Old Swedish. A target word is marked by a user, and the method automatically finds similar word forms in the document by matching them against the target. The method automatically identifies pages and lines. We show that our method improves accuracy compared to earlier proposals for this kind of handwriting. An advantage of the new method is that it performs matching within a text line without presupposing that the difficult problem of segmenting the text line into individual words has been solved. We evaluate our word spotting implementation on two medieval manuscripts representing two script types. We also show that it can be useful by helping a user find words in a manuscript and present graphs of word statistics as a function of page number.

  • 22.
    Wahlberg, Fredrik
    et al.
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Visual Information and Interaction. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    Dahllöf, Mats
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Mårtensson, Lasse
    Brun, Anders
    Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Division of Visual Information and Interaction. Uppsala University, Disciplinary Domain of Science and Technology, Mathematics and Computer Science, Department of Information Technology, Computerized Image Analysis and Human-Computer Interaction.
    Spotting words in medieval manuscripts2014In: Studia Neophilologica, ISSN 0039-3274, E-ISSN 1651-2308, Vol. 86, p. 171-186Article in journal (Refereed)
    Abstract [en]

    This article discusses the technology of handwritten text recognition (HTR) as a tool for the analysis of historical handwritten documents. We give a broad overview of this field of research, but the focus is on the use of a method called word spotting' for finding words directly and automatically in scanned images of manuscript pages. We illustrate and evaluate this method by applying it to a medieval manuscript. Word spotting uses digital image analysis to represent stretches of writing as sequences of numerical features. These are intended to capture the linguistically significant aspects of the visual shape of the writing. Two potential words can then be compared mathematically and their degree of similarity assigned a value. Our version of this method gives a false positive rate of about 30%, when the true positive rate is close to 100%, for an application where we search for very frequent short words in a 16th-Century Old Swedish cursiva recentior manuscript. Word spotting would be of use e.g. to researchers who want to explore the content of manuscripts when editions or other transcriptions are unavailable.

1 - 22 of 22
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf