uu.seUppsala University Publications
Change search
Refine search result
1234567 1 - 50 of 560
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Adams, Allison
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Dependency Parsing and Dialogue Systems: an investigation of dependency parsing for commercial application2017Independent thesis Advanced level (degree of Master (Two Years)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    In this thesis, we investigate dependency parsing for commercial application, namely for future integration in a dialogue system. To do this, we conduct several experiments on dialogue data to assess parser performance on this domain, and to improve this performance over a baseline. This work makes the following contributions: first, the creation and manual annotation of a gold-standard data set for dialogue data; second, a thorough error analysis of the data set, comparing neural network parsing to traditional parsing methods on this domain; and finally, various domain adaptation experiments show how parsing on this data set can be improved over a baseline.  We further show that dialogue data is characterized by questions in particular, and suggest a method for improving overall parsing on these constructions. 

  • 2.
    Adams, Allison
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Stymne, Sara
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Learning with learner corpora: Using the TLE for native language identification2017In: Proceedings of the joint workshop on NLP for Computer Assisted Language Learning and NLP for Language Acquisition, 2017, p. 1-7Conference paper (Refereed)
    Abstract [en]

    This study investigates the usefulness of the Treebank of Learner English (TLE) when applied to the task of Native Language Identification (NLI). The TLE is effectively a parallel corpus of Standard/Learner English, as there are two versions; one based on original learner essays, and the other an error-corrected version. We use the corpus to explore how useful a parser trained on ungrammatical relations is compared to a parser trained on grammatical relations, when used as features for a native language classification task. While parsing results are much better when trained on grammatical relations, native language classification is slightly better using a parser trained on the original treebank containing ungrammatical relations.

  • 3. Agić, Zeljko
    et al.
    Tiedemann, Jörg
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Merkler, Danijela
    Krek, Simon
    Dobrovoljc, Kaja
    Moze, Sara
    Cross-lingual Dependency Parsing of Related Languages with Rich Morphosyntactic Tagsets2014In: Proceedings of the EMNLP’2014 Workshop on Language Technology for Closely Related Languages and Language Variants, 2014, p. 13-24Conference paper (Refereed)
  • 4.
    Ahlbom, Viktoria
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Sågvall Hein, Anna
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Test Suites Covering the Functional Specifications of the Sub-components of the Swedish Prototype1999In: Working Papers in Computational Linguistics & Language Engineering;13, ISSN 1401-923X, no 13, p. 28-Article in journal (Other academic)
  • 5.
    Ahlbom, Viktoria
    et al.
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Sågvall Hein, Anna
    Test Suites Covering the Functional Specifications of the Sub-components of the Swedish Prototype1999In: Working Papers in Computational Linguistics & Language Engineering;13, ISSN 1401-923X, no 13, p. 28-Article in journal (Other scientific)
  • 6.
    Ahrenberg, Lars and Merkel, Magnus and Ridings, Daniel and Sågvall Hein, Anna and Tiedemann, Jörg
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Automatic processing of parallel corpora: A Swedish perspective.1999Report (Other scientific)
    Abstract [en]

    As empirical methods have come to the fore in language technology and translation studies, the processing of parallel texts and parallel corpora have become a major issue. In this article we review the state of the art in alignment and data extraction tec

  • 7. Ahrenberg, Lars
    et al.
    Merkel, Magnus
    Sågvall Hein, Anna
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Tiedemann, Jörg
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Evaluation of LWA and UWA1999Report (Other academic)
  • 8. Alemu, Atelach
    et al.
    Hulth, Anette
    Megyesi, Beata
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. Datorlingvistik.
    General-Purpose Text Categorization Applied to the Medical Domain.2007Report (Other academic)
    Abstract [en]

    This paper presents work where a general-purpose text categorization method was applied to categorize medical free-texts. The purpose of the experiments was to examine how such a method performs without any domain-specific knowledge, hand-crafting or tuning. Additionally, we compare the results from the general-purpose method with results from runs in which a medical thesaurus as well as automatically extracted keywords were used when building the classifiers. We show that standard text categorization techniques using stemmed unigrams as the basis for learning can be applied directly to categorize medical reports, yielding an F-measure of 83.9, and outperforming the more sophisticated methods.

  • 9. Almqvist, Ingrid
    et al.
    Sågvall Hein, Anna
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Defining ScaniaSwedish - A Controlled Language for Truck Maintenance1996In: Proceedings of the First International Workshop on Controlled Language Applications, Centre for Computational Linguistics. Katholieke Universiteit Leuven , 1996Conference paper (Refereed)
    Abstract [en]

    An approach to integrated multilingual document production is proposed. The basic idea of this approach is to use the analyzer of a modular, transferbased machine translation system as the core of a language checker. The checker generates grammatical structures to be forwarded to the transfer and generation components for the various target languages. A precondition for such an approach is a controlled source language. The source language in focus of this presentation, is ScaniaSwedish, to be defined via a standardization of the language presently used by Scania in their truck maintenance documents. Here we concentrate on the identification of the vocabulary of current ScaniaSwedish and present the results that we achieved so far. In parallel with the inventory of the vocabulary, the competence of the language checker is developed.

  • 10. Almqvist, Inrid
    et al.
    Sågvall Hein, Anna
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    A Language Checker of Controlled Language and its Integration in a Documentation and Translation Workflow2000In: Translating and the Computer 22: Proceedings of the Twenty-second international conference, 16-17 November, 2000, London, London: Aslib, 2000, Vol. 22Conference paper (Refereed)
  • 11.
    Andréasson, Maia
    et al.
    Department of Swedish Language, University of Gothenburg.
    Borin, Lars
    Department of Swedish Language, University of Gothenburg.
    Forsberg, Markus
    Department of Swedish Language, University of Gothenburg.
    Beskow, Jonas
    School of Computer Science and Communication, KTH.
    Carlsson, Rolf
    School of Computer Science and Communication, KTH.
    Edlund, Jens
    School of Computer Science and Communication, KTH.
    Elenius, Kjell
    School of Computer Science and Communication, KTH.
    Hellmer, Kahl
    School of Computer Science and Communication, KTH.
    House, David
    School of Computer Science and Communication, KTH.
    Merkel, Magnus
    Department of Computer Science, Linköping University.
    Forsbom, Eva
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Megyesi, Beáta
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Eriksson, Anders
    Department of Philosophy, Linguistics and Theory of Science, University of Gothenburg.
    Strömqvist, Sven
    Centre for Languages and Literature, Lund University.
    Swedish CLARIN Activities2009In: Proceedings of the NODALIDA 2009 workshop Nordic Perspectives on the CLARIN Infrastructure of Language Resources / [ed] Rickard Domeij, Kimmo Koskenniemi, Steven Krauwer, Bente Maegaard, Eiríkur Rögnvaldsson and Koenraad de Smedt, Northern European Association for Language Technology (NEALT) , 2009, p. 1-5Conference paper (Refereed)
    Abstract [en]

    Although Sweden has yet to allocate funds specifically intended for CLARIN activities, there are some ongoing activities which are directly relevant to CLARIN, and which are explicitly linked to CLARIN. These activities have been funded by the Committee for Research Infrastructures and its subcommittee DISC (Database Infrastructure Committee) of the Swedish Research Council.

  • 12.
    Antomonov, Filip
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Megyesi, Beata
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Automatic Morphosyntactic Analaysis of Clinical Text2014Conference paper (Refereed)
    Abstract [en]

    Electronical health records, also called clinical texts, have their own linguistic characteristics and have been shown to deviate from standard language. Therefore, computational linguistics tools trained on standard language presumably do not achieve the same accuracy when applied to clinical data. In this paper, we describe a pipeline of tools for the automatic processing of clinical texts in Swedish from tokenization through part-of-speech tagging and dependency parsing. The evaluation of the components of the pipeline shows that existing NLP tools can be used, but performance drops greatly when models trained on standard language are applied to clinical data. We also present a small, syntactically annotated data set of clinical text to serve as gold standard.

  • 13.
    Axelsson, Hans
    et al.
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics and Philology.
    Blom, Oskar
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics and Philology.
    Utveckling av ett svensk-engelskt lexikon inom tåg- och transportdomänen2006Independent thesis Advanced level (degree of Master (One Year)), 20 credits / 30 HE creditsStudent thesis
    Abstract [en]

    This paper describes the process of building a machine translation lexicon for use in the train and transport domain with the machine translation system MATS. The lexicon will consist of a Swedish part, an English part and links between them and is derived from a Trados

    translation memory which is split into a training(90%) part and a testing(10%) part. The task is carried out mainly by using existing word linking software and recycling previous machine translation lexicons from other domains. In order to do this, a method is developed where focus lies on automation by means of both existing and self developed software, in combination with manual interaction. The domain specific lexicon is then extended with a domain neutral core lexicon and a less domain neutral general lexicon. The different lexicons are automatically and manually evaluated through machine translation on the test corpus. The automatic evaluation of the largest lexicon yielded a NEVA score of 0.255 and a BLEU score of 0.190. The manual evaluation saw 34% of the segments correctly translated, 37%, although not correct, perfectly understandable and 29% difficult to understand.

  • 14. Ballesteros, Miguel
    et al.
    Gómez-Rodríguez, Carlos
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Optimizing Planar and 2-Planar Parsers with MaltOptimizer2012In: Revista de Procesamiento de Lenguaje Natural (SEPLN), ISSN 1135-5948, E-ISSN 1989-7553, Vol. 49, p. 171-178Article in journal (Refereed)
  • 15. Ballesteros, Miguel
    et al.
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Going to the Roots of Dependency Parsing2013In: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 39, no 1, p. 5-13Article in journal (Refereed)
    Abstract [en]

    Dependency trees used in syntactic parsing often include a root node representing a dummy word prefixed or suffixed to the sentence, a device that is generally considered a mere technical convenience and is tacitly assumed to have no impact on empirical results. We demonstrate that this assumption is false and that the accuracy of data-driven dependency parsers can in fact be sensitive to the existence and placement of the dummy root node. In particular, we show that a greedy, left-to-right, arc-eager transition-based parser consistently performs worse when the dummy root node is placed at the beginning of the sentence (following the current convention in data-driven dependency parsing) than when it is placed at the end or omitted completely. Control experiments with an arc-standard transition-based parser and an arc-factored graph-based parser reveal no consistent preferences but nevertheless exhibit considerable variation in results depending on root placement. We conclude that the treatment of dummy root nodes in data-driven dependency parsing is an underestimated source of variation in experiments and may also be a parameter worth tuning for some parsers.

  • 16. Ballesteros, Miguel
    et al.
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    MaltOptimizer: Fast and Effective Parser Optimization2016In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 22, no 2, p. 187-213Article in journal (Refereed)
    Abstract [en]

    Statistical parsers often require careful parameter tuning and feature selection. This is a nontrivial task for application developers who are not interested in parsing for its own sake, and it can be time-consuming even for experienced researchers. In this paper we present MaltOptimizer, a tool developed to automatically explore parameters and features for MaltParser, a transition-based dependency parsing system that can be used to train parser's given treebank data. MaltParser provides a wide range of parameters for optimization, including nine different parsing algorithms, an expressive feature specification language that can be used to define arbitrarily rich feature models, and two machine learning libraries, each with their own parameters. MaltOptimizer is an interactive system that performs parser optimization in three stages. First, it performs an analysis of the training set in order to select a suitable starting point for optimization. Second, it selects the best parsing algorithm and tunes the parameters of this algorithm. Finally, it performs feature selection and tunes machine learning parameters. Experiments on a wide range of data sets show that MaltOptimizer quickly produces models that consistently outperform default settings and often approach the accuracy achieved through careful manual optimization.

  • 17.
    Basirat, Ali
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Principal Word Vectors2018Doctoral thesis, monograph (Other academic)
    Abstract [en]

    Word embedding is a technique for associating the words of a language with real-valued vectors, enabling us to use algebraic methods to reason about their semantic and grammatical properties. This thesis introduces a word embedding method called principal word embedding, which makes use of principal component analysis (PCA) to train a set of word embeddings for words of a language. The principal word embedding method involves performing a PCA on a data matrix whose elements are the frequency of seeing words in different contexts. We address two challenges that arise in the application of PCA to create word embeddings. The first challenge is related to the size of the data matrix on which PCA is performed and affects the efficiency of the word embedding method. The data matrix is usually a large matrix that requires a very large amount of memory and CPU time to be processed. The second challenge is related to the distribution of word frequencies in the data matrix and affects the quality of the word embeddings. We provide an extensive study of the distribution of the elements of the data matrix and show that it is unsuitable for PCA in its unmodified form.

    We overcome the two challenges in principal word embedding by using a generalized PCA method. The problem with the size of the data matrix is mitigated by a randomized singular value decomposition (SVD) procedure, which improves the performance of PCA on the data matrix. The data distribution is reshaped by an adaptive transformation function, which makes it more suitable for PCA. These techniques, together with a weighting mechanism that generalizes many different weighting and transformation approaches used in literature, enable the principal word embedding to train high quality word embeddings in an efficient way.

    We also provide a study on how principal word embedding is connected to other word embedding methods. We compare it to a number of word embedding methods and study how the two challenges in principal word embedding are addressed in those methods. We show that the other word embedding methods are closely related to principal word embedding and, in many instances, they can be seen as special cases of it.

    The principal word embeddings are evaluated in both intrinsic and extrinsic ways. The intrinsic evaluations are directed towards the study of the distribution of word vectors. The extrinsic evaluations measure the contribution of principal word embeddings to some standard NLP tasks. The experimental results confirm that the newly proposed features of principal word embedding (i.e., the randomized SVD algorithm, the adaptive transformation function, and the weighting mechanism) are beneficial to the method and lead to significant improvements in the results. A comparison between principal word embedding and other popular word embedding methods shows that, in many instances, the proposed method is able to generate word embeddings that are better than or as good as other word embeddings while being faster than several popular word embedding methods.

  • 18. Basirat, Ali
    et al.
    Fa, Heshaam
    Constructing Linguistically Motivated Structuresfrom Statistical Grammars2011In: Proceedings of the International Conference Recent Advances in Natural Language Processing 2011, 2011, p. 63-69Conference paper (Refereed)
  • 19.
    Basirat, Ali
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. University of Tehran.
    Faili, Heshaam
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    A statistical model for grammar mapping2016In: Natural Language Engineering, ISSN 1351-3249, E-ISSN 1469-8110, Vol. 22, no 2, p. 215-255Article in journal (Refereed)
    Abstract [en]

    The two main classes of grammars are (a) hand-crafted grammars, which are developed bylanguage experts, and (b) data-driven grammars, which are extracted from annotated corpora.This paper introduces a statistical method for mapping the elementary structures of a data-driven grammar onto the elementary structures of a hand-crafted grammar in order to combinetheir advantages. The idea is employed in the context of Lexicalized Tree-Adjoining Grammars(LTAG) and tested on two LTAGs of English: the hand-crafted LTAG developed in theXTAG project, and the data-driven LTAG, which is automatically extracted from the PennTreebank and used by the MICA parser. We propose a statistical model for mapping anyelementary tree sequence of the MICA grammar onto a proper elementary tree sequence ofthe XTAG grammar. The model has been tested on three subsets of the WSJ corpus thathave average lengths of 10, 16, and 18 words, respectively. The experimental results show thatfull-parse trees with average F1 -scores of 72.49, 64.80, and 62.30 points could be built from94.97%, 96.01%, and 90.25% of the XTAG elementary tree sequences assigned to the subsets,respectively. Moreover, by reducing the amount of syntactic lexical ambiguity of sentences,the proposed model significantly improves the efficiency of parsing in the XTAG system.

  • 20.
    Basirat, Ali
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Real-valued Syntactic Word Vectors (RSV) for Greedy Neural Dependency Parsing2017Conference paper (Refereed)
    Abstract [en]

    We show that a set of real-valued word vectors formed by right singular vectors of a transformed co-occurrence matrix are meaningful for determining different types of dependency relations between words. Our experimental results on the task of dependency parsing confirm the superiority of the word vectors to the other sets of word vectors generated by popular methods of word embedding. We also study the effect of using these vectors on the accuracy of dependency parsing in different languages versus using more complex parsing architectures.

  • 21. Basirat, Ali
    et al.
    Tang, Marc
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Neural network and human cognition: A case study of grammatical gender in Swedish2017In: Proceedings of the 13th Swedish Cognitive Science Society (SweCog) national conference, Uppsala, 2017, p. 28-30Conference paper (Other academic)
  • 22.
    Beck, Daniel
    et al.
    University of Sheffield.
    Cohn, Trevor
    University of Melbourne.
    Hardmeier, Christian
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Specia, Lucia
    University of Sheffield.
    Learning Structural Kernels for Natural Language Processing2015In: Transactions of the Association for Computational Linguistics, ISSN 2307-387X, Vol. 3, p. 461-473Article in journal (Refereed)
    Abstract [en]

    Structural kernels are a flexible learning paradigm that has been widely used in Natural Language Processing. However, the problem of model selection in kernel-based methods is usually overlooked. Previous approaches mostly rely on setting default values for kernel hyperparameters or using grid search, which is slow and coarse-grained. In contrast, Bayesian methods allow efficient model selection by maximizing the evidence on the training data through gradient-based methods. In this paper we show how to perform this in the context of structural kernels by using Gaussian Processes. Experimental results on tree kernels show that this procedure results in better prediction performance compared to hyperparameter optimization via grid search. The framework proposed in this paper can be adapted to other structures besides trees, e.g., strings and graphs, thereby extending the utility of kernel-based methods.

  • 23. Bengoetxea, Kepa
    et al.
    Agirre, Eneko
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Zhang, Yue
    Gojenola, Koldo
    On WordNet Semantic Classes and Dependency Parsing2014In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2014, p. 649-655Conference paper (Refereed)
  • 24.
    Bengtsson, Camilla
    et al.
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Borin, Lars
    Oxhammar, Henrik
    Comparing and combining part-of-speech taggers for multilingual parallel corpora2000Article in journal (Other scientific)
  • 25.
    Bergman, Nicklas
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Unsupervised Normalisation of Historical Spelling: A Multilingual Evaluation2018Independent thesis Basic level (degree of Bachelor), 10 credits / 15 HE creditsStudent thesis
    Abstract [en]

    Historical texts are an important resource for researchers in the humanities. However, standard NLP tools typically perform poorly on them, mainly due to the spelling variations present in such texts. One possible solution is to normalise the spelling variations to equivalent contemporary word forms before using standard tools. Weighted edit distance has previously been used for such normalisation, improving over the results of algorithms based on standard edit distance. Aligned training data is needed to extract weights, but there is a lack of such data. An unsupervised method for extracting edit distance weights is therefore desirable. This thesis presents a multilingual evaluation of an unsupervised method for extracting edit distance weights for normalisation of historical spelling variations. The model is evaluated for English, German, Hungarian, Icelandic and Swedish. The results are mixed and show a high variance depending on the different data sets. The method generally performs better than normalisation basedon standard edit distance but as expected does not quite reach up to the results of a model trained on aligned data. The results show an increase in normalisation accuracy compared to standard edit distance normalisation for all languages except German, which shows a slightly reduced accuracy, and Swedish, which shows similar results to the standard edit distance normalisation.

  • 26. Bertels, Ann
    et al.
    Fairon, Cédrick
    Tiedemann, Jörg
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Verlinde, Serge
    Corpus parallèles et corpus ciblés au secours du dictionnaire de traduction2009In: Cahiers de lexicologie, Classiques Garnier , 2009, p. 199-219Chapter in book (Other academic)
  • 27. Bertels, Ann
    et al.
    Fairon, Cédrick
    Tiedemann, Jörg
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Verlinde, Serge
    Corpus parallèles et corpus ciblés au secours du dictionnaire de traduction2009In: Cahiers de lexicologie, Classiques Garnier , 2009, p. 199-219Chapter in book (Other academic)
  • 28. Bethelsen, Harald
    et al.
    Megyesi, Beata
    Ensemble of Classifiers for Noise Detection in PoS Tagged Corpora2000In: Proceedings of the Third International Workshop on TEXT, SPEECH and DIALOGUE, 2000, p. 27-32Conference paper (Refereed)
    Abstract [en]

    In this paper we apply the ensemble approach to the identification of incorrectly annotated items (noise) in a training set. In a controlled experiment, memory-based, decision tree-based and transformation-based classifiers are used as a filter to detect and remove noise deliberately introduced into a manually tagged corpus. The results indicate that the method can be successfully applied to automatically detect errors in a corpus.

  • 29. Björkelund, Anders
    et al.
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Non-Deterministic Oracles for Unrestricted Non-Projective Transition-Based Dependency Parsing2015In: Proceedings of the 14th International Conference on Parsing Technologies, 2015, p. 76-86Conference paper (Refereed)
  • 30.
    Bohnet, Bernd
    et al.
    University of Birmingham.
    Nivre, Joakim
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
    Boguslavsky, Igor
    Russian Academy of Science.
    Farkas, Richard
    Szeged University.
    Ginter, Filip
    University of Turku.
    Hajic, Jan
    Charles University, Prague.
    Joint Morphological and Syntactic Analysis for Richly Inflected Languages2013In: Transactions of the Association for Computational Linguistics, ISSN 2307-387X, Vol. 1, no 4, p. 415-428Article in journal (Refereed)
  • 31.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    A corpus of written Finnish Romani texts2000In: LREC 2000. Second International Conference on Language Resources and Evaluation. Workshop proceedings. Developing Language Resources for Minority Languages: Reusability and Strategic Priorities, Athens: ELRA , 2000, p. 75-82Conference paper (Refereed)
  • 32.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Alignment and tagging2002In: Parallel corpora, parallel worlds. Selected papers from a symposium on parallel and comparable corpora at Uppsala University, Sweden, 22-23 April, 1999, Amsterdam: Rodopi , 2002, p. 207-218Chapter in book (Refereed)
  • 33.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Alignment and Tagging1999In: Working Papers in Computational Linguistics & Language Engineering;20, ISSN 1401-923X, no 20, p. 10-Article in journal (Other scientific)
  • 34.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    ... and never the twain shall meet?2002In: Parallel corpora, parallel worlds. Selected papers from a symposium on parallel and comparable corpora at Uppsala University, Sweden, 22-23 April, 1999, Amsterdam: Rodopi , 2002, p. 1-43Chapter in book (Refereed)
  • 35.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Att undersöka språkmöten med datorn.2001In: Språkets gränser och gränslöshet. Då tankar, tal och traditioner möts. Humanistdagarna vid Uppsala universitet 2001, Uppsala: Uppsala universitet , 2001, p. 45-56Chapter in book (Other scientific)
  • 36.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Enhancing tagging performance by combining knowledge sources2000In: Korpusar i forskning och undervisning. Corpora in research and teaching. Papers from the ASLA symposium Corpora in research and teaching, Växjö: Växjö University , 2000, p. 19-31Conference paper (Refereed)
  • 37.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    ETAP: Etablering och annotering av parallellkorpus för igenkänning av översättningsekvivalenter (ETAP: Creating and annotating a parallel corpus for the recognition of translation equivalents)1998In: ASLA Information, Vol. 24, no 1, p. 33-40Article in journal (Other scientific)
  • 38.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    ETAP project status report December 20002000Report (Other scientific)
  • 39.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Linguistics isn't always the answer: Word comparison in computational linguistics1998In: NODALIDA '98 Proceedings, Center for Sprogteknologi and Dept. of General and Applied Linguistics, University of Copenhagen , 1998, p. 140-151Conference paper (Refereed)
    Abstract [en]

    String similarity metrics are important tools in computational linguistics, extensively used e.g. for comparing words in a variety of problem domains. This paper examines the sometimes made assumption that the performance of such word comparison metho

  • 40.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Pivot alignment2000In: NODALIDA '99. Proceedings of the 12th "Nordiske datalingvistikkdager"., Trondheim: Department of Linguistics, NTNU , 2000, p. 41-48Conference paper (Refereed)
  • 41.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Something borrowed, something blue: Rule-based combination of POS taggers2000In: Second International Conference on Language Resources and Evaluation, Athens: ELRA , 2000, p. 21-26Conference paper (Refereed)
  • 42.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    What's in a link? Evaluating web-based language learning resources for higher education2002In: International Journal of Design Sciences and Technology, Vol. 9, no 2, p. 103-112Article in journal (Refereed)
  • 43.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Where will the standards for Intelligent Computer-Assisted Language Learning come from?2002In: Reports from Uppsala Learning Lab, no 5, p. 1-8Report (Other scientific)
  • 44.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Where will the standards for intelligent computer-assisted language learning come from?2002In: LREC 2002. Third International Conference on Language Resources and Evaluation. Workshop Proceedings. International standards of terminology and language resources management, Las Palmas: ELRA , 2002, p. 61-68Conference paper (Refereed)
  • 45.
    Borin, Lars
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    You'll take the high road and I'll take the low road: Using a third language to improve bilingual word alignment2000In: Proceedings of the 18th International Conference on Computational Linguistics. COLING 2000, Saarbrücken: Universität des Saarlandes , 2000, p. 97-103Conference paper (Refereed)
  • 46.
    Borin, Lars
    et al.
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Carlson, Lauri
    Santos, Diana
    Corpus based language technology for computer-assisted learning of Nordic languages: Squirrel2002In: Nordisk sprogteknologi. Nordic language technology. Årbog for Nordisk Sprogteknologisk Forskningsprogram 2000-2004, København: Museum Tusculanums Forlag, Københavns Universitet , 2002, p. 257-270Chapter in book (Other scientific)
  • 47.
    Borin, Lars
    et al.
    Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics.
    Dahllöf, Mats
    A corpus-based grammar tutor for education in language and speech technology1999In: Proceedings of the EACL '99 Post-Conference Workshop on Computer and Internet Supported Education in Language and Speech Technology, The Association for Computational Linguistics, Bergen , 1999, p. 36-43Conference paper (Refereed)
    Abstract [en]

    We describe work in progress on a corpus-based tutoring system for education in traditional and formal grammar. It is mainly intended for language and speech technology students and gives them the opportunity to learn grammar and grammatical analysis

  • 48.
    Borin, Lars (ed.)
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Parallel corpora, parallel worlds. Selected papers from a symposium on parallel and comparable corpora at Uppsala University, Sweden, 22-23 April, 19992002Book (Refereed)
  • 49.
    Borin, Lars
    et al.
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Gustavsson, Sara
    Separating the chaff from the wheat: Creating evaluation standards for web-based language-training resources2000In: Learning's W.W.W. Web Based Learning. Wireless Based Learning. Web Mining. Proceedings of CAPS'3, Paris: Europia , 2000, p. 127-138Conference paper (Refereed)
  • 50.
    Borin, Lars
    et al.
    Uppsala University, Humanistisk-samhällsvetenskapliga vetenskapsområdet, Faculty of Languages, Department of Linguistics.
    Prytz, Klas
    Tagging and Alignment1999Report (Other scientific)
1234567 1 - 50 of 560
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf