uu.seUppsala universitets publikasjoner
Endre søk
Link to record
Permanent link

Direct link
BETA
Nivre, Joakim
Publikasjoner (10 av 183) Visa alla publikasjoner
de Marneffe, M.-C. & Nivre, J. (2019). Dependency Grammar. Annual review of linguistics, 5, 197-218
Åpne denne publikasjonen i ny fane eller vindu >>Dependency Grammar
2019 (engelsk)Inngår i: Annual review of linguistics, E-ISSN 2333-9691, Vol. 5, s. 197-218Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Dependency grammar is a descriptive and theoretical tradition in linguistics that can be traced back to antiquity. It has long been influential in the European linguistics tradition and has more recently become a mainstream approach to representing syntactic and semantic structure in natural language processing. In this review, we introduce the basic theoretical assumptions of dependency grammar and review some key aspects in which different dependency frameworks agree or disagree. We also discuss advantages and disadvantages of dependency representations and introduce Universal Dependencies, a framework for multilingual dependency-based morphosyntactic annotation that has been applied to more than 60 languages.

sted, utgiver, år, opplag, sider
ANNUAL REVIEWS, 2019
Emneord
dependency grammar, dependency frameworks, dependency parsing, Universal Dependencies
HSV kategori
Identifikatorer
urn:nbn:se:uu:diva-381531 (URN)10.1146/annurev-linguistics-011718-011842 (DOI)000460289100010 ()
Tilgjengelig fra: 2019-04-11 Laget: 2019-04-11 Sist oppdatert: 2019-04-11bibliografisk kontrollert
Basirat, A., de Lhoneux, M., Kulmizev, A., Kurfal, M., Nivre, J. & Östling, R. (2019). Polyglot Parsing for One Thousand and One Languages (And Then Some). In: : . Paper presented at First workshop on Typology for Polyglot NLP, Florence, Italy, August 1 2019.
Åpne denne publikasjonen i ny fane eller vindu >>Polyglot Parsing for One Thousand and One Languages (And Then Some)
Vise andre…
2019 (engelsk)Konferansepaper, Poster (with or without abstract) (Annet vitenskapelig)
HSV kategori
Identifikatorer
urn:nbn:se:uu:diva-392156 (URN)
Konferanse
First workshop on Typology for Polyglot NLP, Florence, Italy, August 1 2019
Tilgjengelig fra: 2019-08-29 Laget: 2019-08-29 Sist oppdatert: 2019-08-30bibliografisk kontrollert
Basirat, A. & Nivre, J. (2019). Real-valued syntactic word vectors. Journal of experimental and theoretical artificial intelligence (Print)
Åpne denne publikasjonen i ny fane eller vindu >>Real-valued syntactic word vectors
2019 (engelsk)Inngår i: Journal of experimental and theoretical artificial intelligence (Print), ISSN 0952-813X, E-ISSN 1362-3079Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

We introduce a word embedding method that generates a set of real-valued word vectors from a distributional semantic space. The semantic space is built with a set of context units (words) which are selected by an entropy-based feature selection approach with respect to the certainty involved in their contextual environments. We show that the most predictive context of a target word is its preceding word. An adaptive transformation function is also introduced that reshapes the data distribution to make it suitable for dimensionality reduction techniques. The final low-dimensional word vectors are formed by the singular vectors of a matrix of transformed data. We show that the resulting word vectors are as good as other sets of word vectors generated with popular word embedding methods.

Emneord
Word embeddings, context selection, transformation, dependency parsing, singular value decomposition, entropy
HSV kategori
Identifikatorer
urn:nbn:se:uu:diva-392095 (URN)10.1080/0952813X.2019.1653385 (DOI)
Tilgjengelig fra: 2019-08-29 Laget: 2019-08-29 Sist oppdatert: 2019-08-29bibliografisk kontrollert
Smith, A., Bohnet, B., de Lhoneux, M., Nivre, J., Shao, Y. & Stymne, S. (2018). 82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models. In: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies. Paper presented at Conference on Computational Natural Language Learning (CoNLL),October 31 - November 1, 2018 Brussels, Belgium (pp. 113-123).
Åpne denne publikasjonen i ny fane eller vindu >>82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models
Vise andre…
2018 (engelsk)Inngår i: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies, 2018, s. 113-123Konferansepaper, Publicerat paper (Fagfellevurdert)
HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-371246 (URN)
Konferanse
Conference on Computational Natural Language Learning (CoNLL),October 31 - November 1, 2018 Brussels, Belgium
Tilgjengelig fra: 2018-12-19 Laget: 2018-12-19 Sist oppdatert: 2019-03-06bibliografisk kontrollert
Tang, G., Sennrich, R. & Nivre, J. (2018). An analysis of Attention Mechanism: The Case of Word Sense Disambiguation in Neural Machine Translation. In: Proceedings of the Third Conference on Machine Translation: . Paper presented at Third Conference on Machine Translation, October 31 — November 1, 2018, Brussels, Belgium (pp. 26-35).
Åpne denne publikasjonen i ny fane eller vindu >>An analysis of Attention Mechanism: The Case of Word Sense Disambiguation in Neural Machine Translation
2018 (engelsk)Inngår i: Proceedings of the Third Conference on Machine Translation, 2018, s. 26-35Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Recent work has shown that the encoder-decoder attention mechanisms in neural ma-chine translation (NMT) are different from theword alignment in statistical machine trans-lation.In this paper, we focus on analyz-ing encoder-decoder attention mechanisms, inthe case of word sense disambiguation (WSD)in NMT models. We hypothesize that atten-tion mechanisms pay more attention to contexttokens when translating ambiguous words.We explore the attention distribution patternswhen translating ambiguous nouns. Counter-intuitively, we find that attention mechanismsare likely to distribute more attention to theambiguous noun itself rather than context to-kens, in comparison to other nouns. We con-clude that attention is not the main mecha-nism used by NMT models to incorporate con-textual information for WSD. The experimen-tal results suggest that NMT models learn toencode contextual information necessary forWSD in the encoder hidden states. For the at-tention mechanism in Transformer models, wereveal that the first few layers gradually learnto “align” source and target tokens and the lastfew layers learn to extract features from the re-lated but unaligned context tokens

HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-369712 (URN)
Konferanse
Third Conference on Machine Translation, October 31 — November 1, 2018, Brussels, Belgium
Tilgjengelig fra: 2018-12-17 Laget: 2018-12-17 Sist oppdatert: 2019-03-06bibliografisk kontrollert
Tang, G., Cap, F., Pettersson, E. & Nivre, J. (2018). An evaluation of neural machine translation models on historical spelling normalization. In: Proceedings of the 27th International Conference on Computational Linguistics: . Paper presented at COLING 2018 (pp. 1320-1331).
Åpne denne publikasjonen i ny fane eller vindu >>An evaluation of neural machine translation models on historical spelling normalization
2018 (engelsk)Inngår i: Proceedings of the 27th International Conference on Computational Linguistics, 2018, s. 1320-1331Konferansepaper, Publicerat paper (Fagfellevurdert)
HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-369710 (URN)
Konferanse
COLING 2018
Tilgjengelig fra: 2018-12-17 Laget: 2018-12-17 Sist oppdatert: 2018-12-17
Smith, A., de Lhoneux, M., Stymne, S. & Nivre, J. (2018). An Investigation of the Interactions Between Pre-Trained Word Embeddings, Character Models and POS Tags in Dependency Parsing. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: . Paper presented at The 2018 Conference on Empirical Methods in Natural Language Processing, October 31–November 4 Brussels, Belgium (pp. 2711-2720).
Åpne denne publikasjonen i ny fane eller vindu >>An Investigation of the Interactions Between Pre-Trained Word Embeddings, Character Models and POS Tags in Dependency Parsing
2018 (engelsk)Inngår i: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 2018, s. 2711-2720Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2711–2720Brussels, Belgium, October 31 - November 4, 2018.c©2018 Association for Computational Linguistics2711An Investigation of the Interactions Between Pre-Trained WordEmbeddings, Character Models and POS Tags in Dependency ParsingAaron Smith Miryam de Lhoneux Sara Stymne Joakim NivreDepartment of Linguistics and Philology, Uppsala UniversityAbstractWe provide a comprehensive analysis of theinteractions between pre-trained word embed-dings, character models and POS tags in atransition-based dependency parser.Whileprevious studies have shown POS informationto be less important in the presence of char-acter models, we show that in fact there arecomplex interactions between all three tech-niques. In isolation each produces large im-provements over a baseline system using ran-domly initialised word embeddings only, butcombining them quickly leads to diminishingreturns. We categorise words by frequency,POS tag and language in order to systemati-cally investigate how each of the techniquesaffects parsing quality. For many word cat-egories, applying any two of the three tech-niques is almost as good as the full combinedsystem. Character models tend to be more im-portant for low-frequency open-class words,especially in morphologically rich languages,while POS tags can help disambiguate high-frequency function words. We also show thatlarge character embedding sizes help even forlanguages with small character sets, especiallyin morphologically rich languages.

HSV kategori
Forskningsprogram
Datorlingvistik; Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-371245 (URN)
Konferanse
The 2018 Conference on Empirical Methods in Natural Language Processing, October 31–November 4 Brussels, Belgium
Tilgjengelig fra: 2018-12-19 Laget: 2018-12-19 Sist oppdatert: 2019-03-06bibliografisk kontrollert
Nivre, J., Marongiu, P., Ginter, F., Kanerva, J., Montemagni, S., Schuster, S. & Simi, M. (2018). Enhancing Universal Dependency Treebanks: A Case Study. In: Proceedings of the Second Workshop on Universal Dependencies (UDW 2018): . Paper presented at Second Workshop on Universal Dependencies (UDW 2018), 1st November, Brussels, Belgium (pp. 102-107).
Åpne denne publikasjonen i ny fane eller vindu >>Enhancing Universal Dependency Treebanks: A Case Study
Vise andre…
2018 (engelsk)Inngår i: Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), 2018, s. 102-107Konferansepaper, Publicerat paper (Fagfellevurdert)
HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-371249 (URN)
Konferanse
Second Workshop on Universal Dependencies (UDW 2018), 1st November, Brussels, Belgium
Tilgjengelig fra: 2018-12-19 Laget: 2018-12-19 Sist oppdatert: 2019-03-07bibliografisk kontrollert
Bouma, G., Hajič, J., Haug, D., Nivre, J., Solberg, P. E. & Øvrelid, L. (2018). Expletives in Universal Dependency Treebanks. In: Proceedings of the Second Workshop on Universal Dependencies (UDW 2018): . Paper presented at Second Workshop on Universal Dependencies, 2018, 1st November, Brussels, Belgium (UDW 2018) (pp. 18-26).
Åpne denne publikasjonen i ny fane eller vindu >>Expletives in Universal Dependency Treebanks
Vise andre…
2018 (engelsk)Inngår i: Proceedings of the Second Workshop on Universal Dependencies (UDW 2018), 2018, s. 18-26Konferansepaper, Publicerat paper (Fagfellevurdert)
HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-371248 (URN)
Konferanse
Second Workshop on Universal Dependencies, 2018, 1st November, Brussels, Belgium (UDW 2018)
Tilgjengelig fra: 2018-12-19 Laget: 2018-12-19 Sist oppdatert: 2019-03-07bibliografisk kontrollert
Stymne, S., de Lhoneux, M., Smith, A. & Nivre, J. (2018). Parser Training with Heterogeneous Treebanks. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers): . Paper presented at The 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, July 15 - 20, 2018. (pp. 619-625). Association for Computational Linguistics
Åpne denne publikasjonen i ny fane eller vindu >>Parser Training with Heterogeneous Treebanks
2018 (engelsk)Inngår i: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), Association for Computational Linguistics, 2018, s. 619-625Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previouslysuggested, but little evaluated, strategiesfor exploiting multiple treebanks based onconcatenating training sets, with or without fine-tuning. We go on to propose anew method based on treebank embeddings. We perform experiments for severallanguages and show that in many casesfine-tuning and treebank embeddings leadto substantial improvements over singletreebanks or concatenation, with averagegains of 2.0–3.5 LAS points. We arguethat treebank embeddings should be preferred due to their conceptual simplicity,flexibility and extensibility.

sted, utgiver, år, opplag, sider
Association for Computational Linguistics, 2018
HSV kategori
Forskningsprogram
Datorlingvistik
Identifikatorer
urn:nbn:se:uu:diva-362215 (URN)10.18653/v1/P18-2098 (DOI)000493913100098 ()978-1-948087-34-6 (ISBN)
Konferanse
The 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, July 15 - 20, 2018.
Forskningsfinansiär
Swedish Research Council, P2016-01817
Tilgjengelig fra: 2018-10-02 Laget: 2018-10-02 Sist oppdatert: 2019-12-06bibliografisk kontrollert
Organisasjoner