uu.seUppsala University Publications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Tunable Distortion Limits and Corpus Cleaning for SMT
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology. (Datorlingvistik)
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
Uppsala University, Disciplinary Domain of Humanities and Social Sciences, Faculty of Languages, Department of Linguistics and Philology.
2013 (English)In: Proceedings of the Eighth Workshop on Statistical Machine Translation, Association for Computational Linguistics, 2013, p. 225-231Conference paper, Published paper (Refereed)
Abstract [en]

We describe the Uppsala University system for WMT13, for English-to-German translation. We use the Docent decoder, a local search decoder that translates at the document level. We add tunable distortion limits, that is, soft constraints on the maximum distortion allowed, to Docent. We also investigate cleaning of the noisy Common Crawl corpus. We show that we can use alignment-based filtering for cleaning with good results. Finally we investigate effects of corpus selection for recasing.

Place, publisher, year, edition, pages
Association for Computational Linguistics, 2013. p. 225-231
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:uu:diva-207765OAI: oai:DiVA.org:uu-207765DiVA, id: diva2:649447
Conference
WMT 2013; 8-9 August; Sofia, Bulgaria
Available from: 2013-09-18 Created: 2013-09-18 Last updated: 2018-01-11Bibliographically approved

Open Access in DiVA

WMT2013(138 kB)100 downloads
File information
File name FULLTEXT01.pdfFile size 138 kBChecksum SHA-512
1790cdf978d6dcc46cae8b84d255fca252db12c51f23fd8582798483102fb633c89ff5041e6b8ff56647c28a2d801b33d4ab4fe928872bf9730d3ac5132161e8
Type fulltextMimetype application/pdf

Other links

http://www.aclweb.org/anthology/W13-2229

Authority records BETA

Stymne, SaraHardmeier, ChristianTiedemann, JörgNivre, Joakim

Search in DiVA

By author/editor
Stymne, SaraHardmeier, ChristianTiedemann, JörgNivre, Joakim
By organisation
Department of Linguistics and Philology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar
Total: 100 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

urn-nbn

Altmetric score

urn-nbn
Total: 688 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf