To Sentences and Beyond! Paving the way for Context-Aware Machine Translation

Поділитися
Вставка
  • Опубліковано 18 бер 2024
  • Presentation by Rachel Wicks
    Most machine translation systems operate on the sentence-level while humans write and translate within a given context. Operating on individual sentences forces error-prone sentence segmentation into the machine translation pipeline. This limits the upper-bound performance of these systems by creating noisy training bitext. Further, many grammatical features necessitate inter-sentential context in order to translate which makes perfect sentence-level machine translation an impossible task. In this talk, we will cover the inherent limits of sentence-level machine translation. Following this, we will explore a key obstacle in the way of true context-aware machine translation-an abject lack of data. Finally, we will cover recent work that provides (1) a new evaluation dataset that specifically addresses the translation of context-dependent discourse phenomena and (2) reconstructed documents from large-scale sentence-level bitext that can be used to improve performance when translating these types of phenomena.

КОМЕНТАРІ •