Congress 1262
  • Pablo Gamallo, Marcos Garcia
  • Proceedings of the Joint Workshop on Multiword Expressions and WordNet, at ACL-2019. Florencia, Italia. 2019

Unsupervised Compositional Translation of Multiword Expressions

This article describes a dependency-based strategy that uses compositional distributional semantics and cross-lingual word embeddings to translate multiword expressions (MWEs). Our unsupervised approach performs translation as a process of word contextualization by taking into account lexico-syntactic contexts and selectional preferences. This strategy is suited to translate phraseological combinations and phrases whose constituent words are lexically restricted by each other. Several experiments in adjective-noun and verb-object compounds show that mutual contextualization (co-compositionality) clearly outperforms other compositional methods. The paper also contributes with a new freely available dataset of English-Spanish MWEs used to validate the proposed compositional strategy.
Keywords: natural language processing, unsupervised translation, multiword extraction, compositional distributional semantics
Canonical link