Extraction of Bilingual Cognates from Wikipedia
In this article, we propose a method to extract translation equivalents with similar spelling from comparable corpora. The method was applied on Wikipedia to extract a large amount of Portuguese-Spanish bilingual terminological pairs that were not found in existing dictionaries. The resulting bilingual lexicons consists of more than 27,000 new pairs of lemmas and multiwords, with about 92% accuracy.
keywords:
Publication: Congress
1624015017888
June 18, 2021
/research/publications/extraction-of-bilingual-cognates-from-wikipedia
In this article, we propose a method to extract translation equivalents with similar spelling from comparable corpora. The method was applied on Wikipedia to extract a large amount of Portuguese-Spanish bilingual terminological pairs that were not found in existing dictionaries. The resulting bilingual lexicons consists of more than 27,000 new pairs of lemmas and multiwords, with about 92% accuracy. - Gamallo P., Garcia M. - 10.1007/978-3-642-28885-2_7
publications_en