Comparing explicit and predictive distributional semantic models endowed with syntactic contexts
In this article, we introduce an explicit count-based strategy to build word space models with syntactic contexts (dependencies). A filtering method is defined to reduce explicit word-context vectors. This traditional strategy is compared with a neural embedding (predictive) model also based on syntactic dependencies. The comparison was performed using the same parsed corpus for both models. Besides, the dependency-based methods are also compared with bag-of-words strategies, both count-based and predictive ones. The results show that our traditional count-based model with syntactic dependencies outperforms other strategies, including dependency-based embeddings, but just for the tasks focused on discovering similarity between words with the same function (i.e. near-synonyms).
keywords: Word similarity, Word embeddings, Count-based models, Dependency-based semantic models,