In this article, we define the outlier detection task and use it to compare neural-based word embed-
dings with transparent count-based distributional representations. Using the English Wikipedia
as text source to train the models, we observed that embeddings outperform count-based rep-
resentations when their contexts are made up of bag-of-words. However, there are no sharp
differences between the two models if the word contexts are defined as syntactic dependencies.
In general, syntax-based models tend to perform better than those based on bag-of-words for
this specific task. Similar experiments were carried out for Portuguese with similar results. The
test datasets we have created for outlier detection task in English and Portuguese are released.
Keywords: distributional semantics, dependency analysis, outlier detection, similarity