eRISK: Technologies for the early prediction o signs related with psychological disorders

A number of psychological disorders severely impact on modern society. For example, depression is considered by the WHO as the largest generator of dysfunctions or disabilities worldwide, affecting more than 300 million people. Other disorders, such as eating disorders, are also very worrying, have a high percentage of mortality, and affect very young people. Despite the seriousness of these disorders, in many cases people do not receive treatment or receive late treatment. This leads to high societal, health and economic costs.

Language is a powerful indicator of personality traits, emotions and provides valuable clues about mental health. This project aims to develop the technologies and computational models needed to do large-scale analysis of natural language (NL). More specifically, the main purpose is to study the ways people use language (and the evolution of NL use) in order to reveal early signs of psychological disorders ("eRisk: early Risk "). This project does not intend to develop automatic diagnostic technology. As a matter of fact, we think that the diagnostic tasks done by medical professionals cannot be done in a fully automatic way. We pursue here the more realistic goal of, for instance, innovating in methods to detect the appearance of initial signs of depression and understanding the evolution of an individual from early stages to severe stages. Likewise, it is challenging to analyze the evolution of NL use for a person who incrementally develops an eating disorder.

The large volume of interactions and publications available on the Internet and Social Networks permits to perform massive analyses of psychological traits related to mental disorders. It is common that people suffering from psychological disorders interact with other individuals, express their concerns, share their experiences and receive online help from specialized professionals. However, the analysis of online users poses challenges in several areas: information filtering and search (in order to find the users extracts that are relevant to analyze a given psychological problem), linguistic analysis of text and psycho-linguistics, estimating the quality and reputation of contents (for example, for the purpose of recommending reputed information for people suffering from a certain disorder), and massive processing of data (distributed computing methods need to be efficient enough and work in real-time). The project is clearly multidisciplinary, covering aspects of Text Processing (Information Retrieval, Automatic Text Classification and Recommender Systems), Computational Linguistics (Discourse analysis, advanced NL Processing) and High Performance Computing for Big Data. Furthermore, our team has experts in Psychology, who will face challenges related to incorporating expert knowledge into the models, validating the resulting technologies and exploiting the results of eRisk within their daily routine. The objectives of the project include a series of challenges aimed at making improvements in the previous fields. The target advances focus on improving existing models and adapting them to the case of early detection of disorders. The area we face is novel in itself and, thus, we will also work on building experimental foundations (creation of test collections, definition of evaluation metrics, etc.).


  • O1. Develop new methods and resources for evaluating information access systems and employing them for early prediction of risks of psychological disorders.
  • O2. Define effective textual search and filtering methods and applying them for the identification of texts relevant to different profiles of psychological disorders. Define models for analyzing topics and their temporal evolution.
  • O3. Develop linguistic resources related to different psychological disorders and implement advanced natural language processing for the analysis of user publications.
  • O4. Develop flexible and efficient solutions for massive data processing on numerous social media and implement real-time analysis of online users.
  • O5. Define methods for the analysis of results, generation of conclusions and exploitation of the knowledge of the expert psychologist and determine ways in which the technology created in the project can assist the psychologist in their daily activity.
  • O6. Develop content recommendation methods based on collaborative filtering, based on content and models (linear, latent, embeddings), and adaptable to the domain of psychological disorders.