A learning-based approach for the identification of sexual predators in chat logs
The existence of sexual predators that enter into chat rooms or forums and try to convince children to provide some sexual favour is a socially worrying Issue. Manually monitoring these interactions is a way to attack this problem. However, this manual approach simply cannot keep pace because of the high number of conversations and the huge number of chatrooms or forums where these conversations daily take place. We need tools that automatically process massive amounts of conversations and alert about possible offenses. The sexual predator identification challenge within PAN 2012 is a valuable way to promote research in this area. Our team faced this task as a Machine Learning problem and we designed several innovative sets of features that guide the construction of classifiers for identifying sexual predation. Our methods are driven by psycholinguistic, chat-based, and tf/idf features and yield to very effective classifiers.
keywords:
Publication: Congress
1624015018484
June 18, 2021
/research/publications/a-learning-based-approach-for-the-identification-of-sexual-predators-in-chat-logs
The existence of sexual predators that enter into chat rooms or forums and try to convince children to provide some sexual favour is a socially worrying Issue. Manually monitoring these interactions is a way to attack this problem. However, this manual approach simply cannot keep pace because of the high number of conversations and the huge number of chatrooms or forums where these conversations daily take place. We need tools that automatically process massive amounts of conversations and alert about possible offenses. The sexual predator identification challenge within PAN 2012 is a valuable way to promote research in this area. Our team faced this task as a Machine Learning problem and we designed several innovative sets of features that guide the construction of classifiers for identifying sexual predation. Our methods are driven by psycholinguistic, chat-based, and tf/idf features and yield to very effective classifiers. - J. Parapar, D. Losada, A. Barreiro
publications_en