Congress 1191
  • Pablo Gamallo, Marcos Garcia, César Piñeiro, Rodrigo Martínez-Castaño and Juan C. Pichel
  • The Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS). Valencia, Spain. 2018

LinguaKit: a Big Data-based multilingual tool for linguistic analysis and information extraction

This paper presents LinguaKit, a multilingual suite of tools for analysis, extraction, annotation and linguistic cor- rection, as well as its integration into a Big Data infrastructure. LinguaKit allows the user to perform different tasks such as PoS-tagging, syntactic parsing, coreference resolution (among others), including applications for relation extraction, sentiment analysis, summarization, extraction of multiword expressions, or entity linking to DBpedia. Most modules work in four languages: Portuguese, Spanish, English, and Galician. The system is pro- grammed in Perl and is freely available under a GPLv3 license.
