Publications

Marcos García González

2024
Investigating Idiomaticity in Word Representations
Open Generative Large Language Models for Galician
Increasing manually annotated resources for Galician: the Parallel Universal Dependencies Treebank
2022
Evaluating Contextualized Vectors from both Large Language Models and Compositional Strategies”
An exploration of the semantic knowledge in vector models: polysemy, synonymy and idiomaticity
Proxecto Nós: Artificial Intelligence at the Service of the Galician Language
SemEval-2022 Task 2: Multilingual Idiomaticity Detection and Sentence Embedding
The Nós Project: Opening routes for the Galician language in the field of language technologies
A computational psycholinguistic evaluation of the syntactic abilities of Galician BERT models at the interface of dependency resolution and training time
A Targeted Assessment of the Syntactic Abilities of Transformer Models for Galician-Portuguese
SemantiGal: An online visualizer of vector representations for Galician
2021
Assessing the Representations of Idiomaticity in Vector Models with a Noun Compound Dataset Labeled at Type and Token Levels
Exploring the Representation of Word Meanings in Context: A Case Study on Homonymy and Synonymy
Book Review: Embeddings in Natural Language Processing. Theory and Advances in Vector Representations of Meaning
Probing for Idiomaticity in Vector Space Models
Bertinho: Galician BERT Representations
Comparing Dependency-based Compositional Models with Contextualized Word Embedding
2014
Extracção de relações semânticas. Recursos, ferramentas e estratégias
Perldoop: Efficient Execution of Perl Scripts on Hadoop Clusters
Comparing Ranking-based and Naive Bayes Approaches to Language Detection on Tweets
Entity-Centric Coreference Resolution of Person Entities for Open Information Extraction
Análisis morfosintáctico y clasificación de entidades nombradas en un entorno Big Data
PoS-tagging the Web in Portuguese. National varieties, text typologies and spelling systems
An Entity-Centric Coreference Resolution System for Person Entities with Rich Linguistic Information
Citius: A Naive-Bayes Strategy for Sentiment Analysis on English Tweets
Multilingual corpora with coreferential annotation of person entities
2012
Técnicas de procesamiento del lenguaje natural en la Recuperación de Información
Identificação e Classificação de Entidades Mencionadas em Galego
Dependency-Based Open Information Extraction
Automatic Phonetic Transcription by Phonological Derivation
Extraction of Bilingual Cognates from Wikipedia
2011
A Weakly-Supervised Rule-Based Approach for Relation Extraction
Evaluating Various Linguistic Features on Semantic Relation Extraction
Dependency-Based Text Compression for Semantic Relation Extraction
Resolución de Correferencia de Nombres de Persona para Extracción de Información Biográfica
An Exploration of the Linguistic Knowledge for Semantic Relation Extraction in Spanish
Conversión Fonética Automática con Información Fonológica para el Gallego