HumanModel: Linguistic strategies to humanize large language models

The HumanModel project investigates linguistic strategies to improve large language models (LLMs) and make them closer to human language learning processes. The proposal addresses key limitations of current models (such as lack of transparency, reliability issues, biases, and the dominance of English) through linguistically informed methods, human-scale corpora, external knowledge integration, and targeted linguistic evaluation. The project also focuses on the Galician-Portuguese diasystem to promote more inclusive, interpretable, and socially responsible AI technologies.

Objectives

The main objective of HumanModel is to develop more reliable, transparent, and inclusive language models by integrating linguistic knowledge into their design and training processes. To achieve this goal, the project will create human-scale corpora, explore compositional mechanisms guided by syntactic information, develop models for the Galician-Portuguese diasystem, improve reliability through external knowledge sources, reduce bias in training data, and design new linguistic evaluation datasets to better assess the capabilities of language models.