Microsoft places CiTIUS in the elite of multilingual AI model development

Microsoft announces a new collaboration with three leading research centers in Spain for the development of foundational artificial intelligence models in the languages of the State: CiTIUS (University of Santiago de Compostela), Barcelona Supercomputing Center (BSC), and the HiTZ center, at the University of the Basque Country.

The technology company Microsoft has recently announced a collaboration with three leading research centers in Spain - the Barcelona Supercomputing Center (BSC), the HiTZ center of the University of the Basque Country, and CiTIUS of the University of Santiago de Compostela - to develop foundational artificial intelligence models in the official languages of the State. The announcement marks a new step in Microsoft's strategy to promote open, responsible, and linguistically diverse language technologies based on artificial intelligence, laying the foundations to lead the development of language models specific to Catalan, Basque, and Galician. The development of the model for the latter will be carried out by CiTIUS, one of the three entities selected by the multinational.

In an entry published on July 20th on the institutional blog Microsoft on the Issues - a space where the company shares its vision on global public and technological issues, - Microsoft emphasizes the importance of developing multilingual foundational models that include European official languages, with the objective of "preserving cultural heritage and unlocking economic opportunities across Europe." The company stresses that this initiative will enable "closing the digital language gap" and ensure that "all European languages are represented in the AI of the future."

Microsoft's announcement places CiTIUS of USC among the leading scientific institutions in the field of multilingual artificial intelligence in the country.

The models developed will be hosted on the Azure AI Foundry platform, as part of the Microsoft Open Innovation Center (MOIC) program. Their goal will be to create foundational models that facilitate the integration of all the official languages of the State into advanced digital systems, ensuring their presence in the most influential AI technologies of the present and future.

The selection of CiTIUS as the institution responsible for the Galician model reinforces the center's role as a reference in language technologies. Galicia thus joins the international map of AI innovation, with a proposal developed from an academic and public environment that focuses on diversity and digital inclusion. This collaboration reinforces CiTIUS's leadership in the development of advanced language technologies, positioning the Galician community as a key node in inclusive digital transformation in Spain.

The presence of CiTIUS in this strategic alliance with Microsoft underlines the impact of research work done from Galicia on the global artificial intelligence stage.

'Nós': the beginning of it all...

Launched in 2020, the Proxecto Nós is an initiative driven by the Xunta de Galicia, entrusted to the University of Santiago de Compostela through CiTIUS and the Institute of the Galician Language (ILG) with the aim of ensuring the presence of Galician in the digital environment of the 21st century. The project was key in creating foundational linguistic infrastructures: from open corpora and databases to automatic translation, synthesis, and voice recognition tools. In 2023, the team presented Carballo, the first large-scale language model fully trained in Galician.

The advances achieved within 'Nós' not only have already expanded the digital capabilities of Galician but have also placed Galicia on the European AI map, paving the way for this new phase of international collaboration. Previously, the project continued through the initiatives ILENIA (Promotion of languages in Artificial Intelligence), focused on the creation of linguistic resources and tools for th official languages in Spain, and ALIA (Artificial Linguistic Intelligence for Administration), a public-private alliance to develop foundational AI models aimed at multilingual public administration. Both were financed by the Ministry of Economic Affairs and Digital Transformation, consolidating CiTIUS's role in the development of language technologies within a state strategy.

In a new milestone in this line of research, CiTIUS (a research centre co-funded by the European Union through the Galicia ERDF Programme 2021–2027), now joins an initiative defining the future of foundational AI models, contributing to technological progress with a proposal built in Galician, from Galicia, and with global projection.