Training program: 'An Introduction to Transformers Architecture'

In recent years, Transformer neural network architectures have revolutionised Natural Language Processing. This architecture has pushed the boundaries of tasks like machine translation, document summarisation or conversational agents. Their influence is now well beyond NLP, becoming state-of-the-art in other disciplines like Vision, Information Retrieval or Biology.
In this lecture, we are going to introduce this architecture from an NLP perspective. We will explore their inner workings and the reasons for their success, including the Attention mechanism and their scalable training process. In addition, we will analyse their main criticisms and disadvantages, review the most known models and implementations, and discuss their application in other fields.

NLP is one of the hardest subjects in Computer Science. Language is not a well-defined system with exact rules and procedures. Rather, it is ambiguous and requires a lot of implicit knowledge from the world and human culture. Tackling such a challenge, from a computing perspective, requires major innovation and truly intelligent technologies. In the last decade, the discipline has evolved incredibly fast due to advances in computing and deep neural networks. Transformers have been the last and biggest leap forward, providing a versatile architecture that has broadly impacted Deep Learning and computing as a whole.