Transformers
Transformers Architecture was introduced in the paper "Attention is All You Need" by Vaswani et al. in 2017. It revolutionized the field of Natural Language Processing (NLP) and has since been adapted for various other tasks, including computer vision and audio processing.
Large Language Models (LLMs) like GPT-3, BERT, and others are built upon the Transformer architecture. Transformers are pretty complex. The key innovation of Transformers is the self-attention mechanism, which allows the model to weigh the importance of different words in a sentence relative to each other, regardless of their position.