3. Transformers
Transformers These models, introduced in the paper "Attention is All You Need", use self-attention mechanisms and are very effective for many NLP tasks. Variants include:
(a) Transformer (base model)
(b) BERT (Bidirectional Encoder Representations from Transformers)
(c) GPT (Generative Pretrained Transformer)
(d) T5 (Text-to-Text Transfer Transformer)
(e) BART (Bidirectional and Auto-Regressive Transformers)
(f) RoBERTa, ALBERT, DistilBERT (variants of BERT with different training strategies or model sizes)
Last updated