Decoder Only Transformer
In decoder-only transformer, it uses the decoder part of the transformer where it can only attend to the previous words of the sentences.
The pre-training only uses the Causal Language Modeling.
These models are best suited for text-generation.
Some examples of decoder-only models: