Causal Language Modeling
In causal language modeling, it generates the next word depending on the previous words, it can only give attention to the words on the left side.
This is mainly used in training of decoder only transformers or inference of text generation.