Auto Regressive Model

Decoder Only Transformer or Encoder-Decoder Transformer are called Auto Regressive as they generate one token at a time by attending to the previous tokens.


References


Related Notes