Foundation Model

#deep-learning #interview

Foundation Models are a form of General Intelligence, where the models can generate output from the human instruction. It uses Self-Supervised Learning to learn pattern in the input. It is different from other ML architectures in that way.

It can learn the next word probability from the given input and previously generated output. It can generate sharper image or even new image, given a blurry image or a prompt to generate image.

They are called foundation model because they are the foundation of different ML models for different tasks, i.e., BERT model can be used for multiple downstream tasks, even they are trained for those.

Use Cases:

Language Processing
Visual Comprehension
Code Generation

Examples:

BERT
GPT
BLOOM
Cohere

References

Related Notes

Layer Normalization
BERT
Dropout
Vanishing Gradient
Feature Preprocessing