Instruction Fine Tuning

#llm #interview

In the instruction fine tuning, the model is trained using user request and assistant response. This can be multi-turn with other turns like tool calling, tool response.

The cross entropy loss is only calculated for assistant response or the responses that are expected from the model, i.e., tool calling.

This stage is mostly used to train the model for following a specific task like summarization, tool calling, qna.

References

Related Notes

Probability vs. Likelihood
Why do we use Projection in QKV?
Word2Vec Embedding
Expected Value
Unsupervised Learning