Instruction Fine Tuning

In the instruction fine tuning, the model is trained using user request and assistant response. This can be multi-turn with other turns like tool calling, tool response.

The cross entropy loss is only calculated for assistant response or the responses that are expected from the model, i.e., tool calling.

This stage is mostly used to train the model for following a specific task like summarization, tool calling, qna.


References


Related Notes