Pre-Fill in LLM
Pre-fill is the initial phase of the inference where the inference engine (i.e. vllm) creates the query, key, value of the whole prompt and save it into the KV Cache.
Pre-fill is the initial phase of the inference where the inference engine (i.e. vllm) creates the query, key, value of the whole prompt and save it into the KV Cache.