Paged KV Cache

What it does?

  1. 96% utilization
  2. No fragmentation
  3. Non-contiguous memory

How it does?

  1. OS memory management
  2. KV Blocks
  3. Block tables


References

  1. https://www.youtube.com/watch?v=5ZlavKF_98U