Concept: Memory management
CXL-SpecKV: A Disaggregated FPGA Speculative KV-Cache for Datacenter LLM Serving
Offloading memory to remote accelerators improves LLM inference speed and reduces costs

Offloading memory to remote accelerators improves LLM inference speed and reduces costs
