Concept: Prefix

PAT: Accelerating LLM Decoding via P refix- A ware A t tention with Resource Efficient Multi-Tile Kernel
Accelerating language model inference by reusing shared prompt cache across concurrent requests
April 9, 2026
Image Credit: Photo by StockSnap on Pixabay (Source • License)