Systems on mc · notes

Systems on mc · noteshttps://hk.crepuscule.uk/tags/systems/Recent content in Systems on mc · notesHugoen-usSun, 08 Feb 2026 15:20:00 +0800KV cache, and why LLM inference is memory-boundhttps://hk.crepuscule.uk/posts/kv-cache/Sun, 08 Feb 2026 15:20:00 +0800https://hk.crepuscule.uk/posts/kv-cache/The cache that makes autoregressive decoding fast also makes it the thing that runs out of memory first.Notes on cgroups v2: memory limits that actually holdhttps://hk.crepuscule.uk/posts/cgroups-v2/Sun, 09 Mar 2025 21:10:00 +0800https://hk.crepuscule.uk/posts/cgroups-v2/Why my OOM kills moved around after switching to the unified hierarchy, and the three knobs that matter.