I’m a postgrad in Hong Kong working on distributed systems and the infrastructure that keeps training jobs fed. This is a small digital garden — reading notes, half-finished benchmarks, and scripts I’d otherwise lose. Mostly for me; you’re welcome to read.
Reading list — June 2026
A few things that crossed my desk this month.
Data-parallel training: gradient bucketing and overlap
Why DDP feels like magic until you look at the allreduce schedule.
Reading notes: QUIC, 0-RTT and why HTTP/3 is UDP
Moving HTTP onto UDP was not insane — head-of-line blocking is the reason.
A minimal, reproducible Docker setup for ML experiments
Pin everything, mount data, never bake it. The boring setup that stopped me losing runs.
KV cache, and why LLM inference is memory-bound
The cache that makes autoregressive decoding fast also makes it the thing that runs out of memory first.
eBPF: tracing syscalls without a kernel module
Attaching a verified program to a tracepoint, and why this beats strace under load.
Tuning TCP BBR and fq on a high-latency link
Switching off loss-based congestion control on a long-fat path, and the fq gotcha for UDP.
WireGuard: why it's so small
~4k lines, one cipher suite, no negotiation. The design choices that make it auditable.
The TLS 1.3 handshake, in plain terms
One round trip instead of two, and why the ServerHello already carries a key share.
A first, honest look at io_uring
Two ring buffers, one syscall, and the mental model that finally made it click.