eBPF: tracing syscalls without a kernel module

strace is wonderful until you point it at a busy process and watch throughput fall off a cliff — every syscall round-trips through ptrace. eBPF lets you run a small, verified program inside the kernel at a tracepoint instead, so the data never leaves kernel space until you ask for a summary.

The shape of it

With bpftrace the whole thing is a one-liner. Count syscalls by name across the system:

bpftrace -e 'tracepoint:raw_syscalls:sys_enter { @[probe] = count(); }'

Or watch which files a particular binary opens:

bpftrace -e 'tracepoint:syscalls:sys_enter_openat
  /comm == "nginx"/ { printf("%s\n", str(args->filename)); }'

Why it’s safe to run in production

Before your program is attached, the kernel verifier walks every path and proves it terminates, never touches memory it shouldn’t, and has bounded loops. A program that doesn’t pass simply won’t load. That’s the deal that makes it acceptable to run tracing on a production box — you can’t panic the kernel with a bad probe.

Where I reach for it

Latency histograms for a specific syscall, off-CPU analysis, “which process is hammering the disk right now” — all without recompiling anything or loading a module. The learning curve is real, but bpftrace for ad-hoc questions and libbpf + CO-RE for anything you want to ship is a good split.

The shape of it#

Why it’s safe to run in production#

Where I reach for it#

The shape of it

Why it’s safe to run in production

Where I reach for it