Ever wonder if the seccomp profile Docker or Kubernetes thinks it’s applying is the same one
actually enforcing syscalls inside your container? Sure, we all do, right? Well I wrote
seccomp-diff
, a tool that digs
into a live process using ptrace
, extracts out the seccomp BPF bytecode and
lets you compare it with the default seccomp profile that is applied in the cluster
or other containers that are running.
If you’re building a “sandbox” or “hardened container” or whatever you want to call a container with a restricted runtime, many security guides tell you to build your own custom seccomp profile from scratch. Setting seccomp profiles is undeniably a great way to harden your container in principle, but in practice, it loads up two guns to shoot at your foot.
One (among many) problem is that what your container applies, may not match up to what you expect. Kernel versions, container runtime versions, and capabilities can either miss some common issues or completely bypass what you were expecting. For example, there have been a bunch of security issues related to io_uring but whether or not you block these syscalls is not easy to see inside of your seccomp profiles unless you were looking for that specific system call.
This is why seccomp-diff
goes straight to the source by
inspecting the process itself, giving you a list of all the system calls that are supported, and
provides you with some background information about the system call itself.
The output is the ground truth and lets you be accountable
for the system calls that you’re allowing.
I’ve also thrown in a seccomp-dump
tool for those that don’t care about containers:
sudo python seccomp_dump.py --dump 436762
l0000: 20 00 00 00000004 A = [4](ARCH)
l0001: 15 00 04 c000003e IF ARCH != X86_64: 6(l0006)
l0002: 20 00 00 00000000 A = [0](SYSCALL)
l0003: 35 00 01 40000000 jlt #0x40000000, l5
l0004: 15 00 01 ffffffff IF SYSCALL != 0xffffffff: KILL(l0006)
l0005: 06 00 00 7ffc0000 RETURN LOG
l0006: 06 00 00 00000000 RETURN KILL
This tool was built for a talk at the last Shmoocon in 2025. It was a huge effort to build this and at least initially no LLMs were used to help me out so I learned a ton about decoding seccomp’s BPF bytecode and how containers really interact with the kernel. It’s one thing to say I know how a container is created; it’s another to watch syscalls from containerd, to shim, to runc, and eventually your entrypoint.
If you’ve read this far, let me point out something ironic as a reward. I am starting with the premise that Seccomp in JSON is hard to manage… but what do you think the format is that I return to the web interface to populate the visual diff aspect. That’s right:
This is a tool that bypasses seccomp json, to extract the seccomp-bytecode directly from the process, and return back a seccomp json file. The irony is not lost.
I’m hoping by Summercamp this year I’ll be ready to wrap all of this up into some research around teams that are building custom seccomp profiles.