Snooping on proprietary protocols with Frida

Background

During one of our recent assessments, we encountered a hardware appliance using a proprietary protocol to provide its services to desktop clients. As we did not have access to the appliance, apart from what was exposed on its open ports, we decided to inspect the Windows client and study its protocol.

After a brief traffic inspection with Wireshark, we determined that the binary was talking to the appliance on port 443. In order to see the plaintext data we set up Windows to use our Burp listener as a proxy, started the application and… nothing happened.

We quickly discovered our approach was not working for several reasons: the application did not honour Windows proxy settings, the protocol in use was not HTTP, and most importantly, the application was using its own embedded certificate.

We then decided to hook the functions responsible for encryption/decryption to dump plaintext data. In this article, we will show how we found those functions in the stripped binary and used Frida’s JS API to dump plaintext traffic.

Finding suitable functions to hook

First of all, we needed to identify the SSL library used in the binary. The binary was stripped, but running “strings | grep -i ssl” on it revealed, among others, the following:

C:\Source\3rdparty\matrixssl\matrixssl\crypto\digest\sha512.c
C:\Source\3rdparty\matrixssl\matrixssl\crypto\pubkey\rsa.c
C:\Source\3rdparty\matrixssl\matrixssl\crypto\keyformat\x509.c
C:\Source\3rdparty\matrixssl\matrixssl\core\corelib.c
C:\Source\3rdparty\matrixssl\matrixssl\core\WIN32\osdep.c
C:\Source\3rdparty\matrixssl\matrixssl\crypto\pubkey\ecc.c
C:\Source\3rdparty\matrixssl\matrixssl\crypto\layer\matrix.c
C:\Source\3rdparty\matrixssl\matrixssl\crypto\digest\hmac.c
C:\Source\3rdparty\matrixssl\matrixssl\crypto\keyformat\pkcs.c
C:\Source\3rdparty\matrixssl\matrixssl\crypto\pubkey\pubkey.c
C:\Source\3rdparty\matrixssl\matrixssl\matrixssl\sslEncode.c

MatrixSSL is a lightweight SSL implementation, and it’s open source. The documentation for its API lives here.

We started to look at the source code to find a point where we could stop the execution and read the plaintext buffers, before being encrypted (for outgoing data) and after being decrypted (for ingoing traffic). We chose two functions, matrixSslProcessedData() and matrixSslEncode().

MatrixSslProcessedData(), as per documentation, is called after the user code has finished processing incoming data, to instruct the library that it is not needed to keep it in memory anymore. It is defined in matrixsslApi.c as:

int32 matrixSslProcessedData(ssl_t *ssl, unsigned char **ptbuf, uint32 *ptlen)

The second parameter, **ptbuf, points at the plaintext buffer. If we stop the execution at the first instruction of this function and dump it, we can read the plaintext of the incoming traffic.

With this information, we started looking at the binary to find out where this function is. While the binary is stripped, there is a quick way to find this (and many other) MatrixSSL functions.

The function contains the following instruction:

psAssert(ssl->insize > 0 && ssl->inbuf != NULL);

psAssert is defined in osdep.h. It is actually a macro, defined as:

define psAssert(C) if (C) {; } else \
{ halAlert(); _psTraceStr("psAssert %s", FILE); _psTraceInt(":%d ", LINE); \
_psError(#C); }

The macro takes an expression C as an argument, and if the expression is not true, it transforms it (via the # stringizing operator) into a string and passes it to _psError(). This means all the arguments to psAssert() are present as strings in the binary!

With this information in mind, we could now search for references to “ssl->insize > 0 && ssl->inbuf != NULL”, which is only used in the psAssert() call in matrixSslProcessedData().

We found this function, which is our matrixSslProcessedData():

push rdi
sub rsp,20
mov rdi,r8
mov rsi,rdx
mov rbx,rcx
test rcx,rcx
je redactedexename.7FF7A910A0B7
test rdx,rdx
je redactedexename.7FF7A910A0B7
test r8,r8
je redactedexename.7FF7A910A0B7
xor eax,eax
mov qword ptr ds:[rdx],rax
mov dword ptr ds:[r8],eax
cmp dword ptr ds:[rcx+DA8],eax
jle redactedexename.7FF7A9109FEC
cmp qword ptr ds:[rcx+D90],rax
jne redactedexename.7FF7A910A01C
lea rdx,qword ptr ds:[7FF7A93619F0]
lea rcx,qword ptr ds:[7FF7A9361950]
call redactedexename.7FF7A9115750
mov edx,61D
lea rcx,qword ptr ds:[7FF7A93618EC]
call redactedexename.7FF7A9115730
lea rcx,qword ptr ds:[7FF7A9361AF0] ←Reference to our string.
call redactedexename.7FF7A9114C50 ← Call to _psError.

Putting a breakpoint on the first instruction and running the target program allowed us to see the incoming traffic in plaintext, pointed by the R10 register:

R10 : 0000026266717DB8 &”HTTP/1.1 200 OK\r\nDate: Wed, 30 Dec 2020 09:44:01 GMT\r\nServer: …

We repeated the same process for the matrixSslEncode() function, and confirmed the approach worked for outgoing traffic too.

Dumping the plaintext buffers with Frida

While the previous approach was useful to confirm we found the right places to hook, it wasn’t very practical given the large volume of incoming and outgoing data. We decided to use the Frida API to automate the process of stopping the executable at matrixSslEncode() and matrixSslProcessedData() and dumping our plaintext buffers.

In case you don’t know Frida, it is dynamic instrumentation toolkit. It injects a Javascript engine inside a target binary, and allows to perform a variety of tasks such as memory search and hooking. We’ve found it very useful in our reversing projects. The documentation for its Javascript API is here.

We came up with the following script:

function hook_processed_data(address) {

    console.log('[+] hooking ' + address);

    Interceptor.attach(address, {

        onEnter(args) {

            console.log('========INCOMING:===========');
            console.log(this.context.r10.readCString());

        },

        onLeave(retval) {}

    });

}

const m = Process.enumerateModules()[0];

const matrixSslProcessedDataFirstBytes = '57 48 83 EC 20 49 8B F8 48 8B F2 48 8B D9 48 85 C9 0F 84 F6 00 00 00 48 85 D2 0F 84 ED 00 00 00 4D 85 C0 0F 84 E4 00 00 00 33 C0 48 89 02 41 89 00 39 81 A8 0D 00 00 7E 09 48 39 81 90 0D 00 00';

var matrixSslProcessedDataAddress = 0;
Memory.scan(m.base, m.size, matrixSslProcessedDataFirstBytes, {

    onMatch(address, size) {

        console.log('[+] matrixSslProcessedData() found at:', address);
        matrixSslProcessedDataAddress = address;
        hook_processed_data(matrixSslProcessedDataAddress);
        return;

    },

    onComplete() {}

});

The script first searches for the first bytes of matrixSslProcessedData(), then sets up a hook . When the hook is reached, it dumps the content of the R10 register. After confirming it worked, we added a similar hook for matrixSslEncode().

Conclusion

Starting from a stripped binary we had no information about, we were able to first investigate its internals to dump plaintext traffic in a debugger, and then automate the process using Frida’s JS API. If you have never used Frida, we encourage you to play with it. It’s a wonderful tool, and its Python bindings make it more practical than other solutions (such as Intel PIN or DynamoRIO) for quick binary instrumentation.

We also found that our target was transmitting PHP serialized objects to the server backend, but that’s a story for another time. 🙂