dga (re???)

Only worked on this challenge 2 months after the CTF ended, here’s the file if you want to try.

Basic information gathering

Upon running strings on the given file, we see the string UPX which tells us it is packed by UPX. We can easily unpack it by running

upx -d ./dga

When executing the binary, it just freezes without giving any output or responding to any input. Stopping it by pressing ctrl+c on the keyboard shows this interesting output.

  File "dga.py", line 23, in <module>
  File "dga.py", line 8, in gen_domain
  File "nistbeacon/nistbeacon.py", line 195, in get_previous
  File "nistbeacon/nistbeacon.py", line 65, in _query_nist
  File "requests/api.py", line 72, in get
  File "requests/api.py", line 58, in request
  File "requests/sessions.py", line 520, in request
  File "requests/sessions.py", line 630, in send
  File "requests/adapters.py", line 440, in send
  File "urllib3/connectionpool.py", line 601, in urlopen
  File "urllib3/connectionpool.py", line 346, in _make_request
  File "urllib3/connectionpool.py", line 852, in _validate_conn
  File "urllib3/connection.py", line 326, in connect
  File "urllib3/util/ssl_.py", line 329, in ssl_wrap_socket
  File "ssl.py", line 407, in wrap_socket
  File "ssl.py", line 817, in __init__
  File "ssl.py", line 1077, in do_handshake
  File "ssl.py", line 689, in do_handshake

Run strings on the unpacked binary, and we see the string MEIPASS. This tells us that it is a pyinstaller binary. pyinstaller is just a program that bundles a python script into a native binary, which can be either a PE or ELF file.

Extract the original code

Ideally, we would want to obtain the original python code instead of looking at this bundled version. Upon some research, I found this repository, which sadly only works for PE files, while we have an ELF file here.

Other than that, it turns out that pyinstaller comes with pyi-archive_viewer that lets us view the files in the bundle. So just run pip install pyinstaller to get it.

Compressed files

Running pyi-archive_viewer on the unpacked binary gives the following

 pos, length, uncompressed, iscompressed, type, name
[(0, 245, 312, 1, 'm', 'struct'),
 (245, 1111, 1818, 1, 'm', 'pyimod01_os_path'),
 (1356, 4363, 9378, 1, 'm', 'pyimod02_archive'),
 (5719, 7386, 18683, 1, 'm', 'pyimod03_importers'),
 (13105, 1848, 4157, 1, 's', 'pyiboot01_bootstrap'),
 (14953, 210, 254, 1, 's', 'pyi_rth_pkgres'),
 (15163, 1077, 1774, 1, 's', 'pyi_rth_multiprocessing'),
 (16240, 465, 631, 1, 's', 'dga'),
 <truncated>
 (20600859, 4142389, 4142389, 0, 'z', 'PYZ-00.pyz')]

The 2 files that caught my attention were dga and PYZ-00.pyz, the former as it has the same name as our binary, and the latter as through some research I found that a pyz file is an archive of python bytecodes. The rest were .so files and therefore truncated in the output above.

pyi-archive_viewer allows us to extract these files by issuing the x command.

Extract the pyc files from the pyz file

This step is just running scripts. Almost all the ones I found online were for PE files, but stumbled upon this article, that highlighted a script that was used to extract all the pyc files from the pyz file.

The steps applied to PYZ-00.pyz are quite straightforward in this script:

  1. Save the magic number at the start of the pyz file. This magic number is to be used later for indicating the python version used in the file metadata.
  2. Read the offset of the “table of contents”, then load it into memory as a python dictionary.
  3. Based on the table of contents, read data from the pyz file according to given offset and length, decompress it, then finally write it into a pyc file with the magic number saved in (1).

The table of contents looks like this. It can be viewed by running pyi-archive_viewer PYZ-00.pyz.

 Name: (ispkg, pos, len)
{'Crypto': (1, 17, 303),
 'Crypto.Cipher': (1, 320, 820),
 'Crypto.Cipher.AES': (0, 1140, 3400),
 'Crypto.Cipher.ARC2': (0, 4540, 2248),
 'Crypto.Cipher.DES': (0, 6788, 2125),
 'Crypto.Cipher.DES3': (0, 8913, 2666),
 'Crypto.Cipher._mode_cbc': (0, 11579, 2790),
 'Crypto.Cipher._mode_ccm': (0, 14369, 6660),
 'Crypto.Cipher._mode_cfb': (0, 21029, 2936),
 'Crypto.Cipher._mode_ctr': (0, 23965, 4341),
 'Crypto.Cipher._mode_eax': (0, 28306, 4600),
 'Crypto.Cipher._mode_ecb': (0, 32906, 2194),
 'Crypto.Cipher._mode_gcm': (0, 35100, 6909),
 'Crypto.Cipher._mode_ocb': (0, 42009, 5711),
 'Crypto.Cipher._mode_ofb': (0, 47720, 2743),
 'Crypto.Cipher._mode_openpgp': (0, 50463, 2161),
 'Crypto.Cipher._mode_siv': (0, 52624, 4580),
 'Crypto.Hash': (1, 57204, 184),
 ...

Not going to elaborate too much of the steps above, you can refer to this script if interested, the code is quite simple to understand. (Must use python3 as the challenge was bundled with the python3 version of pyinstaller)

Running this script yields a folder with 700+ pyc files, all of which are the libraries used by the program such as Crypto, hashlib, http

Decompile the extracted dga file

Well this doesn’t tell us much about the challenge yet… Remember we extracted 2 files from the unpacked binary, dga and PYZ-00.pyz. Our main program logic should be inside dga.

dga contains bytecode for the program, however we cannot just run uncompyle6 (a well known python bytecode decompiler) on it, as it is not in the right format (pyc format).

How did I know dga contains bytecode for the main program? I just assumed and performed the following steps, and luckily it turned out to be true.

Recall when extracting from the pyz file we saved a magic number. Now we can prepend this magic number along with some file metadata to the dga file, so that it becomes a valid pyc file.

What’s left is to run uncompyle6 on the pyc file and we get the following code.

# uncompyle6 version 2.11.5
# Python bytecode 3.6 (3379)
# Decompiled from: Python 2.7.10 (default, Feb 22 2019, 21:17:52) 
# [GCC 4.2.1 Compatible Apple LLVM 10.0.1 (clang-1001.0.37.14)]
# Embedded file name: dga.py
from nistbeacon import NistBeacon
import time
import hashlib

def gen_domain():
    now = int(time.time())
    now -= 100000000
    record = NistBeacon.get_previous(now)
    val = record.output_value
    h = hashlib.md5()
    h.update(str(val).encode('utf-8'))
    res = h.digest()
    domain = ''
    for ch in res:
        tmp = (ch & 15) + (ch >> 4) + ord('a')
        if tmp <= ord('z'):
            domain += chr(tmp)

    domain += '.org'


if __name__ == '__main__':
    gen_domain()

Solving the challenge?

I can’t really remember what the challenge was asking for, but it was along the lines of generating a domain name based on a given timestamp. I think we just need to modify the function above and we will get the desired domain.


References

  • https://advancedpersistentjest.com/2016/07/31/manually-unpacking-pyinstaller-python-2p6/
  • https://hshrzd.wordpress.com/2018/01/26/solving-a-pyinstaller-compiled-crackme/