Camp X: Intro to Diaphora.

I finally got around to writing this post as well as updating the official Diaphora help.odt file.

This demo quickly demonstrates how easy it is to detect new changes/deletions to patched binaries with Diaphora. I walk you thru how to look for removed/replaced functions (_strcmp, _gets) in a vulnerable binary, and show the reader how to discover the two new secure replaced functions (_strncmp, _fgets) in a patched binary. Let's not waste any time and jmp right in!

Diaphora Support?

It's important to note that Diaphora for now will only be supported for the two latest releases of IDA Pro. This means that current support is only valid for IDA Pro 6.7, and 6.8.

Running Diaphora

In order to run Diaphora, simply, unpack the compressed distribution file (or perform a gitclone - which I think is easier) wherever you prefer and directly execute “diaphora.py” from the IDA Pro menu File → Script file. Please be advised that Joxean Koret (Diaphora author) will be releasing an update to the two newest releases of IDA for a new hot key feature. The new hot key functionality will make it so you can assign a custom shortcut to open Diaphora vs manually opening it. I like to place the directory inside the IDA "%install%/scripts" folder. Once the script diaphora.py is executed, a dialog like the following one will be opened:

This dialog, although can be a bit confusing at first, is used for both exporting the current IDA database to SQLite format as well for performing diffing against another SQLite exported format database.

The first field, is the path of the SQLite file format database that will be created with all the information extracted from the current database.

The 2nd field is the other SQLite format database to diff the current database against. If this field is left empty, Diaphora will just export the current database to SQLite format. If the 2nd field is not empty, it will diff both databases.

The other fields, the check-boxes, are explained bellow:

Use the decompiler if available. If the Hex-Rays decompiler is installed with IDA and IDA Python bindings are available, Diaphora will use the decompiler to get many interesting information that will help during the bindiffing process.
Export only non-IDA generated functions. Enable if you neither want sub_* functions nor library functions to be exported.
Do not export instructions and basic blocks. Export only function summaries, not all instructions. Showing differences in a graph between functions will not be available.
Use probably unreliable methods. Diaphora uses many heuristics to try to match functions in both databases being compared. However, some heuristics are not really reliable or the ratio of similarity is very low. Check this box if you want to see also the likely unreliable matches Diaphora my find. Unreliable results are shown in a specific list, it doesn't mix the “Best results” (results with a ratio of 1.00) with the “Partial results” (results with a ratio of 0.50 or higher) or “Unreliable results”.
Use slow heuristics. Some heuristics can be quite expensive and take long. For medium to big databases, it's disabled by default and is recommended to left unchecked unless the results from a execution with this option disabled are not good enough. It will likely find more better matches than the normal, not that slow, heuristics, but it will take significantly longer.
Relaxed calculations of difference ratios. Diaphora uses, by default, a kind of aggressive method to calculate difference ratios between matches. It's possible to relax that aggressiveness level by checking this option. Under the hood, the function SequenceMatcher.quick_ratio is used when this option is unchecked and SequenceMatcher.real_quick_ratio when this option is checked. Also, when the option is checked, Diaphora will use too the difference ratio of the primes numbers calculated from the AST of the pseudo-code of the 2 functions, calculating the highest ratio from the AST, assembly and pseudo-code comparisons.
Use experimental heuristics. It says it all: experimental heuristics are enabled only if this check-box is marked. Disabled by default as they are likely not useful.
Ignore automatically generated names. Enable this option to ignore sub_* names for the 'Same name' heuristic.
Ignore all function names. Enable this option to ignore all function names for the 'Same name' heuristic.
Ignore small functions. Enable this option to ignore thunk functions, nullstubs, etc.

Diaphora quick start

Finding differences in new versions (Patch diffing)

In order to use Diaphora we need at least two binary files to compare. I will use two different versions of a small binary with buffer overflow vulnerabilities as an example.

cbd98888a848fa5a4927ef2c2cf3c94c fixed_psswd-win86.exe (primary db)
2fc23ed48120710d67f2ee94e5d18de7 vuln_psswd-win86.exe (secondary db)

The file “vuln_psswd-win86.exe” is the pre-patch copy and the binary “fixed_psswd-win86.exe” is the fixed version. I start by launching IDA Pro 32-bit (idaq) and open the file “fixed_psswd-win86.exe”. Once the initial auto-analysis finishes launch Diaphora by running the script “diaphora.py” from the IDA Pro menu File → Script file. The following dialog will open:

We only need to care about 2 things:

Field “Export current database to SQLite”. This is the path to the SQLite database that will be created with all the information extracted from the IDA database of this binary.
Field “Use the decompiler if available”. If the Hex-Rays decompiler is available and we want to use it, we will leave this check-box marked, otherwise uncheck it.

After correctly selecting the appropriate values, press OK. It will start exporting all the data from the IDA database. When export process finishes the message “Database exported.” will appear in the IDA's Output Window. Now, save and close this database (fixed_psswd-win86.idb), and open the “vuln_psswd-win86.exe” binary. Wait until IDA's auto-analysis finishes and, after it completes, run Diaphora like with the previous binary file. This time, we will select in the 2nd field, the one named “SQLite database to diff”, the path to the .sqlite file we just exported in the previous step, as shown in the next figure:

After this, press the OK button. It will first export the current IDA database (vuln_psswd-win86.idb) to the SQLite format as understood by Diaphora and, then, right after finishing, compare both databases.

IDA will show a wait box dialog while Diaphora conducts it's extensive checks including current heuristics, unmatched functions and others are being applied to match functions in both databases as shown in the next figure:

Also please note that you are able to re-open a closed tab by doing the below actions.

After a while a set of lists (choosers, in the HexRays workers language) will appear:

There is one more list that is not shown for this database, named “Unreliable matches”. This list holds all the matches that aren't considered reliable. However, in the case of this binary with symbols, there isn't even a single unreliable result. There are, however, unmatched functions in both the primary (the latest version) and the secondary database (the previous version):

The above image shows the functions not matched in the secondary database, that is: the functions removed in the latest patched version.

The second figure shows the functions not matched in the previous database, the new functions added:

It seems they removed/replaced two vulnerable functions (_strcmp, _gets), and replaced them with two secure alternative functions called _strncmp, and _fgets.

Let's take a look now to the “Best matches” tab opened:

There are many functions in the “Best matches” tab, 24 functions to be exact, and in the primary database there are only a few. The results shown in the “Best matches” tab above are those functions matched where the heuristic was equal (like “100% equal”, where all attributes are equal, or “Equal pseudo-code”, where the pseudo-code generated by the decompiler is equal) that, apparently, doesn't have any difference at all. If you're diffing these binaries to find vulnerabilities fixed, just skip this tab, you will be more interested in the “Partial matches” one ;)

In the Partial matches tab we have only two results:

It shows the functions matched between both databases and, in the description field, it says which heuristic matched. The results also display the ratio of differences. If you're looking for functions where a vulnerability was likely fixed, this is where you want to look at. It seems that the function “_main”, for example, was lightly modified: the ratio is 0.810, so it means that a small % of the function differs between both databases.

Diaphora has three different view modes:

Assembly graph (Diff assembly in a graph)
Plain assembly (Diff assembly)
Pseudo-code (Diff pseudo-code)

Let's see the differences:

Assembly graph (Diff assembly in a graph)

Right click on the result and select “Diff assembly in a graph”, the following graph will appear:

Since this is a very small binary with only very little changes it does not have any yellow nodes (minor changes). Please note that the skin I used in IDA replaces a white background with a black one so white nodes (no changes) show up as black above.

Quick example:

Lets have a look at a bigger binary that has multiple new changes below.

The nodes in yellow color, are these with only minor changes; pink ones, are these that are either new or heavily modified and the blank ones, the basic blocks that were not modified at all.

Plain assembly (Diff assembly)

Now let's diff the assembly in plain text: go back to the “Partial matches” tab, right click on the function “_main” and select “Diff assembly”:

It shows the differences, in plain assembly, that one would see by using a tool like the Unix command “diff”.

Pseudo-code (Diff pseudo-code)

We can also diff the pseudo-code: go back to the “Partial matches” tab, right click in the function and select “Diff pseudo-code”:

As we can see, it shows all the differences in the pseudo-code in a side by side comparison diff, like with the assembly diff. After you know how the 3 different ways to see differences work, you can choose your favorite or use all of the 3 for specific cases.

Next post will show how to write a poc exploit for the above vuln binary. Check back soon.

References from the helper doc are below.

Ref:

Camp X

never quit!

Thursday, November 19, 2015

Intro to Diaphora.

Diaphora Support?

Running Diaphora

Diaphora quick start

No comments:

Post a Comment

Blog Archive