Monday 22 June 2020

Comparative analysis between Bindiff and Diaphora - Patched Smokeloader Study Case

This article presents a comparative study case of diffing binaries using two technologies: Bindiff [1] and Diaphora [2]. We approached this topic in a Malware Analysis perspective by analyzing a (guess which malware family?) Smokeloader (! :D) campaign.

In August 2019, I spotted this campaign using patched samples of Smokeloader 2018 samples. This specific actor patched binaries to add new controllers URLs without needing to pay extra money (Smokeloader's seller charges extra-fee for C2 URL updates). This campaign was described in more detail in this previous article [3].

More details about the original samples [4][5] analyzed in this article can be found in the following tables:

 Filename:                        smokeloader_2018_unpatched.bin 
 33792 Bytes
 File type: PE32 executable (GUI) Intel 80386, for Microsoft Windows
 md5: 76d9c9d7a779005f6caeaa72dbdde445
 sha1: 34efc6312c7bff374563b1e429e2e29b5da119c2
 sha256: b61991e6b19229de40323d7e15e1b710a9e7f5fafe5d0ebdfc08918e373967d3  

 Filename:                        smokeloader_2018_patched.bin
 Size: 1202732 Bytes
 File type: PE32 executable (GUI) Intel 80386, for Microsoft Windows 
 md5: 7ba7a0d8d3e09be16291d5e7f37dcadb
 sha1: 933d532332c9d3c2e41f8871768e0b1c08aaed0c
 sha256: 6632e26a6970d8269a9d36594c07bc87d266d898bc7f99198ed081d9ff183b3f  

The following tables hold details about the unpacked code dumped from "explorer.exe" used in this article [6][7]. 

 Filename:                          explorer.exe.7e8e32c0.0x02ee0000-0x02ef3fff.dmp 
 Size: 81920 Bytes
 File type: data
 md5: 711c02bec678b9dace09bed151d4cedd
 sha1: 84d6b468fed7dd7a40a1eeba8bdc025e05538f3c
 sha256: 865c18d1dd13eaa77fabf2e03610e8eb405e2baa39bf68906d856af946e5ffe1  

 Filename:                       explorer.exe.7e8df030.0x00be0000-0x00bf3fff_patched.dmp
 Size: 81920 Bytes (yes, same size)
 File type: data
 md5: d8f23c399f8de9490e808d71d00763ef
 sha1: e1daad6cb696966c5ced8b7d6a2425ff249bf227
 sha256: 421482d292700639c27025db06a858aafee24d89737410571faf40d8dcb53288  

Summarizing the main changes implemented by this patch are: 
  • wipes out the code for decrypting C2 URLs;
  • replaces it with NOPs and hardcoded C2 URL string; and
  • preserves the original size of decryption function to not disrupt offsets;
Figure 01 and 02 presents the graph of the original code. Figure 01 is the code used for indexing a table of encrypted C2 URLs payloads. Figure 02 lists the code used for decrypting the C2 URLs. This function is called in other parts of the code (not only by the function shown in Figure 01) - this is why they are not merged in one function. We labeled functions (e.g. "__decrypt_C2_url" and "__decrypt_c2_algorithm") in this assembly code to make it easier to read. 

Figure 01 - Original code used for decrypting C2 URLs.

Figure 02 - Original encryption scheme used for decrypting C2 URLs.

Figure 03 and 04 shows the same functions on the patched version of Smokeloader. We can notice that the first function is the same as the unpatched version but the second function was replaced. This second function returns the address for the hardcoded URL in ECX. More details about how it works can be found in this article [3].

Figure 03 - Patched code used for decrypting C2 URLs.

Figure 04 - Patched code returning decrypted C2 URL string.

The next sections describe the output of Diaphora and Bindiff when diffing the samples above. 

.::[ Diffing using Diaphora ]

Diaphora is an Open Source binary diffing tool that uses SQLite as an intermediate representation for storing code and characteristics of reversed binaries [8]. It implements many diffing heuristics (strategies) directly on top of this database. The main advantage of this approach is that Diaphora is technology agnostic - this means that it does not depend on any reversing framework such as Ghidra, IDApro, or Binary Ninja

It can even compare reversing databases built up using one specific tool with projects using another tool. This characteristic facilitates collaboration among researchers. Another big advantage is that by using SQLite for describing its heuristics it makes the processing of adding a new heuristic as "simple" as writing a new SQL so more people can contribute to the growth of the project and more experimental heuristics can be quickly prototyped and verified. 

In the experiment described in this section, we used Diaphora version 2.0.2 (released in October 2019) and IDApro 7.5. Diaphora works as an IDApro Python script and can be easily executed by "File -> Script File" or "alt + F7". Figure 05 shows its main interface. 

Figure 05 - Diaphora main Dialog Interface

This interface is a little bit confusing at a first sight especially for users that did not go through the documentation before trying to use it. It expects the user to input both SQLite databases to be compared, boundaries (for the working database), and set up some checkboxes with options. 

First, we used Diaphora over the reference database (unpatched Smokeloader) for extracting characteristics and generating our reference Diaphora SQLite database. For doing this we just need to open the reference database on IDA, open Diaphora, and fill the "Export IDA database to SQLite" input field. Diaphora will export its SQLite database to the same base directory of the IDA working files. 

After executing Diaphora on our patched Smokeloader using out saved labeled database of Smokeloader 2018 as a reference in Diaphora we get four new tabs in IDA
  • Best Matches - common functions to both databases. The ones with 100% match ratio;
  • Partial Matches - all functions that are not Best matches and not unmatched; 
  • Unmatched in Primary - all functions in the first database that are not present in the second;
  • Unmatched in Secondary - all functions in the second database that are not present in the first;
Figure 06 shows the content of the "Best Matches" tab in our example. 

Figure 06 - Diaphora Best Matches tab

Each row shows function labels in primary and second databases, matching ratio (which goes from 0 to 1), amount of basic blocks in each database and information about which heuristic was used to compare both functions. Diaphora provides features to importing features (such as commends and function labels) from the reference database to the target database. Usually, you will want to copy all annotations from one database to another in this "Best Matches" tab. By checking this tab we can see that few core functions are kept intact in the patched version and we are facing two very similar applications. 

Figure 07 presents the "Partial Matches" functions.

Figure 07 - Partial Matches functions and our "__decrypt_C2_url" function right there with 0.978 matching ratio.

Diaphora provides very fine level diffing and most of these functions are basically unmatching constants. These matches are marked in yellow in the graph diff view. Figure 08 and 09 shows the "__set_file_attributes" function and its match in the primary database. We can clearly see that they actually are the same function. 

Figure 08 - Diffing "__set_file_attributes" function assembly view.

Figure 09 - Diffing "__set_file_attributes" function using graph view.

Figure 10 shows the relevant patch we are interested in the "__decrypt_C2_url" function. As we can see both functions are basically identical until the jump instruction. The function jumps to what I labeled as "__decrypt_C2_algorithm" and is used as a function in other places around the code. 

Figure 10 - "__decrypt_C2_url" graph diff pointed in the "Partial Matches" tab

What disappointed me a little bit was that Diaphora did not include the rest of the code of "__decrypt_C2_algorithm" in this function.  This move bumped up the matching ratio and this was misleading when prioritizing what to manually analyze. The "__decrypt_C2_algorithm" function shows up in the "Unmatched in Secondary" tab. This is a good thing as functions in this tab should be the priority in this kind of analysis. In our example, we got 18 functions (out of 52) to analyze marked in the "Unmatched in Secondary" tab. Figure 11 shows this tab and Figure 12 shows the graph view of this function.

Figure 11 - "Unmatched in Secondary" tab and patched "__decrypted_C2_algorithm" function

Figure 12 - Patched version of "__decrypt_C2_algorithm" function

The nicest thing about using Diaphora is that all this analysis could be easily shared by sharing compressed SQLite databases around. So that is it for the Diaphora side. 

.:: [ Diffing using BinDiff ]

BinDiff is an executable-comparison tool created by Zynamics [9] (called SABRE in earlier days) in 2007. Zynamics was acquired by Google in 2011 and BinDiff became freeware in 2016 [10][11]. BinDiff is a plugin of IDAPro and works directly over IDBs (IDA pro Database). Because of this design, BinDiff requires users to have an IDA license (#notcool). BinDiff 6.0 supports also Ghidra using this extra plugin called BinExport [12] which implements something similar to Diaphora's design.

OALabs released a very didactic video tutorial on BinDiff [13]. This video also teaches how to install BinDiff. We used BinDiff 6.0 and IDAPro 7.5 in this experiment. BinDiff adds a new option to the "File" menu in IDA and to compare two executable is as easy as opening a new IDB on top of the current one. Figure 13 shows the BinDiff option in IDA

Figure 13 - BinDiff option in IDAPro

After loading an annotated IDA database using BinDiff, it will add 4 new tabs:
  • Primary Unmatched - functions in the primary database that did not match any function in the secondary database;
  • Secondary Unmatched - functions in the secondary database that did not match any function in the primary database;
  • Statistics - general similarity information about both executables;
  • Matched Functions - all functions with matches and their respective similarity index. 
The two first tabs hold the same information as their correlated tabs in Diaphora. Statistics tab provides high-level information about the matching process. Information in this tab can be used for quick knowing if the binary is a variation of the reference database. Figure 13 shows the data presented in the Statistics tab after loading our reference Smokeloader database against the patched one using BinDiff

Figure 14 - Statistics tab and confidence and similarity indexes

BinDiff calculated a similarity index of 95% (with 99% confidence). This means that we are likely dealing with two versions of the same software. The other metrics are more about counters and general information about both databases. 

Figure 14 shows the Matched Functions tab and the similarity index for each match. It is also possible to see the heuristic used for each match. 

Figure 15 - Matched Functions tab

BinDiff managed to match 51 out of 52 functions. This is a very good result. Besides that Bindiff managed to find out the two most affected functions by the patch: (i) "__decrypt_C2_algorithm" and (ii) "__decrypt_C2_url". It is possible to zoom in and visualize changes in graphs of the function matched. 

Figure 16 shows changes in function "__decrypt_C2_url" (73% of similarity and 97% of confidence).

Figure 16 - Changes in function "__decrypt_C2_url"

As we can see, BinDiff hits the bullseye and detects exactly the changes without splitting the function into two parts. The way BinDiff organizes its graphs makes changes really easy to visualize. 

Bindiff also matches the "__decrypt_C2_algorithm" function with some other function located at "0x00BE19A5" using the "loop count" heuristic but the similarity index is only 19% and confidence is 27%. This means that Bindiff matched the "__decrypted_C2_algorithm" function, which was wiped out with the patch, with some other random function. I don't even consider this a false-positive as results clearly state that this match has low confidence. It is useful to get a spotlight pointed to this function - this for sure is a good place to start an analysis in this specific scenario. 

.:: [ Conclusions ]

For the specific malware analysis problem discussed in this article, we definitely got better results using BinDiff than Diaphora. I also feel that all fine program analysis used in heuristics applied by BinDiff makes a big difference in the final result. 

In terms of general design (not taking in consideration heuristics), I think Diaphora has more advantages than Bindiff, because of its intermediate representation using an open specification format as SQLite and SQL for modeling heuristics. Diaphora is also Open Source so there is a task force sustained by a community in order to improve heuristics and this is the way to go IMHO

Diaphora also takes boundaries as parameters and this can be very useful in case of analyzing big databases especially when analyzing patches for vulnerability development purposes. In line with that, this article focuses on malware analysis but these same technologies could also be used for analyzing vulnerabilities and its patches - ammunition to another blog post. 

.:: [ References ] 

[8] (BSides Joxean)
[13] (BinDiff OALabs)

Wednesday 10 June 2020

Unpacking Smokeloader and Reconstructing PE Programatically using LIEF

This article holds notes on my experience unpacking a Smokeloader 2020 sample. The unpacked payload is further used for composing a valid PE file. The outcome is a PE32 executable containing clean code ready for reversing. 

First things first, here is the sample used in this research:

 Size308.17 KB (315568 bytes) 
 TypePE32 executable for MS Windows (GUI) Intel 80386 32-bit 
 First seen                     2020-03-06 21:45:11 
 md5 c067e0a2d7fc6092bb77abc7f7156b60
 sha256   25959cfe4619126ab554d3111b875218f1dbfadd79eed1ed0f6a8c1900fa36e0 

You can find it in VirusTotal [1]. 

This sample does regular already documented Smokeloader checks before unpacking the main payload, such as: 
  • checks if the process is running in the context of a debugger using "kernel32.isDebuggerPresent" function [2];
  • makes a copy of ntdll.dll, loads it and uses it instead. This technique helps to evade some sandboxes and has been described already in this article here [3];
  • looks for specific patterns in registry keys to check if the sample is running under a virtualised environment.
It also performs a small profiling of the hosting machine in order to decide which payload to inject. Smokeloader has specific code for both main architectures x86 and x64. In this article, we gonna unpack the x86 payload of the above mentioned sample. 

Smokeloader has been using various techniques to inject its final payload into the user file management process "explorer.exe". The sample analysed uses RtlCreateUserThread approach in order to copy the final payload to the targeted process. This injection method is better described in this Endgame/Elastic article [4].

So our game plan is: 
  1. pause execution before the unpacked payload is executed by "explorer.exe";
  2. transplant this code to a dummy PE shell;
  3. fix PE header values and section boundaries;
  4. patching Smokeloader code preamble;
  5. test unpacked Smokeloader PE;
  6. how to do all this programatically using LIEF [5].

Smokeloader 1st stage decompresses its payload using ntdll.RtlDecompressBuffer [6] after few anti-analysis checks described above. It does not call this function from the initially loaded ntdll.dll but from a copy of it loaded afterwards. So breakpoints should be set after the binary loads the copy of ntdll.dll. Figure 01 presents a screenshot of this specific code IDA.

Figure 01: Smokeloader first stage decompression code

This code allocates a buffer with 0x2D000 bytes using ZwAllocateVirtualMemory which stores the main decompressed payload [7]. This code is still transformed before being injected into "explorer.exe". The following steps are performed during injection:
  • fetches explorer.exe PID by calling GetShellWindow and GetWindowThreadProcessId;
  • sections and maps are created in the current and remote processes using ZwCreateSection and ZwMapViewOfSection;
  • main payload is copied to local section and reflected in the remote section;
  • data section is created in the remote process for holding parameters and dynamically created Import Table;
  • A new thread is created in the remote process by invoking RtlCreateUserThread.
So, at this point, you could ask me: what is the relevance of describing all these call names to the final goal of this article? the answer is: so you can reproduce exactly what I'm describing in here. :D

Next step is setting up a break point in RtlCreateUserThread (from the copy of ntdll.dll) and dump the final payload. It is also necessary to take note of few important addresses: (i) entry point of the thread created in the remote processes and (ii) base addresses for injected code in virtual process.

Figure 02 shows a screenshot of IDApro showing the call to RtlCreateUserThread (where we should pause the execution).

Figure 02: Call to RtlCreateUserThread after injecting code into remote process

By stoping the execution on this call we can collect all data we need to move on to the next step:

 Base address code                    0x02060000
 Base address data  0x00B60000 
 Data payload  a01751fb6eb3f19d9b010818bbecc23c  [8]
 Code payload  2547231b4ae82ea9e395fb0c8a308982 [9] 
 Code entry point  0x02061734 

Code payload is the final unpacked Smokeloader code adjusted to run on Virtual Address with base equal to 0x02060000. The created thread receives the base address of the data segment (0x00B60000) as parameter ("StartParameter" parameter of "RtlCreateUserThread" call). 

Smokeloader loads all resources necessary to its execution dynamically. This article here [10] describes how Smokeloader builds up its import table and how to prepare patch an IDB to overcome this technique before starting reversing. So this main payload does not need any specific setup of imports

In this section we will use 010 Hex Editor [11] to transplant a PE header from a random executable. 010 Hex Editor has a PE format template [12]. Although any other valid PE32 binary could be used in this experiment, we used a PE header extracted from an executable listed in this Sotirov's blog post [13][14]. 

Smokeloader code payload has 0x1000 null bytes at offset zero, so we copied the first 0x1000 bytes containing the PE header from tinype.exe to this region. 

Coincidently, .text section will be already pointing to the beginning of the our payload at offset 0x1000. Probably the malware author just wiped out the PE header before creating the payload and left the null bytes there. Next step is to paste all 20480 bytes (0x5000) of our data payload in the offset 0x4400.

Figure 03 shows the new layout of our binary containing the PE header in the beginning followed by 0x3400 bytes of code payload (at offset 0x1000) and finally 0x5000 bytes of data from our data payload (at offset 0x4400).

Figure 03: Initial layout of new handcrafted PE binary

It is time to adjust our implanted PE header manually using 010 Hex Editor. At this point, all fields in this header are still set up according "tinype.exe". From now on, we gonna use this schematic as reference to PE header internal structures [15].

The first adjustment is to change the number of sections to 2 for holding code and data. This field is located in the "COFFHeader.NumberOfSections". Now our binary will list only 2 sections named ".text" and ".rdata" we can rename this second one to ".data" by changing "SectionHeaders[1].Name". 

Next step is make sure that both sections have correct permissions. "SectionHeaders[0].Characteristics" (".text") should have CODE, EXECUTE and READ flags active and "SectionHeaders[1].Characteristics" (".data") should have the INITIALIZED_DATA, READ and WRITE flags active. Still on SectionHeaders, we can setup the bounds and virtual addresses. "SectionHeaders[0].SizeOfRawData" should be set to 0x3400 (13312 Bytes), "SectionHeaders[0].PointerToRawData" should be set to "0x1000" and finally  "SectionHeaders[0].VirtualAddress" should be set to 0x1000. For "SectionHeaders[1]" (".data") we gonna set "SizeOfRawData" to 0x5000, "PointerToRawData" to 0x4400 and "VirtualAddress" to 0x5000. These changes means that these sections will be mapped in memory in base_address (defined in the OptionalHeader) shifted by each section Virtual Addresses offset. There is an Union inside these section headers called "PhysicalAddress" and "VirtualSize", these fields should hold the same value as "SizeofRawData".

Figure 04 shows a diagram of a Section header. Each section in the binary has an instance of this header associated to it. 

Figure 04: PE Section Header

Now we need to adjust few fields in the Optional Header. In this header we will need to change the following fields: 

 ImageBase Virtual Address where binary will be mapped 0x02060000
 SizeOfCode  size of .text section  0x3400 bytes
 SizeOfInitializedData  size of .text and .data sections together  0x8400 bytes 
 AddressOfEntryPoint offset of the entry point code 0x1734
 BaseOfCode .text section Virtual Address 0x1000
 BaseOfData .data section Virtual Address 0x5000
 SectionAlignment Virtual Addresses have to be multiple of this value  0x1000
 FileAlignment file offsets have to be multiple of this value 0x200
 SizeOfImage total size of binary headers + sections 0x9400
 Checksum  PE file checksum - use PE Explorer [16] or Hiew [17] to calculate this value   --------

"ImageBase" has to match the base of the code section we dumped from "explorer.exe" (0x02060000). As we will not export or import anything all "Data Directories" inside the Optional Header can be zeroed as well.

Summarising the whole process: 
  1. Transplanting PE header from a dummy PE;
  2. Fix sections sizes, boundaries, permissions and Virtual Addresses in SectionHeaders;
  3. Setup section contents;
  4. Setup Optional Header fields;
  5. Setup PE checksum;
Here is the version of our binary after following up all steps described above [18]. This binary is a valid executable and we can load it in any debugger or disassembly but we still need to change one last thing before call it a valid unpacked Smokeloader sample. 

Figure 05 shows our reconstructed PE loaded in IDApro paused on the correct Entry Point. 

Figure 05 - Reconstructed PE paused on Entry Point

We can notice that the entry point function receives an argument (0x02061737) and loads it into ECX and then calls another function located in 0x02061743 which is just below the current function. This argument is the address of the data segment. This data segment will be used for various tasks during Smokeloader execution including holding the dynamically created import table. 

If we execute this file without a valid value in ECX it will break when the main payload tries to write into the data segment (invalid address in ECX). Figure 06 shows what happens when we try to execute our binary the way it is right now. 

Figure 06 - Access Violation exception when executing unpatched reconstructed binary

The plan now is to patch this binary to load the correct address of the data segment into ECX before calling "sub_2061743". Since both functions are consecutive and the function on top does not do much - we gonna replace all 15 bytes of this function (0x02061743 - 0x02061734). Figure 07 shows the new patched code. 

Figure 07 - Code after patching

In this new code the entry point remains the same. We can see that we loaded ECX with the address of the data segment by using the push and pop instructions and then we filled the rest of the remaining bytes with NOP (0x90). We can see the beginning of the second function at the same address as before (0x02061743). Of course there are many ways to achieve this same result but this was the simplest approach we could think of. 

The final step is to update the PE checksum field inside the Optional Header again and we will have a fully unpacked Smokeloader sample. Here are the last version of our reconstructed binary [19]:

 File name                         new_pe_patched.bin
 File type  PE32 executable for MS Windows (console) Intel 80386 32-bit
 Size  37888 bytes
 md5  f401109ae24aaf47dce75266ffc049f8
 sha1  49e7ed68b9569e0e987da71b3c678974d8ed7c81
 sha256  cd42f017913034d527d90a84feebcde015e714baa03714c83f80608555e52386 

For testing our branding new reconstructed PE we ran it into Cuckoo sandbox [20] to analyse its behaviour [21]. As we can see in figure 08 and 09, the binary was executed properly and we got it checking in and contacting its controllers. 

Figure 08 - Reconstructed sample connecting back to Controllers

Figure 09 - List of API calls intercepted by Cuckoo

As we can see we got the sample connecting back to three controller URLs and many pages of intercepted API calls in the behavioural analysis. This is an indication that our unpacked and reconstructed Smokeloader sample is functional. 

So far we described how to unpack Smokeloader main payload and how to manually reconstruct a valid PE file out of this. Now we will automate what we did manually by transplanting a PE header from a dummy binary ("tinype.exe"). The Python library used in this experiment is LIEF [5]. We extended this example called "Create a PE from scratch" they have in their official documentation [22].

The following code does exactly the same as the manual approach but using LIEF.

The final binary generated by LIEF is:

 File name                         unpacked_smokeloader.exe
 File type  PE32 executable for MS Windows (console) Intel 80386 32-bit
 Size  34816 bytes
 md5  a0aebc61bc89208be0585eca4d1ed00c
 sha1  ea2f3c914dec6bb36832abc313b3fce826cdecb0
 sha256  0247de510507792fcbf425fab9dbbc2f067c25dc7e4e80a958d1ebfb0505f6e6 

We uploaded it for testing to Virustotal [23] and CAPE sandbox [24] and is a valid unpacked Smokeloader PE32 executable.