Wednesday 25 November 2015

CryptoWall 3.0

This post presents an analysis for a sample of the crypto-virus known as CryptoWall 3.0. 

Table 1 shows general information about the analyzed binary.

Table 1: Artifact General Information
File name
220 KB
Binary type

The analyzed sample represents a variant of the CryptoWall malware delivered as payload by Angler Exploit Kit. There are some more elaborated analyses about this malware around the Internet [1][2][3]. 

CryptoWall is categorized as a ransomware by most anti-virus technologies. Its main behavior is: 
  • run some "anti-debugging" techniques (e.g. verifies if the caller process is Perl or Python);
  • deactivate native OS protections;
  • deactivate backup and file snapshotting policies;
  • start communication with C&C server;
  • generates a temporary crypto key for each request;
  • Uses this key to encrypt (RC4) the communication between the infected machine and the C&C server. This key is passed as parameter inside each HTTP request;
  • sends an unique identification of the infected machine to the server (called "CUUID");
  • the server sends back a public key and an unique image used to link the infected machine to its server side session;
  • this process uses this public key to encrypt specific files inside the disk of the victim (according an extensions table);
  • the malware redirects the victim to a web page where the user can pay (in Bitcoins) to obtain the private key and recover the encrypted files;
  • the malware deletes itself (files remain encrypted);
Once the amount requested is paid the user receives a private key which can be used for recovering the encrypted files. 

The first step in order to understanding the general behavior of the artifact is to load it into a sandbox environment. We used the Cuckoo Sandbox for this task (in this case we used

As expected, according the available documentation about this malware, we could observe all main documented characteristics of CryptoWall. 

The first evidence is the high amount of changes at the filesystem. According to Cuckoo, 5.069 files were modified. Another key characteristic for CryptoWall is several copies of files containing instructions regarding how to decrypt the files and how to make the payment. Figure 1 shows a instance of these files into the sandbox disk. 

Figure 1: CryptoWall instruction files for decrypting the compromised content
Figure 2 shows the content of the plain text version of this file. We can see that the file describes, in a user friendly way, exactly what happened and how the user should proceed to make the payment.

Figure 2: CryptoWall instruction files content
There are copies of these files every where inside the infected machine. These files can be used as a evidences to detect if the machine was compromised by CryptoWall.

The second evidence is the processes stated by the malware. After preliminary checks and self-protection actions, CryptoWall creates an instance of "svchost.exe" and inject it main payload inside this process. All the main activities of this malware is performed by this process. We can observe this characteristic by analyzing the malware behavioral analysis results. Figure 3 shows the processes spawned by CW3.
Figure 3: Processes started by CryptoWall
Table 2 shows the total amount of actions (network, filesystem, registry, process, services and synchronization) performed by each of these processes.
Table 2: Total amount of actions per process

These totals and by analyzing the content of each action is possible to affirm that the main activity of the malware is performed by the svchost.exe process. This information is useful for defining a more effective approach for static analyzing the artifact.

The next step was to analyze network traffic. The sample connects to 2 servers: 
  • - used to determine the location of the infected machine in order to provide customized payment instructions to the user;
  • - used as remote server where the infected machine identifies itself and requests the public key used for encrypting files. 
It is possible to download a PCAP file containing all network traffic during the sandboxed analysis. We decided to analyze the PCAP file by using Wireshark. The malware uses HTTP as protocol. Figure 4 presents all inbound HTTP traffic registered by our sandbox environment. We could observe the initial HTTP request made for "" and the other 3 requests to the CW3 remote server "".
Figure 4: Network traffic captured by the sandbox.
The traffic transmitted to the remote server is encrypted by using RC4 which is a Symmetric encryption algorithm. This means that somehow the client has to send the key to the server. After reading some documentation online I could figure out the heuristic used by the malware to obfuscate the communication. This heuristic follows the steps:
  • the malware generates a random RC4 string containing 10 to 15 characters;
  • this string goes as URL parameter during the HTTP request;
  • The ordered version of this string is used as key to encrypt and decrypt the traffic. For instance if the string is "jihgfedcba" the encryption key will be "abcdefghij";
  • the content will be injected as a form parameter identified by the last character of the key;
  • the same key is used to decrypt the inbound and outbound traffic;
This heuristic means that all traffic between the remote server and the infected machine can be decrypted and analyzed - including the cryptographic key sent by the remote server. 

Figure 5 shows the structure of this HTTP POST to the remote server. The random string (which generates the encryption key) and the encrypted content is marked with a red rectangle.

Figure 5: CW3 HTTP POST structure
We created a script in Ruby to decrypt CW3 traffic. Code 1 shows the source code for the script used to decrypt CW3 traffic.

Code 1: Ruby script used to decrypt CW3 communication protocol
 require 'rc4'  
if ARGV.size != 1
puts "Usage: #{$0} <key>|<content>\n\n"
exit 1
(key, cypher_text) = ARGV[0].split('|')
key = key.split('').sort.join
rc4 =
s = cypher_text.scan(/../).collect{|x| [x.hex].pack('C')}.join
puts rc4.decrypt(s)

By using this script we could be able to recover all the communication between the infected machine and the remote server. Code 2 shows the execution output for the proposed decoder. 

Code 2: Ruby script used to decrypt CW3 communication protocol
 rc4 mabj$ ls  
cw3_decoder.rb cw3_input.dump
rc4 mabj$ head cw3_input.dump
jg29o6mk639|b42365bcdcfddf6606f935fa0d520993e86d9ebbf0735b7b ...
02bk27i91c|1be7d5ee6b7ea8a5e44cca49ea955105f4fae250add785e8a ...
p4q4z7yqgd|dcf96963bdd535bb0aeace22c4471f9a9cca91b64dd281e17 ...
rc4 mabj$ for l in `cat cw3_input.dump`; do
> ruby cw3_decoder.rb $l
> done

The first message sends the campaign ID ("crypt13"), an unique identification ID (also known as CUUID - "B834AFC69086975FED56B5B9BB7221A0") and location of the infected machine (IP address). Code 3 shows the acknowledge answer for the first HTTP POST message.

Code 3: Remote server answer for the first request

The second request ask the remote server for the RSA public key used for encrypting the files. Code 4 shows the answer containing the public key and a TOR link to be used for the payment messages. 

Code 4: Remote server answer for the second request
 {148|ayh2m57ruxjtwyd5.onion|1weYY4|US|-----BEGIN PUBLIC KEY-----  
-----END PUBLIC KEY-----}

After the third request the remote serve sends a "splash" PNG image containing information about the payment. Figure 6 shows the customized PNG image returned by the server. 

Figure 6: Customized image generated by CW3 server
Would be a nice exercise to implement this decoder as a plugin to generate more accurate signatures for a layer 7 package inspection firewall or IDS in order to block the malware traffic with the public key (this would cause a relevant network overhead, also). 

At this point, we could be able to extract all main information for this malware. The only question not answered is the list of all possible control servers within this specific sample. An a valid approach to obtain this list dynamically is run the malware in a sandbox environment with all traffic blocked by a firewall. We tried this approach by using a combination of a Microsoft Windows 7 VM, Process Explorer and Procmon. It was possible to identify 50 remote servers. Table 3 shows a list with all remote servers found by sniffing my virtual machine.

Table 3: Remote Servers Used By the Provided Sample

The main binary was loaded into IDApro without presenting any error. It is possible to observe that most of the binary is composed by an encrypted, payload which will be decrypted, uncompressed and injected inside svchost.exe. Figure 7 shows the segments layout for the analyzed binary.

Figure 7: Binary segments layout.
Figure 8 shows a sample of the obfuscated code stored at the ".text" segment (which is a read-only segment).
Figure 8: Obfuscated code that will be inject into "svchost.exe"
The main binary implements few obfuscation and anti-debug strategies. Figures 9 shows an example of obfuscation strategy implemented by the sample. The code continuously load random values into the same register just to make the code more difficult to understand what is really loaded into each register.
Figure 9: The code contains a big amount of of dead portions.
We could recover the cryptographic key and analyze the malicious code obfuscated at the data segment. However a hybrid approach (described below) was more effective. 

As we know already that most of the malicious tasks are performed by the svchost.exe process, we decided to use 2 approaches to analyze this executable.
  • Suspend the process and try to attach it to IDA;
  • Dump the process by using Process Explorer and analyze the content.
We could find an array with 60 elements with all remote servers used by the malware. This means that we could find out 10 more remote servers besides the ones we could spot by using a dynamic approach. Table 4 shows the new servers identified through static analyzing the malware. The full list with all remote servers can be found [HERE].

Table 4: Extra Remote Servers Identified by Static Analysis

Figure 10 shows the array structure containing the remote servers for the analyzed sample.

Figure 10: Memory structure used for storing references to all remote servers.
We could spot another very interesting behavior - the malware targets a list of 476 financials, governmental and NGOs (!?) institutions to be used as reference for finding sensitive data inside the users cookies. This information can be used for intelligence purposes. Figure 11 shows the data structure in memory used to hold this information. The full list of targets can be found at the [HERE].
Figure 11: Targeted financial/governmental/NGOs institutions.
The injected binary is very plain and with all static strings in plain text inside what makes easy to inspect just through a quick look at the strings generated by the process dump. It was possible to spot all information gathered during the dynamic analysis.

Table 5 summarizes the findings for this analysis.

Table 5: Summary of findings for this analysis.
File Name
Trojan horse
Campaign ID
Conrtol Servers
Control servers.txt
Crypto keys and scheme analysis
Described before

This section lists the main tools used to perform this analysis. This information is necessary to guarantee reproducibility of the experiment. 
  • IDA 5.5 (freeware version)
  • Radare2 Framework
  • SysInternals Tools
  • Virtual Box
    • Windows 7 professional
  • Cuckoo Sandbox
    • VirusTotal
  • Wireshark 1.12.7


Monday 16 November 2015

Fireeye - FlareOn 2015 (Challenge #4), Relocation Table and ASLR

Continuing the series of posts about the FlareOn reversing challenges 2015 [1]. If you interested, you can find my previous post with solutions for the first three challenges here [2]. 

I had some "extra work" due to infra-structure issues during this challenge and ended up learning new things. So I decided to dedicate a whole post for the challenge #4 not only explaining the solution but also explaining these issues. 

As the previous challenges the objective in this task is to recover the flag (an e-mail address) from a PE-32 binary file. The main insight for this challenge is that the binary was packed using UPX [3]. There are many ways to raise evidences that a binary was packed using UPX, the main is to examine sections. We could obtain this information using rabin2 by the following command. Figure 01 shows the sections labels for the analyzed binary.

Figure 01: Sections information for analyzed binary.
As we can notice there are 2 sections (marked with the red rectangles) called "UPX0" and "UPX1". These sections are used by UPX to store compressed data of the original binary.

UPX standard package offers an utility to unpack the packed binary. Basically this tool reads the UPX headers (out of the scope for this post) and extracts the original binary to an output file. The main advantage of using UPX is to reduce the size of an executable by compacting it. This compressed version of the binary is uncompressed in memory and executed during run-time. This means that the packed executable has the decompression algorithm and a mechanism to change the execution flow to the uncompressed version of the binary during execution time.

At this point we already know that we have an UPX packed binary. The binary executes perfectly packed at our test environment (Microsoft Windows 7 - 64 bits). The issues happens when we tried to unpacking and executing the resulting unpacked binary using this Virtual Machine. Surprisedly the resulting unpacked binary does not work anymore. Figure 02 shows the command line used to unpack the challenge and the checksum for the unpacked version of the binary.
Figure 02: Command line for unpacking binary and MD5
Figure 03 shows the error presented when we tried to execute the unpacked version of the binary.
Figure 03: Execution error of unpacked version of the binary
This was very curious, we analyzed the unpacking code and the uncompressing routine was fine. Then we decided to execute the unpacked binary in different platforms, such as:  

  1. Microsoft Windows 7 - 32 bits
  2. Microsoft Windows XP - 32 bits

The same error occurs when we tried to execute on Windows 7 - 32 bits. The unpacked binary was executed without any error on Windows XP - 32 bits.  Figure 04 shows the execution of the unpacked binary under a Windows XP - 32 bits environment.
Figure 04: Executing the unpacked version of the binary in a Windows XP - 32 bits
This situation raises the Issue investigated in this post (a little bit off-topic but interesting though):

"why the unpacked version of the binary executes on Windows XP but not on Windows 7 even though the packed version executes without any error on both platforms?"

After loading the unpacked version of the binary on IDA we could notice that the executable was compiled by using some version of Visual C with the flag "GS" active [4][5]. This flag tells the compiler to inject code to check the software against buffer overflows. The code implements a set of cookies based routines to check if the boundaries of variables are not violated. Figure 05 shows the code inserted by the compiler used to generate this security cookie.

Why we are talking about this? Because the debugger reports a memory access violation when the program tries read the security cookie initialized and stored inside the data segment by the function  "__init_security_cookie()". Figure 06 shows the exception triggered by the debugger and the defective code.

Figure 06: Exception triggered when executing the unpacked binary.
An interesting behavior is that the code is modified when executed from:


This runtime change is causing the error because the register DS (Data Segment register) is not initialized with the base address for the Data Segment. Well, the Data Segment Register should be filled dynamically according the Relocation Table [6][7] and is used for ASLR [8]. The problem is that the binary does not have Relocation Table (probably striped by UPX) and was compiled with the ASLR flag.  This information can be confirmed by looking at the PE structure by using CFF Explorer [9] and Process Explorer [10].  Figure 07 shows the information about ASLR flag at Process Explorer. As we can see the process has the ASLR flag active.

Figure 07: Process characteristics interface showing that ASLR flag is active 
We can double check and edit characteristics of the binary by using CFF Explorer. Figure 08 shows this same information ASLR flag on CFF Explorer but this time this tool was used to unset this flag.

Figure 08: unpacked binary characteristics on CFF Explorer 
Once the flag was disabled the  binary does not uses ASLR and does not requires the Relocation Table anymore. We can observe now on Figure 09 that the unpacked binary is executing normally on Windows 7.
Figure 09: Unpacked binary is executing fine after ASLR flag was disable
The unpacked binary was executing normally on Windows XP because this version of Windows Operating System does not implement ASLR (ASLR was implemented for the first time on Windows Vista).

No more obscure errors, now we can finally focus on the challenge. The first thing to notice in this challenge was the presence of UPX. Another interesting characteristic is the difference of output between the unpacked and the packed version. The only reason for such thing is that the unpacking algorithm is modifying the unpacked binary during execution time. After analyzing the end of the algorithm used for unpacking the original code, just before redirecting the execution flow for the Original Entry Point, some routines was injected to modify the unpacked binary. There are 2 modification routines:

  1. replaces a "5" by "4" in a specific position of the memory;
  2. switch the cases of an array of characters ("A" goes to "a", "B" to "b" and so on) .

Figure 10 shows the changes made by the modified version of the UPX unpacking algorithm over the packed binary. It is possible also to see the jump to the Original Entry Point at address "0x408621" this instruction changes the execution flow to the beginning of the UPX0 section.

Figure 10: changes made by the modified version of the UPX unpacking algorithm over the packed binary
In order to use the unpacked version of the binary for analyzing the challenge it is necessary to manually patch the unpacked binary and add the modification code injected inside the unpacking algorithm.

The first patch can be achieved by changing the byte at the offset "0x3c4c" from "0x35" (string "5") to "0x34" (string "4"). The second patch can be achieved by changing the content starting at the offset "0x3bb8" until offset "0x3beb". It is necessary to switching the case of all alphabetic characters ("A" goes to "a", "B" to "b" etc).  Figure 11 shows the changes for the second patch.

Figure 11: second patch which swap the case of the characters for a specific region in memory.
By now we have restored the original binary and we are ready to start to analyze what is necessary to solve the challenge.

The first thing to notice is the verification code to check if the binary was unpacked properly. This code checks if the static string located inside "0x0040525c" (labeled as "Str" by IDA) is not equal to "5". Remembering that the customized unpacking algorithm replaces this string with "4" during execution time. The routine calls the function "atoi()" to transform this string in a integer and makes the comparison. Figure 12 shows the unpacking checking routine.

Figure 12: unpacking checking routine
The second thing to notice is that the binary expects a parameter. Figure 13 shows the routine which checks the number of parameters passed to the binary.

Figure 13: Routine which checks the number of parameters passed to the binary
The third relevant code is the call to the the function "localtime64()" which returns a structure containing the current time at the system. Figure 14 shows the code which calls this routine.  Another interesting thing to notice at this routine is a code to calculate a MD5 hash of the inputed parameter (first red rectangle).  The function "sub_4012E0" calculates this MD5 hash.
Figure 14: localtime call routine.
At this point we know that the binary receives an integer parameter with maximum 8 bits (because the mov instruction receiving "al"). Well we could just ignore the rest of the code and brute force the binary with values between 0 and 256. This can give us the flag and solve the challenge. But our target is understanding the behave of the code then we should move on.

The next characteristic is that the binary collects the local hour at the system and user this integer to as an index for an array containing 24 strings encoded with Base64. Inside each component of this array has an encoded MD5 hash. This hash is compared with the MD5 hash calculated by using the input argument; if it does not match the binary exits.  Figure 15 shows the code responsible for collecting the hour in the host system.

Figure 15: collecting the hour in the host system
Figure 16 shows the function responsible for using the current hour for indexing the array with the hashes in Base64 format.

Figure 16: accessing the "hashed" hours vector  

The function in "sub_401000" is responsible for decoding the content and compare the decoded hash with the one calculated by using the current hour. This is the necessary information to solve the challenge, we know that the binary expects as parameter the current hour of the host system. Figure 17 shows the resulting output by inserting the current hour as first parameter to the binary.

Figure 17: Binary executed with the current hour of the system as a parameter.
The challenge is solved but we still want to understanding the rest of the analyzed binary.  Well by observing the graph generated by IDA we can observe that the _main function flow structure can give us some hints. Figure 18 shows the flow graph of the main function calculated by IDA. We can observe that the main function has basically 2 sequences of code with 24 functional blocks each.

Figure 18: Flow graph of the _main function generated by IDA.
The first set of blocks is responsible to building the vector containing the encoded hashes for numbers between 1 and 24. This is used to verify if the user is inserting the expected time as a parameter to the binary. The second set of blocks builds another vector containing an encrypted version of the flag for each hour. The hash representation of the hour is used as a XOR key for decrypting the flag. Figure 19 shows the final decryption routine which is used to recover the flag.

Figure 19: XOR decryption routine
That's all folks!

PS: during my journey to understanding how relocation tables works, I found a very interesting article about code injection [11].

[7] Windows Internals (Chapter #4)

Sunday 11 October 2015

Fireeye - FlareOn 2015 (Challenges 1-3)

I took the day off today to solve the Fire Eye Reversing FlareOn challenges [1] and decided to publish my notes here. These challenges were very entertaining and I strongly recommend to anyone who is interested in reversing for fun.

In total, there are 11 challenges with different levels of difficult and covering the most diverse kind of technologies (from .NET to mobile reversing). This post goes through solutions for the first challenges. 

.::[ Challenge 01
This challenge is a win32 executable and basically validates a password inputed by the user. If the password is correct the binary outputs a flag otherwise an error.

Figure 01: Challenge 01 XOR encryption scheme
By analyzing the code we realize that the binary compares the user input with an encrypted string located at address "0x00402140". This string has 24 characters (according comparison at the address "0x0040105e") and is the targeted flag in this challenge.

Figure 02: Encrypted key located at position 0x00402140 (24 characters)
The software uses an xor based encryption scheme with key equal to "0x7d". This key could be found hardcoded at address "0x00401053". To recover the original password it was necessary to:

password = xor(encrypted_password, key) = ""

Figure 03: Challenge 01 flag
  • Binary name: i_am_happy_you_are_to_playing_the_flareon_challenge.exe
  • MD5: 7c0f16de595ae03e2928d3fa6b73b235
  • Cryptographic key: 0x7D
  • Password size: 24 characters
  • Encrypted password: 1F08131304220E114D0D183D1B111C0F18501213531E1210
  • Decrypted password (flag): ""

.::[ Challenge 02
This challenge is also a win32 binary without the extension ".exe" (the binary has to be renamed before executed). The binary is designed to validate a password inputed through standard input. If the password is correct then the flag is output otherwise an error. 

The first detail to observe within the binary is the size of the password which is 37 characters (0x25). This information can be extracted by observing address "0x0040108e". Figure 04 shows the code responsible for validating the size of the password. 

Figure 04: Code responsible for validating the size of the password
Immediately after validating the password size the execution flows to the main decryption block. Figure 05 shows the decryption and main block used in this challenge. 

Figure 05: main decryption code 
In this block we can recognize a XOR based cryptographic scheme with key equal to "0x1c7" located at position "0x004010a9".  Actually, besides the XOR operation, the block performs a set of small transformations. The data index register (edi) in this block is pointing to the encrypted data located at address "0x004010f4". 

Figure 06: Encrypted data 
The full heuristic of this block can be described as the following pseudo-code
  1. initialize 2 "zeroed" variables: one is the summation (stored at bx) and the other one is the rotation stored at dl (first 8 bits of edx);
  2. Iterates over each character within the inputed password; 
  3. apply a mask of 0x3 (11b) to summation variable. This means that the outcome of this operation can be any value between 0 and 3
  4. XOR the character with the key 0xC7;
  5. The resulting value of the previous operation is stored in al register segment (first 8 bits of AEX register);
  6. rotate left ah in dl bits; 
  7. Evaluate the expression al = al + ah + CF (Carry Flag);
  8. Compare the value of al with the encrypted character pointed by edi;
  9. if is not equal: exit("invalid password!");
  10. Otherwise: 
    1. bx = bx + al (update the summation variable);
    2. go back to step 2.

In order to decrypt the encrypted content and recover the plain text password, we need to reverse the above mentioned heuristic. The reversed operation for the encryption heuristic is: 

plain_text_char = XOR ((encrypted_char - shift_var),  key)

We created a small Ruby script reversing the above described heuristic. Figure 07 shows the Ruby script created to decrypted the data. 

Figure 07: decryption Ruby script 
 This script will decrypt and output the password. Figure 08 shows the output of the decryption script.
Figure 08: decryption script output

Finally we get our flag and a success message when executing the challenge binary.

Figure 09: success message when executing the binary with the right password


  • Binary name: very_success
  • MD5: d88dafdaefe27e7083ef16d241187d31
  • Cryptographic key: 0xC7
  • Password size: 37 characters
  • Decrypted password (flag): ""

.::[ Challenge 03
Challenge 03 is composed by a "frozen" Python script. This characteristic can be verified by many ways, such as: monitoring the filesystem, by the icon of the application or string analysis. Figure 10 and 11 shows the output of the "strings" command and "procmon" [3] (application from sysinternals) with some evidences that the binary is, indeed, a frozen Python script. 
Figure 10: "strings" outputting some strings containing references to Python 
Figure 11: Procmon monitoring the elfie binary and showing references to Python
The binary shows an image in the background and prompts for the flag. For this challenge the first step was extract and inspect the content of the executable. We used the PyInstaller Extractor [4] to extract all the content inside the executable. The extractor will create a folder called "elfie.exe_extracted" with many files inside. We are interested in the file called elfie (without any extension). This file has a big python script full of variables with obfuscated names and containing pieces of base64 content. Figure 12 shows a piece of this obfuscated script. At the last line of this file it is possible to spot a very interesting line which concatenate all variables, decode and execute the outcome content.  
Figure 12: decompiled elfie python script 
If we replaced the "exec()" by "print()" this script will print the source code. This source code is still obfuscated but at least the execution flow and some strings are still readable. In fact most of the code is a base64 related to the images and the real source code has less than 50 line of code. Figure 13 shows the flag inside the deobfuscated source code. 

Figure 13: Decompiled Python source code

We can spot the flag inside this file (passed as parameter to the "reversed()" function) which is: "". Now if we type this value in the challenge elfie application we get success.  

Figure 14: Flag

  • Binary name: elfie.exe
  • MD5: 8f0400fe6d897ddbcef2aaf9f9dbd0a4
  • Decrypted password (flag): ""

.::[ References: