The Hancitor malware family has produced several variants over the past several months. The variations have been similar, but changed enough to bypass protections that are exclusively signature-based.
By understanding the underlying mechanisms of the Hancitor family, you can properly protect against the threat. Despite the signature changes over the variants, the general behavior has continued along a similar path. As a result, by defending with both signatures and behavioral protections, your enterprise will be much safer. This blog outlines how to do that.
Hancitor Malware Brief
Microsoft Office documents with embedded macros have become common means to deliver malware to unsuspecting victims. Phishing emails exclaim how critical it is that a document be handled immediately or dire consequences will occur. Various techniques are used to encourage the user to enable macros if they are not enabled by default and Microsoft does the rest.
Helpful APIs provide the malware author the ability to run their malicious code the second macros are given free reign. Macros are written in the scripting language Visual Basic for Applications (VBA), which exposes helpful functions such as Document_Open(), which executes a macro as soon as the document is loaded. Because of how obvious malicious VBA scripts are, many malware authors resort to heavy obfuscation to prevent easy analysis of the script.
Hancitor malware uses strings, casting API calls as gibberish words, extraneous code, and embedded shellcode to delay the inevitable exposure by security personnel. There have been a few versions of Hancitor malware found in the wild up to this point that have been fairly close in behavior. FireEye detailed a sample that uses PowerShell within the macro and Didier Stevens used dynamic analysis techniques to pull apart a sample that is similar to the variant discussed below.
Hancitor Malware Summary
Hancitor is a family of malicious documents that uses VBA macros to perform various activities centered around downloading additional payloads. The VBA macros contain a large amount of obfuscation and rely on the user to have macros enabled. The Hancitor family has been seen downloading and executing remote payloads using PowerShell, process hollowing into svchost.exe and explorer.exe, and dropping embedded executables.
The payload involves both known trojans and custom code used to download and execute arbitrary payloads. The family is well written and fault-tolerant with custom protocols and verification for communication on both client and server side to ensure that the malware runs successfully. Below are details from the Hancitor sample with the following hash:
“Please God Erasus”
In order to make it harder to follow the code, the malware author added in a large amount of gibberish code. This code calls itself, modifies variables, and generally does nothing. To further obfuscate the code, it is split into two different macros and interspersed with comments containing the lyrics to Erasus by Subkulture. All of this makes the code harder to read from a static perspective. Additionally, Microsoft makes the language difficult to read by providing multiple formats for calling a function. Without stepping through and seeing which statements are important and which are not, it can be very difficult to see what the end result of the code will be.
Dim API as Junk
If junk code, extraneous code, and strings obfuscation is not enough to ward off analysis, Hancitor sets aliases for several of the API function calls that are used in the malware. Eight API calls are imported and set as aliases to random words. Of those eight calls, only three of them are actually used.
VirtualAllocEx, EnumCalendarInfoW, and RtlMoveMemory are the only APIs used to execute the malicious code. The point of all of the extra “junk” is to make it more difficult to tell what it is important and what is not at first glance.
All of the previous obfuscation hides about 30 lines of malicious code within 350 lines of code. The majority of the 30 lines are responsible for retrieving shellcode, which is stored in a form tab within the VBA project. The Document_Open function initiates the malware as soon as the document is opened. The shellcode is retrieved, deobfuscated, and copied into space allocated with VirtualAllocEx. Or thanks to VBA syntax, the API ‘colicky’ is called (/s). One of the signatures of Hancitor is the use of Microsoft API functions that allow the programmer to specify a callback function to handle the results of the API call. In this case, the author doesn’t care about the API call; they instead want to execute the shellcode through an API that is not likely being monitored. For this variant, the author used EnumCalendarInfoW to execute the shellcode. After this function call, we have moved out of the VBA macro into assembly execution. The macro does a few more pointless calculations and then exits.
A Hollow New World
It is rare to see shellcode launched directly from the document, but the VBA language provides access to the APIs required. In this case, the shellcode is used to perform hollowing without any files, other than the phishing document, touching the disk. Most malicious documents drop a binary that performs the hollowing separately. Because of the use of the callback functionality of EnumCalendarInfoW, it can be difficult to break into the shellcode to debug it. This is where a super handy reversing trick can pay dividends. Luckily, I happen to have one of those tricks up my sleeve. With the breakpoint set on the Macro’s API call to EnumCalendarInfoW (countywide)…
We get the address of the first argument, ‘seizing’ (0x2F8116D). This is the start address for the callback function being passed to the calendar API. Looking at that memory address in Process Hacker, you can find the function prologue for the shellcode.
Now is where the fun comes into play. We could probably change that to an INT3 (0xCC) and catch the exception in a debugger, but where is the fun in that? This can also cause crashes and be unreliable. Instead, we can modify these bytes to execute an infinite loop at the entry point. This lets the malware do all the setup for us and provides a nice easy loop to break into with a debugger without fear that malicious code will be executed. By changing these two bytes to a JMP -2 (0xEB FE), we can turn the function prologue into an infinite jump to itself.
By typing in EB FE to Process Hacker and clicking the ‘Write’ button in the lower left, we can write to the memory block. After executing the next instruction on the macro, the shellcode is neutered and spinning in an infinite loop and we can attach a debugger to the thread to gain control!
Break in with your favorite debugger (I’m using x64_dbg here) and set a breakpoint on the jump and hit run on the debugger. This should leave you at the shellcode entry point where we can replace our jump with the original bytes.
Now we have a nice starting point to step through the shellcode. Snapshot your VM (you’re using a VM right… RIGHT?!?) and continue on with the analysis.
The remainder of the shellcode is fairly straightforward. The malware uses the address of the Process Environment Block (PEB) to retrieve the list of loaded modules. The first one is always ntdll.dll, which the shellcode uses to walk through the functions and find the one it is looking for based off a name. It finds the undocumented ntdll function LdrLoadDll and uses this along with the custom find function code to load the additional dlls and functions needed to hollow out a process.
The shellcode determines if it is running on a x86 or x64 operating system and will hollow out a different process depending on which environment it is running in.
32-bit will hollow out %WINDIR%\explorer.exe
64-bit will hollow out %WINDIR%\SysWOW64\svchost.exe
The payload is downloaded from a remote C&C server that is hardcoded into the shellcode. It uses URLDownloadToCacheFile to retrieve the payload from the server and decode it. This differs from other versions seen up to this point. The request that is sent to the C&C server contains the operating system version. For my analysis system it looked like this:
61 is the Windows OS version corresponding to Windows 7. The response from the server contains an encoded payload. The decoding of the payload is the same as Stevens pointed out in his writeup linked above. Each byte of the payload has a 3 added to it and then XOR’d with 0xE before being base64 decoded.
The target hollowing process is created as a suspended process, the original code is unmapped, and the decoded payload is written in its place. The thread is then resumed and the payload is now executing on the infected system.
The Payload Payoff
The bad guys now have a running payload on the target system. This is arbitrary code and the payload can change at any time since it is downloaded from a C&C server. However, it’s worth taking a look at what is currently being used.
The payload starts like a lot of other malware. The malware loads the addresses for LoadLibrary and GetProcAddress, which are standard APIs required to bootstrap additional functions. At this point the malware does something fairly unique. The MZ header is erased from memory. This won’t have a large impact on anything, but since this payload never touches disk it could make forensics more difficult after the fact.
Next, the malware begins to check out the environment and determine if it has run before on the system or not. The first step is to get the module name of the running process, append ‘.cfg’ to the end and see if that file exists. If the cfg does exist, the C&C addresses inside it are loaded into memory. Otherwise, the embedded URLs are used. The malware then gathers the following information about the infected system:
- OS version number
- MAC address
- Windows Directory
- Volume Information
- IP address – gathered from http://api.ipify[.]org
- Processor Architecture
- Computer Name
This information is gathered into a POST request which is sent to the C&C server. The malware contains an embedded list of C&C servers to check with. It will contact each one in turn until it successfully receives a response.
A POST from my analysis machine looks like this:
GUID=15460471460388136120&BUILD=0911b&INFO=WIN-1PU82PIDOO6 @ WIN-1PU82PIDOO6\Rick&IP=0.0.0.0&TYPE=1&WIN=6.1(x32)
The response that comes from the server is validated to ensure it is getting the correct response. A custom signature is is used to verify the response. Here is an example of a response:
The first four characters are used as a verification signature for the communication. First, the malware verifies that the first two characters are uppercase Alpha characters. Second, the first character (0x5A -> Z) is subtracted from the hardcoded value 0x9B. The result of this calculation must match the fourth character (0x41 -> A).
The same operation is performed on the second character and compared to the value of the third character. If these values match, the signature is valid and the first four characters are dropped. The remaining characters are base64 decoded and XOR’d with 0x7A. The communication above translates into:
Here is quick python script that will verify the signature and decrypt the traffic:
import sys import base64 if len(sys.argv) != 2: print 'usage: %s [encoded communication]' sys.exit(-1) comms = sys.argv if not ((0x9B - ord(comms)) == ord(comms) and (0x9B - ord(comms)) == ord(comms)): print "\nFailed to verify server comms" sys.exit(-1) decoded = base64.b64decode(comms[4:]) output = '' for x in decoded: output = output + chr(ord(x) ^ 0x7A) print output
The decoded command is also verified and must start and end with open/close curly braces respectively. The third character must be a semicolon and the second character must be from the set of defined commands that can be issued from the C&C server:
|b||Download file from URL, hollow svchost.exe and execute it|
|c||Write URL plus embedded URLs to cfg file (svchost.exe.cfg or explorer.exe.cfg’)|
|e||Download file from URL, executes file in new thread|
|l||Download file from URL, executes file in new thread with a hardcoded argument|
|n||No command – This is a pull only malware|
|r||Get temp file name, download file from URL, write to tmp file, and execute it|
Multiple URLs are separated by a ‘|’ and parsed inside the command handling algorithm. The ‘|’ allows multiple URLs to be used to download a payload. The malware will stop attempting to download from the URLs after successfully downloading the payload if multiple URLs are supplied. The backdoor sleeps for one minute before repeating the C&C communication loop.
VMRay sandbox gives us a good picture of the overall infection.
This malware family has produced several variants over the past couple of months. The variations have been similar, but changed enough to bypass signature based protections. It is important to understand the underlying mechanisms of the family in order to properly protect against the threat. Despite the signature changes over the variants, the general behavior has continued along a similar path. By defending with both signatures and behavioral protections, your network will be much safer.
(Editor’s Note: Looking for more on Hancitor? Visit VMRay’s Analysis Report to read more.)