Posted on behalf of Martin Lee, Senior Software Engineer, Symantec.cloud
Occasionally, MessageLabs Intelligence is lucky enough to find an email thread contained within a malicious email that allows us to examine the conversation leading up to the attack. This particular email exchange between the attacks and intended target allows us to understand the social engineering that leads to an attack.
The initial contact is from an individual claiming to be a journalist from a well known American newspaper sent to a media contact for a large professional services company.
Despite the name given in the email, the language used does not appear to be that of a native English speaker. The name used in the header ‘From’ address is non-English and does not correspond with the name used in the email signature. These are surprising lapses given the otherwise professional nature of the attack. However, small inconsistencies in an email are often indicators that something is amiss.
In reply, the target requests further information. The attacker responds with the actual attack -- instructing the intended victim to open the attached document.
This pdf attachment contains the malicious payload.
Analyzing the attached pdf binary shows that it is comprised of ten separate objects.
The separate objects within the pdf file are mostly small text objects concerned with constructing the document and are of little interest. However, objects 9 and 10 are large and contain binary data, which likely contain the majority of the malicious payload, but it’s not clear how this data is accessed or executed.
Object 1 contains XML Data Package information which is an XML format that can be used for describing PDF formatting information. Examining the XML content shows that part of the data refers to an image only 28.575mm (h) x 1.39mm (w). This image referred to as “ImageField1”, is in TIFF format and is embedded within the XML coded in base64 format which can be easily decoded.
The decoded tiff image is 8661 bytes in size, but something interesting happens if we try and parse the image to identify its size or depth of color etc...
The tiff parser crashes. The DotRange value is corrupt; the value is much larger than the parser expects and causes the parser to unexpectedly halt. In fact, this is a known vulnerability in certain versions of software that display TIFF image files that can allow attackers to execute malicious code. Referred to as CVE-2006-3459, some versions of a common cross-platform PDF reading program are vulnerable to this exploit, as are un-patched versions of a well known fashionable Smartphone.
Viewing the tiff image within a hex editor shows that a large area within the file is full of the hex value ‘41’; following this is some binary code.
Viewing this binary code in a disassembler shows that this is a decryption loop (outlined in red). The hexadecimal value 41 is subtracted from each encrypted value, then adjacent are values combined to create the final decrypted value. The decryption routine then loops back to the beginning and repeats (area outlined in blue). This routine decrypts the large amounts of binary data contained within the other objects in the PDF file and causes it to be executed when complete.
We don’t need to know the contents of the binary objects to know that this is a malicious file. The presence of code exploiting a known vulnerability, and a decryption loop hidden within an image file is evidence enough to show that this is malware.
An attempt to open the PDF file with the latest patched version of the reader results in a blank document being displayed with an error warning. The malware does not execute.
If we use earlier versions of the software to open the PDF something interesting happens.
Although the PDF Reader opens and closes as if nothing has happened, some changes are apparent by examining the registry. A brand new Windows service has been initialized and created as soon as the PDF is opened. This service ensures that the newly created executable file, JYZVYH.exe, is run from its temporary directory.
Once the temporary executable file is run, the service is deleted. The changes to the registry are only momentarily visible. The malware then installs itself, masquerading as an ‘updater’ and remains resident attempting to connect to two URLs:
Presumably, further instructions are expected from command and control services operating from the URLs in question which will direct the subsequent actions of the malware.
Identifying the characteristics of the malware manually is a laborious process. However, it is possible to automate the process to some degree and use computer code to analyze files to search for clues that a file may be suspicious. Scripts can check thousands of values for correctness, and search for potentially malicious code in the most unlikely of places, such as the out of range TIFF DotRange value and hidden machine code within an image in this example.
MessageLabs Intelligence heuristic virus detection system, Skeptic, comprises many thousands of such automatic analyses. The result is that new malware and even new exploits can be detected by identifying the characteristics of malicious code rather than searching for the presence of known viruses.
Anoirel Issa and Joseph Rabaiotti contributed to this blog post.