Studying malicious PDFs
The Portable Document Format (PDF) was developed by Adobe in the 90s for uniformly presenting documents, regardless of the application software or operating system used. Originally proprietary, it was released as an open standard in 2008. Unfortunately, due to its popularity, multiple attackers misuse it to deliver their malicious payloads. Let’s see how they work and how they can be analyzed.
File structure
A PDF is a tree file that consists of objects that implement one of eight data types:
Null object
: Represents a lack of data.Boolean values
: Classic true/false values.Numbers
: Both integer and real values.Names
: These values can be recognized by a forward slash at the beginning.Strings
: Surrounded by parentheses.Arrays
: Enclosed within square brackets.Dictionaries
: In this case, double curly brackets are used.Streams
: These are the main data storage blocks, and they support binary data. Streams can...