A PDF is a tree file that consists of objects that implement one of eight data types:
- Null object.
- Boolean values.
- Numbers.
- Names: These values can be recognized by a forward slash at the beginning.
- Strings: Surrounded by double parentheses.
- Arrays: Enclosed within square brackets.
- Dictionaries: In this case, double curly brackets are used.
- Streams: These are the main data storage blocks, and they support binary data. Streams can be compressed in order to reduce the size of the associated data.
Apart from this, it is possible to use comments with the help of the percentage sign.
All complex data objects (such as images or JavaScript entries) are stored using basic data types. In many cases, objects will have the corresponding dictionary mentioning the data type with the actual data stored in a stream.
All PDF documents start with the %PDF signature, followed by the format version number (for example, 1.7) separated by a dash.
There are multiple keywords supported to define...