Extracting metadata from PDF documents
Document metadata is a type of information that is stored within a file and is used to provide additional information about that file. This information could be related to the software used to create the document, the name of the author or organization, as well as the date and time the file was created or modified.
Each application stores metadata differently, and the amount of metadata that is stored in a document will almost always depend on the software used to create the document. In this section, we will review how to extract metadata from PDF documents with the PyPDF2 and PyMuPDF modules.