In this recipe, we will demonstrate how to extract metadata from an Excel spreadsheet using the Apache PDFBox API. We will demonstrate using the Excel document created in the Extracting text from a spreadsheet recipe.
Extracting metadata from a spreadsheet
Getting ready
To prepare this recipe, we need to do the following:
- Create a new Maven project.
- Add the following dependency to the project's POM file:
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.13</version>
</dependency>
- Copy the spreadsheet developed in the Extracting text from a spreadsheet recipe to the root level of the project