Extracting metadata from files
Many authoring tools provide a mechanism for capturing metadata within the authoring tool user interface. If certain pieces of metadata are already stored as part of the file, why force content contributors to re-key the metadata when they add the content to the repository? Alfresco can use metadata extractors to inspect the file, extract the metadata, and save the metadata in the node's properties.
A metadata extractor is a Java class configured as a Spring bean that either gets called when content is created in the repository, or when the extractor is invoked by a rule action. Alfresco knows which extractor to use for a given piece of content because metadata extractors declare the MIME types they support.
Metadata extractors have a default mapping that identifies which pieces of file metadata should be stored in which node properties. The property mapping can be overridden by pointing to a custom mapping in the Spring bean configuration.