UIMA integration with Solr
Solr can also be integrated with Apache UIMA (short for Unstructured Information Management Architecture), which can be used to define a custom pipeline to add metadata to documents.
Note
More information about Solr UIMA integration can be found at https://wiki.apache.org/solr/SolrUIMA.
In Solr, UIMA can be configured by following these steps:
In
solrconfig.xml
, we can add the following libraries:<lib dir="../../contrib/uima/lib" /> <lib dir="../../dist/" regex="solr-uima-\d.*\.jar" />
After adding the libraries, we can add the following fields to
schema.xml
, which will contain the language, concept, and sentence fields:<field name="language" type="string" indexed="true" stored="true" required="false"/> <field name="concept" type="string" indexed="true" stored="true" multiValued="true" required="false"/> <field name="sentence" type="text" indexed="true" stored="true" multiValued="true" required="false" />
After adding these fields, we'll...