Importing structured data from the database
Many applications in the enterprise world store their important data in relational databases. The databases become one of the important data sources for Apache Solr for searching. Apache Solr provides DataImportHandler
to deal with this type of data source. With DataImportHandler
, you can also load only the deltas instead of the complete data set again and again. Many times, this can be set as off-time scheduled job activity to minimize the impact of indexing on day-to-day work. In case of real-time updates, this activity has to be scheduled with a fixed frequency.
Traditionally, DataImportHandler
supports pull mechanism, but in the newer release of Apache Solr, DataImportHandler
supports push operation as well. Some of the interesting features of DataImportHandler
are listed as follows:
- Imports data from RDBMS/XML/RSS/ATOM in Solr using configuration across multiple tables
- Data is denormalized, and it supports full as well as incremental import...