Segment merging under control
Lucene segments are written once and read many times, and data structures, apart from the information about the deleted documents that are held in one of the files, can be changed. After some time, when certain conditions are met, the contents of some segments can be copied to a bigger segment, and the original segments are discarded and thus deleted from the disk. Such an operation is called segment merging.
You may ask yourself, why bother about segment merging? There are a few reasons. First of all, the more segments the index is built from, the slower the search will be and the more memory Lucene will need. In addition to this, segments are immutable, so the information is not deleted from them. If you happen to delete many documents from your index, until the merge happens, these documents are only marked as deleted and are not deleted physically. So, when segment merging happens, the documents that are marked as deleted are not written into the new segment...