Faceting with hierarchical taxonomy
You will have come across e-commerce sites that show facets in a hierarchy. Let's take a look at www.amazon.com and check how hierarchy is handled there. A search for "shoes"
provides the following hierarchy:
Department Shoes -> Men -> Outdoor -> Hiking & Trekking -> Hiking Boots
How is this hierarchy built into Solr and how do searches happen on it?
In earlier versions of Solr, this used to be handled by a tokenizer known as solr.PathHierarchyTokenizerFactory
. Each document would contain the complete path or hierarchy leading to the document, and searches would show multiple facets for a single document.
For example, the shoes
hierarchy we saw earlier can be indexed as:
doc #1 : /dept_shoes/men/outdoor/hiking_trekking/hiking_boots doc #2 : /dept_shoes/men/work/formals/
The PathHierarchyTokenizerFactory
class will break this field, say, into the following tokens:
doc #1 : /dept_shoes, /dept_shoes/men, /dept_shoes...