Semantic search with SBERT
We may already be familiar with keyword-based search (Boolean model), where, for a given keyword or pattern, we can retrieve the results that match the pattern. Alternatively, we can use regular expressions, where we can define advanced patterns such as the lexico-syntactic pattern. These traditional approaches cannot handle synonyms (for example, car is the same as automobile) or word-sense problems (for example, bank as the side of a river or bank as a financial institute). While the first synonym case causes low recall due to missing the documents that shouldn’t be missed, the second causes low precision due to catching the documents that were not being caught.
Let’s set up a case study for frequently asked questions (FAQs) that are idle on websites. We will exploit FAQ resources within a semantic search problem. We will be using the FAQ from the World Wide Fund for Nature (WWF), a nature non-governmental organization (https://www.wwf...