Similarity Searching with Vectors
This chapter is all about the R or retrieval part of retrieval-augmented generation (RAG). Specifically, we are going to talk about four areas related to similarity searches: indexing, distance metrics, similarity algorithms, and vector search services. With this in mind, in this chapter, we will cover the following:
- Distance metrics versus similarity algorithms versus vector search
- Vector space
- Code lab 8.1 – Semantic distance metrics
- Different search paradigms – sparse, dense, and hybrid
- Code lab 8.2 – Hybrid search with a custom function
- Code lab 8.3 – Hybrid search with LangChain’s EnsembleRetriever
- Semantic search algorithms such as k-NN and ANN
- Indexing techniques that enhance ANN search efficiency
- Vector search options
By the end of this chapter, you should have a comprehensive understanding of how vector-based similarity searching operates and why it’s...