Vectors
It could be argued that understanding vectors and all the ways they are used in RAG is the most important part of this entire book. As mentioned previously, vectors are simply the mathematical representations of your external data, and they are often referred to as embeddings. These representations capture semantic information in a format that can be processed by algorithms, facilitating tasks such as similarity search, which is a crucial step in the RAG process.
Vectors typically have a specific dimension based on how many numbers are represented by them. For example, this is a four-dimensional vector:
[0.123, 0.321, 0.312, 0.231]
If you didn’t know we were talking about vectors and you saw this in Python code, you might recognize this as a list of four floating points, and you aren’t too far off. However, when working with vectors in Python, you want to recognize them as a NumPy array, rather than lists. NumPy arrays are generally more machine-learning...