Log vectorization
Log vectorization is the process of transforming logs into embeddings. This process requires a couple of steps, such as generating logs for the test and expanding and using a general model to generate vectors.
In addition, we made the arbitrary choice to do everything in Python here, which gives you the ability to re-execute the same examples in a Google Colab notebook for educational purposes.
All the code from this chapter is available in the chapter 7
folder of this book’s GitHub repository: https://github.com/PacktPublishing/Vector-Search-for-Practitioners-with-Elastic/tree/main/chapter7.
Note that instead of applying the first approach and trying to generate vectors directly from the logs, we will adopt the strategy of expanding them to a human-readable description first, allowing us to avoid the intensive process of model training.
We are now going to learn how to generate synthetic logs.
Synthetic log
With synthetic logs, we enable...