Semantic search on our logs
In this section, we will focus on transforming the expanded logs into vectors in Elasticsearch and then implementing a semantic search functionality on top of the vectorized content. We do this because—remember—our logs are now stored in human-readable language, so we can apply to them the principles of NLP and semantic search we saw earlier.
Building a query using log vectorization
The following code takes the sequence of expanded logs to build a bulk indexing query for Elasticsearch:
# Generate the sequence of JSON documents for a bulk index operation bulk_index_body = [] for index, log in enumerate(batchCompletion): document = { "_index": "logs", "pipeline": "vectorize-log", "_source": { "text_field": log, "log": logs[index] } } bulk_index_body.append(document)
The code then executes the bulk indexing operation using a Python helper. Note that we do not...