The Data Storage Layer is where data is persisted. Our core persistent store is HDFS but because of its inherent slowness in querying, we need to have a technology on top of it for fast reading/querying. After indexing, the appropriate speed views are created and kept in the Lambda Speed Layer. The Lambda Speed Layer is entrusted with indexing the data stored in HDFS for high performance and scalable querying of required data.
Elasticsearch is our choice of technology capable of doing this effectively. Elasticsearch is one of the de-facto choices for such a capability, and with not much deliberation this technology choice was made.
The following sections of this chapter aim at covering Elasticsearch in detail so that you get a clear picture of this technology as well as get to know the data storage layer in detail.