In the previous chapter, we learned how to preprocess documents by using ingest pipeline processors before indexing operations. We've looked at all Ingest APIs and learned how to use the processors. We were also involved in an in-depth discussion of conditional execution and error handling.
In this chapter, we'll use a powerful tool, the Aggregation Framework, to perform data analysis. According to the definition from the Information Technology Laboratory (ITL) at the National Institute of Standards and Technology (NIST) (https://www.itl.nist.gov/div898/handbook/eda/section1/eda11.htm), Exploratory Data Analysis (EDA) is an approach to carrying out data analysis by allowing the data to reveal its underlying structure and model. We'll try to use a few examples to illustrate EDA.
By the end of this chapter, we will...