What is the MongoDB aggregation framework?
The MongoDB aggregation framework enables you to perform data processing and manipulation on the documents in one or more MongoDB collections. It allows you to perform data transformations and gather summary data using various operators for filtering, grouping, sorting, and reshaping documents. You construct a pipeline consisting of one or more stages, each applying a specific transformation operation on the documents as they pass through the pipeline. One of the common uses of an aggregation pipeline is to calculate sums and averages, similar to using SQL's GROUP BY
clause in a relational database but tailored to the MongoDB document-oriented structure.
The MongoDB aggregation framework enables users to send an analytics or data processing workload—written using an aggregation language—to the database to execute the workload against the data it holds. The MongoDB aggregation framework has two parts:
- An aggregation API provided by the MongoDB driver that you embed in your application. You define an aggregation pipeline in your application's code and send it to the database for processing.
- The aggregation runtime in the database that receives the pipeline request from the application and executes the pipeline against the persisted data.
Figure 1.1 illustrates these two elements and their relationship:
Figure 1.1: MongoDB aggregation framework
Each driver provides APIs to enable an application to use both the MongoDB Query Language (MQL) and the aggregation framework. In the database, the aggregation runtime reuses the query runtime to efficiently execute the query part of an aggregation workload that typically appears at the start of an aggregation pipeline.