The aggregation framework
The MongoDB aggregation framework is an easy way to get aggregated values and works well with sharding without having to use MapReduce (see Chapter 13, Working with MapReduce). The aggregation framework is flexible, functional, and simple to implement operational pipelines and computational expressions. The aggregation framework uses a declarative JSON format implemented in C++ instead of JavaScript, which improve the performance. The aggregate method prototype is shown as follows:
db.collection.aggregate( [<pipeline>] )
In the following code, we can see a simple counting by grouping the sentiment
field with the aggregate
method. In this case, the pipeline is only using the $group
operator:
from pymongo import MongoClientcon = MongoClient() db = con.Corpus tweets = db.tweets results = tweets.aggregate([ {"$group": {"_id": "$sentiment", "count": {"$sum": 1}}} ]) for doc in results["result"]: print(doc)
In the following screenshot, we can see the result...