Aggregation framework
The MongoDB aggregation framework is an easy way to get aggregated values and works fine with sharding without having to use MapReduce (see Chapter 13, Working with MapReduce). Aggregation framework is flexible, functional, and simple to implement operation pipelines and computational expressions. Aggregation Framework uses a declarative JSON format implemented in C++ instead of JavaScript, which improves the performance. The aggregate
method prototype is shown here:
db.collection.aggregate( [<pipeline>] )
In the following code, we can see a simple counting by grouping the sentiment
field with the aggregate
method. In this case, the pipeline is only using the $group
operator:
from pymongo import MongoClientcon = MongoClient()
db = con.Corpustweets = db.tweets
results = tweets.aggregate([
{"$group": {"_id": "$sentiment", "count": {"$sum": 1}}} ])
for doc in results["result"]:...