State retention and the need for Trident
Trident is a distributed real-time analytics framework. Trident maintains its state either internally for example, in-memory, or externally for example, Hazelcast, in a fault-tolerant way. It is similar to processing an event exactly once. Trident fits for micro batch processing use cases such as aggregation, filtration, and so on.
Let's take an example that explains how to achieve exactly-once semantics. Suppose that you're doing a count of how many people visited your blog and also storing the running count in a database. Now suppose you store a single value representing the count in the database, and every time you process a new tuple you increment the count.
Now, if failures happen, tuples will be replayed by Storm topology. Here the problem is whether or not the tuple has been processed and the count has already been updated in the database—if so, then you should not update it again or if the tuple did not process successfully then you have to...