In this chapter, we will discuss an example that demonstrates how Apache Apex can be used for processing real-time ride service data. We don't have live access to such data; however, a historical Yellow Cab trip data is freely available on the website of the New York City government, which we will use in this example to simulate real-time ride service data processing.
We will use some important concepts in stream processing and Apache Apex in this example, including event-time windowing, out-of-order processing, and streaming windows. In this chapter we'll cover following topics:
- The goal
- Datasource
- The pipeline
- Simulation of real-time feed using historical data
- Running the applicationÂ