The pipeline
The application pipeline consists of six operators as shown in the following diagram:
The first operator, NycTaxiDataReader
, reads from the source file(s). The second operator, NycTaxiCsvParser
, reads the raw lines from NycTaxiDataReader
, parses the data, and passes it to the third operator, NycTaxiZipFareExtractor
. The NycTaxiZipFareExtractor
 operator extracts the zip code from the lat-lon information in the data and prepares the output for WindowedOperator
to consume. It also produces watermarks for WindowedOperator
. NycTaxiDataServer
takes the output from WindowedOperator
and serves the data by WebSocket by passing the data to QueryResult
. QueryResult
is PubSubWebSocketOutputOperator
, which delivers results via WebSocket.