The problem with using tail
If you had used any of the Flume 0.9 releases, you'll notice that the TailSource is no longer part of Flume. TailSource provided a mechanism to tail
(http://en.wikipedia.org/wiki/Tail_(Unix)) any file on the system and create Flume events for each line of the file. Many have already used the filesystem as a handoff point between the application creating the data (for instance, log4j
) and the mechanism responsible for moving those files someplace else (for instance, syslog
).So, TailSource was the perfect replacement for the syslog transport without needing to make changes to the application creating the data.
As is the case with both channels and sinks, events are added and removed from a channel as part of a transaction. When you are tailing a file, there is no way to participate properly in a transaction. If a failure to write successfully to a channel occurred or if the channel was simply full (a more likely event than failure), the data couldn't be "put back...