fileStream
The fileStream
creates an input stream that monitors a Hadoop-compatible filesystem. It reads new files using a given key-value type and input format. Any filenames starting with .
are ignored. Invoking an atomic file rename function, a filename starting with .
is renamed to a usable filename that can be picked up by the fileStream
and have its contents processed:
def fileStream[K: ClassTag, V: ClassTag, F <: NewInputFormat[K, V]: ClassTag] (directory: String): InputDStream[(K, V)]
textFileStream
The textFileStream
command creates an input stream that monitors a Hadoop-compatible filesystem. It reads new files, as text files with the key as Longwritable
, the value as text
, and the input format as TextInputFormat
. Any files that have names starting with .
are ignored:
def textFileStream(directory: String): Dstream[String]
binaryRecordsStream
Using binaryRecordsStream
, an input stream that monitors a Hadoop-compatible filesystem is created. Any filenames starting with .
are ignored...