Distributed video decoding in Hadoop
Most of the popular video compression formats, such as MPEG-2 and MPEG-4, follow a hierarchical structure in the bit-stream. In this subsection, we will assume that the compression format used has a hierarchical structure for its bit-stream. For simplicity, we have divided the decoding task into two different Map-reduce jobs:
Extraction of video sequence level information: From the outset, it can be easily predicted that the header information of all the video dataset can be found in the first block of the dataset. In this phase, the aim of the map-reduce job is to collect the sequence level information from the first block of the video dataset and output the result as a text file in the HDFS. The sequence header information is needed to set the format for the decoder object.
For the video files, a new
FileInputFormat
should be implemented with its own record reader. Each record reader will then provide a<key, value>
pair in this format to each...