MapReduce-based Video Analytic Application

4.2 Methodology

4.2.3 Hadoop System Design Overview

4.2.3.3 MapReduce-based Video Analytic Application

For the proposed experiment a simple face detection algorithm and motion detection algorithms were implemented and tested as the custom MapReduce job. The system makes use of FFmpeg for video file decoding and encoding and OpenCV for the execution of the algorithms. Unfortunately these applications are C and C++ based native libraries, whereas Hadoop is a java based run time environment. Therefore the javacv wrapper was selected to provide a java API to Hadoop.

In the proposed research the Mapreduce-based algorithms were implemented by modifying the default java classes utilized in different phases of mapreduce data flow. Figure 4.4, illustrates mapreduce data flow showing the connection between system phases and detailed steps of processing one video file, namedInputSplit).

Figure 4.4: Hadoop performing a video analytic job

As illustrated in figure 4.4, initially when video file is stored in HDFS it is generally divided into logical separate files InputSplits of the same size and distributed them across the cluster of VM nodes (see Figure 4.4). The known storage locations of the Inputsplits are used by the Hadoop system (i.e. the master) to

schedule map tasks on the tasktracker (of VM nodes), where data splits resits. It is worth mentioning that a mapper takes the file as an input, so data locality becomes important.

In our case we consider the input video file as a complete file to be processed as one mapper by overwriting default Hadoop isSplitable() method in FileInput-

Format class. We avoid splitting the input file for reasons detailed in section

4.3.1.1, i.e., a compressed video file consists of correlated frames and hence ran- dom splitting will cause dependent frames to be processed in different Inputsplits

thus gives non-decodable files by FFmpeg.

When mapreduce face detection or motion detection task is executed, typically it should first calculate the splits for the job by calling getSplits. In the proposed configuration only one Inputsplit is considered as discussed above. The application will send this split to the master jobtracker to schedule a map task to be processed by the only tasktracker (a VM). The details of the map and reduce phases are as follows:

• In map phase:

– VideoRecordReader class: Map task uses ReaderRecorder to de- code and extract the sequence of frames out of the InputSplit by calling FFmpeg tool. Each decoded frame is then represented by a key-value pairs. The Key is a unique frame id corresponding to the frame number within the sequence and the value is the data of the corresponding frame. Subsequently Inputsplits in a form of key-value pair are sent to the map function to process. For instance a video file have the following sequence of frames & transformed into (key,value) pairs, see Figure 4.5:

– Mapper function: Takes key-value pairs generated from previous phase and subsequently group them depending on the video analytic algorithms requirements as single frames like face detection algorithm or series of frames like motion detection algorithm. If there are more than one reduce mapper the output is partitioned by key and is sent to the buffer as input for the reduce phase. Map output is named as an intermediate output.

• In Reducer phase:

– Shuffle phase: Transfers intermediate data from the mapper nodes to the reducer nodes scheduled by the jobtracker. Reducer deals with (key, value) as input, therefore any node (VM) can perform the reducer task and there is no need for concern about data locality.

– Sort phase : It sorts intermediate inputs that comes from the different mappers, by key.

– Reducer function: Each reducer takes all key-value pairs with the same key and merges them, and subsequently applies the face detection or motion detection algorithm on the frames (i.e., values) according to the instructions within the java code representing the computer vision algorithm. Finally the results are sent to the class OutputFormat.

– OutputFormat: Generates output in a form of text including the frame number in which a face is detected and the locations of face/s on the frame. Finally the Record-Writer is used to write the results to the HDFS, ready for the application to read. .

The output for Mapreduce face detection application is written in a text file showing the coordinates (left, top, width, height) as location of faces in each video frame(images). The output for Mapreduce motion detection application is also written in a text file showing the number, time and duration of the detected motion. We checked the accuracy of these applications when running in hadoop environement with that running in stand-alone system, we found similar results in both scenarios. This is expected since in Hadoop distributed system each machine processes the same application code on every video file then merges the output.

Table 4.1 provides the pseudo code for the mapreduce functions of the face detection (i.e applied on frame by frame basis) and motion detection algorithm (i.e applied on overlaped frames).

Table 4.2: Pseudo code for the implementation of a single-frame and overlapped- frame oriented applications based on Hadoop MapReduce.

Map Phase:

Inputs: <frameID,frame>

Outputs: <groupID,EncodedFrame>

if Single-Frame-App //if the application is single frame oriented

groupID=frameID EncodedFrame=frame

else

groupID= get-episod(frameID) // determine which group this frame belongs to

EncodedFrame= <frameID,frame> //encapsulate each frame

with its id end Reduce Phase: Inputs: <groupID,encodedFrame-set > Outputs: <groupID,output-data> if Single-Frame-App

for each frame in encodedFrame-set do

output= proc-single-frame(frmae) // a custom proedure processing a single frame output-data.add(output)

end else

encodedFrame-array= sort (encodedFrame-set) //restore the order of the frames in one group

output-data= proc-episode(encodedFrame-array) // a procedure for processing an episode

In document Performance modelling and optimization for video-analytic algorithms in a cloud-like environment using machine learning (Page 82-86)