Object Detection and Classification for Self-Driving Cars

(1)

ISSN: 2395-1303 http://www.ijetjournal.org Page 179

Object Detection and Classification for Self-Driving Cars

Ruchita Sheri1, Niraj Jadhav2, Rahul Ravi3, Ankita Shikhare4, Sanjeev Sannakki5

Department of Computer Science, KLS Gogte Institute of Technology, Belagavi

ABSTRACT:

1

I. INTRODUCTION

Driverless cars were once just the stuff of science fiction. But in recent years, they’ve become a reality and they’re now hitting the streets in a number of U.S. cities. Companies like Uber, Google, and Ford recently started testing hundreds of self-driving vehicles on public roads.

Supporters of driverless cars say the vehicles will make roads safer by cutting down on the number of crashes caused by distracted driving or other human errors.

In recent years, deep Convolutional Networks (ConvNets) have become the most popular architecture for large-scale image recognition tasks. The field of computer vision has been pushed to a fast, scalable and end-to- end learning framework, which can provide outstanding performance results on object recognition, object detection, scene recognition, semantic segmentation, action recognition, object tracking and many other tasks. With the explosion of computer vision research, Advanced

Driver Assistance System (ADAS) has also become a main stream technology in the automotive industry. Autonomous vehicles, such as Google’s self-driving cars, are evolving and becoming reality. A key component is vision- based machine intelligence that can provide information to the control system or the driver to manoeuvre a vehicle properly based on the surrounding and road conditions. There have been many research works reported in traffic sign recognition, lane departure warning, pedestrian detection, and etc. This application is confined to detect objects on the road using MobileNet architecture and Single Shot Detectors (SSD) framework. Unlike other object detection techniques like RCNN’s or YOLO, SSD’s are more accurate and fast. The application uses Caffe model to get the predictions.

Abstract

With the advancement in image processing, object detection has been one of the interesting topics due to its spectrum of applications in real time. For past 10 years Advanced Driving Assistance System (ADAS) has rapidly grown. Recently not only luxury cars but some entry level cars are equipped with ADAS applications, such as Automated Emergency Braking System (AEBS).ADAS systems are used for assisting the drivers by providing advice and warnings when necessary.Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems.This application is for multiple object detection and classification in a given video based on Open Computer Vision (OpenCV) libraries. The application also uses MobileNet architecture, SSD (Single Shot Detectors) framework and Caffe (Convolutional Architecture for Fast Feature Embedding) model to get the predictions.The system is used as one of the features in ADAS system for collision avoidance by detecting and classifying the objects such as vehicles and pedestrian.

Keywords—ADAS, OpenCV, CNN, MobileNet, SSD, Caffe.

RESEARCH ARTICLE OPEN ACCESS

(2)

ISSN: 2395-1303 http://www.ijetjournal.org Page 180 II. RELATED WORK

An integrated real-time approach was developed for detecting objects in the captured images of self-driving vehicles. Object detection is modelled as a regression problem on the predicted bounding boxes and their class probabilities. Unified neural network has been performed on the whole image which could predict the bounding boxes and class probabilities at the same time. [1]

An improved frame-difference method was introduced, which can shorten the running time and improve the accuracy of the object detection. The results of the experiment show that after adding the improved frame-difference method, the detection speed is increased by 21.06 times, the image detection accuracy is improved about 8%. The algorithm is robust and it can be adapted to different scenes including indoor and outdoor. [2]

Another method uses haar-like features of the images and AdaBoost classifier for detectionwhich provides a very fast detection rate. In order to predict the class of a vehicle, a feature based method is proposed. HOG, SIFT, SURF all are well represented feature for image classification. [3]

Works have been carried out using Region Proposal Network (RPN), a fully convolutional network that simultaneously predict’s object bounds and scores at each position. The RPN was trained end-to-end to generate high-quality region proposals, which were used by Fast R-CNN for detection.

Furthermore RPN and Fast R-CNN were merged into a single network by sharing their convolutional features.Their accuracy is high but computational rate is slow. [4]

III. PROPOSED SYSTEM

In this system an input video is taken where objects are detected and the data of the detected objects is sent to a text file. To achieve a balance between accuracy and speed, our system uses Single Shot Detectors (SSD) along with MobileNet architecture and Caffe model.

Figure 1 shows the system design and its implementation will be explained in the subsequent chapters.

Fig. 1 Block diagram

IV. IMPLEMENTATION

The implementation is divided into three modules:

A) Camera module B) Processing module C) Data Module

(3)

ISSN: 2395-1303 http://www.ijetjournal.org Page 181 A) Camera Module:

OpenCV is a software toolkit for processing real-time image and video, as well as providing analytics, and machine learning capabilities.The camera module consists of a input video which is captured using an OpenCV function VideoCapture(). The read() function of OpenCV is then used on the input video object to divide it into frames.

B) Processing Module:

The system uses MobileNet architecture that is a class of Convolutional Neural Networks (CNN). They are based on a streamlined architecture that uses depth-wise separable convolutions and point-wise convolutions to build light weight deep neural networks. This reduces the burden on the first few layers of the CNN, hence making the network fast.

Single Shot Detector (SSD) is used for object detection and classification together with MobileNet architecture. It runs a convolutional network on input image only once and calculates a feature map. It then runs a small 3×3 sized convolutional kernel on this feature map to predict the bounding boxes and classification probability. Each convolutional layer operates at a different scale hence it is able to detect objects of various scales.SSD achieves a good balance between speed and accuracy.

CAFFE (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework, which is used to train our model.

Caffe supports many different types of deep learning architectures geared towards image classification and image segmentation. It supports CNN, RCNN, and fully connected neural network designs.

C) Data Module:

In this module the system sends the data of the detected objects to a text file. The data includes class of the object, probability of detection and co-ordinates of its bounding box.

This data can be used to take further decisions by the ADAS system.

V. RESULT AND DISCUSSION Working of the project is depicted through snapshots as follows.

Fig. 2(a) Object detection

Fig. 2(b) Object detection

Fig. 2(c) Object detection

(4)

ISSN: 2395-1303 http://www.ijetjournal.org Page 182 The following fig.3 shows the snapshot of the

text file where the data of the detected objects are sent.

Fig. 3 Data sent to the text file

CONCLUSION

Trust in autonomous technology is the key to a driverless future. A vision-based object detection system for on-road obstacles is realized using Single Shot Detectors (SDD) and MobileNet architecture. This proposed work is used to categorize the moving objects like pedestrians, cars, motorbikes etc. into their respective classes and locate them by drawing bounding boxes around them. The resulting bounding boxes of detected objects and their classes are useful for subsequent motion planning and control subsystems of self-driving cars.

ACKNOWLEDGMENT

We owe heartfelt thanks to Visvesvaraya Technological University for supporting and providing us needful things for the project.

REFERENCES

[1] SeyyedHamedNaghavi, Cyrus Avaznia, HamedTalebi “Integrated real-time object detection for self-driving vehicles”, 10th Iranian Conference on Machine Vision and Image Processing (MVIP), IEEE Nov.

2017.

[2] Mingzhu Zhu, Hongbo Wang “Fast detection of moving object based on improved frame-difference method”, 6th International Conference on Computer Science and Network Technology (ICCSNT), IEEE Oct. 2017.

[3] Md. Shamim Reza Sajib, Saifuddin Md. Tareeq “A feature based method for real time vehicle detection and classification from on-road videos”, 20th International Conference on Computer and Information Technology (ICCIT), IEEE, June 2016.

[4] Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun, “Faster R- CNN: Towards Real-Time Object Detection with Region Proposal Networks”, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, pp. 1137-1149, June 2016.

[5] David Geronimo, Antonio M. Lopez, Angel D. Sappa, Thorsten Graf,

“Survey of Pedestrian Detection for Advanced Driver Assistance Systems

“, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.

32, pp. 1239 – 1258, May 2009.

[6] J CezarSilveira Jacques, Claudio Rosito Jung, and SoraiaRauppMusse, “Background subtraction and shadow detection in grayscale video sequences”, 18th Brazilian Symposium on Computer Graphics and Image Processing, pp. 189–196, IEEE, 2005.

[7] Shahbe Mat Desa and Qussay A Salih. “Image subtraction for real time moving object extraction” International Conference on Computer Graphics, Imaging and Visualization (CGIV), pp. 41–45, IEEE, 2004.

[8] Robert T Collins, Alan Lipton, Takeo Kanade, Hironobu Fujiyoshi, David Duggins, Yanghai Tsin, David Tolliver, Nobuyoshi Enomoto, Osamu Hasegawa, Peter Burt, “A system for video surveillance and monitoring”, vol. 102, 2000.

[9]Chris Stauffer and W Eric L Grimson, “Adaptive background mixture models for real-time tracking”, IEEE Computer Society Conference on Computer Vision and Pattern Recognition,Vol 2, IEEE, 1999.

[10] Alan J Lipton, Hironobu Fujiyoshi, and Raju S Patil, “Moving target classification and tracking from real-time video”, Fourth IEEE Workshop on Applications of Computer Vision, pp. 8–14, IEEE, 1998.

(5)

ISSN: 2395-1303 http://www.ijetjournal.org Page 183 Ms. Ruchita Sheri is currently pursuing her

Bachelor’s degree in Computer Science and Engineering at KLS Gogte Institute of Technology. She is good at academics and other curricular activities. She is interested in machine learning and upcoming technologies.

Mr. Niraj Jadhav is currently pursuing his Bachelor’s degree in Computer Science and Engineering at KLS Gogte Institute of Technology. He is good at academics and looks forward to learn new things; he is also good at programming and analytic skills. He is interested in photography, plays football and volleyball.

Mr. Rahul Ravi is currently pursuing his bachelor's degree in Computer science and engineering at KLS Gogte Institute of Technology. He indulges his time readingabout

cutting edge technologies and learning about them.

Ms. Ankita Shikhare is currently pursuing her Bachelor's degree in Computer Science and Engineering at KLS Gogte Institute of Technology. She has got good programming and logical skills and is interested in exploring new trending technologies.

Dr. Sanjeev S. Sannakki is currently working as a Professor in the Department of Computer Science

& Engineering at KLS Gogte Institute of Technology, Belgaum. He has obtained his Master’s degree & Doctor of Philosophy from Visvesaraya Technological University. He has total Eighteen years of teaching experience and vast research experience. He has guided many projects at Undergraduate, Post graduate & Ph.D levels.