Robust Malware Detection for Internet of Things Using Deep Learning

(1)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 946

Robust Malware Detection for Internet of Things Using

Deep Learning

#1 M.Amit Kumar Goud, Dept of ECM, Sreenidhi Institute of Science & Technology, Ghatkesar, Rangareddy (D.T), T.S, India.

#2 V.BikshaSai, Dept of ECM, Sreenidhi Institute of Science & Technology, Ghatkesar, Rangareddy (D.T), T.S, India

#3 P.Venkat Vinay, Dept of ECM, Sreenidhi Institute of Science & Technology, Ghatkesar, Rangareddy (D.T), T.S, India

Abstract: -

Internet of Things (IoT) in

military settings typically consists of a various vary of net-connected devices and nodes. These IoT devices and nodes area unit a valuable target for cybercriminals, particularly state-sponsored or nation-state actors. a standard attack vector is that the use of malware. during this paper, we tend to gift a deep learning based mostly methodology to discover net Of pa rcel of land Things (IoBT) malware via the device’s Operational Code (OpCode) sequence apply an formula like Random Forest and call Tree learning approach to classify malicious and benign things. We tend to conjointly demonstrate

he strength of our projected approach in malware detection and its property against junk code insertion attacks. Lastly, we tend to build accessible our malware sample onKaaggle, that hopefully can profit futur e analysis efforts

Index Terms—Internet of Things

Malware, Internet Of Battlefield Dataset,

Deep Learning Algorithm, Machine

Learning

1 Introduction

With associate degree calculable markey

share of seventieth to eightieth, robust has

become the foremost fashionable.

Unsurprisingly, cyber-criminals have

followed, increasing their malicious activities to mobile platforms. Mobile threat researchers have recognized associate

degree direful increase of strong malware from 2012 to 2013 and estimate that the

quantity of detected

malicious web is within the vary of a hundred and twenty,000 to 718,000. To anciently notice malware from web on the market from offcial and third-party sources, several efforts have contributed to finding out the character of web of things platforms and their web within the past decade. The sturdy platform the permission system to limit web privileges to secure the sensitive resources of the users. The developer is accountable

for determinative befittingly that permissi ons associate

(2)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 947 te degree application has to get a user’s

approval of the requested permissions to access non-public or otherwise-restricted resources. though the permission system will defend users from web with invasive behaviors, its effectiveness extremely depends on a user’s comprehension of the implications of granting a permission. in

line with recent

studies, several users don't perceive what every permission suggests that and blindly grant them, doubtless permitting associate degree application to access sensitive/private data. Another laws that the user cannot attempt to grant single permissions, whereas denying others. Many users, though associate degree app would possibly request a

suspicious permission

among several ostensibly legi timate permissions, can still conﬁrm the installation. The sturdy security model relies in the main on permissions. As a result, the implementation of those permissions is of interest to North American nation. associate degree sturdy permission may be a restriction limiting access to a locality of the code or to information on the device. The limitation is obligatory to safeguard crucial information and

code that might be used to distort or injury a user’s expertise. Permissions also

are accustomed enable or prohibit applicat ion access to restricted arthropod genusand resources. for instance, the sturdy ‘INTERNET’ permission is needed by things to perform network communications; thus, gap a

network affiliation is restricted by the ‘INTERNET’

permission. moreover, associate

degree application should have the ‘READ CONTACTS’ permission so as to scan entries during a user’s telephone book similarly. to want a permission, the developer specifies them victimization the Manifest file in declaring a "" attribute. The "robust: name" field specifies the name of the permission within the code. A permission is related toone in every of the subsequent four protection levels:

Normal: A low-risk

permission that permits web to access API

calls (‘SET

WALLPAPER’) inflicting no hurt to users. • Dangerous:

A insecure permission that permits web to access potential harmful API calls(‘READ CONTACTS’) like leaky

non-public user information or management over web of things device.Dangerous permissions arexpressly shown to the user before associate degree app is put in and therefore the user should attempt to grant the permissions or not, determinative whether or not the installation continues or fails,respectively. • Signature: A permission that is granted if its requesting application is signed with an equivalent certificate because the application that defines the permission is signed. • Signature-or-system: A permission that is granted as long asits requesting application is within the same Robustsystem image or is signed with an equivalent certificate because

the application that deﬁnes the permission is signed.

(3)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 948 some entrant ar making malicious robust

application as tool to steal the sensitive information and fraud / fraud mobile bank, mobile wallets. There ar such a large amount of malicious net detection tools

and code are accessible. however associat e degree effectively and expeditiously malicious net detection tools required to tackle and handle new advanced malicious things created by entrant or hackers. during this paper we tend to came up with plan of mistreatment machine learning approaches for detection the malicious robust application. 1st we've got to collect dataset of past malicious things as coaching set and with the assistance of Support vector machine algorithmic

program and call tree algorithmic program conjure comparsion

with coaching dataset and trained dataset we are able to predict the malware robust things upto 93.2 % unknown / New malware mobile application.

EXISTING SYSTEM:

Traditionally Numerous malware detection tools have been developed, but some tools are may not able to detect newly created malware application and unknown malware application infected by various Trojan, worns, spyware Detecting of large number of malicious application over millions of robustapplication is still a challenging task using traditional way. In existing, Non machine learning way of detecting the malicious application based on characteristics,properties,behavioural.

Drawabacks: Identification of newly updated or created malicious application is hard to find out.Non Machine learning

approaches are not reliable and efficient In Existing approaches covers only 30 permissions out of 300 app permissions, due to this limited things permissions different types of attacks can occurs.

PROPOSED SYSTEM:

In proposed paper, we implements SIGPID, Significant Permission Identification (SIGPID).The goal of the sigid is to improve the things permissions effectively and efficiently. This SIGID system improves the accuracy and efficient detection of malware application. With help machine learning algorithms such as SVM and Decision Tree algorithms make a comparison between training dataset and trained dataset .Support vector machine algorithms act as a classifier which is used to classify malicious and benign.

Overcome: Improves the percentages of detection malicious application. Machine learning is better efficient than Non machine learning algorithm.Able to detect new malware robust Internet .We only need to consider 22 out of 135 permissions to improve the runtime performance by 85.6%.

(4)

Experimental Design

Our experimental design is made up of two main steps, which can be broken down further. These two portions, corresponding to the building of the decision tree and the development of the mobile application, can be seen below.

Three common data pre-processing steps are :

 Formatting: The data you have selected may not be in a format that is suitable for you to work with. The data may be in a relational database and you would like it in a flat file, or the data may be in a proprietary file format and you would like it in a relational database or a text file.

 Cleaning: Cleaning data is the removal or fixing of missing data. There may be data instances that are incomplete and do not carry the data you believe you need to address the problem. These instances may need to be removed. Additionally, there may be sensitive information in some of the attributes and these attributes may need to be anonymized or removed from the data entirely.

 Sampling: There may be far more selected data available than you need to work with. More data can result in much longer running times for algorithms and larger computational and memory requirements. You can take a smaller representative sample of the selected data that may be much faster for exploring and prototyping solutions before considering the whole dataset.

 Module

1. Permission

2. Combination of Permission 3. Feature Extraction

4. Classification

Permission

(5)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 950 SMS’, and ‘SEND SMS’. They found that

malicious things tend to request more permissions than benign ones. They found no strong correlation between Internet categories and requested permissions, and introduce a method to visualize permissions usage in different app categories. The aim of their work is to classify Robust Internet into several categories such as entertainment, society, tools, and productivity, multimedia and video, communication, puzzle and brain games. Mentions a method that analyses manifest files in Robust application by extracting four types of keyword lists:(1) Permission, (2) Intent filter (action), (3) Intent filter (category), and (4) Process name. This approach determines the malignancy score by classifying individually permissions as malicious or benign.

Combination of Permission

A high-level contextual analysis and an exploration of RobustInternet based on their implementation of permission-based security models by applying network visualization techniques and clustering algorithms. From that, they discovered new potentials in permission-based security models that may provide additional security to the users. This method on network classiﬁcation helps to deﬁne irregular permission combinations requested by abnormal implications of sensitive data on Robustdevices in enterprise settings. They characterized malicious things and the risks they pose to enterprises. Finally, they have proposed several approaches for dealing with security risks for enterprise. From the analysis of third-party Internet , Permission additions dominate the evolution of third-party things, of which Dangerous permissions tend to account for most of the changes. a method for detecting malware

based on three metrics, which evaluate: the occurrences of a speciﬁc subset of system calls, a weighted sum of a subset of permission that the application required, and a set of combinations of permissions.

Feature Extraction

A new method to detect malicious RobustInternet through machine learning techniques by analyzing the extracted permissions from the application itself. Features used to classify are the presence of tags uses-permission and uses-feature into the manifest as well as the number of permissions of each application. These features are the permission requested individually and the «uses-feature» tag.the possibility of detection malicious RobustInternet based on permissions and 20 features from Robustapplication packages.

Classification

(6)

Result

1. Using Random Forest Algorithm we are getting a predication graph.

I. Train and Test Data Splitting for random forest.

 This graph with 20 iteration. In this First graph we are taking only an 20 trees and to predict the train and test validation.

 In this graph blue line cross validation represent the our test data value and red line represents our train data value

 X axis our represent 350 datas in dataset

 Y axis represent mean value(rmse)

II. Final Prediction grap for Random Forest.

2. Using Decision Tree Algorithm we are getting a predication graph

I. Train and Test splitting for decision tree.

 In this First graph we are spliting dataset into train test and and features to predict the error validation.

 In this graph blue line cross validation represent the our test data value and red line represents our train data value

 X axis our represent 250 datas in dataset

 Y axis represent Incorrect data

II. Final Result for Decision Tree

Conclusion and Future Work

(7)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 952 update occurs. We believe that this is an

important step in preventing Robustmalware, because this application brings to the user’s attention all the possibly dangerous Internet , allowing them to scrutinize the Internet that they trust more carefully. This in turn will help users become more security-conscious overall.Even so, this is only a first step. Future work for this project will include increasing the accuracy of the classifier, migrating the Python portions of this project to Java, and integrating more advanced methods of detecting malicious behavior such as looking at API calls (this follows a "defense in depth" strategy). One benefit of the decision tree classifier is its speed. It can serve as a preliminary screen for more advanced but slower methods, to focus the Internet they will inspect. Lastly, taking into account application categories such as being a game or email-client would also help detect suspicious permissions and behaviors.

REFERENCES

[1] E. Bertino, K.-K. R. Choo, D. Georgakopolous, and S. Nepal,“Internet of things (iot): Smart and secure service delivery,” ACMTransactions on Internet Technology, vol. 16, no. 4, p. Article No. 22,2016.

[2] X. Li, J. Niu, S. Kumari, F. Wu, A. K. Sangaiah, and K.-K. R. Choo,“A three-factor anonymous authentication scheme for wireless sensor networks in internet of things environments,” Journal of Network and Computer Applications, 2017.

[3] J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of things (iot): A vision, architectural elements, and future directions,” Future generation computer systems, vol. 29, no. 7, pp. 1645–1660, 2013. [4] F. Leu, C. Ko, I. You, K.-K. R. Choo, and C.-L. Ho, “A smartphonebased wearable sensors for monitoring real-time

physiological data,” Computers & Electrical Engineering, 2017.

[5] M. Roopaei, P. Rad, and K.-K. R. Choo, “Cloud of things in smart agriculture: Intelligent irrigation monitoring by thermal imaging,” IEEE Cloud Computing, vol. 4, no. 1, pp. 10–15, 2017.

[6] X. Li, J. Niu, S. Kumari, F. Wu, and K.-K. R. Choo, “A robust biometrics based three-factor authentication scheme for global mobility networks in smart city,” Future Generation Computer Systems, 2017.

[7] L. Atzori, A. Iera, and G. Morabito, “The internet of things: A survey,” Computer networks, vol. 54, no. 15, pp. 2787–2805, 2010.

[8] D. Miorandi, S. Sicari, F. De Pellegrini, and I. Chlamtac, “Internet

of things: Vision, applications and research challenges,” Ad Hoc

Networks, vol. 10, no. 7, pp. 1497–1516, 2012.

[9] A. Kott, A. Swami, and B. J. West, “The internet of battle things,”

Computer, vol. 49, no. 12, pp. 70–75, 2016. [10] C. Tankard, “The security issues of the internet of things,” Computer

Fraud & Security, vol. 2015, no. 9, pp. 11 – 14, 2015.

[11] C. J. DOrazio, K. K. R. Choo, and L. T. Yang, “Data exfiltration

from internet of things devices: ios devices as case studies,” IEEE

Internet of Things Journal, vol. 4, no. 2, pp. 524–535, April 2017.

[12] S. Watson and A. Dehghantanha, “Digital forensics: the missing

piece of the internet of things promise,” Computer Fraud & Security,

vol. 2016, no. 6, pp. 5–8, 2016.

(8)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 953 of things security and forensics: Challenges

and opportunities,”

Future Generation Computer Systems, vol. 78, no. Part 2, pp. 544 –

546, 2018.

[14] E. Bertino and N. Islam, “Botnets and internet of things security,”

Computer, vol. 50, no. 2, pp. 76–79, Feb 2017.

[15] J. Gardiner and S. Nagaraja, “On the security of machine learning

in malware c&c detection: A survey,” ACM Computing Surveys,

vol. 49, no. 3, p. Article No. 59, 2016. [16] J. Peng, K.-K. R. Choo, and H. Ashman, “User profiling in intrusion

detection: A review,” Journal of Network and Computer

Applications, vol. 72, pp. 14–27, 2016. [17] E. M. Rudd, A. Rozsa, M. Gnther, and T. E. Boult, “A survey of

stealth malware attacks, mitigation measures, and steps toward

autonomous open world solutions,” IEEE Communications Surveys

& Tutorials, vol. 19, no. 2, pp. 1145–1172, 2016.

[18] S. Iqbal, M. L. M. Kiah, B. Dhaghighi, M. Hussain, S. Khan, M. K.

Khan, and K.-K. R. Choo, “On cloud security attacks: A taxonomy

and intrusion detection and prevention as a service,” Journal of

Network and Computer Applications, vol. 77, pp. 98–120, 2016.

[19] Y. Ye, T. Li, D. Adjeroh, and S. S. Iyengar, “A survey on malware

detection using data mining techniques,” ACM Computing Surveys,

vol. 50, no. 3, p. Article No. 41, 2017. [20] Z. K. Zhang, M. C. Y. Cho, C. W. Wang, C. W. Hsu, C. K.

Chen, and S. Shieh, “Iot security: Ongoing challenges and research

opportunities,” in 2014 IEEE 7th International Conference on Service-

Oriented Computing and Applications, Nov 2014, pp. 230–234.

[21] K. Gai, M. Qiu, L. Tao, and Y. Zhu, “Intrusion detection techniques

for mobile cloud computing in heterogeneous 5g,” Sec. and Commun.

Netw., vol. 9, no. 16, pp. 3049–3058, Nov. 2016.

[22] P. Faruki, A. Bharmal, V. Laxmi, V. Ganmoor, M. S. Gaur, M. Conti,

and M. Rajarajan, “Android security: A survey of issues, malware

penetration, and defenses,” IEEE Communications Surveys & Tutorials, vol. 17, no. 2, pp. 998–1022, Secondquarter 2015.

[23] Z. Fadlullah, F. Tang, B. Mao, N. Kato, O. Akashi, T. Inoue, and

K. Mizutani, “State-of-the-art deep learning: Evolving machine

intelligence toward tomorrows intelligent network traffic control

systems,” IEEE Communications Surveys & Tutorials, 2017.

[24] N. Milosevic, A. Dehghantanha, and K.-K. R. Choo, “Machine

learning aided android malware classification,” Computers & Electrical Engineering, 2017.

[25] R. Kohavi et al., “A study of cross-validation and bootstrap for

accuracy estimation and model selection,” in Ijcai, vol. 14(2).

Stanford, CA, 1995, pp. 1137–1145.

[26] Y. Bengio and Y. Grandvalet, “No unbiased estimator of the

(9)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 954 research, vol. 5, no. Sep, pp. 1089–1105,

2004.

[27] Z. Yuan, Y. Lu, and Y. Xue, “Droiddetector: android malware characterization

and detection using deep learning,” Tsinghua Science

and Technology, vol. 21, no. 1, pp. 114–123, Feb 2016.

[28] J. Saxe and K. Berlin, “Deep neural network based malware

detection using two dimensional binary program features,” in Malicious

and Unwanted Software (MALWARE), 2015 10th International

Conference on, 2015, pp. 11–20.

[29] D. Bilar, “Opcodes as predictor for malware,” Int. J. Electron. Secur.

Digit. Forensic, vol. 1, no. 2, pp. 156–168, Jan. 2007.

[30] R. Moskovitch, C. Feher, N. Tzachar, E. Berger, M. Gitelman,

S. Dolev, and Y. Elovici, “Unknown malcode detection using

opcode representation,” Intelligence and Security Informatics, pp.

204–215, 2008.

[31] I. Santos, F. Brezo, X. Ugarte-Pedrero, and P. G. Bringas, “Opcode

sequences as representation of executables for data-mining-based

2377-3782 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/ publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TSUSC.2018.2809665, IEEE Transactions on Sustainable Computing

8

unknown malware detection,” Information Sciences, vol. 231, pp.

64–82, 2013.

[32] H. Hashemi, A. Azmoodeh, A. Hamzeh, and S. Hashemi, “Graph

embedding as a new approach for unknown malware detection,”

Journal of Computer Virology and Hacking Techniques, 2016.

[33] M. Siddiqui, M. C. Wang, and J. Lee, “Data mining methods for

malware detection using instruction sequences,” in Proceedings of

the 26th IASTED International Conference on Artificial Intelligence and

Applications, ser. AIA ’08. Anaheim, CA, USA: ACTA Press, 2008,

pp. 358–363.

[34] Y. Tan, Class-Wise Information Gain. John Wiley & Sons, Inc., 2016,

ch. 11, pp. 150–172.

[35] K. Shaerpour, A. Dehghantanha, and R. Mahmod, “Trends in android

malware detection,” The Journal of Digital Forensics, Security

and Law: JDFSL, vol. 8, no. 3, p. 21, 2013. [36] N. Idika and A. P. Mathur, “A survey of malware detection

techniques,” Purdue University, vol. 48, 2007.

[37] P. Vinod, R. Jaipur, V. Laxmi, and M. Gaur, “Survey on malware

detection methods,” in Proceedings of the 3rd Hackers Workshop on

computer and internet security (IITKHACK09), 2009, pp. 74–79.

[38] Z. Bazrafshan, H. Hashemi, S. M. H. Fard, and A. Hamzeh, “A

survey on heuristic malware detection techniques,” in Information

(10)

Available online:https://edupediapublications.org/journals/index.php/IJR/ P a g e | 955 pp. 113–120.

[39] O. E. David and N. S. Netanyahu, “Deepsign: Deep learning

for automatic malware signature generation and classification,”

in Neural Networks (IJCNN), 2015 International Joint Conference on.

IEEE, 2015, pp. 1–8.

[40] R. Pascanu, J. W. Stokes, H. Sanossian, M. Marinescu, and

A. Thomas, “Malware classification with recurrent networks,”

in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE

International Conference on. IEEE, 2015, pp. 1916–1920.

[41] J. Demme, M. Maycock, J. Schmitz, A. Tang, A.Waksman, S. Sethumadhavan, and S. Stolfo, “On the feasibility of online malware detection

with performance counters,” in ACM SIGARCH Computer

Architecture News, vol. 41, no. 3. ACM, 2013, pp. 559–570.

[42] M. S. Alam and S. T. Vuong, “Random forest classification for

detecting android malware,” in 2013 IEEE International Conference

on Green Computing and Communications and IEEE Internet of Things

and IEEE Cyber, Physical and Social Computing, Aug 2013, pp. 663–

669.

[43] A. Azmoodeh, A. Dehghantanha, M. Conti, and K.-K. R. Choo,

“Detecting crypto-ransomware in iot networks based on energy

consumption footprint,” Journal of Ambient Intelligence and Humanized

Computing, 2017.

[44] H. H. Pajouh, R. Javidan, R. Khayami, D. Ali, and K. K. R.