2017 International Conference on Electronic and Information Technology (ICEIT 2017) ISBN: 978-1-60595-526-1
Permissions Based Android Malware Stealing
Privacy Data Detection
Zhe-ling ZENG
1,*, Yi-Tao NI
1and Bo-gang LIN
1 1FuZhou University, Fujian Fuzhou, China.
*Corresponding author
Keywords: Permission mechanism; Malware; Privacy data
Abstract. The large numbers of malwares steal user privacy data in the Android application market, which poses a huge threat to user privacy data. To alleviate the threat, a scheme is proposed in this paper which based on privilege-feature detection. First, we filter sensitive privilege-feature that used to achieve attack out from the built-in permissions of Android. Second, based on sensitive privilege-feature, I design a scoring formula which can used to determine if the application is malicious; Last, I implement a prototype tool base on this method. Using this tool to detect malicious applications, the experimental results show that this method is more effective than the general antivirus engine.
Introduction
According to the 2016 "China Internet Development Statistics Report" shows that as of June 2016, the number of Chinese mobile Internet users reached 665 million, of which Android mobile phone users accounted for 67.8% [1].As the Android mobile phone market share and Android operating system open source resulting in Android user privacy data has become a malware of the target [10] [11]. According to the report, the total number of new Android malwares in 2016 intercepted 14.33 million. The number of malwares that steal privacy data accounts for 6.1% of the total. The report counts the number of years of user privacy data, including contacts, text messages, call records, photos. Their number is 1.44 billion, 7.18 billion, 5.15 billion and 252.0 million. To alleviate threat we have done three tasks:
(1)Manually analyze malwares that stealing user privacy data, to sort out set of sensitive permissions combinations and sensitive permissions.
(2)Based on the set of sensitive permissions database, we designed a lightweight detection method that can detect malwares that steal user privacy data.
(3)According to the detection method that has be designed to achieve the tool and use tool to experiment. The experimental results show that the tool can detect malware more effectively than the general antivirus engine.
Detection Method
To determine whether the application is malware. We summed up t permissions of malwarefor getting sensitive feature database and designed the detection method for malware based on the sensitive feature database.
Privacy Feature Database
Firstly, according to permissions in Google Developer website, we selected 22 permissions related to users' privacy data from the 134 built-in permissions in the Android system [2-4].
Then, the manual test 46 malicious applications, based on these malwares need to apply for permission.
{
}
{
}
( )
|| 22 permission
MA ap ap malware
c e e
ap e
= ∈
= ∈
: per mi ssi on e i n appl i cai on ap
->
( ) {
}
( )
( )
{
}
( )
( )
{
}
ower set |
| 1 ,
| 2 ,
P c a a c
A e e P c e ap e ap MA
B e e P c e ap e ap MA
= ⊆
= ∈ ∧ ≡ ∧ ∃ ∈
= ∈ ∧ ≥ ∧ ∃ ∈
P :
By test the permission of 46 malwares [10], some malwares stealing privacy data need to apply for a permission, and other malwares accessing hardware for real-time data need to apply for two permissions. These ways of stealing the privacy data sensitive levels are different. Therefore, they need to be analyzed separately. And |A|= 11, | B, |= 13.
Score Formula
Based on the sensitive feature library, we designed the scoring formula to determine whether the application is a malicious application.
Design Scoring Formula
Define:
[
]
( )
( )
: 0,1
: :
: :
A B
A B
V
f A V f B V
a A f a V b A f b V
→ →
∀ ∈ ∃ ∈ ∀ ∈ ∃ ∈
;
; Then:
(
)
( )
( )
( )
( )
A A B B
a P D A b P D B
rank App m r a m r b
∈ ∩ ∈ ∩
= ×
∑
+ ×∑
(1)mA and mB are constants, respectively representing the weight of the individual
permissions and permissions combinations of applications, AndmA+mB ≡1.
Training Scoring Formula
To score the constants mA and mB, this section uses gradient descent method to train.
( ) ( )
( ) ( )
( ) ( ) 1
1
: ( )( )
n
k k
t t A B
k a P D A b P D B
m m y score r a r b
n
α
= ∈ ∩ ∈ ∩
= + ∑ − ∑ − ∑ (2)
In the formula, y represents the actual score of the training sample, score represents calculated by using the scoring formula. k represents the k training sample, and α
represents the step length.
Experiment and Result Analysis
Experiment Preparation
The training set, test set and the constant of the training score formula need to be constructed before the experiment.
From Virus Share and the app store, we extract the application as the training set and test set. The 1660 apps were collected on VirusTotal.500 applications were selected from the 830 Virus Total warning antivirus engine, and 500 applications were selected from Virus Total's number of warning antivirus engines as training set. The remaining 660 applications are used as test set.
The training set was trained as a training sample, with the method training constant mA and mB in 1.2.2. The final form of the formula is as follows: m1= 0.366, m2= 0.634.
0.366 ( ) 0.634 ( ) j
i B
A
j i
A B
V V
score
S S
= × + ×
∑
∑
(3)
Analysis of Experimental Results
The experiment has tested 417 malwares from test set. 274 were identified as maliciously applied by Virus Total, with a collection of k1, said 112 applications have not been detected third-party applications by Virus Total, with collection of k2, said 31 applications are general applications.
[image:3.612.123.486.513.609.2]First, Analysis of the relationship between the application of k1 and the number of antivirus engines in VirusTotal, as shown in table 1.
Table 1. Result of Detecting Malware. Score
/VirusTotal Numbers 1-2 3-9 10-14 15-20 21-35
0.100-0.149 69 40 9 7 1 0.150-0.249 47 22 10 2 3 0.250-0.399 24 11 4 7 3 0.399-1.000 7 2 2 3 1 Total 147 75 25 19 8
The number of antivirus engines in VirusTotal is positively correlated with the scores of the malwares tested in this paper, indicating that the scoring formula of this paper is reasonable and effective.
Table 2. Accuracy in anti-malaware of VirusTotal.
Anti-malware Number of Malware Accuracy
Kingsoft 78 23.64%
Qihoo-360 294 89.09%
AVG 317 96.06%
Kaspersky 281 85.15%
Tecent 284 86.06%
This method can detect the malicious application of 83.03%VirusTotal identification. The detection accuracy is much higher than that of kingsoft; The detection accuracy of the anti-virus engine with Tecent and kaspersky is equivalent; The detection accuracy of the anti-virus engine is lower than that of AVG and 360.
Then, the number of the VirusTotal warning antivirus engines of 56 undetected applications was counted, such as table 3.
Table 3. Result of missing detection.
VirusTotal Number 1-2 3-9 10-14 15-20 21-35 Numbers of App 21 19 10 5 1
From the table the undetected malware is mostly the application of VirusTotal warning anti-virus engine. the application of k2 is tested by hand. Found that actually there are 64 of them are malwares, such as com. dhceiekce. jdceldecdecee application can conduct a recording attack, com. dotsgame. wei application can shot attack.
This method can more effectively detect the popular Android malware than the domestic and foreign popular antivirus engine. But it has little effects on the detection of malwares that have ability to steal privacy data but attack less.
Acknowledgement
Y. Zhou [5] et al enables the detection of prototyping tools, Content Scope, to detect applications on the market for the content leakage and contamination problems that may exist with the Content Provider component. The static stain analysis technique [6] found privacy data included in user input, such as user name, password, etc. Bianchi [7] et al suggested that users may be subject to Activity hijacking and other GUI confusing attacks, resulting in user sensitive information leakage. Jiang X [8], etal analysis of various type of malware installation and behavioral mechanisms for these malwares detection results through the collection and collation of a large numbers of Android malware.
Summary
References
[1]. Jingping Shen. The 39th China Internet Development Statistics Repore [J].Media, 2017(3):30-30.
[2] Joshua J. Drake, et al. Android Hacker's Handbook [M]. Posts & Telecom Press, 2015, 64-68.
[3]. Nikolay Elenkov , N. Android Security Internals [M]. Electronic Industry Press, 2016, 19-29.
[4]. Felt A P, Chin E, Hanna S, et al. Android permissions demystified[C]//ACM Conference on Computer and Communications Security. ACM, 2011:627-638. [6] Yuxiang Li, BoGang Lin. Design of Application Security Policy Consolidation System Based on Android Repackaging [J]. Netinfo Security. 2014(1):43-47.
[5]. J Huang, Z Li, X Xiao, Z Wu, K Lu, et al. SUPOR: Precise and Scalable Sensitive User Input Detection for Android Apps[C]//Proceedings of the 24th USENIX Security Symposium. 2015: 977-992.
[6]. Zhi Li, Jinwei Chen, Shijie Chen, et al. Research on Android Information Leakage Based o6.n Static Taint Analysis [J]. Electronic Quality, 2015(10).
[7]. Bianchi, A., Corbetta, J, Invernizzi, L, et al. What the App is That? Deception and Countermeasures in the Android User Interface. [C]. Security and Privacy (SP), 2015, IEEE 2015: 931-948.
[8]. Jiang X, Zhou Y. Dissecting Android Malware: Characterization and Evolution[C]// IEEE Symposium on Security and Privacy. IEEE Computer Society, 2012:95-109.
[9]. Wenyang Li. Research on Malware Detection Technology in Android [J]. Netinfo Security, 2015(9):62-65.
[10]. Jizhou Fen, Minghui Tian. Research and Consideration of Software Potential Security Defect Test Case [J]. Netinfo Security, 2015(6):85-90.