<t
'(lr
lv\yl
?"aiteltfort\
PAu
--
It
Inlernational Reviev' otr Commtters and Software (I.RE.CO"S.), Vol- 9, N.
I0
lsslt/.t828-600J October 2011
A
Practical Rule
Based
Technique by
Splitting
SMS
Phishing
from
SMS Spam
for
Better Accuracy
in Mobile
Device
Cik
FeresaMohd
Foozy,
RabiahAhmad, Faizal
M. A.
Abstract
-
Short lllessage Settice/Sl/S/ ts
oneol
the popular communitation serriccs. However, this can contributeto
increasing mobile device dttdcks. Presently, SMS phishing (SMiShing) ailack is alarming to the mobilc phonc users because lhesc attacks usually succeed itt .stealing inJ'ornation and money" Moreover, SMS phishing and spam dre lwo dilJbrent ty'pas oJattack and letel of risk. Thus, it is imporlant to have a SMS phishing corpus. The established SMS
corpus is linited to span and none can be Jband suitable ./br SMS Phishing. This study proposes a technique to split the class of SMS phishingfron SMS spam and produce befier accurocy using the Ba:v'esian teclrnique.
Ihe
resuh shovs lhat the enhanced SMS corpus gets 99.8064.01' atcurale classifcatiort. The study identified classes and generated an intpt'ovenrent of Slt{S Phishing corpuswhich has been labelled
in
three different classes ie., Ham, Spam and Phishingwith
better accuracj. Copyright @ 2014 Praisellorthy
Prize S.r.L -Ail
rights resewed.Kqmords
:
Classification, Derectio,t, Phishing, Secw'it"v-,.1M$ SpanrI.
Introduction
Phishing
is
oneof
the
vadous social engineering attacks thatis
continuously arising on mobile devrces recently. Many researches claim that a phishing attack isdesigned to steal valuable inlbrmation such as credit card
zurd social sccuriry numbers, user IDs and passwords
[].
[2.]. Aucording to Boodae [2], mobile device users ale thrce times more likelyto
entel a web-based phishing aftack than deslmp users. Iu addition, phishing can attack through browser and email[3].
However,it
also can attackuia
Bluetoorh, SMS,Voice Over
IP,
mobile applications and mobile bron'sers.It
is u'idely accepted that attack via SMS phishing or SMiShing is increasing t4]-t61. A smrfi defencc mechanism is highly required to cornbat rhis issue. The big challenge lrowever, are thatcuncnt studies
on
SMS attack detection are mainly focusing on SMS spam detection and filtering"Up
to date,no
study has proposed detection technique for SMiShing. As various wpesof
established SMS spamcorpus
are
availablepublicly
over the
Internet, adetection model is improved by ditlbrentiating phishing class fnrm sparn class by its proposed phishing features.
The basic model
of
splitting and extracting SMS phishing from SMS spam *'as proposed by Beck [7].In this stud,v, the extraction of the SMS phishing was done from SMS spam sample. Additionally, Nazario [8] also identified a set of phishing email in his spam email.
He gave an idea to develop email phishing corpus by using Bayesian sparn classifier then labelling the spam email as phishiog email. This
is
because thei'eis
nopublic
phishing email corpus availablefor
research, therefore a similar yet simple approach was used in this study in detecting the SMS phishing in SMS spam class.Since
a
mobile
device hasa
snrall storage and processor capability, a simple but efficient approach is neededto
sepatate the SMS phishingliom
SMS spam class.To
separateSMS
Phishingfrom
SMS
spamt apractical rule
is
applied based on the technique which contain the lbatures of phishing SMS. The pubiic SMS sparn corpus that u'ill bc uscd has bccn downloadcd fiomUCI
Machine Leaming and the corpuses are collected from Almeida and Hidalgo [9] and Nazario[0].
This SMS spam corpus only contains two classes
of
SMS ham and SMS spam. Sincc SMS phishing attack also has been increasing and ail'ecting the security and privacy of users, there rtas more motivation to develop SMS phishing corpus which contain SMS ham, spam and phishing for SMS phishing detectionstudy-The reason the SMS phishing was not self collected is because the data collection process
will
need extra trme and cost. Beck [?] also has proposed a modified Bayesian technique to separate email phishing from email spam.Thc difference between the technique ofBeck
[7]
and tlre proposed technique in this research is that Beck [?] extracts a sctof
phishing word based on collec'tionof
email phishing. Then,mn
experimenron
other email datasets by using the Bayesian technique. Wheteas the rcchniquc proposcdin
this rcscarch runs tbc modificd SMS datasets that contain ham, spam and phishing classin
WEKA
tools
[l]
to
get
accurate resultsof
thedatasets.
In addition, this proposed fi'amework does not include
the pr-ocess
of
identification and exuactingthe
SMS phishingt'ord.
This
is
because,until
this paper u'as written, there was no SMS phishing available publicly for SMS phishing studies, thus thede
that u'as appliedI
on the SMS spam class wirs to again classily
if
the SMSwill be in spam or in a nen'phishing class,
Current sudies
on
SMS detection are fircusing indetecdng SMS spam and
not
SMS phishing. Several repons has identified that SMS phishing have higher riskthan SMS spam and GetSaf'eOnline
[2],
a
security wetrsite idenrified thatnot
all
spamale
scambut
all phishing are scam. This statement shows the risk levelof
phislug attack is more serious than spam attack. Most of the tips in detecting phishittg attack suggests
to
avoid clicking on any strang€URL
addressin
the SMS. According to S. Abu-Nimeh etal
"fl3l
and Shah[4],
SMS phishingwill
tr-ick users widr a fake URL andby
clicking on the URL. a tiaudulent websitewill
belaunched iurd thc malware
u'ill
be downloaded onto the mobile device. Moreover, SMS phishing atnck has been discussed by Dunham[5]
and Salem et al. 116l.K. Dunham
fi5]
explained that SMS phishing is a new tacticto
spread malwareby
adding the URL links in SMS and influencing the recipient to click on the URL.However,
not
all
SMS wh.ich containsa URL
is phishing. Thus, in rhis study, lhe SMS phishingwill
bescparatcd fiom SMS spam class and not all URL in SMS spam class should be in the phishing category. Some
of
the URL is r,alid and the website exists. Thus, to replace SMS spam that contain URL to the phishing class is not proposed- As an alternative, a rule based technique was dcvelopedto
split the SMS phishing from SMS sparn according to fte spam and phishing theory.Thus, the contribution
of
this paperis
a rule basedtechnique that
will
separate SMS phishing from SMS spam class and then generate an enhanced SMS phishing corpus. The rest ofthis paper is as follows: Section 2, is Related Work thatwill
discuss in deail the current sudyof
SMS Scam Detection. SMiShing and SMS SparnCharacterisdcs and the Rule Based Theory. Section 3, is
on the Methodology of proposing SMS Phishing Corpus. For section 4,
will
be thc Analysis and Result of the SMS Phishing Corpus and classification experiment thatwill
be run to see the accuracy of the classification using the Bayesian Classification technique. Finallyis
section 5,with conclusion and further snrdies.
II.
Releted
Work
There
are
manytools and
soft'*'arethat
can be installedin
the mobile deviceto
ensureall
data areprotected and secure from any attacks but the inforrnation security problem
still
arises. According to Leavitt [17],phishiag attacks are rapidly growing on mobile devices. Moreover,
Joc and Shim
[18]
said
incoming SMS phishing makes dre SMS recipient l'eel their personalprivacy
is
vrolated. The increasingatack
is
because mnbile phonesis
oneof
the
important devices to communicate nowadays. F-Secure[9]
reported that SMS sertices usually has a hidilen charge on replyngSMS phishing. The phishes
will
not only get money but also infbrmation about mobile device vcrsions, contact numbers and ofhers.Copngh e 2014 Praise Wortlry Pri:e S.r.l. - .4ll righls re:en'ed
Cik Feresa Mohd Faom, Rabiah 'lhmad, Faizal
M
'4Convcntionally, SMS phishing
till
be sentin
htgh volumes bur the current rend shows that SMS was sentin
small quantitiesfor
the purposeof
monitoring the securitylevel
of
the mobile device.ln
addition, the problem with SMS phishing detecdon shrdy is the useof
SMS language such as "eontacf'will
be typed as"clcf'
or "cotrlct". Thus,it
will
be haud to idcntify thc specific phishingword
tbr
SMS phishing comparedto
other phishing attacks.il.].
.SMS&,zrrln DetectionGenerally, SMS
architecture
contains Sender/Receiver; Protocol, Tirnestampand
payload(SMS
content) behaviour.Each
of
this
behaviour contains a log ofrhe nrobile device and the most valuable inlbrmation are in the SMS payload which connin texts, numbers and symbols in the message tvhere the styleof
message can be learned and predicted by using machine learning techniques.
I.
W. Yoon et al. 120) and Peizhou et al.[21] shrdied rhe SMS content-based with challenge-rtsponse schemewlrere J. W. Yoon et
al.
l2O) applied cryptography andC.{PCIIA and
Peizhouet al. l2ll
conrbined the blacklisting tecbniqucs and CAPCHAto
dctect SMS spam.In
addition, Gr)mez Hidalgoet al.
[22]
had proposed a content-based SMS spam filteringfor
dual languageof
English and Spanish SMS spam using the Bayesian filter technique. Moreover,T. M.
M:rtrmoud and A. M. Mahtbuz [23] applied whitelisting techniques to detect content of SMS phishing and Q. Xu et al. 124) proposed SMS filtering on non content-based using SVM and KNN techniques.In
addition,Joe
and
Shim[l8] also
uses SVM techniqueswidr
rhesaurusand
appliedin
Windowsenvironment.
Several
experimentswere done
by Ctrrmack etal.
l25f on SMS filtering using 5 typesof
spamtilter
tools
and similarly, Najadatet
al.
1261, obsen'ed severalclassifier
techniquesand
did
aconrparison of accuracy to detect spam SMS.
Machine lcaming technique also has been applied in detecting SMS. Studi€s that has been done by Xiang el
ul. l21l and Cai et
all2Bl
shows an improvement on the spam filter using traditional balanccd Winnow alg{tnthm vvhichis a
linear-threshokl classificadon algorithm to detect SMSin
Chinese language. Moreovet, Wu er a/. [29] proposcd a rcal dmc monitoring and frltcring SMS systemby
using Bayesian filtering and Pinyin Fuzzed keyword pattem machine- lie et a/. [30] also applied the matching pattern technique. In addition, Uysal [31], alsoproposed
a
novel
frarneworkusing
the
Ba1'esian tec'hrrique but with a dual featur-e seluction technique.An
independent mobile devicefiltering
by
Taufiq Nuuzzaman et al. l32l using Naive Bayes technique hasbeen proposed and they prol'ed that spam filtering can be applied independently trn the mobile device, Yadav
[33]-[34] developed
an
application called SMSAssassin that uses the Baycsian technique and blacklist wordsin
thesystem.
T.
M.
Marhmoud antl A. M. Mahtbuz [23] applied ananificial immune system technique and the result shows the proposed technique is better than the Naive Rayesian
algorithm. Charninda
et al.
l35l
proposeda
hybrid solution of neural network and Bayesian frltering.A verilication technique has been proposed by Verma
e/ aL
[36]
to
add
a
verifiiing
mechanismin
the application.In thc Cryptography area, Saxena [37] has proposed a
secure
SMS ptotocol
tbr
SMS
transmissiorrand
acryptographic algorithm
in
theSIM
card. Moreovcr,Pereira
e/ aL
[38]
also
proposeda
lighttreight gryptography algorithm to mitigate SMS securiry issues. In addition, Choi [39] appiied the Common Public Key CY-vptographv techaiquefor
SMS
commnnication etficiency.Vall
and Rosso [40] did experiments that comparedplagiarism detection
tools
u'ith
machine
learning detecrion techniques. Rafiqueet al.
[41]
applied the Hiddcn Markov Modcl (MHH) in SMS protocol tbr rcai dme detection rurd compared four evolutionary algorithm and non evolutionary algorithm classilicarion. Que andFarooq
[42]
also
applicd
MIIII
on
a
bytc
lcvcl distributionof
SMS.In
addition, SMS-Watchdog SMS detection scheme by Yan et al.[43] has been developed using anomaly detection merhods by idendfuing fte SMSuser behaviour in the Windows based.
Until
this
paper was written,only
non
technical studies were done by [13],[5], [6]
and [44],. However, tbr SMS phishing tcchnical analysis, this study is the lirstto separate
fie
SMS phishing from SMS spam corpus.11.2.
SMiShing and SMS Spam CharacteristicsSpam not only gives risk to the user, but also a{fect the organizarion [a5].
ln
addition, phishing attack also has a high impact on users and organizatioos- Horvever,in
rhe SMS research area, the studyonly
fbcuses on detecting SMS spam antl not for SMS phishing.SMS
Spamand
phishing attackshave
diff'erent definitions and this was agreedby
Toolan and Carthy[46], Mistry et
al. [47]
andS.
S.
Chandran and S.Mulugappan [48]. They defined SMS spam as uontaining adveniscment of rnarketing.
Additionally, phishing message is a social engineering method that colrtain messages to trick the user response
to
thc
SMS.Aftcr
that,thc
rnalicious\r'cb
sitcwill
display
and
unforntnatelythe
malware
will
be downloaded. As a consequence ofthal
phishes can get informationof
mobile device, contact number, bank account. track location and others.From the findings of this snrdy, several charactcristic
of
SMiShing attackhas
been identifiedand
these choacteristic were also implementedto
separare SMS phishingfrom
SMS spam. The chancrcristicof
SMS phishing consists of:r
maiicious URLtl3l-tl5l,
[a9]and [,H],r
malicious Telephone Number [44],r
asking to unsubscribe the service [49],Copyright O 2014 Prai.:e lAofihy Pti:e S.r.l. - ,4ll rights resen'ed
Cik Feresa Mohd Fooz,v" Rabiah Ahmad, Faizal M. A
r
self answedng SMS tbr YES as agree to subscribe Is0],r
Announce the SMS recipient as a winner in a contest or competition that was never eittered orr
Asking for help to get profit such as need money toeat or reload prepaid phonx. However, SMS spamwill contain:
r
Marketins campaigns 1401, t261.[i1],
[23],
[]41, [32],[al],
[42] orr
Spread news [24]The similarit-v
of
SMiShing and SMS spam are ltrat bothof
them :ue unwanted SMS and the SMS scndet'snunber is not in the SMS recipient's telephone nunber contact
list.
Ho'rever,it
is
ditlicult to
idcntityif
theelephone
numberis
maliciousor
nol
unless the applicarion connects with the contact directory.For this
srudy,
one
of the
SMS
Phishing characteristics abovewill
be the
parameterfor
thedecision making
of
whetherthe
SMSis
phishing or spam.The
features selectionwill
contain Markering Adverdsementsand Winning
Announcements. The Methodology sectionwill
discuss furtherhow
SMS phishing is separated trom SMS spam data.11.3.
Rule Based TlteotYThe rule based approach is a part ot'ao expert systern arca and scvcral srudies havc applied this mcthod for decision making. Peizhou, Xangming et al.
[2])
sairl t]recomrnon SMS content-based filtering method are rule-based techniques. The study
to
detect phishing attacks using the rule based technique has been done by R.B. Basnet e, al. [51], which generated 15 rule sets that was analyzed using machine learning techniques. Moreover, Shreeraur[52]
proposeda nrle
sctto
detcct phishing attacks using genetic algoritbms.Salenr [16] have proposed a rule set in the studies ftrr the awar-eness program and aniticial intelligcnt tools It) detect phishing emails. Liping et a/. [53] had examined seven feah.rres for phishing email detection. In addition, Pamunuwa et al. [54), proposed to detect phishing email by using IDS.
Alnajim aud Munro [55] studied on phishing detection
in
rvebsites. Zhouei
a/. [56]
also proposedto
detect phishing attacks and Abunous eral.
151) had listed several criteria to detect phishing and the technique that u'as applied was the fuzz-v technique. SpoofGuard, which has been discussed by Huajunct
a/. [58], is a popular anti-phishing tnol. The author also sudied several criteria to detect phishing.There are several tactics to spread phishing email and u.eb sites but there are
still
lessfor
SMS Phishing and Slr'{S spanr. Unifornr Resource Locator(URL;
is
anaddress of website which is one of the important features
to
detect phishing attacks. Moreover,URL
phishing dctection also has been widcly applicd to detect mobile u'eb phishing and ernail phishing. Nevedheless, tbr SMS phishing attacks, detecting phishing througlrURL
is importanr as one of rhe lbaures to detect SMS phishingI
Cik Feresa Mohd Foozv, Rabiah .lhmad, Faizal M- -J
because this attack rvrll hick users by scnding an SMS
uith malicious
tiRL
[13]-[15], [49] and [44].These solution
for
mobile web phishing using URLhas been studied by Niu [59] and Weili et al. [6O].
Niu
[59] proposed to design an important and shortURL
to
be displayed on salbri browser on the iPhone mobile device. Weili el a/. [60] upplied a URL checker intheir anti login intertace liamework. Yadav et a/. [33] also applied
URI
features to detect SMS ham and SMSspam but not SMS phishing.
Howcver. since this study only focuses on scparating
phishing
SMS
fiom
spam class,not
all
URLs
aremalicious. Some URLs contaim a valid format or have existing websites.
Thus,
new
feature selectionssuch
as
Winning AnnouncementSMS and
AdvertisementSMS
was applied to improve accrracy result. Moreo't'er, based on rhe understanding of spam and phishing SMS, these rwo features differcntiate the spam and phishing SMS.nL
Methodologr
The aim
of
this paperis
to
separate SMS phishing fi'om SMS sprm class in SMS sparn Corpus that has bten downloaded from I-ICI Machine Learning. The featuresof
SMS
phishing
are
based
on the
phishing characteristics that has been discussedin
the previous section such as f'ake URLs, u'inning announcement andti'ee gifts and request to response.
The SMS phishing chatactcristics were gained from
various conl'erence papers and joutrrals
from
digitallibruy
IEEE Xplore,
ScienceDirecfACM
DigitalLibrary and Sprirrgetlink.
In
this sh.rdy, English SMS sparn corpus are based on [61], [62] and [22] datasecthat
has heen dourrloadedftom the UCI
Machine Learning [9].The
total
SMS that were downloaded were 5572, however after pre-processing data such as rcdundancy rernoval, 51 66 of SMS spam corpus was applied. There are two classesof
ham and spam SMSin
this datasetsu'hich consist or 4515 SMS ham and 651 SMS spam.
Table
I,
shows the SMS spam specification that has beendownloaded-Fig.
I
below is a generic rule flows how the sep&7rdng phishing SMS to SMS spam is done.TABLEI
ENcusH SMS SP.{-Ni SPEcTFtcATIoN
English SI\IS Corpus gnglish SMS Soecificetion
Fig. | - Gcncric Proccss ofRulc-Bascd Decision Making to S€pdate SMS Phishiug Aom SMS Spam
$tep 1: Pre-Processing the entire SI\{S Spam cotpus such as removing SMS redundancy and tokenize
Step 2: Applied lbur nrles ro rnake decision rnaking:
Rule
l:
lF
the SMSis
an wiuner arurourcement, TTIEN result=lRule
2:
TF the SMSis
a
marketing advertisement TIIENresult:l
STEP 1 storing and handling SMS sparn corpus using Microsoft Office Excel and pre-processing tbe SMS to clean the data tiom noise. STEP 2 is how the rule based
pedorms. Theie
will
be no issue of the generated SMS phishing being misconstrued as a ham SMS because therulcs are only for SMS spam class liom thc UCI Machine Learning website.
For Rule
l,
the screening words ofkeyvuord"rev:art'
and "lr,irr"
is
a word ro initiate user to respond to the SMS. Thus,if
the SMSinfom
ttrc recipient that fteywon, the result
will
beI
as TRUE. For rule 2, if the SMS is an advertisemcnt SMS, it alsowill
be TRUE.IV.
Analysis
and
Findings
From the analysis that was done on the SMS spam corpus that was downloaded
from
theUCI
Machine Learning, there are several SMS phishing that has beenideutifled
on
SMS
spam.After the
pre-proccssing process, there are 4515 SMS hanr and 651 SMS spamand the result aftet Step
I
until Step 3 was applied. 45 | 2[ntematioml Ra'iev on ComPuters ond Sofware, Vol. 9, N. | 0 No ofmessages
No of rvonls No of cless
Avcragc no oiwords pcr
lrrcssage
Collcction mcthods ald
proceduts
Composifon oI Lext
I-alguagc of communication T)Ipe of conmuication
5166 7974a
2 (Ham md Spam)
r4. l 7928
Dorvnload at UCI Machinc Lrarning Mobile Device uer
EngLislr
SMS from Unknown Smder,
.{d\crtisemilt. Pcrsonal cumrurucation
SIUS Spam Comus
RULE
I:
Stinner?:-t
RT.ILI 2: .Ad!cnisctrrint !-4
[image:4.593.311.506.104.377.2] [image:4.593.82.291.584.705.2]Cik Feresa Mohd Fooz,v, Rabiah Ahmad, Faizal M. A
SMS ham, 567 SMS Spam md 84 SMS spam has been
identified (Table II).
Moreover,
this SMS
phishing corpushas
beenanalysed using the WEKA tool [1
l]
for evaluation of theaccuracy classification.
RuhrBrsed RulsBasrd
SMS
Spar lhm
SDrm
Pbishing llam Sprm tshishlng DamcL [9],n0145t5 65i
0
4515567
E4Aftcr
SMS Phishing Datasets has been developed using the proposed rule-based technique, an experiment was done using theWEKA tool
I l]
to
validate the proposed rule based technique. True Positive Rate (TPR), False Positive Rate (l'PR), b'alse Negative Rate IFNR),and True
NegativeRate
(TNR)
of
SMS
Phishing Dausets havc been identifietl using the WEKA tool.To get the result of accuracy level, the SMS phishing fcaturcs was constructcd bascd on litcrafurc rcvicw. In addition. there are several Machine Leaming techniques
in the WEKA tool. however
in
this paper the Bayesian technique was appliedto
check the classificationof
accuracy rate of SMS ham, spam and phishing.Based ou the result in Table
III,
thc accuracy ofthesc SMS Phishing Corpus show high accuracy compared to SMS spam Ctrrpus.TABI.EItr
RESULT SMS DFTF-L"TION USINC BA1'ESIAN TECH}TIOUE
B'@
l'lramelrr
T.. Phlshlng sDrm
Phishing compau'cd
to
SMS spam.It
is
dueto
the separating pl'ocess. The result however look only limited to SMS corpus by Almeida and Hidalgo [9] and Nazario[10]. Further srudy,
will
be tested using Malay SMS corpus for explore the efficiency of this method.Acknorvledgements
The authors would like to thank Universiti Teknikal Malaysia Melaka (UTeM). Universiti Tun Hussein Onn
Malaysia(JTHM) and
Ministry
of
Higher Educarion Malaysiafor
supporting this research. This research is f'unded by MOHE under Long Research Grant Scheme LRGS/20 1 l /FTMK/TKo l i 1R00002.References
[1] Microsoli. (2011, 8th June)- Email and weh scams: How to help
pn)ted your:ell.
I2l
tll
t4l
t5l
t61
t?l
Available
htlp:./,ruw.micro*ti conr/sccuiry
onlinc-privacy/phishhg-sDams.aspx
M. Boodae, "Mobile Uscrs Tluee Timcs Morc Vulncrablc to Phishing Ailacks," il lrstaar vol. 2012. €d, 201 l.
P. Soni, er al, 'A phishing malysis of wcb burd systctrrs," presented at the Proceedings ofthe 201 I International Confetence
ou Comuication, Computing \&\C38; Security. Rou'kela,
O<lisha, India. 201 t.
C. F. V. Foory, et a/., "Pbishing Detecriotr Taxonomy for Mobi-le
Device," .InteDutilnal Jrurnul ol Computet SrietEe LlTues
qJCSII. t;ol. 1 0, 20 13.
,{. Kang. ct dJ., "Security Considemtions for Sman Phone Smi.shing Attacks.'in.4dtunced in CorrPuret Science and
it,-"lppfitations, cd,: Springcr. 20 14, pp. 46?-471.
D. Kim md J- Ryou, "SccurcSMS: prcvmfon of SMS imerception on .{ndroid plarfom," prsf,ted at the Proceedings
of dre 8dr Inrerurional Conference on Ubirruitou Infomntion Muguur md Cmurmicadon, Siern Reap, Cmrbodrq 2014.
K. Ber:k md J. Zhin, "Hnshing Using a Modified Bayesim
Tmhnique," in Sacial Computing (SriulGtm), 2l)10 IEEE Frlse
Positive
0.01.1Correcdy Clessilicd
hcorrcctlv Clessilitd
00
99.8064 %0.t936 %
V.
Condusion
As
a
conclusion,SMS
phishingand
spam have different characterizations and security risks. Since there areno
available SMS Phishing corpusin
the English language, this study is the first to propose a practical rule based techniqueto
sepatate SMS phishingiiom
SMSspam class.
Two new feanrres are also introduced
in
this studysuch
as
the
Winning
AnnouncementSMS
and Advertisement SMS. These two features has been added because ol the main characteristics of SMS phishing and spaun whcre generally, SMS phishingwill
ccmtain anwinning
announcernentand SMS
spam
thar
\r'ill
broadcast markedng advertisement.Additionally, bascd on thc cxpcrimcnts rcsult in Tablc
II,
the proposed rule based with these new features has shown the availabiliry to separat€ the SMS phishing classfrom SMS spam and generated a SMS phishing corpus tbr further study in SMS phishing attack which has been incrcascd nowadays. According
to
the proposed rule based system, the result shows befter accuracy for SMSCopyrigtu e 2014 Praise llortlry Pri:e S r.l. - All ighls reten'ed
Suoild lrta'ndtionul ConJercnce on,2010, pp. 649-655.
[8] J. Nazario, "Phishing Corpus," vol. 201 3, ed, ?005.
[9] T. A. Almeida and J. M. G- Hidalgo. (:012, l3 September 201?).
SMS St am Coilt<tutr Don .Sel, ,Availahle: htt):,//archive.ics. uci- edulrnl/dara sets/SMS+Sptm+Collecdm
[10] J. Nazatio, "Phishing Corpus," 2004-2007.
[ll] 1,. ts'. Mak llall, Geotliey llotmes, Bemhard Ptahringer, Peter Reutemmn. Ian H. wittil, "'lhe WIKA Data Mining Solts'are: .{n Update," S/GXDD Ex plo ru t iort, vol. I l, 2009.
[2] GctSafcOnlinc. (201?, I January 2013). Spun & Scun email.
Available: htrp:/l*rvu'.gcrsar.conline.org/protccring'your' conrputer/spcnr-md-scam-etmil/
[13] S. Abu-Nimeh, er nl., "Disnibured Phishing Detccrion by ,\pplying Variable Selecrion Using Bayesian ,\dditive Regression Trees," in Conmunitatiorc, 2009. ICC '09. IEEE lnternTliaml
('onfercnce on,2009, pp. l-5.
[14] J. Shah. "Onlinc crime nigr"tes to mobile phoues," Sage vol. l,
pp- 22-23,200?.
[15] K. Drnhm- "Chalter 6 - Phishing, SMishing, and Vishirtg," n
Mohile Maht'are Allacks awl Defense, D. Ken, Ed.. rd Boslon: Syngress. 1009. pp. | 25-196.
[16] O. Salcm, e/ a/., "Awarcness Program and AI bascd Tool to Rcducc Risk of Phishing Arucks," iu Cotttputet and Information
Tethnolog; (ClT), 2010 IEEE l0h ltternailonal ConJerence on, 2010,pp. l4l8-1423.
[17] N. Leavin. "lvlobile Security: Finaliy a Serious Problem?," Computer,,tol. 44, pp. I I - 14, ?0 | l.
[18] I. Joc and H. Shim, "An SMS Spam Filtoing Syst€m U:irg Suppot Ve ctu' Machine," iu Flrtare Generatiou Infomarion
Techrclog;. vol. 6485, T.-h. Kim, et ul.,Eds.. cd: Springer Berlirr
l
Hcidclbcrg, 201 0. pp. 5??-58a.
[ | 9] F-Sccure, "Mobilc Thrtat Rcport Ql 20 | 2,. F-Secure L.abs20 | 2.
[20] J. W. Ymn, eI a/., 'Hvbrid spanr filtering for nrobile conrnrunication,r' Contpaters &: *curitv. !$1. 29, pp. 446-459.2010.
[21] H. Pcizhoq a a/., "A Norel Methorl tbr Filtrring Grcup Sending Shorr Mcssagc Spam," in Corltgcttcc ancl Ih'btid htfotmalion Technolog,,, 2008. ICHIT '08. Intenntiotal Corlcrcncc an,2008,
pp.60-65.
[22] J. M. G. Hidatgo, c, rrl, "Cotrtcnt based SMS spam filtedng,"
preienred ut the Pnrcecdings of the 20{]6 ACM symposium on Documcnt cnginccring, Ar:rstcrdmr, Thc Ncthcrlands, 20t)6-{231 T. M. Va.hmoud md A. M. Mahfotu, "SMS Spam Filtcrmg
Tcchruque Bascd on .Arliflicral Imunc Sysictr!," 7JCS1
International Jotrnal of(bmputcr Sciencc Issws, vol. 9,2012.
[2al Q. Xu eraL, "SMS Span Dclcction Ning Comcnt-lcss Fmru'cs,"
Iiltelligtnt Systcnts,Ifftr, vol. PP, pp. l-1,2012.
[25] G. V. Cornack et d/., *Fearue engineering for mobile iSMS)
spm fikoing," presented at the Procedings of the 30th mrual
intemarional ACM SIGIR cDnference on Reserch and
developrnent in infomradon rerrie\al, Amsterdm, The Nethu.lands, 2007.
[26] H. Najadat er al, "Mobile SIr{S Spanr Fihering based on Mixing
Clasit-rcrs. "
[2?] Y. Xiang et a/., "Filtcring nrobile spm by support lictor
machinc " prcscmcd at thc Confmcncc m Computcr Scrcnccs,
Sofiwae tngineenng, lnfomadon lechnology, E-Business md
.A.pplicatiom (3rd; ?0(X ; Cairo, Egypt), Cailo, Eg1'pt,20el.
[2E] C. Iie, er aL, "Spam Filter for Shon lv{*sags Using Wimow," in
,4dvanted Longutge Aucessilg und Weh InJinnation Tuhnologt, 2(x)lt. ALPIT 'o8. fnrunational CanJb'eu:e on,20O8, pp.45,{-459.
[29] W" Ningning. et al. "Real-time monitorhg md filtering systent
for nrobile SMS," in Induslriol Elcctronics and Applianrions, 2008. ICIE.I 2008. 3rd IEEE Confcrercc or, 2008, pp.
l3l9-[]0] J. lfumg et a]., "A Bayesim Approach for Text Fifter on lG
Nctwork " in l/ireJess Cotnrnunicalions .\tenuor*ing an,l :Vohile Computing lIyiCOM), 2010 6lh Inlernatioml Corferetce ot,
2010. pp. l-5.
[31] A. K. Uysal, er aL, "A novel frantcwork for SMS spam filttring."
'n Innovtlions in Intelligenl S1'srems aul Application: (DIISTA), 201 2 lntcrnational Symposium on,201 2, pp. l -4.
[32] M. Taufiq Nuruzaman, c, dJ., "Simple SMS spam fihering on independenr mobile phone," Stcurin: and Cannrlnicailon Nerrorls, vol. 5, pp. 1209-l 220, 201 2.
[33] K. \'adav, e, ,1., "SMsAssassin: ctowdsowcing driven
mohile-based system 1br SL.{S spam tilterhg," Prsenred ar the ntuoaedings of thc l2th Workshop ou Itlobile Conrputing Systms
and Applicatiols, Fhoebix. Aribna,
3011-[]al K. Yadav, el al., "Take Conrrol of You SMSes: Desigring m Usabh Spanr SN'[S Filtcring System," in Mobile Dutu
ll{tmgemen (MDn, 2012 IEEE 13rh lilteildti.nrdl Co4ference oa, 201 2, pp-
352-355-[35] T. Chaninda er a/., "Content based hybrid sms spam frllering
sysrem," 20l4.
[]61 R. K. Vemo, el cL, "Exracdon and Verification oi Vobile
Mssege Integrity," it Corwnuticanbn Dtreil.r and Neu-olk
Te&nologies (CSNT), 201 | lnlemalional Confererce on, 201|,, pp.49-53.
[3?] N. Suma md N. S. Chaudiuvi, "SccueSMS: A securc SMS
protsol l'gr VAS and other applications," Jourrtal oJ.Srsterns and Software, vol. 90, pp. 138-150. 2014.
t38l G. C. C. F. Pcrcir4 el aL. "SMSCrypto: A lightn'right
cqptopraphic trameu'ork for sccurc SMS transmission," Journal o/$,srerru and So/ware, vol. 86, pp. 698-706, 201 3.
[39] J. Choi and H. Kim, "A Novel Approach ibr SMS security,'t lntemational Joumal of Seetri4. & Its Applimriob,l'of. 6, 2011.
[40] E. Valt iurd P. Rosso, "Dctcction of ncar-duplicatc ulcr ginei?ted q)nterrtt: the SN{S spam collection," prcsarrd at the hnceedings
of the 3rd inteinadonal rvorkshop on Search and mining user-gsrerat€d cortcnts, Glasgorl. Scotlan{ UK, 201 l.
[41] M. Z Rafique, et c,f., ".A.pplication of evolutionary algorirhms in dctecting SMS spam a[ access iaycr," presented at the Proceedings
Copyright 9 2014 Praise l(orthy Pri:e S.r.l. - All righr.s reren'ed
of thr 13dl annul confcrcncc on Gcnctic and cvoluraona.ry
cotrpuhtion, Dublin, Trcland, 20 I | .
[42] Nf. Z. R. que and M. Far6oq, "SMS Spanr Dctecriorr By Opcratirrg
On E''te-Level Disciburions Using Hiddcn Mukor' I\{odels
IHMMS)," prcsentcd al tlre Virus Bulletin Conlbrence Scptenlbir
20r0.2010.
[a3] G. Yan, ct a/., "SMs-W4tchdog: Ptofiling Social Bchaviors of
SMS Uscrs tbr Anomaly Dctcction Rccrrrt Adv'anccs in Inhxsion Derection." vol. 5758, E. Kirda e, al, Eds., ed: Sprilger Berliu /
Hcidelbetg, 20tJ9, pp. 202-223.
[44] S. Lee. (2012, Snishing: SMS + Plrr]ing, Prcsent And On The
Ri,rz On '1rulroid. Availablc:
http: ri'wrvw al cnboot. cob./blog./blogs/cndpoint_sccruiry/rchivc/2 0.1 2/l l/14/smishing-sms-phiJung-prcscn!-md-on-drc-rasc-on-android.aspx
[45] Ktuthika Rcnuka, D.. Vislakshi, P., Blcnding lirctly md baycs
classilicr lbr cmail spam clmsilication. {2013) Iternilional R*iew on Cohpl*ets and Saftv'dle (IRECOS), I (9.), pp. 2168-7111.
[46] F. Toola-o andJ. Cuthy, "Feamre selection for SprimandPhishing detecdon," in eCline rResearchers Sunmit (eCrine.1, 2010,2010,
pp. 1-1?.
[.1?] N. Mistry. cr a/., "Preventive -\ctions to Emerging Theats in Snrad Dcviccs Sccuriry," 2t)l 1.
[48] Chmdran, S.S., N{uugappm, S.. Spam detcction md clinination of mcssgcs from twitcr, i2ql3) Intcrtatiotal Revicw on Computers anl So/n'are (RLa'Os}8 (10). pp. 2438-244i.
[a9] L. Kriel, cl al, "Torvalds a computer securq' inducrion nillnl for non-IT citims."
[50] A. Mahaja4 et dl, "Identification of Fake SMS generated using
,{niloid Applications in Android Devices."
tsll R.B- Bamet , zt al., "Rule-Based Phishing Attack Detecuon,"
Interndlirndl Conlbrcrte or Securitv tnl Manugenenl SiMll (2ot 1), 2O1r.
[52] V. Shreerm, ct al, "Anti-phishing derection of phishing anacks
using gcnctic algoridun," in Cannunicalton Control and Computing Iec,hnologies (ICCCCJ|), 201() IEEE Inlenalional
Conlbrence on, 2010, pp. 447-450.
[53] M. Liping, er al, "Automaticdly Genuating Ctassifis for
Ptrislring Enail Prediction," in Pelvasdve Syttens, Algorilhnt,
oazl,Vanr,af.r (ISPAN), 20il9 I(Jrh Intenwrional SymJxrsium on, 2009. pp. ?79-783.
[54] H. Pmunuwn, et al., "An Inmrsion Derecrion System for
Detecting Phishing ,Attacks." in Sc"c/r€ Data Matogemcnt. vol.
4711. W. Jonker and M. Petkovrc, Eds., ed: Springer Berlin i
Heidelberg, 2007. pp. l8l-192.
[55] ,{. Alnajim and M. Munro, "An Approach to the Implenflmdon
of the Anii-Phishing Tool lbr Phishing Websitrs Deteclron," m lntelligent Nemortirg tnd Collahot'an\t S.ytrems, 2009. INCOS 'il9. ln r enw tion tt I L' onfere n c e o n, 2009, pp- I 0,5- I I 2.
[56] C. V- Zhort, et ol., "A Selt:Hcaling, Self-horechng Collablrdtive
lntruion Dctccrion Architcctttre to Trauc'Bask Fast-Flu Phishing Donrains,'in Natunr.t OPerationt tnd Montgenent
Svmatsiun Worlrshops, 2008. I;OVS Workshops 2008. IEEE, 2008. pp. 32 I -32?.
[57] N,f. .\bunous, e, al., "lntelligent Phishing Website Deeclion
System using Fuzzy Tcchniques." in hrJonution and
Commilnrtution Techmlogies: From Theory lD AlrPliL'ation\, 2Nt8. ICTTA 2008. 3rd htemational Confuove nn, 2008- JrJt-
l-6.
[58] H. Hutju, et al, "Coutomeasu: Tecbniques tbr I)cceltlive
Phishing.Attack," in rVew h'ettls in lnlbnnatiott und Scn'ice
Scien*, 2009.,VISS ?9. lntenntioual Confereute ol, ?009, pp.
636-641.
[59] Y. Niu er a/., "iPhish: Phishing Vu]ncrabilitics on Consumcr Elecronics." in LrP.iEL'. 2008.
[60] H. Weili, er a/., "Anti-Phishing by Smart Mobile Device," in
Netb'ork rnd Purallel Compuing lforkshops, 2007. NPC l{orkshops. IFIP Intennional Cot4ference on,2007 . pp. 295-302.
[61] G. V- Cormack, rl dl, "SPdm filtering tirr short missges,"
prcsentcd at thc I\occcdings ofthc sixtcctrth ACM cont-erence on Corrt'creirce on infbrmation arrd krrou'ledgc managenrent, Lisbon. Porugal, 2007.
[62J O" V. Cormack, cl a/,. "Farure engineering for mobile (SMS)
Cik Feresa lulohd Foozv. Rabiah Ahmad. Faizal M. '4
Cik Feresa Mohd Fooz-v, Rabiah Ahmad, Faizal M. A.
spam iiltcring," prcscntcd at thc Procccdings of thr 30th annual
intanational .{CM SIGIR oonfcrcncr cn Rcsarch antl
dcvclopment in irrfomration .ctrieval, Anrstcrdam, The Netherlands, 2007.
Authors' information
Cik tr'ercee l\{ohd tr'uozy is currendy working
with Universiri Tun Husscin Onn Malaysia
(UTHM), Malaysia. Fcrcsa holds a Master's
dcgcc in Computer Scicncc(lnfomarion Security) ltom Univcrsiri Teknologi Malaysin
Malaysia urrl a Bachclo"s degtc il Infbmariou Technology and Muldmedia fi-orn
Univcniti Tun Husscin Onn Malaysia (UTTiM), Malaysia. Shc is cuncntly pursuiag hu PhD at thc Univcrsiti Tcknikal Malaysia Mclaka,
Malaysio-Rrbirh .{hmad is an .Associalc Profesv:r at rhe
Fauulty of Infomrarion Terhnology and Communication, Universiei Teknikal Malaysia Mclaka,Malaysia. She rrceived her PhD in
lnibmadou Stu{tics (hcdth infbnnarics} litrn the Univeniry ot ShEtliel4 UK and M.Sc. linlbrmadon security) liam rfu Royrl Hollo*'ay
Uuivcritv of London. UK. Hcr rcscarch irrtercsts include healthcae sysrem securiry and healdr intbmadcs at narional as well as intemational levels. She has also publishedpapers m accrcditcd information sccurity archi&cmrc. Shc has dclivcrcd papus at various national/inrcmarionaljoumals. Risides that, shc also sewes as a
riviclcr tbr variom confcrrnccs and ioumals,
Mohd Faizal Abdollah is an Assmiate
Prol'cssor at thc Faculty ol lnlbmration
Tcchnology ud Comuication. Univrrsiti
Tcknikal Malaysia Mclaka.Malaysia. Hc receiverl his PhD in Compurer and Nerrvork
Sertriry froor tlnivcrsiti Teknikal Malaysia
MelakaMalaysia, md M.Sc. {ComFuter
Science) li-om the Univetsity Kebangsm Malaysia His rcscarch intcrcar mcludc ncnvork ud mobilc sccuity
and nclvork moltttonng. Hi has dclivererl papcrs at vzu-ious network security conltrences at national as rvcll as intcrnattonal levels, IIc hm
also publishcd papcrs in accrltitcd narionalrintcmational loumals-Bcsidcs tha! hc also scn cs as a rcvicwcr lbr vuiou conlcrcnccs md
iournals.