A practical rule based technique by splitting SMS phishing from SMS spam for better accuracy in mobile device

(1)

<t

_'(lr

lv\yl

?"aite

ltfort\

PAu

--

It

Inlernational Reviev' otr Commtters and Software (I.RE.CO"S.), Vol- 9, N.

I0

lsslt/.t828-600J October 2011

A

Practical Rule

Based

Technique by

Splitting

SMS

Phishing

from

SMS Spam

for

Better Accuracy

in Mobile

Device

Cik

Feresa

Mohd

Foozy,

Rabiah

Ahmad, Faizal

M. A.

Abstract

_-

Short lllessage Settice

_{/Sl/S/ ts}

one

ol

the popular communitation serriccs. However, this can contribute

to

increasing mobile device dttdcks. Presently, SMS phishing (SMiShing) ailack is alarming to the mobilc phonc users because lhesc attacks usually succeed itt .stealing inJ'ornation and money" Moreover, SMS phishing and spam dre lwo dilJbrent ty'pas oJ

attack and letel of risk. Thus, it is imporlant to have a SMS phishing corpus. The established SMS

corpus is linited to span and none can be _{Jband suitable}_./brSMS Phishing. This study proposes a technique to split the class of SMS phishingfron SMS spam and produce befier accurocy using the Ba:v'esian teclrnique.

Ihe

resuh shovs lhat the enhanced SMS corpus gets 99.8064.01' atcurale classifcatiort. The study identified classes and generated an intpt'ovenrent of Slt{S Phishing corpus

which has been labelled

in

three different classes ie., Ham, Spam and Phishing

with

better accuracj. Copyright @ 2014 Praise

llorthy

Prize S.r.L -

Ail

rights resewed.

Kqmords

:

Classification, Derectio,t, Phishing, Secw'it"v-,.1M$ Spanr

I. Introduction

Phishing

is

one

of

the

vadous social engineering attacks that

is

continuously arising on mobile devrces recently. Many researches claim that a phishing attack is

designed to steal valuable inlbrmation such as credit card

zurd social sccuriry numbers, user IDs and passwords

[].

[2.]. Aucording to Boodae [2], mobile device users ale thrce times more likely

to

entel a web-based phishing aftack than deslmp users. Iu addition, phishing can attack through browser and email

_[3].

However,

it

also can attack

uia

Bluetoorh, SMS,

Voice Over

IP,

mobile applications and mobile bron'sers.

It

is u'idely accepted that attack via SMS phishing or SMiShing is increasing t4]-t61. A smrfi defencc mechanism is highly required to cornbat rhis issue. The big challenge lrowever, are that

cuncnt studies

on

SMS attack detection are mainly focusing on SMS spam detection and filtering"

Up

to date,

no

study has proposed detection technique for SMiShing. As various wpes

of

established SMS spam

corpus

are

available

publicly

over the

Internet, a

detection model is improved by ditlbrentiating phishing class fnrm sparn class by its proposed phishing features.

The basic model

of

splitting and extracting SMS phishing from SMS spam *'as proposed _{by Beck [7].}

In this stud,v, the extraction of the SMS phishing was done from SMS spam sample. _{Additionally, Nazario [8]} also identified a set of phishing email in his spam email.

He gave an idea to develop email phishing corpus by using Bayesian sparn classifier then labelling the spam email as phishiog email. This

is

because thei'e

is

no

public

phishing email corpus available

for

research, therefore a similar yet simple approach was used in this study in detecting the SMS phishing in SMS spam class.

Since

a

mobile

device has

a

snrall storage and processor capability, a simple but efficient approach is needed

to

sepatate the SMS phishing

liom

SMS spam class.

To

separate

SMS

Phishing

from

SMS

spamt a

practical rule

is

applied based on the technique which contain the lbatures of phishing SMS. The pubiic SMS sparn corpus that u'ill bc uscd has bccn downloadcd fiom

UCI

Machine Leaming and the corpuses are collected from Almeida and Hidalgo [9] and Nazario

_[0].

This SMS spam corpus only contains two classes

of

SMS ham and SMS spam. Sincc SMS phishing attack also has been increasing and ail'ecting the security and privacy of users, there rtas more motivation to develop SMS phishing corpus which contain SMS ham, spam and phishing for SMS phishing detection

study-The reason the SMS phishing was not self collected is because the data collection process

will

need extra trme and cost. _{Beck [?]}also has proposed a modified Bayesian technique to separate email phishing from email spam.

Thc difference between the technique ofBeck

_[7]

and tlre proposed technique in this research is that Beck [?] extracts a sct

of

phishing word based on collec'tion

of

email phishing. Then,

mn

experimenr

on

other email datasets by using the Bayesian technique. Wheteas the rcchniquc proposcd

in

this rcscarch runs tbc modificd SMS datasets that contain ham, spam and phishing class

in

WEKA

tools

_[l]

to

get

accurate results

of

the

datasets.

In addition, this proposed fi'amework does not include

the pr-ocess

of

identification and exuacting

the

SMS phishing

t'ord.

This

is

because,

until

this paper u'as written, there was no SMS phishing available publicly for SMS phishing studies, thus the

de

that u'as applied

(2)

I

on the SMS spam class wirs to again classily

if

the SMS

will be in spam or in a nen'phishing class,

Current sudies

on

SMS detection are fircusing in

detecdng SMS spam and

not

SMS phishing. Several repons has identified that SMS phishing have higher risk

than SMS spam and GetSaf'eOnline

_[2],

a

security wetrsite idenrified that

not

all

spam

ale

scam

but

all phishing are scam. This statement shows the risk level

of

phislug attack is more serious than spam attack. Most of the tips in detecting phishittg attack suggests

to

avoid clicking on any strang€

URL

address

in

the SMS. According to S. Abu-Nimeh et

al

_"fl3l

and Shah

[4],

SMS phishing

will

tr-ick users widr a fake URL and

by

clicking on the URL. a tiaudulent website

will

be

launched iurd thc malware

u'ill

be downloaded onto the mobile device. Moreover, SMS phishing atnck has been discussed by Dunham

_[5]

and Salem et _{al. 116l.}

K. Dunham

_fi5]

explained that SMS phishing is a new tactic

to

spread malware

by

adding the URL links in SMS and influencing the recipient to click on the URL.

However,

not

all

SMS wh.ich contains

a URL

is phishing. Thus, in rhis study, lhe SMS phishing

will

be

scparatcd fiom SMS spam class and not all URL in SMS spam class should be in the phishing category. Some

of

the URL is r,alid and the website exists. Thus, to replace SMS spam that contain URL to the phishing class is not proposed- As an alternative, a rule based technique was dcveloped

to

split the SMS phishing from SMS sparn according to fte spam and phishing theory.

Thus, the contribution

of

this paper

is

a rule based

technique that

will

separate SMS phishing from SMS spam class and then generate an enhanced SMS phishing corpus. The rest ofthis paper is as follows: Section 2, is Related Work that

will

discuss in deail the current sudy

of

SMS Scam Detection. SMiShing and SMS Sparn

Characterisdcs and the Rule Based Theory. Section 3, is

on the Methodology of proposing SMS Phishing Corpus. For section 4,

will

be thc Analysis and Result of the SMS Phishing Corpus and classification experiment that

will

be run to see the accuracy of the classification using the Bayesian Classification technique. Finally

is

section 5,

with conclusion and further snrdies.

II. Releted

Work

There

are

many

tools and

soft'*'are

that

can be installed

in

the mobile device

to

ensure

all

data are

protected and secure from any attacks but the inforrnation security problem

still

arises. According to Leavitt _[17],

phishiag attacks are rapidly growing on mobile devices. Moreover,

Joc and Shim

_[18]

said

incoming SMS phishing makes dre SMS recipient l'eel their personal

privacy

is

vrolated. The increasing

atack

is

because mnbile phones

is

one

of

the

important devices to communicate nowadays. F-Secure

_[9]

reported that SMS sertices usually has a hidilen charge on replyng

SMS phishing. The phishes

will

not only get money but also infbrmation about mobile device vcrsions, contact numbers and ofhers.

Copngh e 2014 Praise Wortlry Pri:e S.r.l. - .4ll righls re:en'ed

Cik Feresa Mohd Faom, Rabiah _{'lhmad, Faizal}

M

_'4

Convcntionally, SMS phishing

till

be sent

in

htgh volumes bur the current rend shows that SMS was sent

in

small quantities

for

the purpose

of

monitoring the security

level

of

the mobile device.

ln

addition, the problem with SMS phishing detecdon shrdy is the use

of

SMS language such as "eontacf'

will

be typed as

"clcf'

or "cotrlct". Thus,

it

will

be haud to idcntify thc specific phishing

word

tbr

SMS phishing compared

to

other phishing attacks.

il.].

.SMS&,zrrln Detection

Generally, SMS

architecture

contains Sender/Receiver; Protocol, Tirnestamp

and

payload

(SMS

content) behaviour.

Each

of

this

behaviour contains a log ofrhe nrobile device and the most valuable inlbrmation are in the SMS payload which connin texts, numbers and symbols in the message tvhere the style

of

message can be learned and predicted by using machine learning techniques.

I.

W. Yoon et al. 120) and Peizhou et al.[21] shrdied rhe SMS content-based with challenge-rtsponse scheme

wlrere J. W. Yoon et

al.

_{l2O) applied cryptography}and

C.{PCIIA and

Peizhou

et al. l2ll

conrbined the blacklisting tecbniqucs and CAPCHA

to

dctect SMS spam.

In

addition, Gr)mez Hidalgo

et al.

_[22]

had proposed a content-based SMS spam filtering

for

dual language

of

English and Spanish SMS spam using the Bayesian filter technique. Moreover,

T. M.

M:rtrmoud and A. M. Mahtbuz [23] applied whitelisting techniques to detect content of SMS phishing and Q. Xu et al. 124) proposed SMS filtering on non content-based using SVM and KNN techniques.

In

addition,

Joe

and

Shim[l8] also

uses SVM techniques

widr

rhesaurus

and

applied

in

Windows

environment.

Several

experiments

were done

by Ctrrmack et

al.

_{l25f on}SMS filtering using 5 types

of

spam

tilter

tools

and similarly, Najadat

et

al.

1261, obsen'ed several

classifier

techniques

and

did

a

conrparison of accuracy to detect spam SMS.

Machine lcaming technique also has been applied in detecting SMS. Studi€s that has been done by Xiang el

ul. l21l and Cai et

all2Bl

shows an improvement on the spam filter using traditional balanccd Winnow alg{tnthm vvhich

is a

linear-threshokl classificadon algorithm to detect SMS

in

Chinese language. Moreovet, Wu er a/. [29] proposcd a rcal dmc monitoring and frltcring SMS system

by

using Bayesian filtering and Pinyin Fuzzed keyword pattem machine- lie et _{a/. [30]}also applied the matching pattern technique. In _{addition, Uysal [31],}also

proposed

a

novel

frarnework

using

the

Ba1'esian tec'hrrique but with a dual featur-e seluction technique.

An

independent mobile device

filtering

by

Taufiq Nuuzzaman et al. l32l using Naive Bayes technique has

been proposed and they prol'ed that spam filtering can be applied independently trn the mobile device, Yadav

[33]-[34] developed

an

application called SMSAssassin that uses the Baycsian technique and blacklist words

in

the

system.

(3)

T.

M.

Marhmoud antl A. M. Mahtbuz [23] applied an

anificial immune system technique and the result shows the proposed technique is better than the Naive Rayesian

algorithm. Charninda

et al.

_l35l

proposed

a

hybrid solution of neural network and Bayesian frltering.

A verilication technique has been proposed by Verma

e/ aL

[36]

to

add

a

verifiiing

mechanism

in

the application.

In thc Cryptography area, Saxena _[37]has proposed a

secure

SMS ptotocol

tbr

SMS

transmissiorr

and

a

cryptographic algorithm

in

the

SIM

card. Moreovcr,

Pereira

e/ aL

_[38]

also

proposed

_a

lighttreight gryptography algorithm to mitigate SMS securiry issues. In addition, Choi [39] appiied the Common Public Key CY-vptographv techaique

for

SMS

commnnication etficiency.

Vall

and Rosso [40] did experiments that compared

plagiarism detection

tools

u'ith

machine

learning detecrion techniques. Rafique

et al.

[41]

applied the Hiddcn Markov Modcl (MHH) in SMS protocol tbr rcai dme detection rurd compared four evolutionary algorithm and non evolutionary algorithm classilicarion. Que and

Farooq

_[42]

also

applicd

MIIII

on

a

bytc

lcvcl distribution

of

SMS.

In

addition, SMS-Watchdog SMS detection scheme by Yan et al.[43] has been developed using anomaly detection merhods by idendfuing fte SMS

user behaviour in the Windows based.

Until

this

paper was written,

only

non

technical studies were done _{by [13],}

_{[5], [6]}

and _{[44],. However,} tbr SMS phishing tcchnical analysis, this study is the lirst

to separate

fie

SMS phishing from SMS spam corpus.

11.2.

SMiShing and SMS Spam Characteristics

Spam not only gives risk to the user, but also a{fect the organizarion [a5].

ln

addition, phishing attack also has a high impact on users and organizatioos- Horvever,

in

rhe SMS research area, the study

only

fbcuses on detecting SMS spam antl not for SMS phishing.

SMS

Spam

and

phishing attacks

have

diff'erent definitions and this was agreed

by

Toolan and Carthy

[46], Mistry et

al. [47]

and

S.

Chandran and S.

Mulugappan [48]. They defined SMS spam as uontaining adveniscment of rnarketing.

Additionally, phishing message is a social engineering method that colrtain messages to trick the user response

to

thc

SMS.

Aftcr

that,

thc

rnalicious

\r'cb

sitc

will

display

and

unforntnately

the

malware

will

be downloaded. As a consequence of

thal

phishes can get information

of

mobile device, contact number, bank account. track location and others.

From the findings of this snrdy, several charactcristic

of

SMiShing attack

has

been identified

and

these choacteristic were also implemented

to

separare SMS phishing

from

SMS spam. The chancrcristic

of

SMS phishing consists of:

r

maiicious URL

_tl3l-tl5l,

[a9]and [,H],

r

malicious Telephone _{Number [44],}

r

asking to unsubscribe the service _[49],

Copyright O 2014 Prai.:e lAofihy Pti:e S.r.l. - ,4ll rights resen'ed

Cik Feresa Mohd Fooz,v" Rabiah Ahmad, Faizal M. A

r

self answedng SMS tbr YES as agree to subscribe Is0],

r

Announce the SMS recipient as a winner in a contest or competition that was never eittered or

r

Asking for help to get profit such as need money to

eat or reload prepaid phonx. However, SMS spamwill contain:

r

Marketins campaigns 1401, t261.

[i1],

[23],

[]41, [32],

[al],

[42] or

r

Spread news _[24]

The similarit-v

of

SMiShing and SMS spam are ltrat both

of

them :ue unwanted SMS and the SMS scndet's

nunber is not in the SMS recipient's telephone nunber contact

list.

Ho'rever,

it

is

ditlicult to

idcntity

if

the

elephone

number

is

malicious

or

nol

unless the applicarion connects with the contact directory.

For this

srudy,

one

of the

SMS

Phishing characteristics above

will

be the

parameter

for

the

decision making

of

whether

the

SMS

is

phishing or spam.

The

features selection

will

contain Markering Adverdsements

and Winning

Announcements. The Methodology section

will

discuss further

how

SMS phishing is separated trom SMS spam data.

11.3.

Rule Based TlteotY

The rule based approach is a part ot'ao expert systern arca and scvcral srudies havc applied this mcthod for decision making. Peizhou, Xangming et al.

[2])

sairl t]re

comrnon SMS content-based filtering method are rule-based techniques. The study

to

detect phishing attacks using the rule based technique has been done by R.B. Basnet _{e, al. [51], which}generated 15 rule sets that was analyzed using machine learning techniques. Moreover, Shreeraur

_[52]

proposed

a nrle

sct

to

detcct phishing attacks using genetic algoritbms.

Salenr _[16]have proposed a rule set in the studies ftrr the awar-eness program and aniticial intelligcnt tools It) detect phishing emails. Liping et a/. _{[53] had}examined seven feah.rres for phishing email detection. In addition, Pamunuwa et al. [54), proposed to detect phishing email by using IDS.

Alnajim aud _{Munro [55]}studied on phishing detection

in

rvebsites. Zhou

ei

a/. [56]

also proposed

to

detect phishing attacks and Abunous er

al.

151) had listed several criteria to detect phishing and the technique that u'as applied was the fuzz-v technique. SpoofGuard, which has been discussed by Huajun

ct

a/. [58], is a popular anti-phishing tnol. The author also sudied several criteria to detect phishing.

There are several tactics to spread phishing email and u.eb sites but there are

still

less

for

SMS Phishing and Slr'{S spanr. Unifornr Resource Locator

(URL;

is

an

address of website which is one of the important features

to

detect phishing attacks. Moreover,

URL

phishing dctection also has been widcly applicd to detect mobile u'eb phishing and ernail phishing. Nevedheless, tbr SMS phishing attacks, detecting phishing througlr

URL

is importanr as one of rhe lbaures to detect SMS phishing

(4)

I

Cik Feresa Mohd Foozv, Rabiah .lhmad, Faizal M- -J

because this attack rvrll hick users by scnding an SMS

uith malicious

tiRL

_{[13]-[15], [49]}and _[44].

These solution

for

mobile web phishing using URL

has been studied _{by Niu [59]}and Weili et al. [6O].

Niu

[59] proposed to design an important and short

URL

to

be displayed on salbri browser on the iPhone mobile device. Weili el _{a/. [60] upplied}a URL checker in

their anti login intertace liamework. Yadav et a/. [33] also applied

URI

features to detect SMS ham and SMS

spam but not SMS phishing.

Howcver. since this study only focuses on scparating

phishing

SMS

fiom

spam class,

not

all

URLs

are

malicious. Some URLs contaim a valid format or have existing websites.

Thus,

new

feature selections

such

as

Winning Announcement

SMS and

SMS

was applied to improve accrracy result. Moreo't'er, based on rhe understanding of spam and phishing SMS, these rwo features differcntiate the spam and phishing SMS.

nL

Methodologr

The aim

of

this paper

is

to

separate SMS phishing fi'om SMS sprm class in SMS sparn Corpus that has bten downloaded from I-ICI Machine Learning. The features

of

SMS

phishing

are

based

on the

phishing characteristics that has been discussed

in

the previous section such as f'ake URLs, u'inning announcement and

ti'ee gifts and request to response.

The SMS phishing chatactcristics were gained from

various conl'erence papers and joutrrals

from

digital

libruy

IEEE Xplore,

ScienceDirecf

ACM

Digital

Library and Sprirrgetlink.

In

this sh.rdy, English SMS sparn corpus are based _{on [61], [62]}and [22] datasec

that

has heen dourrloaded

ftom the UCI

Machine Learning [9].

The

total

SMS that were downloaded were 5572, however after pre-processing data such as rcdundancy rernoval, 51 66 of SMS spam corpus was applied. There are two classes

of

ham and spam SMS

in

this datasets

u'hich consist or 4515 SMS ham and 651 SMS spam.

Table

I,

shows the SMS spam specification that has been

downloaded-Fig.

I

below is a generic rule flows how the sep&7rdng phishing SMS to SMS spam is done.

TABLEI

ENcusH SMS SP.{-Ni SPEcTFtcATIoN

English SI\IS Corpus gnglish SMS Soecificetion

Fig. | - Gcncric Proccss ofRulc-Bascd Decision Making to S€pdate SMS Phishiug Aom SMS Spam

$tep 1: Pre-Processing the entire SI\{S Spam cotpus such as removing SMS redundancy and tokenize

Step 2: Applied lbur nrles ro rnake decision rnaking:

Rule

l:

lF

the SMS

is

an wiuner arurourcement, TTIEN result=l

Rule

2:

TF the SMS

is

a

marketing advertisement TIIEN

result:l

STEP 1 storing and handling SMS sparn corpus using Microsoft Office Excel and pre-processing tbe SMS to clean the data tiom noise. STEP 2 is how the rule based

pedorms. Theie

will

be no issue of the generated SMS phishing being misconstrued as a ham SMS because the

rulcs are only for SMS spam class liom thc UCI Machine Learning website.

For Rule

l,

the screening words ofkeyvuord

"rev:art'

and "lr,irr"

is

a word ro initiate user to respond to the SMS. Thus,

if

the SMS

infom

ttrc recipient that ftey

won, the result

will

be

I

as TRUE. For rule 2, if the SMS is an advertisemcnt SMS, it also

will

be TRUE.

IV. Analysis

and

Findings

From the analysis that was done on the SMS spam corpus that was downloaded

from

the

UCI

Machine Learning, there are several SMS phishing that has been

ideutifled

on

SMS

spam.

After the

pre-proccssing process, there are 4515 SMS hanr and 651 SMS spam

and the result aftet Step

I

until Step 3 was applied. 45 | 2

[ntematioml Ra'iev on ComPuters ond Sofware, Vol. 9, N. | 0 No ofmessages

No of rvonls No of cless

Avcragc no oiwords pcr

lrrcssage

Collcction mcthods ald

proceduts

Composifon oI Lext

I-alguagc of communication T)Ipe of conmuication

5166 7974a

2 (Ham md Spam)

r4. l 7928

Dorvnload at UCI Machinc Lrarning Mobile Device uer

EngLislr

SMS from Unknown Smder,

.{d\crtisemilt. Pcrsonal cumrurucation

SIUS Spam Comus

RULE

I:

Stinner?:-t

RT.ILI 2: .Ad!cnisctrrint !-4

[image:4.593.311.506.104.377.2] [image:4.593.82.291.584.705.2]

(5)

Cik Feresa Mohd Fooz,v, Rabiah Ahmad, Faizal M. A

SMS ham, 567 SMS Spam md 84 SMS spam has been

identified (Table II).

Moreover,

this SMS

phishing corpus

has

been

analysed using the WEKA tool [1

l]

for evaluation of the

accuracy classification.

RuhrBrsed RulsBasrd

SMS

Spar lhm

SDrm

Pbishing llam Sprm tshishlng DamcL [9],n01

45t5 65i

0

4515

567

E4

Aftcr

SMS Phishing Datasets has been developed using the proposed rule-based technique, an experiment was done using the

WEKA tool

_{I l]}

to

validate the proposed rule based technique. True Positive Rate (TPR), False Positive Rate (l'PR), b'alse _{Negative Rate IFNR),}

and True

Negative

Rate

(TNR)

of

SMS

Phishing Dausets havc been identifietl using the WEKA tool.

To get the result of accuracy level, the SMS phishing fcaturcs was constructcd bascd on litcrafurc rcvicw. In addition. there are several Machine Leaming techniques

in the WEKA tool. however

in

this paper the Bayesian technique was applied

to

check the classification

of

accuracy rate of SMS ham, spam and phishing.

Based ou the result in Table

III,

thc accuracy ofthesc SMS Phishing Corpus show high accuracy compared to SMS spam Ctrrpus.

TABI.EItr

RESULT SMS DFTF-L"TION USINC BA1'ESIAN TECH}TIOUE

B'@

l'lramelrr

T.. Phlshlng sDrm

Phishing compau'cd

to

SMS spam.

It

is

due

to

the separating pl'ocess. The result however look only limited to SMS corpus by Almeida and Hidalgo [9] and Nazario

[10]. Further srudy,

will

be tested using Malay SMS corpus for explore the efficiency of this method.

Acknorvledgements

The authors would like to thank Universiti Teknikal Malaysia Melaka (UTeM). Universiti Tun Hussein Onn

Malaysia(JTHM) and

Ministry

of

Higher Educarion Malaysia

for

supporting this research. This research is f'unded by MOHE under Long Research Grant Scheme LRGS/20 1 l /FTMK/TKo l i 1R00002.

References

[1] Microsoli. (2011, 8th June)- Email and weh scams: How to help

pn)ted your:ell.

I2l

tll

t4l

t5l

t61

t?l

Available

htlp:./,ruw.micro*ti conr/sccuiry

onlinc-privacy/phishhg-sDams.aspx

M. Boodae, "Mobile Uscrs Tluee Timcs Morc Vulncrablc to Phishing Ailacks," il lrstaar vol. 2012. €d, 201 l.

P. Soni, er al, 'A phishing malysis of wcb burd systctrrs," presented at the Proceedings ofthe 201 I International Confetence

ou Comuication, Computing \&\C38; Security. Rou'kela,

O<lisha, India. 201 t.

C. F. V. Foory, et a/., "Pbishing Detecriotr Taxonomy for Mobi-le

Device," .InteDutilnal Jrurnul ol Computet SrietEe LlTues

qJCSII. t;ol. 1 0, 20 13.

,{. Kang. ct dJ., "Security Considemtions for Sman Phone Smi.shing Attacks.'in.4dtunced in CorrPuret Science and

it,-"lppfitations, cd,: Springcr. 20 14, pp. 46?-471.

D. Kim md J- Ryou, "SccurcSMS: prcvmfon of SMS imerception on .{ndroid plarfom," prsf,ted at the Proceedings

of dre 8dr Inrerurional Conference on Ubirruitou Infomntion Muguur md Cmurmicadon, Siern Reap, Cmrbodrq 2014.

K. Ber:k md J. Zhin, "Hnshing Using a Modified Bayesim

Tmhnique," in Sacial Computing (SriulGtm), 2l)10 IEEE Frlse

Positive

0.01.1

Correcdy Clessilicd

hcorrcctlv Clessilitd

00

99.8064 %

0.t936 %

V. Condusion

As

a

conclusion,

SMS

phishing

and

spam have different characterizations and security risks. Since there are

no

available SMS Phishing corpus

in

the English language, this study is the first to propose a practical rule based technique

to

sepatate SMS phishing

iiom

SMS

spam class.

Two new feanrres are also introduced

in

this study

such

as

the

Winning

Announcement

SMS

and Advertisement SMS. These two features has been added because ol the main characteristics of SMS phishing and spaun whcre generally, SMS phishing

will

ccmtain an

winning

announcernent

and SMS

spam

thar

\r'ill

broadcast markedng advertisement.

Additionally, bascd on thc cxpcrimcnts rcsult in Tablc

II,

the proposed rule based with these new features has shown the availabiliry to separat€ the SMS phishing class

from SMS spam and generated a SMS phishing corpus tbr further study in SMS phishing attack which has been incrcascd nowadays. According

to

the proposed rule based system, the result shows befter accuracy for SMS

Copyrigtu e 2014 Praise llortlry Pri:e S r.l. - All ighls reten'ed

Suoild lrta'ndtionul ConJercnce on,2010, pp. 649-655.

[8] J. Nazario, "Phishing Corpus," vol. 201 3, ed, ?005.

[9] T. A. Almeida and J. M. G- Hidalgo. (:012, l3 September 201?).

SMS St am Coilt<tutr Don .Sel, ,Availahle: htt):,//archive.ics. uci- edulrnl/dara sets/SMS+Sptm+Collecdm

[10] J. Nazatio, "Phishing Corpus," 2004-2007.

[ll] 1,. ts'. Mak llall, Geotliey llotmes, Bemhard Ptahringer, Peter Reutemmn. Ian H. wittil, "'lhe WIKA Data Mining Solts'are: .{n Update," S/GXDD Ex plo ru t iort, vol. I l, 2009.

[2] GctSafcOnlinc. (201?, I January 2013). Spun & Scun email.

Available: htrp:/l*rvu'.gcrsar.conline.org/protccring'your' conrputer/spcnr-md-scam-etmil/

[13] S. Abu-Nimeh, er nl., "Disnibured Phishing Detccrion by ,\pplying Variable Selecrion Using Bayesian ,\dditive Regression Trees," in Conmunitatiorc, 2009. ICC '09. IEEE lnternTliaml

('onfercnce on,2009, pp. l-5.

[14] J. Shah. "Onlinc crime nigr"tes to mobile phoues," Sage vol. l,

pp- 22-23,200?.

[15] K. Drnhm- "Chalter 6 - Phishing, SMishing, and Vishirtg," n

Mohile Maht'are Allacks awl Defense, D. Ken, Ed.. rd Boslon: Syngress. 1009. pp. | 25-196.

[16] O. Salcm, e/ a/., "Awarcness Program and AI bascd Tool to Rcducc Risk of Phishing Arucks," iu Cotttputet and Information

Tethnolog; (ClT), 2010 IEEE l0h ltternailonal ConJerence on, 2010,pp. l4l8-1423.

[17] N. Leavin. "lvlobile Security: Finaliy a Serious Problem?," Computer,,tol. 44, pp. I I - 14, ?0 | l.

[18] I. Joc and H. Shim, "An SMS Spam Filtoing Syst€m U:irg Suppot Ve ctu' Machine," iu Flrtare Generatiou Infomarion

Techrclog;. vol. 6485, T.-h. Kim, et ul.,Eds.. cd: Springer Berlirr

(6)

l

Hcidclbcrg, 201 0. pp. 5??-58a.

[ | 9] F-Sccure, "Mobilc Thrtat Rcport Ql 20 | 2,. F-Secure L.abs20 | 2.

[20] J. W. Ymn, eI a/., 'Hvbrid spanr filtering for nrobile conrnrunication,r' Contpaters &amp: *curitv. !$1. 29, pp. 446-459.2010.

[21] H. Pcizhoq a a/., "A Norel Methorl tbr Filtrring Grcup Sending Shorr Mcssagc Spam," in Corltgcttcc ancl Ih'btid htfotmalion Technolog,,, 2008. ICHIT '08. Intenntiotal Corlcrcncc an,2008,

pp.60-65.

[22] J. M. G. Hidatgo, c, rrl, "Cotrtcnt based SMS spam filtedng,"

preienred ut the Pnrcecdings of the 20{]6 ACM symposium on Documcnt cnginccring, Ar:rstcrdmr, Thc Ncthcrlands, 20t)6-{231 T. M. Va.hmoud md A. M. Mahfotu, "SMS Spam Filtcrmg

Tcchruque Bascd on .Arliflicral Imunc Sysictr!," 7JCS1

International Jotrnal of(bmputcr Sciencc Issws, vol. 9,2012.

[2al Q. Xu eraL, "SMS Span Dclcction Ning Comcnt-lcss Fmru'cs,"

Iiltelligtnt Systcnts,Ifftr, vol. PP, pp. l-1,2012.

[25] G. V. Cornack et d/., *Fearue engineering for mobile iSMS)

spm fikoing," presented at the Procedings of the 30th mrual

intemarional ACM SIGIR cDnference on Reserch and

developrnent in infomradon rerrie\al, Amsterdm, The Nethu.lands, 2007.

[26] H. Najadat er al, "Mobile SIr{S Spanr Fihering based on Mixing

Clasit-rcrs. "

[2?] Y. Xiang et a/., "Filtcring nrobile spm by support lictor

machinc " prcscmcd at thc Confmcncc m Computcr Scrcnccs,

Sofiwae tngineenng, lnfomadon lechnology, E-Business md

.A.pplicatiom (3rd; ?0(X ; Cairo, Egypt), Cailo, Eg1'pt,20el.

[2E] C. Iie, er aL, "Spam Filter for Shon lv{*sags Using Wimow," in

,4dvanted Longutge Aucessilg und Weh InJinnation Tuhnologt, 2(x)lt. ALPIT 'o8. fnrunational CanJb'eu:e on,20O8, pp.45,{-459.

[29] W" Ningning. et al. "Real-time monitorhg md filtering systent

for nrobile SMS," in Induslriol Elcctronics and Applianrions, 2008. ICIE.I 2008. 3rd IEEE Confcrercc or, 2008, pp.

l3l9-[]0] J. lfumg et a]., "A Bayesim Approach for Text Fifter on lG

Nctwork " in l/ireJess Cotnrnunicalions .\tenuor*ing an,l :Vohile Computing _lIyiCOM),2010 6lh Inlernatioml Corferetce ot,

2010. pp. l-5.

[31] A. K. Uysal, er aL, "A novel frantcwork for SMS spam filttring."

'n _Innovtlions_in_Intelligenl_S1'srems_aul_Application:(DIISTA), 201 2 lntcrnational Symposium on,201 2, pp. l -4.

[32] M. Taufiq Nuruzaman, c, dJ., "Simple SMS spam fihering on independenr mobile phone," Stcurin: and Cannrlnicailon Nerrorls, vol. 5, pp. 1209-l 220, 201 2.

[33] K. \'adav, e, ,1., "SMsAssassin: ctowdsowcing driven

mohile-based system 1br SL.{S spam tilterhg," Prsenred ar the ntuoaedings of thc l2th Workshop ou Itlobile Conrputing Systms

and Applicatiols, Fhoebix. Aribna,

3011-[]al K. Yadav, el al., "Take Conrrol of You SMSes: Desigring m Usabh Spanr SN'[S Filtcring System," in Mobile Dutu

ll{tmgemen (MDn, 2012 IEEE 13rh lilteildti.nrdl Co4ference oa, 201 2, pp-

352-355-[35] T. Chaninda er a/., "Content based hybrid sms spam frllering

sysrem," 20l4.

[]61 R. K. Vemo, el cL, "Exracdon and Verification oi Vobile

Mssege Integrity," it Corwnuticanbn Dtreil.r and Neu-olk

Te&nologies (CSNT), 201 | lnlemalional Confererce on, 201|,, pp.49-53.

[3?] N. Suma md N. S. Chaudiuvi, _"SccueSMS:A securc SMS

protsol l'gr VAS and other applications," Jourrtal oJ.Srsterns and Software, vol. 90, pp. 138-150. 2014.

t38l G. C. C. F. Pcrcir4 el aL. "SMSCrypto: A lightn'right

cqptopraphic trameu'ork for sccurc SMS transmission," Journal o/$,srerru and So/ware, vol. 86, pp. 698-706, 201 3.

[39] J. Choi and H. Kim, "A Novel Approach ibr SMS security,'t lntemational Joumal of Seetri4. & Its Applimriob,l'of. 6, 2011.

[40] E. Valt iurd P. Rosso, "Dctcction of ncar-duplicatc ulcr ginei?ted q)nterrtt: the SN{S spam collection," prcsarrd at the hnceedings

of the 3rd inteinadonal rvorkshop on Search and mining user-gsrerat€d cortcnts, Glasgorl. Scotlan{ UK, 201 l.

[41] M. Z Rafique, et c,f., ".A.pplication of evolutionary algorirhms in dctecting SMS spam a[ access iaycr," presented at the Proceedings

of thr 13dl annul confcrcncc on Gcnctic and cvoluraona.ry

cotrpuhtion, Dublin, Trcland, 20 I | .

[42] Nf. Z. R. que and M. Far6oq, "SMS Spanr Dctecriorr By Opcratirrg

On E''te-Level Disciburions Using Hiddcn Mukor' I\{odels

IHMMS)," prcsentcd al tlre Virus Bulletin Conlbrence Scptenlbir

20r0.2010.

[a3] G. Yan, ct a/., "SMs-W4tchdog: Ptofiling Social Bchaviors of

SMS Uscrs tbr Anomaly Dctcction Rccrrrt Adv'anccs in Inhxsion Derection." vol. 5758, E. Kirda e, al, Eds., ed: Sprilger Berliu /

Hcidelbetg, 20tJ9, pp. 202-223.

[44] S. Lee. (2012, Snishing: SMS + Plrr]ing, Prcsent And On The

Ri,rz On _'1rulroid. Availablc:

http: ri'wrvw al cnboot. cob./blog./blogs/cndpoint_sccruiry/rchivc/2 0.1 2/l l/14/smishing-sms-phiJung-prcscn!-md-on-drc-rasc-on-android.aspx

[45] Ktuthika Rcnuka, D.. Vislakshi, P., Blcnding lirctly md baycs

classilicr lbr cmail spam clmsilication. {2013) Iternilional R*iew on Cohpl*ets and Saftv'dle (IRECOS), I (9.), pp. 2168-7111.

[46] F. Toola-o andJ. Cuthy, "Feamre selection for SprimandPhishing detecdon," in eCline rResearchers Sunmit (eCrine.1, 2010,2010,

pp. 1-1?.

[.1?] N. Mistry. cr a/., "Preventive -\ctions to Emerging Theats in Snrad Dcviccs Sccuriry," 2t)l 1.

[48] Chmdran, S.S., N{uugappm, S.. Spam detcction md clinination of mcssgcs from twitcr, i2ql3) Intcrtatiotal Revicw on Computers anl So/n'are (RLa'Os}8 (10). pp. 2438-244i.

[a9] L. Kriel, cl al, "Torvalds a computer securq' inducrion nillnl for non-IT citims."

[50] A. Mahaja4 et dl, "Identification of Fake SMS generated using

,{niloid Applications in Android Devices."

tsll R.B- Bamet , zt al., "Rule-Based Phishing Attack Detecuon,"

Interndlirndl Conlbrcrte or Securitv tnl Manugenenl SiMll (2ot 1), 2O1r.

[52] V. Shreerm, ct al, "Anti-phishing derection of phishing anacks

using gcnctic algoridun," in Cannunicalton Control and Computing Iec,hnologies (ICCCCJ|), 201() IEEE Inlenalional

Conlbrence on, 2010, pp. 447-450.

[53] M. Liping, er al, "Automaticdly Genuating Ctassifis for

Ptrislring Enail Prediction," in Pelvasdve Syttens, Algorilhnt,

oazl,Vanr,af.r (ISPAN), 20il9 I(Jrh Intenwrional SymJxrsium on, 2009. pp. ?79-783.

[54] H. Pmunuwn, et al., "An Inmrsion Derecrion System for

Detecting Phishing ,Attacks." in Sc"c/r€ Data Matogemcnt. vol.

4711. W. Jonker and M. Petkovrc, Eds., ed: Springer Berlin i

Heidelberg, 2007. pp. l8l-192.

[55] ,{. Alnajim and M. Munro, "An Approach to the Implenflmdon

of the Anii-Phishing Tool lbr Phishing Websitrs Deteclron," m lntelligent Nemortirg tnd Collahot'an\t S.ytrems, 2009. INCOS 'il9. ln r enw tion tt I L' onfere n c e o n, 2009, pp- I 0,5- I I 2.

[56] C. V- Zhort, et ol., "A Selt:Hcaling, Self-horechng Collablrdtive

lntruion Dctccrion Architcctttre to Trauc'Bask Fast-Flu Phishing Donrains,'in Natunr.t OPerationt tnd Montgenent

Svmatsiun Worlrshops, 2008. I;OVS Workshops 2008. IEEE, 2008. pp. 32 I -32?.

[57] N,f. .\bunous, e, al., "lntelligent Phishing Website Deeclion

System using Fuzzy Tcchniques." in hrJonution and

Commilnrtution Techmlogies: From Theory lD AlrPliL'ation\, 2Nt8. ICTTA 2008. 3rd htemational Confuove nn, 2008- JrJt-

l-6.

[58] H. Hutju, et al, "Coutomeasu: Tecbniques tbr I)cceltlive

Phishing.Attack," in rVew h'ettls in lnlbnnatiott und Scn'ice

Scien*, 2009.,VISS ?9. lntenntioual Confereute ol, ?009, pp.

636-641.

[59] Y. Niu er a/., "iPhish: Phishing Vu]ncrabilitics on Consumcr Elecronics." in LrP.iEL'. 2008.

[60] H. Weili, er a/., "Anti-Phishing by Smart Mobile Device," in

Netb'ork rnd Purallel Compuing lforkshops, 2007. NPC l{orkshops. IFIP Intennional Cot4ference on,2007 . pp. 295-302.

[61] G. V- Cormack, rl dl, "SPdm filtering tirr short missges,"

prcsentcd at thc I\occcdings ofthc sixtcctrth ACM cont-erence on Corrt'creirce on infbrmation arrd krrou'ledgc managenrent, Lisbon. Porugal, 2007.

[62J O" V. Cormack, cl a/,. "Farure engineering for mobile (SMS)

Cik Feresa lulohd Foozv. Rabiah Ahmad. Faizal M. _'4

(7)

Cik Feresa Mohd Fooz-v, Rabiah Ahmad, Faizal M. A.

spam iiltcring," prcscntcd at thc Procccdings of thr 30th annual

intanational .{CM SIGIR oonfcrcncr cn Rcsarch antl

dcvclopment in irrfomration .ctrieval, Anrstcrdam, The Netherlands, 2007.

Authors' information

Cik tr'ercee l\{ohd tr'uozy is currendy working

with Universiri Tun Husscin Onn Malaysia

(UTHM), Malaysia. Fcrcsa holds a Master's

dcgcc in Computer Scicncc(lnfomarion Security) ltom Univcrsiri Teknologi Malaysin

Malaysia urrl a Bachclo"s degtc il Infbmariou Technology and Muldmedia fi-orn

Univcniti Tun Husscin Onn Malaysia (UTTiM), Malaysia. Shc is cuncntly pursuiag hu PhD at thc Univcrsiti Tcknikal Malaysia Mclaka,

Malaysio-Rrbirh .{hmad is an .Associalc Profesv:r at rhe

Fauulty of Infomrarion Terhnology and Communication, Universiei Teknikal Malaysia Mclaka,Malaysia. She rrceived her PhD in

lnibmadou Stu{tics (hcdth infbnnarics} litrn the Univeniry ot ShEtliel4 UK and M.Sc. linlbrmadon security) liam rfu Royrl Hollo*'ay

Uuivcritv of London. UK. Hcr rcscarch irrtercsts include healthcae sysrem securiry and healdr intbmadcs at narional as well as intemational levels. She has also publishedpapers m accrcditcd information sccurity archi&cmrc. Shc has dclivcrcd papus at various national/inrcmarionaljoumals. Risides that, shc also sewes as a

riviclcr tbr variom confcrrnccs and ioumals,

Mohd Faizal Abdollah is an Assmiate

Prol'cssor at thc Faculty ol lnlbmration

Tcchnology ud Comuication. Univrrsiti

Tcknikal Malaysia Mclaka.Malaysia. Hc receiverl his PhD in Compurer and Nerrvork

Sertriry froor tlnivcrsiti Teknikal Malaysia

MelakaMalaysia, md M.Sc. {ComFuter

Science) li-om the Univetsity Kebangsm Malaysia His rcscarch intcrcar mcludc ncnvork ud mobilc sccuity

and nclvork moltttonng. Hi has dclivererl papcrs at vzu-ious network security conltrences at national as rvcll as intcrnattonal levels, IIc hm

also publishcd papcrs in accrltitcd narionalrintcmational loumals-Bcsidcs tha! hc also scn cs as a rcvicwcr lbr vuiou conlcrcnccs md

iournals.