• No results found

Community Detection Using Random Walk Label Propagation Algorithm and PageRank Algorithm over Social Network

N/A
N/A
Protected

Academic year: 2020

Share "Community Detection Using Random Walk Label Propagation Algorithm and PageRank Algorithm over Social Network"

Copied!
5
0
0

Loading.... (view fulltext now)

Full text

(1)

IJEDR1702115 International Journal of Engineering Development and Research (www.ijedr.org) 672

Community Detection Using Random Walk Label

Propagation Algorithm and PageRank Algorithm

over Social Network

1

Monika Kasondra,

2

Prof. Kamal Sutaria,

1M.E. Student, 2Assistent Professor , 1Computer Engineering Depart ment, 1

V.V.P Engineering Co llage, Rajkot, India

_________________________________________________________________________________________ _______________ Abstract Now a days, human bei ng bec ome more social, because it is allows pe ople to e xtend their social life in unprece de nte d ways. People bec ome more friendly with e ach other on social sites like Face book, Twi tter , etc, based on their interest, likes etc. In social network, havi ng same interest have the same c ommunity. So, detecti ng and e valuating the community structure is an essential task in social network analysis. In this paper, we are g oing to discover hybrid appr oach of Random Walk Label Pr opagation algorithm and PageRank algorithm to define d best community over social network. Main issues to define c ommunity ti me c omple xity, scalability, heter ogeneity, to i mpr ove this issues, hybrid appr oach has bee n pr opose d. Here de tect the c ommunities base d on seed node

Index Termscommunity detection, seed node, social network

________________________________________________________________________________________________________

I. IN TRO DUC TION

There are verit ies of the social networks, such as Facebook, Twitter, A ma zone, Lin kedIn, etc. We are growing with the informat ion age. Now a day most of the people connected with this type of social network. W ith the development of the smartphones, more and more people log into their social networks through their s martphones and share te xt and mult ime dia informat ion with their friends online.

The social network is usually modeled as graphs. A social network consists of a set of nodes along with edges connecting the nodes. The nodes represent the object in social networks, such as people, co mmodit ies, etc. The edges represen t the relationships between objects.

The commun ity is an important structure in social networks. Basically, co mmun ities are set of nodes with higher edge density than the null model. In recent year, find seed nodes and how to use the seed nodes for community detec tion has become a hot topic in social network analysis and field of data min ing.

In social network analysis, finding seed nodes have a wide range of applications. E.g., if we can dig out the most influential customers in marketing, then through the commun ity structured by the seed nodes, a product brand can be rapidly promoted.

In this paper, we show the different methods, which describe how can detect communit ies based on the seed nodes over the social network.

II.CO MMUNITY DETEC TIO N

Detecting and evaluating the community structure of real-world graphs constitutes an essential task in the area of graph mining and social network analysis. The network contains many structures and tightly connected groups. Also referred as community, c lusters, modules, etc.

The task of community detection is to find sets of nodes with lots of connections inside the sets and few edges outside the set.

In figure 1, depicted the different communit ies over the social network. Different color defines different commun ities and number specify node id.

There are many available approaches for Co mmunity Detection[5]:

1)Group-based approach

2)Network -based approach

3)Propagation-based approach

4)Hierarchy-based approach

Seed-centric approach

(2)

IJEDR1702115 International Journal of Engineering Development and Research (www.ijedr.org) 673 Figure, 1. Co mmunit ies in social network [4]

III.RELA TED WO RK

Seeding phase [2]

In seeding phase, which consists of four phases: filtering phase, seeding phase, seed set expansion phase and propagation phase.

Figure 2. Seeding phase[1]

Filtering phase

In this phase, to remove regions of the graph that are trivially separable from the rest of the graph, so will not participate in overlapping.

The output of the filtering phase is the biconnected core graph where whiskers (subgraphs connected to the biconnected core) are filtered out. It re moves regions of the graph that are clearly partitionable fro m the re ma inder. More importantly, there is no overlap between any of whiskers. Th is indicates that there is no need to apply overlapping community detection algorith m o n the detached regions.

Seeding phase

After we get biconnected core graph, we find seeds in filtering phase. The goal of seeding strategy is to identify a diversity of vertices that lie within a c luster. The output of this phase is seed nodes over a graph.

Seed set expansion

In this phase, expand the seed set using a personalized Page Rank c lustering scheme. A personalized Page Rank vector, followed by a sweep over all cuts induced by a vector, will identify a set of good conductance within the graph. The set identified via this procedure has a conductance that isn’t too far away from the best conductance of any set containing that vertex.

Propagation phase

After seeding phase, propagation phase will apply. In th is phase, once we get the personalized Page Rank co mmun ities on biconnected core graph, we further e xpand each of the communit ies to the regions that we detached in the filter ing phase. For each detached whisker connected via a bridge, we add that piece to all of the clusters that utilize the other vertex in the b ridge. The output of this final phase of seed is different co mmunit ies of the graph.

In the Random Walk Label Propagation algorithm[2], first this algorith m does not required number of node as well as size of commun ity, so it is random not static. To find the important node over social network it uses the position probability of the random walker. Th is algorith m following the depicted steps:

1) Ca lculate the importance of each node

(3)

IJEDR1702115 International Journal of Engineering Development and Research (www.ijedr.org) 674 Sa me as in PageRank a lgorith m[3], is a useful algorithm fo finding co mmunit ies over social network. To obtain the overall community structure of the network, personalized PageRan k should be executed amounts of times. A personalized PageRan k vector is the stationary distribution of a random wa lk that fo llo ws an ed ge of the graph with probability α. The essence of the induced community is that an inexact personalized PageRank vector, co mputed via an algorith m that “pushes” rank round the graph, will identify good bottlenecks nearby a seed vertex.

IV.PRO POSED WO RK

To find co mmunity here proposed method is depicted.

Figure.3 Proposed System

Here, as per depicted figure, first take input fro m the database dataset, which is in form of graph. Then apply Random Walk Label Propagation Algorithm by conductance parameter. Find all node which having ma ximu m conductance, declare all node as a seed node. Now find the InDegree and OutDegree of all the seed node which are lies in seed set. Then apply PageRank Algorith m over seed set. It will chose ma ximu m InDegree or OutDegree of each seed node which are best seed node and there is high probability having best community. And at last e xplore the neighborhood and find the communities.

The e xperiment had done in different three dataset. We measure here the value o f conductance and then based on this value find the co mmunity.

Dataset Co mmunity

Detected

Conductance I/P fro m

Database

Seed Select ion by Random LPA

Seed Set

Find Degrees of each Seed Node

Apply PageRank a lgo on Seed Node

(4)

IJEDR1702115 International Journal of Engineering Development and Research (www.ijedr.org) 675 Karate Club

Network

4 0.4188

American Football Net work

10 0.6043

Dolphin Network `6 0.5277

Table 1. Conductance value with dataset Based on this value here is a graphical representation.

Figure. 5 Graphical representation

We can see that in Table 1, there is a Karate Club Network having conductance value and detect 4 communit ies which is shown as bellow.

Figure. 6 Co mmun ity Detected

V.CO NCLUSION

In this paper we conclude that, having ma ximu m conductance value we gain best community over social network. Here we tested on three different dataset and find the communities. So this combine approach of Random Walk LPA and PageRank Algorithm gives the best communities .

0 2 4 6 8 10 12

Karate club network

American football network

Dolphin network

Community Detected

(5)

IJEDR1702115 International Journal of Engineering Development and Research (www.ijedr.org) 676

VI.REFER ENC E

[1] https://www.google.co.in/search?q=seed+phase+in+social+network

[2] Su Chang, JiaXiaotao, XieXianzhong, Yu Yue,“A new random-walk based label propagation community detection algorith m”, IEEE – International conferences on web intelligence and intelligent agent technology 2015.

[3] Fei Jiang, Yang Yang, Shuyuan Jin, Jin Xu, “Fast Search to Detect Co mmunities by Truncated Inverse PageRank in Social Networks, IEEE International conference on Mobile service – 2015.

Figure

Figure, 1. Communities in social network  [4]
Figure. 5 Graphical representation

References

Related documents

Similarly, the following Table 9 reports coefficients of model 2 for variables of financial development (LNFD), agriculture sector (LNAS), manufacturing sector (LNMS),

We examine four classes of approaches used to collect reliability feedback data for mass-market software (MMS). We focus on the most commonly-used approaches. Other specialized

ABSTRACT: Helicopter planetary gear train with compact structure, large transmission range, powerful bearing capacity, is one of the most important part of helicopter drive

Attorney Mario Merola attributed the low trial conviction rate in part to minority commu- nities’ distrust of police); John Simerman, Orleans Parish Conviction Rates in Jury

( 3 ) The knobS chromosome, an altered abnormal 10 lacking about one half of the heterochromatic segment and distal euchromatic strand, shows normal 1 : 1 ratios

On the other hand, the results of a long-time selection experiment at the Uni- versity of California (LERNER 1958) did not indicate that a plateau had been reached in

(a) List courses you taught this year and those you taught last year: (If you participated in team-taught course, indicate each of them and what percent of courses you taught.)

assess progress in meeting the defined goals. Student progress in specific classes that are designed to help build specific competencies is one aspect of the assessment process.