TAGora. Semiotic Dynamics in Online Social Communities. V. Loreto.

50 

Loading....

Loading....

Loading....

Loading....

Loading....

Full text

(1)

V. Loreto

http://www.tagora-project.eu/

TAGora

(2)

TAGora consortium

Prof. L. Steels Prof. N. Shadbolt Dr. Harith Alani Prof. S. Staab Prof. G. Stumme Prof. V. Loreto

Coordination

(3)

TAGora Synergies

PHYS-SAPIENZA

SONY-CSL

• complex systems expertise

• modeling and simulation

• data from existing folksonomies

• statistical tools for data analysis

• stochastic models of tagging behavior

• dissemination

• project coordination

UNI-SOTON

• image tag-based navigation system

• music tag-based navigation system

• feature-enhanced navigation maps

• expertise in semiotic dynamics

• collective intelligence for sustainability

• dissemination

UNIK UNI KO-LD

• tagging system for bibliographic data

• data mining for folksonomies

• social network analysis

• dissemination • recommendation systems • cross-folksonomy analysis • trend detection • network analysis • dissemination

• peer-to-peer testbed for folksonomies

• representation of folksonomy data

• ontology learning

(4)

a short history of the web

1989-1991

1991-2000

1998

2000

2000-2004

2005

the Semantic Web vision by T. Berners-Lee

WWW is created at CERN

users become content providers,

rise of online communities

“bottom-up” information architecture

mass adoption, users are consumers,

taxonomic approach

Google is born

(5)
(6)
(7)

artwork by R. Munroe

(8)

~ 100 million users

artwork by R. Munroe

(9)
(10)

http://del.icio.us

(11)
(12)
(13)
(14)

resource

user

(15)

resource

user

(16)

resource

user

{ tags }

(17)

.net 3d advertising

ajax

animation api

apple

architecture

art

article

articles

audio

bittorrent

blog

blogging

blogs

book

books

browser

business

calendar cms code collaboration color comics

community

computer

computers cooking

cool

css

culture

daily database del.icio.us

design

development

diy

download

downloads dvd economics

education

electronics email english entertainment environment fashion film finance

firefox

flash

flickr

fonts

food

forum framework

free

freeware

fun

funny

gadgets gallery

game

games

geek

google

graphics

gtd guide hack hacks

hardware

health

history

home

hosting

howto

html

humor

icons illustration

images

imported

information

inspiration

interesting

internet

ipod japan

java

javascript

jobs

language

learning

library

life lifehacks links

linux

list literature

mac

magazine management map

maps

marketing

math

media

microsoft

mobile

money movie

movies

mp3

music

mysql

network networking

news

online

opensource

osx

p2p perl personal philosophy phone

photo

photography

photos

photoshop

php

plugin podcast

politics

portfolio privacy

productivity

programming

psychology

python

radio

rails

recipes

reference

religion

research

resource

resources

reviews

rss

ruby

rubyonrails safari_export

science

search

security

seo server service shop

shopping

social

software

spyware statistics sysadmin

tech

technology

tips

tool

tools

toread

travel

tutorial

tutorials

tv

typography ubuntu unix usability useful utilities

video

videos visualization

web

web2.0

webdesign

webdev

wiki

windows

wordpress

work

writing

xml

del.icio.us

(18)

Navigate the information sea

(19)

Navigate the information sea

(20)
(21)

the complexity

community level

(22)

the complexity

community level

http://www.flickr.com/photos/gustavog/9708628/

user level

(23)

F O L K S O N O M Y

the complexity

community level

http://www.flickr.com/photos/gustavog/9708628/

user level

http://dml.riken.go.jp/~ciro/blog/2005/Feb/14

(24)
(25)

Main results of the first year

Extensive data collection from selected collaborative

(26)

Main results of the first year

Extensive data collection from selected collaborative

tagging

!

systems (

del.icio.us

,

Flickr

and

Last.Fm

)

Acquisition of existing datasets from several social

(27)

Main results of the first year

Extensive data collection from selected collaborative

tagging

!

systems (

del.icio.us

,

Flickr

and

Last.Fm

)

Acquisition of existing datasets from several social

websites

!

(

IMDB

,

Netflix

,

Wikipedia

)

Realization of web-based applications:

!

BibSonomy

(www.bibsonomy.org)

!

Ikoru

(www.ikoru.net)

(28)
(29)

http://bibsonomy.org (by KDE @ Kassel)

post

{ tags }

resource

user

(30)
(31)
(32)

Tag co-occurrence: raw data

design developmentcss webdevdhtml xml programming xmlhttprequestweb javascript

ajax

xslt xmlhttprequest htmlcss javarss programmingajax web javascript

xml

tech politicsart daily cssrss musicnews web design

blog

time rank

(33)

Tag co-occurrence: raw data

design developmentcss webdevdhtml xml programming xmlhttprequestweb javascript

ajax

xslt xmlhttprequest htmlcss javarss programmingajax web javascript

xml

tech politicsart daily cssrss musicnews web design

blog

time rank

“serialize” posts to produce a stream of tags

(34)

100 101 102 103 104

R

10-7 10-6 10-5 10-4 10-3 10-2 10-1

P(R)

"blog" "ajax" "xml" 100 101 102 R 10-4 10-3 10-2 10-1 P(R) web news music rss design javascript web "blog" "ajax" "xml" "H5N1" del.icio.us Connotea

frequency-rank plot

α ! 5/4

(35)

100 101 102 103 104

R

10-7 10-6 10-5 10-4 10-3 10-2 10-1

P(R)

"blog" "ajax" "xml" 100 101 102 R 10-4 10-3 10-2 10-1 P(R) web news music rss design javascript web "blog" "ajax" "xml" "H5N1" del.icio.us Connotea

frequency-rank plot

α ! 5/4 high rank

P

(

R

)

R

−α

α >

1

(36)

100 101 102 103 104

R

10-7 10-6 10-5 10-4 10-3 10-2 10-1

P(R)

"blog" "ajax" "xml" 100 101 102 R 10-4 10-3 10-2 10-1 P(R) web news music rss design javascript web "blog" "ajax" "xml" "H5N1" del.icio.us Connotea

frequency-rank plot

low rank

leveling off

α ! 5/4 high rank

P

(

R

)

R

−α

α >

1

(37)

we start with n

0

words

at time

t

: with probability

p

, a new word is appended

with probability

1-p

, a word is copied at random from the past

the Yule-Simon process

t-x new p 1-p ... t-9 t-8 t-7 t-6 t-5 t-4 t-3 t-2 t-1

P

(

R

)

R

1

p

(38)

Tag Correlations

C

(∆

t, t

w

) =

1

T

t

t=tw+T −∆t

!

t=tw

δ

(tag(

t

+ ∆

t

)

,

tag(

t

))

t

... t+8 t+7 t+6 t+5 t+4 t+3 t+2 t+1 t

(39)

100 1000

Δ

t

0.006 0.008 0.010 0.012

C(

Δ

t,t

w

)

c(t w) c(t w) c(tw1) 2 3 1 1 2 2 3 3 3 3 3 t w tw + T t w tw + T t w tw + T a(t w) / [Δt + δ(tw)] + c(tw)

C

(∆

t

)

1

t

+

δ

Tag Correlations

C

(∆

t, t

w

) =

1

T

t

t=tw+T −∆t

!

t=tw

δ

(tag(

t

+ ∆

t

)

,

tag(

t

))

t

... t+8 t+7 t+6 t+5 t+4 t+3 t+2 t+1 t c(tw) = R max(tw) ! R=1 Pt2 w( R)

(40)

a Yule-Simon model with memory

we start with n

0

words

at time

t

: with probability

p

, a new word is appended

with probability

1-p

, a word is copied from position

t-x

x

is distributed according to a fat-tailed memory kernel Q(x)

t-x new p 1-p

x

... t-9 t-8 t-7 t-6 t-5 t-4 t-3 t-2 t-1

Q

t

(

x

)

1

x

+

τ

ln x

(41)

frequency-rank plot: exp. vs model

p = 0.06

p = 0.03

τ

= 20

τ

= 100

“blog”

“ajax”

100 101 102 103 104

R

10-7 10-6 10-5 10-4 10-3 10-2 10-1

P(R)

"blog", experimental "blog", theory "ajax", experimental "ajax", theory "xml", experimental "xml", theory 100 101 102 R 10-4 10-3 10-2 10-1 P(R) experimental web news music rss design javascript web "blog" "ajax" "xml" "H5N1" del.icio.us Connotea C. Cattuto, VL, L. Pietronero PNAS 104, 1461 (2007)

(42)

frequency-rank plot: exp. vs model

p = 0.06

p = 0.03

τ

= 20

τ

= 100

“blog”

“ajax”

Q

t

(

x

)

1

x

+

τ

100 101 102 103 104

R

10-7 10-6 10-5 10-4 10-3 10-2 10-1

P(R)

"blog", experimental "blog", theory "ajax", experimental "ajax", theory "xml", experimental "xml", theory 100 101 102 R 10-4 10-3 10-2 10-1 P(R) experimental web news music rss design javascript web "blog" "ajax" "xml" "H5N1" del.icio.us Connotea C. Cattuto, VL, L. Pietronero PNAS 104, 1461 (2007)

(43)

10

4

10

5

10

6

10

7

10

8

τ

10

2

10

3

10

4

10

5

10

6

10

7

N(

τ

)

2004 2005 2006 1×106 2×106 N

global vocabulary growth

~ 650.000 users

~ 2 · 10

7

resources

~ 5 · 10

7

posts

~ 2.5 · 10

6

tags

(44)

advertising agency art arte artist artists blog branding cool creative creativity css

design

designer designers digital drawing fashion flash gallery graphic graphics

ideas identity illustration illustrator ilustração inspiration interactive magazine marketing

motion photo photographer photography portfolio portfolios print propaganda reference

showcase studio sweden typography vector wallpaper web webdesign webdev website

R

1

T

1

art awards blog blogs code community cool

css

cssgallery

design

development directory

diseño examples free galeria galleries gallery graphics html ideas inspiration interface

layout links news portal portfolio programming reference resource resources safari_export

showcase standards style template templates tools tutorial typography usability web

web2.0 web_design webdesign webdev website webstandards xhtml

T

2

R

2

Tagcloud Overlap Metrics

w

R1,R2

=

!

tT1T2 min(ft1.ft2) ft

!

tT1∩T2 max(ft1.ft2) ft

+

!

tT1−T2 ft1 ft

+

!

tT2−T1 ft2 ft

(45)

-0.005 0 0.005 -0.005 0 0.005 -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 V5 V3 V4 V5 2 1 3 4 5 6

Resource Networks: Community Structure

politics

(46)

37signals art blog books css

design

development font

fonts free graphics howto illustration inspiration photo

photography photoshop portfolio productivity programming reference software system:unfiled themes tutorial

tutorials typography web webdesign wordpress

activism art blog burn bush creativity culture dvd economics

flash freeware fun funny government history humor

maps media money

politics

reference research software

speechwriter statistics system:unfiled tools usa war windows

art business color css

design

development flash free

fun game games google graphics html inspiration patterns

photography photos pricing reference resources search software stock

system:unfiled tools web web2.0 webdesign webdev

ajax art awards blog blogger blogs color cool

css

design

flash gallery graphics html images

inspiration internet javascript lightbox politics portal

portfolio reference system:unfiled templates tools web web2.0

webdesign

webdev

ajax art blog books color css

design

desktop desktops development extension extensions firefox flash

graphics icons illustration inspiration programming reference

software system:unfiled technology tools typography wallpaper

wallpapers web webdesign webdev

activism blog blogs bush colbert comedy conservative culture

election fraud freedom funny government grillo humor internet law

libertarian maps media news political

politics

progressive

science security system:unfiled usa video voting

1

2

4

5

6

3

Resource Networks: Community Structure

-0.005 0 0.005 -0.005 0 0.005 -0.1 -0.08 -0.06 -0.04 -0.02 0 0.02 0.04 0.06 0.08 0.1 V5 V3 V4 V5 2 1 3 4 5 6

“humor” in politics

news in politics

web design

visual design

(47)

spam detection

raw data

no spam

shuffled

(48)

Expected results of the Project

Devise methods and algorithms for analysing raw data

collected in online social communities.

Develop suitable modeling and theoretical

constructions to understand, predict and control

emergent properties.

Develop and make publicly available innovative

applications embodying novel navigation and control

concepts.

Foster the growth of new web communities revolving

around the applications developed by the Consortium.

Create the first extensive and comprehensive body of

data on web-based tagging and make it available to the

broader IT and scientific community.

(49)

C. Cattuto, V.D.P. Servedio and VL “A Yule-Simon process with memory” Europhys. Lett. 76, 208 (2006)

C. Cattuto, VL and L. Pietronero

“Semiotic dynamics and collaborative tagging” PNAS 104, 1461 (2007)

C. Cattuto, A. Baldassarri, V.D.P. Servedio and VL “Vocabulary growth in social tagging systems” http://arxiv.org/abs/0704.3316v1

C. Cattuto et al.

“Network Properties of Folksonomies” AI Communications (2007), in press

http://www.tagora-project.eu/

(50)

C. Cattuto, V.D.P. Servedio and VL “A Yule-Simon process with memory” Europhys. Lett. 76, 208 (2006)

C. Cattuto, VL and L. Pietronero

“Semiotic dynamics and collaborative tagging” PNAS 104, 1461 (2007)

C. Cattuto, A. Baldassarri, V.D.P. Servedio and VL “Vocabulary growth in social tagging systems” http://arxiv.org/abs/0704.3316v1

C. Cattuto et al.

“Network Properties of Folksonomies” AI Communications (2007), in press

Harith Alani

Andrea Baldassarri

Ciro Cattuto

Mirand Grahl

Andreas Hotho

Cristoph Schmitz

Vito D.P. Servedio

Steffen Staab

Luc Steels

Gerd Stumme

Martin Szomszor

Thanks to

http://www.tagora-project.eu/

Figure

Updating...

Related subjects :