• No results found

A methodology for workload characterization of file-sharing peer-to-peer networks

N/A
N/A
Protected

Academic year: 2021

Share "A methodology for workload characterization of file-sharing peer-to-peer networks"

Copied!
25
0
0

Loading.... (view fulltext now)

Full text

(1)

A methodology for workload characterization of

file-sharing peer-to-peer networks

Diˆego Nogueira, Leonardo Rocha, Juliano Santos,

Paulo Ara´ujo, Virg´ılio Almeida, Wagner Meira Jr.

Department of Computer Science Federal University of Minas Gerais - Brazil

(2)

What is peer-to-peer (P2P)?

Class of distributed applications on the Internet • peers act as both servers and clients (servent) • servents share computational resources

• particular features:

– dual role of servents

– totally distributed processing nature

(3)

Why characterize P2P networks?

• growth of P2P networks (especially file-sharing) • lack of characterization methodologies

(4)

Gnutella: case study

• file-sharing open P2P network • why Gnutella?

– simple P2P network

– intense traffic, with users all over the world • previous work:

– did not focus on standard statistical distributions

(5)
(6)

Workload characterization methodology

• derives from the classic client-server characterization • divided into:

– Qualitative characterization

∗ conceptual definition of atributes – Quantitative characterization

∗ client-side criteria

(7)

Qualitative characterization - (1/2)

Conceptual definition of the attributes: P2P architecture:

• type of resources shared

• communication protocol (connection + service interface)

P2P application:

(8)

Qualitative characterization - (2/2)

P2P network:

• application implemented by the servents • set of servents

– the resources shared by the servent – the servent’s neighborhood

(9)

Gnutella qualitative characterization

Gnutella architecture

• Shared resource: storage device (hard disk) • Communication protocol:

– connection interface: {ping, pong}

– service interface: {query, queryhit, download, push}

Gnutella application

(10)

Quantitative characterization

Workload characterization of a live P2P network

• collection of traffic and peer behavior data, analysis of the data

Client-side criteria

• demand for resources • interaction pattern • servents’ connectivity

Server-side criteria

• resource availability • service capacity

(11)

Gnutella quantitative characterization

• Data collector

– developed over Gnut

∗ collected data addressed to and through peer ∗ periodically sent random querys

• Experiments

– 2 Linux workstations connected to Brazilian research network – connection to reference servents on Gnutella

(12)

Demand for resources characterization

Identifies:

• the subjects of interest

• popularity of subjects among users • temporal locality among requests

Gnutella:

• Servents’ interests

– 2,992,390 querys received / 94,642 distinct words (including stop words)

(13)
(14)

Interaction pattern characterization

• quality-of-service metric

• used to quantify the overall performance of a P2P network

Gnutella: • Latency

– sent ttl 1 pings

(15)
(16)

Servents’ connectivity characterization

Quantified through:

• average number of neighbors

• network traffic associated with communication • amount of data exchanged

Gnutella:

• Unique servents

– number of servents varied approximately 15% across collection periods

(17)

Resource availability characterization

• how dynamics of P2P affects access to information

• assess effectiveness of network in providing information

• good for comparing data distribution protocols and mechanisms

Gnutella:

• Shared kbytes

– 120,535 addressed servents / 90,282 replied – information from pong messages

(18)
(19)

Service capacity characterization

• quantifies amount of information provided by servents • helps to understand if idle capacity is used efficiently • provides information to improve scalability of servents

Gnutella:

• Servents’ availability

– 98% on for at most 45 min. – 84% no longer than 10 min.

(20)
(21)

Data convergence analysis

• verify how representative the data collected and studied are

• mechanism:

– Perform the same experiments in shorter periods ∗ 6, 12, 18 and 24 hours, for example

(22)

Data convergence analysis

(23)

Conclusions

Definition of a workload characterization methodology

• Qualitative characterization (conceptual definitions)

• Quantitative characterization (client + server side criteria)

Successful application of methodology on Gnutella Interesting results about Gnutella’s traffic and peer

behavior

• Statistical distribution analysis

• Latency distribution follows the Log-Normal distribution • Search traffic 50 times larger than control traffic

(24)

Questions?

(25)

Index

P2P definition Demand for resources

Motivation Interaction pattern

Gnutella Servents’ connectivity

Methodology introduction Resource availability

Qualitative characterization Service capacity

gNet qualitative characterization Data convergence analysis

Quantitative characterization Conclusions

References

Related documents

In the probe design for DSECT system, the effects of the probe design parameters including the number of GMR sensor, excitation coil thickness and probe diameter were investigated

The relationship between seismicity and fault structure in zones A and B on DW (Figure 9) is strikingly simi- lar to what is observed on the western end of the G3 segment of the

Data Domain Replicator software can be used with the encryption option, enabling encrypted data to be replicated using collection, directory, MTree, or application-specific managed

The true extent of rent-seeking is intrinsically di¢cult to measure, though like Besley et al (2010) (and many others) we argue that it is nonetheless worthwhile examining

Log GDP percap. a 44-country sample due to missing data on corporate tax rates. b 35-country sample due to missing data on FDI restrictions and incentives. c 57-country sample due

In addition to weed species listed in the ANNUAL WEEDS and BIENNIAL and PERENNIAL WEEDS Application Rate and Timing tables, these treatments may be used to control or suppress

However, I wanted to see how students themselves define “traditional,” so I sent a question about the definition to the honors listserv for the Southern Polytechnic State

The average goods and services deficit decreased $2.1 billion to $43.1 billion for the three months ending in November..  Average exports of goods and services decreased $0.7