Peer to Peer Network
including JXTA
Euresco Conference at Castelvecchio Pascoli Italy June 16-21 2001 EuroConference on Problem-Solving Environments for Numerical
Mathematics, Science and Engineering Applications
Geoffrey Fox
IPCRES Laboratory for Grid Technology Computer Science, Informatics, Physics
Indiana University Bloomington IN
P2P: Peer-to-Peer Networks
• Peer-to-Peer Networks are considered to be the next "killer application" for the Internet.
• There are several important technology challenges and applications that vary from the sublime to the ridiculous.
• Like most such over hyped concepts, P2P is rather loosely defined and covers a set of rather disparate ideas.
• Perhaps the only common theme is a client oriented view of the world and P2P can be thought of as Power to the People.
– Clients do most of the work, communicate with each other and really are the most important computers;
– Servers may be around and even essential but remain subservient to the clients.
P2P References
• General discussions on P2P technology can be found at two good web sites;
– http://www.openp2p.com from the O'Reilly group and
– http://www.peer-to-peerwg.org/ as an industry working group originally initiated by Intel.
• There is a remarkable book Peer-to-Peer: Harnessing the Power of Disruptive Technologies by Andrew Oram, Nelson Minar, Clay Shirky, Tim O'Reilly (March 15, 2001, O'Reilly & Associates;
ISBN: 059600110X) ;
– This has several articles describing either key systems or key services
• Disruptive Technology(DT): One that does something (perhaps a new capability) far better than previous best practice; it does old things less well at the start but gradually catches up.
– The Web itself is a DT: hyperlinks empower new ways of doing things. Browsers are (initially) clumsier user interface than
Napster I
• Shawn Fanning developed the original Napster application and service in January 1999 while a freshman at
Northeastern University.
• Napster allowed any client to advertise any MP3 files stored on its disk and choose to download MP3 files from other
clients connected to the Napster server network.
• It is said that Shawn was taking a computer-programming course at Northeastern, but had to buy a programming book to build Napster ……..
• Like most good ideas, Napster was designed to solve a real need - in this case to allow Shawn, a musician himself, to share his music with his friends on campus.
Napster II
• The system has become staggeringly popular. Quoting a legal opinion from last summer, approximately 10,000 music files are shared per second using Napster, and every second more than 100 users attempt to connect to the system and there
will be 75 million Napster users by the end of 2000.
http://news.cnet.com/News/Pages/Special/Napster/napster_patel.html
– Current legal debate has increased instantaneous use
• Napster has some other typical P2P services; Instant Messenger, chat rooms, "Buddy lists" and information about "hot music" (music portal) but the key feature is the ability to share files between any Internet connected consenting clients.
Napster III: Key Features
• MP3 files are important as a popular digital encoding for audio
• it is straightforward to "rip" files off an audio CD and look up key meta-data (Artist, Title etc.) in a CDDB database on the Web (http://www.gracenote.com/).
– Length of songs uniquely identifies CD’s
• The audio and meta-data can be stored and accessed as a single unit.
• Although a server is used to establish the initial connection, the file transfer is done efficiently - directly from client to client; a P2P service.
• This is an improvement over most NFS systems where use of
distributed files is not easy (except possibly for the originator who named file) as usually all you have is a cryptic filename.
• The added value of meta-data for files lies at the heart of the
Napster Like Services I
• There some 200 available Napster clones to support this area
http://www.ultimateresourcesite.com/napster/main.htm
• Currently the most popular is Imesh
[http://www.imesh.com], which has some 2 million users and can share any type of file.
• Some of the best known file sharing systems are
– MojoNation [http://www.mojonation.net],
– Freenet [http://freenet.sourceforge.net/] ,
– Gnutella [http://gnutella.wego.com/]
• These three are not server based like Napster but rather support waves of software agents expressing resource
Napster Like Services II
• There are many interesting ideas being explored;
– Breaking shared files into many parts to both increase
bandwidth (parallel I/O) and increase security of content as no one site can access files without cooperation from its peers.
This type of technology is controversial as it makes censorship very hard.
– MojoNation has a load balancing and scheduling algorithm in the form of micro payments to reward those who contribute most to the community of peers.
– Gnutella - which is a family of related products -- is usually described as a P2P search engine as its interface is nearer that of a search engine than a Web file system.
• So far we have identified some basic P2P services; file registration, access and search.
• We can categorize other P2P systems in various ways; we will choose distributed computing, collaboration, and core
P2P for Distributed Computing or Web Computing I
• The distributed computing P2P applications are highlighted by the use of millions of Internet clients to analyze data looking for extraterrestrial life (SETI@home http://setiathome.ssl.berkeley.edu/ ) and the
• Newer project examining the folding of proteins ( Folding@home http://www.stanford.edu/group/pandegroup/Cosm/ ).
• These are building distributed computing solutions for a special class of applications:
– Those that can be divided into a huge number of essentially
independent computations, and a central server system doles out separate work chunks to each participating client.
– In the parallel computing community, these problems are called "pleasingly or embarrassingly parallel".
• This approach is included in the P2P category because the computing is Peer based even though it does not have the "Peer only communication" characteristic of all aspects of Gnutella and Napster for information
transfer.
Parabon
• Pure Java model
Entropia Financial Modeling II
• Each basic financial instrument can be calculated independently
• Central Server interprets the total simulation
Drug
United Devices also does Dru
Simulation
• Parameter Study: do billions of simulations – each with different parameters
• Search Engine like interface to simulation
P2P for Distributed Computing or Web Computing II • Other projects of this type include:
• United Devices (http://www.ud.com/home.htm based on SETI@home),
• AppliedMeta (http://www.appliedmeta.com based on well known Legion project from the University of Virginia),
• Parabon computation (http://www.parabon.com),
• Condor (from Wisconsin http://www.cs.wisc.edu/condor/) and
• Entropia (http://www.entropia.com/).
• Other applications for this type of system include financial modeling, bio-informatics, measurement of web server
performance and the scheduling of different jobs to use idle time on a network of workstations.
• Ian Foster has given a more detailed review of these activities at
http://www.nature.com/nature/webmatters/grid/grid.html
Collaboration with P2P Systems
• Collaborative systems form a rather different type of P2P network.
• We have a community of clients working together and sharing different Internet resources.
• Probably the Instant Messenger (IM) or various forms of chat room are the most used capability in this arena.
– We have a set of clients exchanging messages with each other.
– Unlike the file-sharing case, one typically needs to multi-cast the same message to multiple clients at the same time and the best architecture is still active research.
Peer to Peer P2P “Illusion” among collaborating clients
Serve r
Serve r
Serve r Serve
r
Serve r
Core P2P technologies or services I
• These include P2P management, messaging, security, client grouping as well as the file or more generally object registration, discovery and access capabilities already discussed for Napster.
• Sun Microsystems has two important technology projects.
– Jini (http://www.sun.com/jini/ ) has a simple model for dynamic self defining objects which act like
Napster peers and register with distributed servers allowing other peers to discover and access them.
– JXTA (from juxtaposition http://www.jxta.org ) is a new project from Bill Joy aiming at core P2P
Core P2P technologies or services II
• Digital Cash or “Digital Reputation” implemented with DigitalCash is an important area as even in “free systems” need mechanism to limit abuse of resources
– “Tragedy of the Commons”
• Digital Cash has many very clever implementations
– Public Key based “coins” issued by digital banks – issues of safety (no overdrafts or forgery) and anonymity (cash is
anonymous, credit cards are not)
– POW (Proofs of Work) to stop DOS (Denial of Service)
• Network Architecture
– Create the Small World effect where only a few hops needed to get from any A to any B
Core P2P technologies or services III
• I expect research and commercial experience to identify more base services, as we understand better the common needs of P2P systems
• Management of resources in such a network must be an important challenge; it is our Nirvana - the Web
Operating System.
• Maybe society can live in a Gallimaufry of unstructured knowledge swept back and forth by armies of Gnutella agents.
– However this will not do for what the Gartner Group (http://www.oreillynet.com/pub/d/547 ) and O'Reilly term Enterprise P2P needed by a Fortune 500
organization.
http://www.jxta.org April 25 2001
• Network computer maker Sun Microsystems Inc. on
Wednesday opened a toolbox for programmers to build a decentralized, Napster - like Web that it hopes will speed up and deepen the Internet.
• Rather than anarchy, Sun expects pockets of programmers to build a more secure and reliable network that will include every device imaginable, from computers to kitchen
appliances.
• ``Our goal at the end of this is to build a completely reliable system from unreliable parts,'' John Gage, chief researcher at Palo Alto, California-based Sun, said in an interview.
• Sun's formula for peer-to-peer computing, which links computers directly to one another without relying on a server to coordinate communication, rests on a small
JXTA for the Press II
• ``The idea is that entities on the Net can find each other and then send information back and forth,'' said Sun's chief scientist, Bill Joy, who heads Project JXTA,
pronounced ''juxta,'' which released the protocol, or rules, for peer-to-peer networks.
• Sun bets that it will be able to sell more of its Internet-building computers as the Web grows on its software, similar to the strategy it has taken with its Java
programming language.
• Peer-to-peer computing has achieved a kind of cult celebrity because of the notoriety of Napster and a number of companies are looking for commercial applications for the way that it allows individuals to
JXTA for the Press III
• Microchip maker Intel Corp. for instance, has used peer-to-peer technology to link employees' PCs, using the collective power of the herd to design new chips faster.
• Song-swapping service Napster holds a central index
which points members to others who have the songs they want, but members swap directly. Other services avoid a central index, instead passing lists from member to
member in a type of computer gossip.
• But distributed computing, as peer-to-peer is more
formally known, can also ease network bottlenecks by making it possible to break down big programs into lots of little ones which can communicate with each other. • ``The role of jxta will be as a solvent to dissolve the
JXTA for the Press IV
• Sun said that manufacturers could start building jxta
into devices from mobile handsets to refrigerators to get them on the web in a year or so.
• The tiny layer of jxta layer released on Wednesday
includes protocols for computers to talk to each other, a way to organize groups within the broader environment, and ways to follow transactions and keep them secure. • ``The monitoring and security will ultimately determine
the success of the thing,'' Joy told a Webcast.
• Sun's archrival Microsoft Corp. also is focusing its
energy on making transactions secure with its own plans to build the infrastructure of an on,
always-connected Web, although as usual Sun dismissed
JXTA Philosophy
• Keep it familiar: where practical, use stuff that's standard and has worked before
• Leverage experts: engage a variety of experts early and often
• Encourage open development: the design, the specs, the code, and whatever the result is.
• Vision for the software
– promote communication (not isolation) among
applications
– develop administrative commands for peers, peer
groups, and groups of peers in the spirit of UNIX pipes and shells
– keep the core small and elegant: make an architectural distinction between core mechanisms and optional
policies
– support multiple platforms and languages, micro
Key JXTA Concepts
• Peers, Messages, Groups, Pipes
• JXTA Service Advertisements – XML metadata • JXTA Shell
• PDA (Java KVM) Implementation
• Li Gong: JXTA: A Network Programming Environment
JXTA Projects in Categories
• Applications
– Configurator: A GUI configuration tool for the JXTA platform
– Instantp2p:JXTA Demonstration GUI
– Shell: JXTA Command Line Shell for interactive access to the JXTAP2P platform
• Core JXTA
– Platform: JXTA P2P platform infrastructure building blocks and protocols.
• This project defines the JXTA core P2P building blocks: peers, peer groups, pipes, codats and peer group policies. Includes specifications and reference implementations for core JXTA protocols (peer and peer group discovery, peer group membership, peer group pipes and codat sharing).
– Security: JXTA P2P Security Project
• JXTA Services
– Cms: JXTA Content Management System
– Monitoring: Monitoring and Metering
JXTA Protocols -- Discovery
• Project JXTA has defined six protocols so far.The developer community
• may define more over time.
• Peer Discovery Protocol — enables a peer to find
advertisements on other peers, and can be used to find any of the peer, peer group, or advertisements.
– This protocol is the default discovery protocol for all peer groups, including the World Peer Group.
– It is conceivable that someone may want to develop a premium discovery mechanism that may or may not choose to leverage this default protocol, but the
inclusion of this default protocol means that all JXTA peers can understand each other at the very basic level.
– Peer discovery can be done with or without specifying a name for either the peer to be located or the group to
JXTA Protocols – Resolver, Information, Membership
• Peer Resolver Protocol — enables a peer to send and
receive generic queries to search for peers, peer groups, pipes, and other information.
– Typically, this protocol is implemented only by those peers that have access to data repositories and offer advanced search capabilities.
• Peer Information Protocol — allows a peer to learn about the capabilities and status of other peers.
– For example, a ping message can be sent to see if a peer is alive.
– A query can also be sent regarding a peer’s properties where each property has a name and a value string.
• Peer Membership Protocol — allows a peer to obtain group membership requirements, to apply for membership and receive a membership credential along with a full group advertisement, to update an existing membership or
JXTA Protocols – Binding, Endpoints
• Pipe Binding Protocol — allows a peer to bind a pipe
advertisement to a pipe endpoint, thus indicating where messages actually go over the pipe.
– In some sense, a pipe can be viewed as an abstract, named message queue that supports a number of
abstract operations such as create, open, close, delete, send, and receive.
– Bind occurs during the open operation, whereas unbind occurs during the close operation.
• Peer Endpoint Protocol — allows a peer to ask a peer router for available routes for sending a message to a destination peer.
– For example, when two communicating peers are not
directly connected to each other, such as they are not using the same network transport protocol or when they are separated by firewalls or NATs, peer routers
respond to queries with available route information — that is, a list of gateways along the route.