9.1 Conclusions
9.2.2 CompTorrent
CompTorrent has demonstrated the contributions of this thesis however there is always scope for further extension and improvement. This section will briefly discuss some suggestion for further work with CompTorrent.
9.2.2.1 Protocol and Routing Optimisation
The usage of XML for communication between nodes for all communications could be wasteful for fine-grained tasks. XML was originally chosen due to its ease of extensibility and modification as the research progressed. Beyond compressing the XML, which is a possibility but again unlikely to yield much gain for very small payloads (Goldsmith 2004), a binary structure implementation where the bare
minimum of meta data is exchanged is feasible. A solution between binary structures and XML that has gained popularity during the progress of this thesis is YAML, which provides a lighter weight mark up language for structured data (Ingerson, Evans & Ben-Kiki 2001).
Examining the relationships between the nature of the computation task and the topology of the overlay network is already showing promise. Applying different routing algorithms is an area in its own right and further work beyond the least common ancestor heuristic which is used now should prove worthwhile. Other routing arrangements used in distributed hash tables such as a Skiplist, Cartesian Coordinate Space, Plaxton Tree and similar could be compared to see if they offer performance benefits whilst considering their cost in terms of implementation complexity and transparency for the user.
9.2.2.2 Interprocess Communication
Another obvious extension would be support for algorithms that are not completely independently parallel. The classic choices between shared memory or message passing are two obvious candidates for implementation and testing. Implementing shared memory across nodes in a CompTorrent swarm would also allow for a
distributed tracker to be overlaid on the network. This could either be as a primary or secondary tracker service and it would be interesting to see how this could be used to improve the robustness of the system.
9.2.2.3 Optimisation of File Transfer
Optimization of file transfer is another area that may yield improved results. A lot of work has already been done investigating the efficiency of BitTorrent for file
transfers including some recent work (Piatek et al 2007) that has further increased performance by some 70% by selective uploading to connected peers based on their behaviour. It will be interesting to see if these ranking algorithms would have a
similar result with peers based on their bandwidth contribution as well as their computing contribution. This would expand on existing work of allocating tasks based on the number of data chunks processed, number of file requests services, time taken to respond, etc.
9.2.2.4 Trackers
The tracker is currently an HTTP service and has a relatively small bandwidth load (subject to the granularity of the task and data). A recent idea involves investigating the possibility of embedding tracker data into unlikely places or protocols. As the tracker is mainly shared memory (lists of connected nodes, completed chunks) it may well be possible to host tracker data on another unrelated service such as Internet Relay Chat. Security related issues is also an area where much work can be done. An obfuscation technique that has already been proven in concept is embedding tracker data into an image using steganographic techniques. It will be interesting to see if the extra bandwidth required will result in any stealth advantage. As would looking at the mobility of projects between trackers during computation.
The tracker could obviously be extended to involve more computation when suggesting nodes for connection. It has been kept as simple as possible in the case of CompTorrent in order to prove that a genuinely decentralised swarm can produce similar results to a client-server system. In practice, especially where a higher confidence in nodes is recognised (i.e. in a controlled, production environment), more tracker involvement would be sensible. However, in a peer-to-peer system this should always be carefully considered along with the level of centralisation the system is to maintain.
9.2.3 Botnets
As previously mentioned in 3.3.3, botnets have been potentially identified as some of the most distributed, peer-to-peer computing systems known to be in practical, albeit illegal, use. Further work to identify existing techniques being used by these
networks could compliment the work in this thesis as well as providing potential insight into mitigating the effects of these harmful botnets. This is raised here as a general research direction beyond the scope of CompTorrent and general purpose distributed computing.
9.3 Some Personal Concluding Remarks on Peer-to-Peer as a
Controversial Research Topic
At the beginning of Chapter 1 of this thesis, a quote from J. Bronowski was included which referred to the nature of a scientist as being one of dissent. Peer-to-peer
computing, as a technique, has come under great, sustained fire from the media, industry groups and politicians as being synonymous with illegal file sharing and the distribution of other prohibited content. There have been many calls for it to be banned and many attempts for it to be blocked at a Internet Service Provider level.
This incorrect assumption that a networking or computing technique equals malfeasance and corruption is one that must not be allowed to continue to propagate further. This is hardly the first time that computing has been controversial, yet I feel that it is significant for us here as so many different groups are in active opposition at once.
useful harnessing of resources that might not otherwise have been economically feasible and therefore available to a research group. This allows scientific projects another avenue for procuring computing cycles just like BOINC and Condor (and many others) are doing now. This project, at time of writing, uniquely allows the not necessarily professional and less funded groups or individuals the ability to host a distributed computing project easily and virtually without cost.
In this field, researchers should find it their duty to show how peer-to-peer is not all about “MP3s and piracy” and continue to speak out at ill-informed debate and demonstrate where peer-to-peer as a discrete technology itself is being used for the real benefit of humanity. The mesh networking scheme used in the One Laptop Per Child project is a real example of peer-to-peer being a part of the success of a humanitarian effort.
It is hoped that this thesis, and others like it in the area of peer-to-peer computing, will show that this technique is overwhelmingly benign and can be used productively and widely for the benefit of all.