Chapter 11. Conclusions
11.3.3. The big picture
aspects of distribution and communication management which require very low-level support. Also, I think (or rationalise?) that an intimate familiarity with all levels of the system gives a more intuitive sense of the “aesthetics” of distribution. I hope that this will have created a more harmonious and integrated system rather than an awkward simulation of the desired effects running over a reluctant infrastructure. Finally, I may never have the time to indulge myself quite like this again.
The last thing which I would have done differently is more of a strategic error: in both systems I attempted to use a common distribution system for all interaction and com- munication, from RPCs through graphical updates to audio streams. This has certain advantages in constructional and conceptual simplicity. However it is also one of the main reasons why I believe the scalability of MASSIVE-2 is currently limited. It is not so much that the metaphors are wrong, but that streamed media in particular are both highly demanding and also amenable to medium-specific handling and optimisa- tions. For example, the encoding rules for audio and video are entirely medium spe- cific. To have an additional generic layer of marshalling (as there is at present) is a real waste. Also, the despatch mechanism appropriate for object oriented program- ming (as used in MASSIVE-2) is much more heavyweight than is required for streamed media in most situations. In MASSIVE-1 part way through development the audio medium was split out into a separate generic audio service with tailored com- munication protocols. I think that the same thing should be done with MASSIVE-2. Similarly, if video is integrated it will need careful and appropriate handling (for rea- sons of performance, rather than correctness).
11.3.3. The big picture
This thesis concludes by considering the way in which the areas covered in this thesis may be viewed within the broader picture of computer science and society. This is considered it terms of: inter-disciplinary research, inhabited television and ubiqitous computing.
Inter-disciplinary research
Inter-disciplinary research appears to be extremely difficult to maintain, with barriers of language and terminology, of philosophy and of ideology, adding to the more mun- dane logistical and administrative barriers which exist between departments and organisations. One of the interesting aspects of the work presented in this thesis is the way that it brings together concerns from sociology and psychology together with issues of computer science, from CSCW and CHI to distributed systems and network- ing. The motivations behind CVEs derive in part from ethnographic observations of everyday work practice. Within the lifetime of the work presented here this has come full circle, with these same ethnographic techniques being applied to environments which they have inspired (see in particular [Bowers, Pycock and O’Brien, 1996]). These observations yield new insight and understanding and in this way practice progresses. When inter-disciplinary research can be made to work it can be exception- ally productive and innovative. Perhaps one way in which it can be made to work, as in CVEs, is around a common ideal or driving application. Furthermore, within the framework of “Inhabited Television” (see below) artists, writers, producers and direc- tors all become potential collaborators. It is to be hoped that this spirit (and reality) of collaboration will be maintained and enhanced, both within this area of work and also
11.3.3. The big picture
more generally. Computer science is not an end in itself: the highest duty of computer science is in the service of man (or the glory of God, depending on your perspective).
Inhabited TV
If one were to adopt a visionary perspective then perhaps the “killer application” for collaborative virtual environments would be “Inhabited Television”, i.e. making this type of technology and environment available to domestic users in their own homes. The traditional model of television is one of highly centralised production and coordi- nated large scale distribution to large numbers of passive viewers. This approach is required by the limitations of the technologies traditionally used for television, i.e. high-investment terrestrial broadcasting. The emerging area of Interactive TV (see for example [Salmony, 1995]) supplements this unidirectional flow of content with a (typically very low bandwidth) reverse channel, allowing simple feedback from indi- vidual viewers to the content provider. Uses for this back-channel typically include requesting content (e.g. video-on-demand) and simple responses to content such as voting and tele-shopping (ordering goods and services). Inhabited TV seeks to extend this model in two significant respects: firstly it introduces direct communication between the distributed viewers; and secondly it expands the size and nature of the back-channel to support richer interaction and moment-by-moment involvement with the content. That is, it allows isolated viewers to become involved participants: par- ticipating in a collective (mutually aware) audience and being able to participate directly in the content itself, for example by making a significant contribution to it. Moving from CVEs as they exist today to the notional future of Inhabited TV presents many major technical and social challenges. For example, the issue of scalability, which has been one of the concerns of this thesis, assumes massive (!) proportions. In the UK alone peak viewing figures for conventional broadcast TV regularly exceed 10 million simultaneous viewers. The global viewing figures for the funeral of Diana, Princess of Wales, in September 1997 are reported to be in excess of 2 billion. There is a long way to go to address this scale of use. There are also other issues such as pro- viding users with effective navigation and rich interaction within a relatively uncon- strained (and often social) domestic environment. Also, viewers will need to be able to move between various modes of (non-) participation, for example from passive viewing, through co-aware audience membership, to central participation. There are also profound issues relating to the content of Inhabited TV. What should it be like? How should it be structured? Who will create it and how? Who will control and man- age it? And so on.
Mixed realities and ubiquitous computing
Bowers, O’Brien and Pycock [1996] make the point, in observing the process of stag- ing a meeting in MASSIVE-1, that only part of the activity and communication is occurring within the system, i.e. within the virtual world. Each participant is still very much part of their own physical environment such as their office or their home. There seems to have been a tacit assumption in much research related to virtual reality that
ideally people will “leave the physical world behind” and step whole-heartedly into
the virtual, at least for the duration of use. This is clearly not usually the case. Nor is it even ideal except in relatively constrained training and entertainment applications. It is necessary, rather, to consider the real and virtual worlds evolving and coexisting in parallel at all times.
11.3.3. The big picture
This may also be seen as a perspective or approach to the philosophy of ubiquitous computing [Weiser, 1993], i.e. the notion that everything, including normally mun- dane and everyday objects, should not only be computerised but also communicating and cooperating in a benign web of natural electronic assistance. Ubiquitous comput- ing can be seen as breaking down the boundary between “real” and “virtual” (i.e. elec- tronic) by infusing the real world with the virtual by technological means. Similarly, the techniques of augmented reality (as in [Bajura et al., 1992], for example) seek to supplement or overlay the real world with virtual artefacts and information. One might say that the ideal is not virtual reality, but mixed reality, a merging of real and virtual objects and of real and virtual spaces. This is an area which has begun to be explored by the author and others (see for example [Milgram and Kishino, 1994] and [Benford et al., 1996]). These approaches and ideas may have a profound influence both on the way in which these technologies evolve and ultimately upon the way in which technology and perhaps even reality are understood by society at large.
Appendix A. User profiling
Appendix A. User profiling
This appendix presents a quantitative analysis of aspects of user behaviour in MASSIVE-1. This allows an approximate model of expected user behaviour to be built. Such a model is a key element in assessing the network and computational resources required to support varying numbers of users in any CVE system and is used in the network traffic models of chapters 6 and 10. Section A.1 begins by describing the sources of data which have been utilised. Section A.2 then goes on to present results for user movement while section A.3 considers the use of audio. The data presented here is based on the last six meetings held within the ITW project. These meetings were relatively free of technical problems, and more data is available for these than for other meetings. Consequently the results presented are derived from a relatively limited class of CVE usage, i.e. for small structured meetings which are dominated by inter-personal communication. None the less, it can provide a starting point for analysis and modelling in other application areas.
Further details of these trials can be found in [Greenhalgh et al., 1997].