Content Selection through the Viewfinder - A Proof-of-Concept Implementation

5.3 A Proof-of-Concept Implementation

5.3.1 Content Selection through the Viewfinder

In both of our scenarios, we wanted to enable users to simply take a picture of the content of interest without requiring a separate connection mode. On startup, the device scanned the environment for existing Bluetooth services. Once it discovered the correct one (described by a UUID3 and

a name), it connected to the display. This step is part of the application launch procedure and is not recognized by users. From the technical point of view, the connection is established before any content is selected. From the user’s point of view, this procedure is transparent. While this type of connection is not suitable for multiple displays it still allows for testing the overall idea of selecting contentthrough the display. In the next section, we will describe, how we apply this approach to multiple displays in the environment.

Figure 5.7: Connecting and selecting content through the display: While the application

launches, the system connects to the display already with a specialized service name (a). Then the camera preview is shown which allows users to aim at content of interest (b). After capturing an image, users can decide whether this image has the content in its center (c). Once the device is connected, a live camera preview is shown. Users can now aim at the content they wish to select. When users are standing further away from the display, the wide-angle lenses built into mobile devices causes multiple content items to be shown simultaneously. While this actually is better for the aforementioned image processing due to several neighboring items, it is more complicated for users to know which item is the one being selected. To facilitate this, the live video is overlaid with a crosshair to allow pixel-exact selection regarding the mobile device’s display. Now they can use their mobile device asabsoluteanddirectpointing device to select content on the display (i.e., the item underneath the crosshair is being selected). If they now press the ”select” button, the mobile device takes a picture with a higher resolution with more detail than in the live camera preview for easier image processing. The taken image is then sent immediately to the computer driving the display which initiates the aforementioned content analysis. During this time, the mobile device displays a wait screen letting the user know that the system is gathering the selected content.

If the system was not able to detect the selected content, users receive an error message asking them to select the content again in the described fashion. If the content has been detected, the system sends back a list of options describing which actions users can perform on this item. The list of actions is described for each content item on the computer running the display. In this

5.3 A Proof-of-Concept Implementation 99

way, the mobile device acts as a thin client only displaying information gathered from the public screen. As shown in figure 5.7, a thumbnail of the selected item is shown on top of the list. This preview allows users to determine whether the system actually selected the correct content. If the item was the wrong one, they can go back and select it again. The list itself has two meanings: first, users see the possible actions they can use for this item. And second, users can immediately select one of these options. To perform one of these actions, the mobile device only needs to present its generic user interfaces (i.e., writing text, showing an image, and playing an audio or video file). If more complex user interfaces are required (e.g., a drawing canvas), these interfaces need to be described by the content itself in order to be rendered on the mobile device. Different hardware (mostly regarding input) can be a limited factor for this approach. For example, a drawing canvas can benefit from the presence of a touch screen, but has to work on a regular mobile device featuring input through a keypad as well.

Figure 5.8: Walkthrough of AdWall: When an item has been selected successfully, users

can select from a list of options associated to the content (a). After selecting the option ”download trailer”, the content is transferred to the mobile device (b). Users can then view the downloaded trailer associated to the selected movie poster (c).

As the bandwidth of Bluetooth is limited we chose to only transfer the list of options and the thumbnail after the selection has been made. When selecting an option, the additional content needs to be downloaded from the display to be rendered on the mobile device. While the content is being downloaded, the device again shows a screen notifying the user that data transfer is in progress. In early discussions with potential users we found that this is not harmful for the interaction as long as the download time does not exceed one minute. Users are familiar with this concept since they use mobile internet with eventually higher loading times throughout their day. Once the content has arrived fully, the device shows its predefined user interface. This interface is well-known to the device’s owner as they may use it for other media on their device. Instead of fully downloading the content, one may stream the data which is a considerable option for audio and video files. Comparing download times of compressed media files (e.g., 20 seconds for 1 MB) with their length (e.g., 1 minute), however, reveals that downloading is done faster. In the

streaming scenario, users have to reside in the vicinity of the display to fully watch the content until it ends. Downloading the content, on the other hand, allows them to watch it (1) later and (2) multiple times. For this reason, we favored downloading content over streaming it.

Naturally, the content could have been directly shown on the large display as well. This then would avoid the waiting times resulting from downloading the media files. There are certain reasons against this procedure: first, as mentioned before, downloading the content allows users to watch it later and (if they want) repeatedly without being in the vicinity of the large display that hosts the information. And second, multiple users may produce race conditions. For example, the large display may show an image representation of a video clip. If this clip would now be played on the large display, others joining the environment later would have missed the beginning and have to wait until the original user is done watching it. For this reason, we preferred the large display aspoint of accessingmedia over apoint of consumingmedia.

5.3.2 Applications and Implementation

To demonstrate the use of this interaction technique, we implemented two possible applications calledAdWallandPhotoWall. In the first application, the public display showed covers of movies and music albums that are available at a nearby store. To break the rigid relationship of a public display only being an information broadcaster, users could select the image representation of an item using the aforementioned system. Once they selected an item, they were presented with options depending on the content. For movies, they were able to view a short trailer (about one minute) advertising it. If users selected a music album, the mobile device presented a list of all tracks. Choosing a track then caused the mobile device to play a 30 second sample (in lower quality for copyright protection) of it. This approach is similar to the previews users can access on online stores such as Amazon 4. If they want to listen to another track, they could simply go back without the need to select the item again. With this, users can already have a personal preview of an album or video which may help them to decide whether they are interested. This application only allows selecting content from the remote display but not manipulating it. In the second application, the public display showed pictures taken by various people similar to online photograph services such as flickr. In contrast to the first application, users are able to select a picture and read or write comments to it. Written comments are then immediately accessible to other users in the same way. As the public display shows a large number of pictures at the same time, these can be downloaded and displayed on the mobile device for a more detailed view. Selecting a picture results in two options: either viewing the picture or receiving the latest comments. If users select to download the picture, the image is transferred and stored on the mobile device. Subsequently, the image viewer allows users to get a closer look at it using familiar interactions built into the mobile device. If they want to obtain the comments written for the selected picture, a list of comments is being shown. Each comment contains the date and time, the author as well as the text itself. Users can read these comments and further write their

5.3 A Proof-of-Concept Implementation 101

own. The interface of writing a comment is similar to the one used for writing text messages (e.g., SMS5). After writing a comment, users can send the comment back to the display allowing everyone else to read it. Although the comments are not directly displayed on the external screen, users are able to manipulate the content that is being sent to others.

Figure 5.9:Walkthrough ofPhotoWall: After selecting an item, users can choose from a list

of options (a). After selecting the option ”view comments”, previously written comments are shown (b). Users can then write their own comment in a text-based interface (c).

We implemented our prototype on a Bluetooth-enabled mobile phone (Sony Ericsson K800i) featuring a 3 megapixel camera. The display has a resolution of 240×320 pixels and a diagonal of 2 inches. The pictures taken by the built-in camera had a resolution of 480 × 640 pixels. We chose to use this size to allow a fast transmission while still maintaining details within the picture. The mobile device was further capable of rendering MP3 audio (i.e., music samples) and 3GP video data (i.e., video trailer). We simulated a public display using a 50” plasma screen with 1366×768 pixels (see figure 5.10). This is the same display used in the studies described in previous chapters. All software components on the mobile device were written in JavaME (CLDC 1.1, and MIDP 2.0). The components running on the public display are implemented in C# for displaying content and Java for Bluetooth connectivity. The computer driving the large display featured a 3.0 GHz CPU and 2 GB of main memory to allow for fast image processing.

In document Boring, Sebastian (2010): Interacting "Through the Display": A New Model for Interacting on and Across External Displays. Dissertation, LMU München: Fakultät für Mathematik, Informatik und Statistik (Page 117-121)