6.2 Exploration : TileSearch
7.1.2 MeetHub Search
The research and development of the search component, i.e. the MeetHub Search, were my major contributions to the joint project. The lower two corners on the MeetHub PC interface
with the MeetHub.
"Marquee" in the shared workspace (PC interface)
In the preceding chapter, we presented the TileSearch system which used image or Wikipedia search results as contextual information scents, which were returned by querying terms captured from conversational words. The study did not confirm which was better in many aspects. The Marquee in the MeetHub system did not abandon either type of design. Instead, each Marquee presents a combination of the two kinds as information scents. Automatic search results were returned by searching query terms built from various "text objects" in the MeetHub system. Figure 7.3(a) illustrates an example of a Marquee, which consists of two image and one Wikipedia thumbnails scrolling from right to left. We tested several time durations for the scrolling animation, from 10 seconds to 1 minute. Finally we chose 25 seconds so that the Marquees were neither too fast nor too slow for users to recognize. A user can click a specific thumbnail during its "travel" to view more details in the Web browser. How did the system build query terms? In fact, building search queries from context is usually done in two phases. First, keywords are extracted from contexts (i.e. keywords extraction phase). Second, the keywords are combined with specific rules to form queries (i.e. query building phase). For the keyword extraction phase, both the RaindropSearch and the Tile- Search in Chapter 6 extracted nouns as keywords. In the studies of the two systems, our subjects expressed dissatisfactions with the quality and relevance of the resulting informa- tion scents, which might partially attribute to the simple keyword extraction approach. As discussed in the previous chapter, keyword extraction is a nontrivial task, which can require sophisticated statistical models and natural language processing technology. Despite all these, the effectiveness may also vary from context to context. As a result, there is no gold standard for it. Jean-Louis et al. (2014) presented a comparative study on several online semantic annotators available on the Internet, and the AlchemyAPI1performed well in many tests in the study. It provides a RESTful API (Web service) that can be easily integrated in existing applica-
tions. Therefore, the MeetHub Search employed the AlchemyAPI to find and rank keywords from the text on the shared workspace. Not only individual words but also phrases can be captured as keywords. As an example, if the sentence "Every neuron has an electrical voltage on both sides of the membrane that is called the membrane potential" is fed into the Alchemy API, the extracted keywords, from high to low relevance, are "membrane potential","electrical voltage", "neuron" and "sides".
In order to generate information scents from interaction context, we must first identify the user-generated objects that may contain contextual information, in the form of text. In fact, there are three types of such objects in the MeetHub: (1) text typed into the text-editing widgets such as text boxes or notes in the shared workspace (2) query terms typed directly in the Web browser search box (3) text in the opened Webpages. The Webpages are not created by the users, but they are presented in the system as a result of user interactions. In addition, intensity and transactivity of the interactions may also be considered as interaction context. We implemented 3 approaches to build queries from the keywords captured in the aforementioned text objects. Searching with the queries then yield results that lead to the construction of Marquees.
Typing-triggered Approach (TA) The typing-triggered approach (TA) only considers texts
typed in the text box and note widgets as interaction context. The system detects delimiters such as "?","!",";",".", as well as the "return" keystroke. These symbols usually signal the end of a semantically meaningful text segment, which is then fed into the Alchemy Web service for keyword extraction. Time limit is also considered. If none of the aforementioned delimiters were hit by users and the text being edited is idle for 15 seconds, then the previously typed content is also fed for keyword extraction. The keyword extraction is not performed on the complete text written in a text box or note widget. Instead, each time the system compares the changes between the current text and its previous state when last keyword extraction was performed. Only the changed text is used.
Usually one or more keywords or phrases are extracted, but there are also occasions where no keywords are extracted, especially when the text segment is composed of only stop words. Each keywords forms a query of its own and is fed into the search engine to retrieve results. The top 2 image results and the first Wikipedia results construct a Marquee.
Combinational Approach (CA)
The advantage of the TA is the extraction of instant interaction context from user-generated text, but the result may suffer from the absence of global context. In the combinational approach (CA), search queries combine the keywords extracted by the TA with a list of N (maximum = 5) pre-selected keywords that defines the task context to be performed by users: Whenever a keyword or phrase is extracted, it is mixed with each combination of
à N 1 ! and à N 2 !
weight: 4.0. The following types of terms are weighted as 3.5 because they are resulted from interactions with shared awareness: (1) search queries explicitly specified by users in the public Web browser on the wall display; (2) terms in the Wordcloud that are clicked by users (will be described later); (3) queries that are used to compose a "consumed" Marquee. The terms associated with the following situations are weighted as 3.0 because they are resulted from private interactions: (1) a user searched a term on his/her own iPad; (2) a user clicked a suggested term in the Querylist (will be described later); (3) a user shares a link from the iPad to the wall display. In addition, keywords extracted from the text-editing widgets (like in the TA) have initial weights of 2.
The weight of a keyword does not stay unchanged. Every time a new Marquee is generated, the weights of all words decrease by 0.5. If a word’s weight reduces to zero, then it is removed. Additionally, intensity and transactivity of keyword contributions are considered. Intensity refers to the recurrence of a certain word or phrase. Transactivity means a keyword contributed by one user is repeated by another user. If a term is captured again from the same user, then its weight increase by half. In case of transactivity, the weight doubles. At the beginning, the system composes search queries with a single word with the highest weight and generates a Marquee accordingly. After the Marquee scrolls out of scene, the next search takes 2 words with largest weights. The process continues by taking one more word each time to compose a query, until (1) no results can be returned from the search engine; (2) the number of selection exceeds the number of positively weighted words available. Under these two circumstances, the number of selection is reset to one, and the above described process repeats.
Expected Usage and Benefits
When group users type into the text-editing widgets in the shared workspace, we assume their goal is to note down certain things or to express ideas. Usually there is no immediate explicit need for searching information at the time of typing. The marquees aim to capture users’ text interactions and present automatic search results as contextual information scents, which are expected to cue for latent information needs.
"Wordcloud" in the Web browser (PC interface)
In the MeetHub Search, we design a Wordcloud (Figure 7.3) alongside the Web browser on the wall display. When a user views a Web page, the Wordcloud displays a maximum of 15 most relevant keywords it contains with distinct colors. The keyword extraction is done via the AlchemyAPI, which returns a list of words or phrases with corresponding relevance scores. The sizes of the keywords are proportional to their relevant scores.
The Wordcloud serves as contextual information scents in a similar way as words in the RaindropSearch presented in Chapter 6: The words are immediately searchable as query terms. A user can click a specific keyword and the Web browser will navigate to the corresponding Google search result page.
Expected Usage and Benefits
When group users view a particular page, they may want to quickly grasp its main topic. The Wordcloud may serve for this purpose. The "cloud" layout offers instantly comprehensible situational information about the Webpage being viewed. In the meanwhile, certain key concepts in the page may spark additional information needs. The benefit is the convenience for searching potentially useful information with a mouse click.
"Querylist" in the Web browser (iPad interface)
Due to space constraints, the shared workspace in the iPad interface is not featured with Marquees. However, the search results carried by each Marquee, as well as the corresponding query terms are display in the Querylist in the Web search panel alongside the Web browser. As Figure 7.3 illustrates, the query term (i.e. Lady Gaga) is displayed as a list item. Clicking on the item leads to its expansion, which allows a user to find the images and Wikipedia articles that were previously displayed in the marquees. The Querylist creates contextual information scents as "search memories", so that a user can be aware of all the searches conducted by the MeetHub system in the past.
Expected Usage and Benefits
The Querylist is only visible on iPads. Suppose a user is focusing on interacting with the iPad, his/her attention is definitely away from the wall display. As a result, the user is not aware of what contextual information scents have been presented or what others have been doing in the meanwhile. The Querylist increases the awareness of the system’s proactive behaviors and others’ activity, which would probably be helpful for collaboration as well as for self-reflection.