We generate for each pair of read and recommended news items (is, it),
a small set of explanations going beyond showing only the title and a short abstract of the news item and with the goal of letting the user understand why the news item has been recommended to him and consequently increasing the awareness on that. Therefore, one of the first problems we deal with is to select what kind of explanations we have to generate and how. We adopted a very pragmatic approach and we come up with a set of 16 different explanations, grouped in three different classes according to the feature type we use to generate them. In particular we distinguish between text -based, entity-based, and usage-based explanations. In the following we report the list of the 16 different kinds of explanations grouped by their class.
5.4.1
Text-based explanations
The explanations in this class have the common characteristic to be gener- ated by exploiting the textual content in either the read or recommended news items.
Similarity The recommended news item is similar to the one you are read- ing. This explanation considers the similarity between the content of the source and target news items. Similarity is computed by using cosine
5.4. GENERATION OF EXPLANATIONS 119
similarity with TF-IDF weighting, where the inverse document frequen- cies are defined over a large collection of news items. The Similarity explanation is triggered by a similarity score greater than a predefined threshold τ , fixed to 0.4 in all our experiments. The motivation for this explanation template is the following. News items are recommended, generally, by showing the title and the first sentence. This presentation methods might not be enough for the user and if we also add the infor- mation about how high the similarity is between two news items then we could indeed inform the reader about one of the possible reasons why she might be interested in the news item.
SimilaritySnippet The explanation consists in providing a snippet con- taining the two most similar sentences extracted from the read and rec- ommended news items. Also in this case the cosine similarity is used. The idea for this explanation is very similar to the previous one. Indeed, the two sentences added to the motivation play the same role as the snippets (i.e., query-biased summaries) in search engine results.
TargetSnippet This explanation is constructed by simply taking the first two sentences appearing in the recommended news item. The idea, here, is to increase the amount of content provided to the user. It is equivalent to increasing the space devoted to the news recommendation to fit the first two sentences. We include this template to evaluate the effect of putting more text in addition to the title of the news item.
Tagsplanation [152] Both the read and recommended news items share tags X. The explanation exploits the common tags X associated with the read and recommended news items, if any. Tags are generated by taking the 15 terms with the highest TF-IDF score from each news item.
5.4.2
Entity-based explanations
In this class fall explanations using entities, or images associated with them, as a means to convey the reason why a news item has been recommended to a user. Named entities are recognized and extracted with the SuperSense Tagger.1
SharedEntity The recommended news item is interesting because it tells about X as the one you are reading. The explanation tells to the user that a given named entity X is shared between the read and recommended news items. We conjecture a user might be interested by some particular
120 CHAPTER 5. ON GENERATING NEWS EXPLANATIONS
fact about an entity (e.g. a person, a movie, etc.) since he was reading about that entity in the news item accessed.
TargetEntity The recommended news item is about X. The explanation presents to the user the named entity X extracted from the recommended news item if it does not appear in the read news item. The idea is to give the user the information that might be more important to him, i.e. the entity X of which the news item discusses.
DistinctEntities The news item you are reading and the recommended one are about X and Y . The explanation presents the two main distinct entities extracted from the read and recommended news items. The idea is similar to the two previous types but we aim at increasing the informativeness and capturing news diversity by adding one entity to the explanation.
TargetImage The recommended news item is about X. The explanation shows an image X associated with the main named entity represented in the recommended news item. The image shown is taken from Freebase, using the provided API2. The idea is that of “one picture is worth thou- sand words” and we aim at increasing user awareness on the explanation by adding the image depicting the main subject of the news item. Images The read and recommended news items are about X and Y , respec-
tively. The explanation shows two images X and Y , associated with entities represented in the read and recommended news items, respec- tively. If X = Y , i.e., the news item refers to the same named entity, the image is obviously shown only once. The purpose of this kind of recommendation is the same as SharedEntity or DistinctEntities but with a focus on images rather than textual description of the entities involved.
SharedPlaces Both the read and recommended news items refer to the places X, Y , ... The explanation proposes the geo-names shared be- tween the source and target news items, if any. The purpose is to let the reader understand the localization of the news item she is reading and that she has received as a recommendation.
TargetPlaces The recommended news item refer to X, Y , ... This ex- planation suggests that the recommended news item refers to some geo- names. The idea, here, is the same as SharedPlaces but we let the
5.4. GENERATION OF EXPLANATIONS 121
Source News:
NFL star Junior Seau re-
membered in surfing cere-
mony
OCEANSIDE, California (Reuters) - Hundreds of surfers paid their respects to Junior Seau on Sunday with a “paddle- out” ceremony in the Pacific, just beyond the beachfront home where the football great and avid surfer committed suicide at age 43 . . .
Target News:
Football great Junior Seau’s
brain to be examined
LOS ANGELES (Reuters) - Football great Junior Seau’s brain will be exam- ined for evidence of repetitive injuries from his playing days following the re- tired linebacker’s suicide in his California beachfront home . . .
Queries The recommended
news item is about: junior, seau
SharedEntity The recommended
news item is interest-
ing because it tells
about Junior Seau, San Diego Chargers as the one you are reading
SharedPlaces Both the read and
recommended news
item refer to the place: Oceanside, CA, US, San Diego, CA, US
Images
The read and rec-
ommended news
items are about
Junior Seau
Figure 5.1: An example of explanations for sport-related news.
user think by herself of a possible relationship with the news item she is reading.
Categories Both the read and recommended news items are categorizable as X, Y , ... The explanation exploits the common categories (e.g., sport, politics, etc.) associated with the read and recommended news item, if any. Categories are provided by the news provider. The idea is to explain the user that the interest in this news item can be due to the categories it belongs.
5.4.3
Usage-based explanations
Explanations of this class are extracted by considering how news items are accessed by users. We extract knowledge from toolbar data to explain with some form of global statistics the recommendations produced.
Popularity The recommended news item is interesting because N users have recently read it. The explanation highlights the popularity of the recommended news item. This explanation is oblivious with respect to the read news items and has the purpose of telling the user that the recommended news item is important among users of the news platform.
122 CHAPTER 5. ON GENERATING NEWS EXPLANATIONS
Source News:
Conservative factions domi-
nate Iran’s run-off elections
DUBAI (Reuters) - Iranian President Mahmoud Ahmadinejad, now out of fa- vor with Supreme Leader Ayatollah Ali Khamenei, suffered more setbacks in a run-off parliamentary election seen as a pointer for next year’s presidential race. . . Target News:
Heavy fighting rocks eastern
Syria: residents
AMMAN (Reuters) - Heavy fighting between rebels and government troops erupted overnight in the capital of an oil producing province in eastern Syria, res- idents and activists said on Sunday, the latest escalation of violence in a tribal area bordering Iraq
DistinctEntities The news items you
are reading and the recommended one are about Iranian Presi- dent Mahmoud Ah-
madinejad and gov-
ernment troops
TargetSnippet AMMAN (Reuters)
- Heavy fighting be-
tween rebels and
government troops
erupted overnight in the capital of an oil producing province in eastern Syria
SimilaritySnippet (Reporting by Khaled
Yacoub Oweis; Edit-
ing by Andrew Os- born) ...”The fighting subsided early in the morning...
TargetPlaces The recommended
news items refer to: Rif Dimashq, Syria
Figure 5.2: An example of explanations for geo-political news.
Queries The recommended news item is about S. The explanation is given by the set S of terms shared among the past queries for which the rec- ommended news item was retrieved and clicked. The motivation for this explanation is that we want to give users a view-point consisting of the terms other users were using to reach that news item.
TargetQueryBiasedSnippet The explanation consists in providing a query- biased snippet X for the recommended news item. The snippet is ex- tracted from the news item by using the openNLP3 framework and the terms shared among the queries for which the news item was clicked. SourceQueryBiasedSnippet Similar to TargetQueryBiasedSnippet
except for the fact that the terms used are those of the queries associated with the read news item and not the recommended one.
Please observe that for each pair of news items, it is not always possible to produce all the 16 explanations. For instance, the explanation Images can be produced only if we are able to retrieve the image entities for both the source and the target news items. Finally, note that snippets generated for explanations contain two sentences and do not exceed 200 characters.