Music Recommendation
Listen to the music You like
Recuperação de Informação 1ºsemestre 2011/2012 Ricardo Dias, nº55444
Bibliography
Music Recommendation and Discovery: The Long
Tail, Long Fail, and Long Play
Òscar Celma, Springer 2010, Ch-1-3,5
Recommender Systems
Prem Melville, Vikas Sindhwani
Encyclopedia of Machine Learning, 2010
Handbook of Multimedia for Digital
Entertainment and Arts
MOTIVATION & CONTEXT
Music Consumption Change
Time
Music Recommendation
Digital Era - PortabilityUp to ~20 tracks Up to ~40.000 tracks
Music Recommendation
Digital Era – Online ServicesMusic Recommendation
Digital Era – Online ServicesAmazon
Music Recommendation Problem
• Overwhelming number of choices of which
music to listen to
– Users feel:
• Paralyzed • Doubtful
• Need to provide personalized filters and
recommendations to ease users’ decisions
Music Recommendation
Before Digital Era• Cannot
only
rely on recommendations from:
– Radios – Friends
– Local Record Dealers – Dj’s and Music Experts – Etc.
Music Recommendation
Digital EraMusic Characteristics
• Different from other types of media
– Tracking users’ preferences can be implicit – Items can be consumed several times
(even repeatedly and continuously) – Instant feedback
– Music consumption depends on context (morning, work, afternoon, etc.)
Music Recommendation Specificities
• Current music recommendation algorithms try
to accurately predict what people will want to
listen
– Making accurate predictions about a user could
listen
to, orbuy
next, independently of howuseful
the provided recommendations are to the userTHE RECOMMENDATION PROBLEM
Formalization
• Recommendation Problem
• Prediction problem – estimation of the items’ likeliness for a given user
• Recommend a list of N items – assuming that the system can predict likeliness for yet unrated items
Prediction Problem
•
𝑼 = 𝒖
𝟏, 𝒖
𝟐, … , 𝒖
𝒎 the set of Users
•
𝑰 = 𝒊
𝟏, 𝒊
𝟐, … , 𝒊
𝒏 items that can be
recommend
•
𝑰
𝒖𝒋– list of items a user j expressed his interests
• Function 𝑷
𝒖𝒂,𝐢j– predicted likeness of item 𝑖
𝑗,
for the active user 𝑢
𝑎, where 𝑖
𝑗∉ 𝐼
𝑢𝑎Recommendation Problem
• Find a list of N items 𝑰
𝒓⊂ 𝑰 that the user will
like the most
• The ones with higher 𝑃𝑢𝑎,𝑖𝑗
• The resultant list should not contain items from the user’s interests
Use Cases
• Common usages of a recommender system:
1. Find good items
2. Find all good items
3. Recommend sequence (e.g. playlist generation)
4. Just browsing
5. Find credible recommender 6. Express self
General Model
• Users and Items
• Two types of
recommendations:
• Top-N predicted items • Top-N predicted
User Profile Generation
• Two key elements:
• Generation and Maintenance
• Exploitation of the profile using a recommendation system
User Profile Creation
• Empty Profile – The simplest, but…
• Manual – Direct feedback to the system, but…
• Data import – create the profile from an
external representation
• Training Set – Provide feedback to concrete
items, marking them relevant or irrelevant to
user’s interests, but…
• Stereotyping – Assign a user into a cluster of
similar users that are represented by their
User Profile Maintenance
• Explicit Feedback
• Ratings (Problems?)
• Comments and Opinions
• Implicit Feedback
• Monitoring user’s actions (e.g., tracking play,
pause, skip and stop buttons in the media player, etc.)
• Problem?
User Profile Adaptation
• Adapt the system to users’ profile changes:
• Manually
• Adding new information while keeping the old • Gradually forgetting old interest and promoting
Recommendation Methods
• Standard classification of recommender
systems:
1. Demographic Filtering 2. Collaborative Filtering 3. Content-based Filtering 4. Context-based Filtering 5. Hybrid ApproachesDemographic Filtering
• Used to identify the kind of users that like a
certain item
• Classifies user profiles in clusters based on:
• Personal data (age, gender, marital status, etc.) • Geographic data (city, country)
Advantages/Limitations
• The simplest recommendation method
• But…
• Recommendations are too general
• Requires effort from the user to generate the profile
Collaborative Filtering
• Predict user preferences for items by learning
past user-item relationships
• CF methods work by build a matrix M with n
items and m users, that contains the
interaction (e.g. ratings, plays, etc.) of the
users with the items.
Collaborative Filtering
• The value 𝑴
𝒖𝒂,𝒊𝒋represents the “rating” of the
Collaborative Filtering Approaches
• Item-Based Neighborhood
• User-Based Neighborhood
• Matrix Factorization
Item-Based Neighborhood
• Only users that rated the items 𝒊
𝒋and 𝒊
𝒌, are
Item-Based Neighborhood
• Only users that rated the items 𝒊
𝒋and 𝒊
𝒌, are
Item-Based Neighborhood
1. Compute the similarity between two items, i
and j
1. Example: Adjusted cosine similarity
2. Predict to the target user, u, a value for the
active item, i
𝑺𝒌 𝒊; 𝒖 - set of k neighbors of item I, that the user u has rated
User-Based Neighborhood
• Compute the predicted rating value of item i,
for the active user u, taking into account those
users that are similar to u
𝑟 - average rating for user u 𝑢
Matrix Factorization
• Useful when the M user-item matrix is sparse
• Reduce dimensionality of the original matrix,
generating matrices U and V that approximate the original one
• Example: SVD – Singular Value Decomposition
• Computes matrices 𝑛 𝑥 𝑘 𝑈 and 𝑚 𝑥 𝑘 𝑉, for a given number k, such as:
Matrix Factorization
• After matrix reduction we can calculate the
predicted rating value for item i for a user u
Limitations
• Data sparsity and high dimensionality
• Gray sheep problem*
• Cold-start problem (early-rater problem)
• Does not take into account items’ descriptions
• Popularity Bias
Content-based Filtering
• Uses information describing the items
• Process of characterizing item data set can be:
• Manual (annotations by domain experts)
• Automatic (extracting features by analyzing the content)
Content-based Filtering
• Similarity Functions
1. Euclidean 2. Manhattan 3. Chebychev 4. MahalanobisLimitations
• Cold-start problem (only to user)
• Gray-sheep problem
• Novelty (?)
• Limitation of extracted automatic features
• Subjective (personal opinions) not taken into
Context-based Filtering
• Uses context information to describe and
characterize the items
• Context Information – any information that can be used to characterize a situation or an entity
• Context != Content
• Two main techniques:
• Web mining • Social Tagging
Web Mining
• 3 different web mining categories:
• Web content mining
• text, hypertext, markup, and multimedia mining
• Web structure mining
• focuses on link analysis (in- and out- links)
• Web usage mining
• uses the information available on session logs. This information can be used to derive user habits and
preferences, link prediction, or item similarity based on co-occurrences in the session log
Social Tagging
• Aims at annotating web content using tags
• Tags are freely chosen keywords, not
constrained to a predefined vocabulary
• Recommender systems can derive social
Social Tagging
• When users tag items, we get tuples of :
< 𝒖𝒔𝒆𝒓, 𝒊𝒕𝒆𝒎, 𝒕𝒂𝒈 >
• These triples conform a 3-order matrix
Social Tagging
• Two approaches to compute item (and user)
similarity:
1. Unfold the 3-order tensor in three bidimensional
matrices (user- tag, item-tag and user-item matrices)
Unfolding the 3-order tensor
• User-Tag (U matrix) - 𝑼
𝒊,𝒋contains the number
of times user i applied the tag j
• Item-Tag (I matrix) - 𝑰
𝒊,𝒋contains the number
of times an item i has been tagged with tag j
• User-Item (R binary matrix) - 𝑹
𝒊,𝒋denotes
Unfolding the 3-order tensor
• Item similarity (using I) or user similarity (using
U or I), can be computed using:
• Cosine-based distance
• Dimensionality reduction techniques(SVD, NMF)
• Then recommendations can be made by using:
• R user-item matrix or,
Using the 3-order tensor
• The available techniques are (high-order)
extensions of SVD and NMF
• HOSVD is a higher order generalization of matrix SVD for tensors,
• Non-negative Tensor Factorization (NTF) is a generalization of NMF
Limitations
• Coverage
• Problems with tags:
• Polysemy • Synonymy
• Usefulness of personal tags • Sparsity
Hybrid Approaches
• Goal
• Achieve better recommendations by combining some of the previous approaches
• Methods:
• Weighted • Switching • Mixed
Factors Affecting Recommendation
• Novelty and Serendipity
• Explainability (transparency)
• Cold Start Problem
• Data Sparsity and High Dimensionality
• Coverage
Factors Affecting Recommendation
• Trust
• Attacks
• Temporal Effects
MUSIC RECOMMENDATION
Use Cases
• Main task of a music recommendation
system:
• Propose interesting music, consisting of a mix of known and unknown artists, as well as the
Use Cases
• Artist Recommendation
• Playlist Generation
• Shuffle, Random Playlists • Personalized Playlists
User Profile Representation
• Extend user profile with music related
information
• Has not been largely investigated
• Useful to:
• Improve music recommendation • Share with others your preferences
Type of Listeners
• Each type of listener needs different type of
recommendations
User Profile Representation
Proposals
• Most relevant proposals are:
• User modeling for Information Retrieval (UMIRL) • MPEG-7 standard
User Modeling for Information
Retrieval
• Allows one to describe perceptual and
MPEG-7 User Preferences
• User preferences in MPEG-7 includes:
• Content filtering
• Searching and browsing preferences
FOAF: User Profiling in the
Semantic Web
• Provides conventions and a language “to tell”
a machine the type of things a user says about
herself in her homepage
Item Profile Representation
• Music items:
• Artists • Songs
Music Information Plane
• Music knowledge management categories:
• Editorial Metadata • Cultural Metadata • Acoustic Metadata
Music Description Facets
• Low-level Timbre Descriptors
• Spectral Centroid/Flateness/Skewness, MFCCs, etc.
• Instrumentation
• Rhythm
• Harmony
• Structure
• Intensity
• Genre
• Mood
Recommendation Methods
(examples and specificities)
• Collaborative Filtering (CF)
• Explicit/Implicit Feedback
• Content-Based Filtering
• Context-Based Filtering
• Hybrid Methods
Collaborative Filtering
• CF makes use of the editorial and cultural
information
• Explicit feedback – based on ratings about songs / artists
CF with Explicit Feedback
• Examples:
• Ringo – 1st music recommender based on CF and explicit feedback
• Racofi – based on CF and a set of logic rules based on Horn clauses
• Indiscover
CF with Implicit Feedback
• Main Drawbacks:
• The value that a user assigns to an item is not
always in a predefined range (e.g. from 1..5 or like it/hate it)
• Cannot gather negative feedback
• Recommendations usually performed at artist
level, but listening habits are recorded at song
level Aggregation
Content-Based Filtering
• Uses content extracted from music to provide
recommendations
• Compute similarity among songs, in order to recommend music to the user
• Two ways to describe audio content:
• Manually
Manually Audio Content
Description
• Very time consuming
• Scalability problems
• But…
• Annotations can be more accurate than automatic
• Example: Pandora
• Analysts annotate 400 parameters per song, using a ten point scale per attribute
Automatic Audio Content
Description
• Early work on audio similarity is based on
low-level descriptors, such as Mel Frequency
Cepstral Coefficients ( MFCC)
• Foote proposed a music indexing system based on MFCC histograms
• Audio features are usually aggregated together using mean and variance, or modeling it as a Gaussian Mixture Model (GMM)
Automatic Audio Content
Description
• Analyses audio signal and automatically
extracts a set of features:
• Tzanetakis extracted a set of features representing the spectrum, rhythm and harmony (chord
structure); then merged into a single vector, and were used to determine song similarity
• Cataltepe et al. presented a music
recommendation system based on audio
similarity, where user’s listening history is taken into account
Context-Based Filtering Techniques
• Uses cultural information to compute artist or
song similarity
• Mainly based on web mining techniques, or
mining data from collaborative tagging
Context-Based Filtering Techniques
• Example:
• M3 (Music for My Mood), uses context (season, month, day of the week, weather, temperature) and Case-based Reasoning to recommend music
Hybrid Methods
• Allows a system to minimize the issues that a
solely method can have
• How cascade approach works:
• One technique is applied first, obtaining a ranked list of items. Then, a second technique refines or re-rank the results obtained in the first step
Hybrid Method
• Example:
• Tiemann et al. investigate ensemble learning methods for hybrid music recommender
algorithms. Their approach combines social and
content-based methods, where each one
produces a weak learner. Then using a
combination rule, it unifies the output of the weak learners.
EVALUATION
Evaluation
• Three different strategies
• System-centric • Network-centric • User-centric
System-centric Evaluation
• Evaluation measures how accurate the system
can predict the actual values that user have
System-centric Evaluation
• Most approaches based on the leave-n-out
method
• Similar to the classic n-fold cross validation
• Dataset divided into two (usually disjunct)
sets:
• Training and Test
• Accuracy evaluation based only on a user’s
dataset
System-centric Evaluation
• Metrics:
• Predictive accuracy
• Mean Absolute Error, Root Mean Square Error
• Decision based
• Mean Average Precision, Recall, F-measure, Accuracy, ROC
• Rank based
• Spearman’s ƥ, Kandall – Ƭ, Half-life Utility, Discounted Cumulative Gain
System-centric Evaluation
Limitations
• Cannot evaluate recommendations concerning:
1. Coverage 2. Novelty
3. Transparency (explainability) 4. Trustworthiness (confidence) 5. Perceived Quality
Network-centric Evaluation
• Evaluation aims at measuring the topology of
the item (or user) similarity network
Network-centric Evaluation
• The similarity network is the basis to provide
the recommendations
• Important to analyze and understand the
underlying topology of the similarity network
• Measures:
• Coverage
Network-centric Evaluation
• In terms of:
• Navigation
• Average Shortest Path, Giant Component
• Connectivity
• Degree Distribution, Degree-Degree Correlation, Mixing Patterns
• Clustering
• Local/Global Clustering Coefficient
• Centrality
Network-centric Evaluation
• Limitations:
• Accuracy of the recommendations cannot be measured
• Transparency (explainability) and trustworthiness (confidence) of the recommendations cannot be measured
• The perceived quality (i.e. usefulness and
effectiveness) of the recommendations cannot be measured
User-centric Evaluation
• Evaluation focuses on the user’s perceived
quality and usefulness of the
User-centric Evaluation
• Copes with the limitations of both:
• System- and Network-centric approaches
• Evaluates:
• Novelty
User-centric Evaluation
• Gathering Feedback (Explicit, Implicit)
• Perceived Quality • Novelty
Perceived Quality
• Easiest way to measure?
• Explicitly ask the users
• User needs information about:
• Item (e.g. metadata, preview, etc.)
• Reasons why the item was recommended
• Then can rate the quality of each recommended item (or the whole list)
Novelty
• Ask users if they recognize the predicted items
or not
• Combining novelty and perceived quality we
can infer if:
• User likes to receive and discover unknown items • Prefers more conservative and familiar
A/B Testing
• Present two different versions of an algorithm
(or two algorithms)
• Evaluate which one performs the best
• Performance measured by the impact the
new algorithm has on the visitors’ behavior,
User-centric Evaluation
• Limitations:
• Need of user intervention in the evaluation process
• Gathering feedback from the user can be tedious for some users
Evaluation summary
• Combining the three methods we can cover
Evaluation summary
• System-centric
• Evaluates performance accuracy of the algorithm
• Network-centric
• Analyses the structure of the similarity network
• User-centric
• Measure satisfaction about recommendations they receive
DATASETS FOR EVALUATION
Last.fm Dataset – 1K users
• Contains <user, timestamp, artist, song>
tuples
• Represents the listening habits for ~1.000 users
• Collected from Last.fmAPI
• User.getRecentTracks()
• Statistics:
• ~108,000 Artists with MusicBrainz ID • ~70.000 Artists without MusicBrainz ID
Last.fm Dataset – 360K users
• Contains tuples <user, artist, plays> from
360.000 users
• Collected from Last.fm API
• User.getTopArtists()
• Statistics:
• ~190.000 Artists with MusicBrainz ID
The Million Song Dataset
• One Million Songs!!!
• 280GB of data
• ~45.000 unique artists • ~8.000 unique terms
• > 2 Million asymmetric similarity relationships • Acoustic features
• Pitch, Timbre, Loudness, etc.
• Links to other sources to obtain more information
NEXTONE PLAYER
NEXTONE PLAYER: A
Music Recommendation
System Based on User
Behavior
Yajie Hu and Mitsunori Ogihara
Session-based CF for Music
Recommendation
Session-based Collaborative
Filtering for Predicting the
Next Song
Sung Eun Park, Sangkeun Lee, Sang-goo Lee