Chapter Summary - Learning implicit recommenders from massive unobserved feedback

In this chapter, we have presented a novel ranking predictor Lambda Factoriza- tion Machines (LambdaFM). LambdaFM is a negative sampling based algorithm, although it is analyzed from the top-N ranking perspective. In contrast with PRFM, LambdaFM has more advanced samplers, which can effectively find infor- mative negative examples. LambdaFM is unique in two aspects: (1) it is capable of optimizing various top-N item ranking metrics in implicit feedback settings; (2) it is very flexible to incorporate various context information for context-aware recommendations. Different from the original lambda strategy, which is tailored for the CE loss function, we have proved that the proposed sampling surrogates are more general and applicable to a set of well-known ranking loss functions. Furthermore, we have built a family of PRFM and LambdaFM algorithms, shed- ding light on how they perform in real tasks. In our evaluation, we have shown that LambdaFM largely outperforms the state-of-the-art counterparts in terms of four standard ranking measures. The methodologies and conclusions in this chapter support the thesis statement (1) and statement (2).

3.7 Chapter Summary

other research fields with positive-only data, e.g., the word embedding and visual semantic embedding tasks, where user-item relation in the item recommendation task can be regarded as word-word relation in the word embedding (and image- class relation in the visual semantic embedding) task. Understanding this, it is possible to adapt the model from one domain to another with some slight changes. For example, in Guo et al. (2018a), we successfully adapt the ranking and negative sampling method in LambdaFM for the word embedding task; and in

Guo et al. (2018b), we observe that the adaptive sampler proposed in Rendle and

Freudenthaler (2014) for item recommendation may also be adapted to improve

the repeated sampling process in Weston et al. (2011) for image recognition, althoughWeston et al.(2011) is not based on the BPR loss which is the main claim

in Rendle and Freudenthaler (2014). The main reason that a specific model can

be used in very different scenarios is probably because data in the three research fields has some similar distribution. But note that the algorithm may perform slightly different since characteristics of these datasets are not exactly the same. Empirically, the data in the visual semantic embedding task is much sparser than that in the word embedding task. Moreover, different learning algorithms are also impacted differently by these sampling methods. Here, we intend to clarify the similarity and difference between these works. Our insightful observation potentially suggests that many specific models developed in one of these fields are promising to benefit others by minor (or no) changes. We believe this will open a new direction of research to bridge these fields (Yuan et al., 2018b).

Boosting Factorization Machines

In this chapter, we design an ensemble method that applies LambdaFM (Yuan

et al.,2016b) and PRFM (Qiang et al.,2013) as component recommenders, called

Boosting Factorization Machines (BoostFM). From this perspective, BoostFM is also a negative sampling based model for implicit feedback scenario. BoostFM combines the strengths of boosting and factorization machines during the process of item ranking. Specifically, BoostFM is an adaptive boosting framework that linearly combines multiple homogeneous component recommenders, which are repeatedly constructed on the basis of individual FM model by a re-weighting scheme. To demonstrate its effectiveness, we perform experiments on three pub- licly available datasets and compare BoostFM (with uniform and static sampling) to state-of-the-art baseline models.

This chapter is mainly based on our previous work “BoostFM: Boosted Factor- ization Machines for Top-N Feature-based Recommendation” (Yuan et al., 2017) published in The 22nd Annual Meeting of The Intelligent User Interfaces Com- munity (IUI) 2017 with DOI: http://dx.doi.org/10.1145/3025171.3025211.

4.1 Introduction

Ensemble learning has become a prevalent method to boost machine learning results by combining several models. In this chapter, we make contributions on ensemble-based recommendation models. Specifically, we apply boosting techniques to improve Factorization Machines (FM) in implicit feedback scenarios. Boosting techniques were first employed to improve the performance of classification by integrating a set of weak classifiers (i.e., the classification accuracy

4.1 Introduction

rate should larger than 0.5) into a stronger one with better performance (Jiang

et al.,2013). Note that since the employed component recommenders all perform

significantly better than random guessing, we do not specify this requirement. Previous research has proven that boosting techniques usually come with better convergence properties and stability (Bertoni et al., 1997; Chowdhury et al.,

2015). So far, the most common implementation of boosting is AdaBoost (Freund

and Schapire,1997), although some newer boosting variants are reported (Freund,

2001; Freund et al., 2003; Xu and Li, 2007). We find that boosting techniques

have been recently introduced to solve recommendation problems with better results than single collaborative filtering (CF) algorithms (Jiang et al.,2013;Wang

et al., 2014; Liu et al.; Chowdhury et al., 2015). However, all existing solutions

are based on the basic matrix factorization model, which fails to incorporate more general context information. Moreover, in our work the learning process of each component recommender is optimized for top-N recommendation with implicit feedback, which is different from most previous work either optimized for rating prediction (Jiang et al.,2013) or even optimized for ranking but on explicit rating datasets (Chowdhury et al.,2015;Wang et al., 2014).

In this chapter, we propose BoostFM, a boosting approach for top-N context- aware CF, by combining the most well-known boosting framework AdaBoost with FM. Specifically, we first choose Factorization Machines to build the component recommender and multiple homogeneous component recommenders are linearly combined to create a strong recommender. The coefficient of each component recommender is calculated from the weight function based on a certain performance metric. At each boosting round, we devise a ranking objective function to optimize the component recommender following PRFM and LambdaFM. That is each component recomender in BoostFM is also based on negative sampling method. In addition, in the process of learning, we develop a re-weighting strategy and assign a dynamic weight to force the optimization concentrate more on observed context-item interactions with bad evaluation performance.

In document Learning implicit recommenders from massive unobserved feedback (Page 83-87)