Conclusion - Towards Recommender Engineering: tools and experiments for identifying recommender

1. Replace each qualiﬁed dependency (𝑞, 𝜏) with a dependency on a synthesized type 𝜏_𝑞 ⊆ 𝜏. These synthetic types may be realized as actual types in the runtime envi- ronment, or they may exist only as bookkeeping entities in the DI container.

2. Modify the initial component request (𝑞, 𝜏, 𝜒) to be (𝑞_⊥, 𝜏_𝑞, 𝜒) if 𝑞 ≠ 𝑞_⊥. 3. Modify the bindings as follows:

• Bind each synthetic qualiﬁer type 𝜏_𝑞to a constructor 𝑐_𝑞(𝜏)(𝜏𝑞with 𝒟(𝑐_𝑞(𝜏)) = ⟨𝜏⟩.

• For each binding 𝑏 = ( ̄𝑞, ̄𝜏, ̃𝜒) → 𝑡, substitute the binding ( ̄𝜏, ̃𝜒 ++ ⟨ ̃𝑞⟩) → 𝑡, where ̃𝑞 matches any synthetic constructor 𝑐𝑞(𝑡) whose corresponding qualiﬁer is matched by ̄𝑞.

We show how to modify bindings; any computable binding function should be able to be similarly modiﬁed to look for contexts terminating in 𝑡𝑞.

This reduction works by replacing each qualified dependency (𝑞, 𝜏) resulting in a constructor 𝑐 with a constructor chain 𝑐_𝑞(𝑡) Ð→ 𝑐. Any policy that examines the qualifier at-𝜏 tached with a dependency type can instead look at the context to see if the type is being configured to satisfy the dependency of a synthetic constructor.

After this reduction, the only qualifier in use is 𝑞⊥, and the only qualifier matcher is ⊤, so qualifiers can be removed entirely. Not only are qualifiers unneeded, but they do not add any expressive power over context-sensitive policy.

4.7 Conclusion

This chapter has described an approach to dependency injection based on a mathematical model of dependencies and their solutions and a Java implementation using this framework.

4.7. Conclusion

Grapht provides static analysis capabilities, context-sensitive policy, and extensive default- ing capabilities, allowing it to better meet the needs of LensKit than the existing solutions. Our approach to context-sensitive policy allows expressive matching on deep context with easy configuration. For configuring recommender applications, this allows LensKit’s individual components to be reconfigured into arbitrarily complex hybrid configurations, allowing extensive code reuse. One of LensKit’s design goals is to provide an extensive collection of building blocks that can be combined into sophisticated algorithms, and the ability to configure them without requiring extensive and verbose object instantiation code is crucial to that aim. We have also shown that context-sensitive policy is strictly more powerful than the dependency qualifiers provided by many current dependency injection frameworks; while we expect that qualifiers will live on due to their convenience, they can be viewed as a syntax sugar on top of a more expressive paradigm.

There are a variety of extensions to dependency injection that may be worth considering in the future. One is weighted dependency injection: under this scheme, constructors or bindings have associated weights expressing the cost of using them, and the injector tries to ﬁnd the lowest-cost solution to the component request. This problem is likely NP-hard.

Opportunistic dependency injection is a simpliﬁed extension that is likely more practi-

cal. In opportunistic DI, some optional dependencies are marked as “opportunistic”, mean- ing that they will only be instantiated and used if required by some other component as a non-opportunistic dependency. They differ from normal optional dependencies in that an optional dependency will be supplied if it is possible to satisfy the dependency given the binding function, while an opportunistic dependency is only supplied if the configured constructor is invoked to satisfy some other dependency in the DIP. The key use case for this extension is when a component A can operate more efficiently if an expensive component B is available, but the efficiency gain alone is not sufficient to warrant the cost of instantiating

4.7. Conclusion

B. If some other component requires B, however, then A can take advantage of it under opportunistic DI. In LensKit this comes up with some of the data structures used for iterative training of models such as the FunkSVD model. The structures used to make the FunkSVD model training process eﬃcient can be used by many other components to decrease time and memory requirements, but it is not worth the cost of computing them just to compute the mean of the ratings in the system.

Grapht has proven to be a valuable tool in making LensKit ﬂexible and easy to use. We hope that its well-deﬁned model and straightforward implementation will make it a useful platform for future developments in dependency injection.

Chapter 5

Conﬁguring and Tuning Recommender Algorithms

T   several offline experiments we have run using LensKit. These experiments serve two primary purposes: to improve our understanding of the behavior of different algorithms and algorithm configurations, and to validate LensKit through repro- ducing and extending previous results. The diversity of experiments we present here, and their accompanying source code, also demonstrate the flexibility and usefulness of LensKit for a variety of recommender research tasks, as well as being independent research contri- butions in their own right.1

We first present comparative evaluation of several design decisions for collaborative filtering algorithms in the spirit of previous comparisons within a single algorithm [HKR02; Sar+01]. We examine LensKit’s user-user, item-item, regularized gradient descent SVD algorithms. These experiments extend previous comparative evaluations to larger data sets and multiple algorithm families and serves to demonstrate the versatility of LensKit and its capability of expressing a breadth of algorithms and configurations. In considering some configurations omitted in prior work we have also found new best-performers for algorith- mic choices, particularly for the user-user similarity function and the normalization for co- sine similarity in item-item CF. This set of experiments serves to show LensKit’s versatility in recommender experimentation, and fill in gaps in our current understanding of how to 1_{This work was done in collaboration with Michael Ludwig, Jack Kolb, Lingfei He, John T. Riedl, and} Joseph A. Konstan. Portions have been published in [Eks+11]; other portions are currently in preparation. Jack Kolb and Lingfei He were particularly involved in the work on tuning baselines and item-item CF.

5.1. Data and Experimental Setup

tune and configure commonly-used collaborative filtering algorithms. These experiments also provide insight into possible strategies for systematically tuning recommender system parameters (or the difficulties of doing so, in the case of FunkSVD.

We conclude this chapter with some results on the impact of rank-based evaluation on recommender conﬁguration and design.

5.1 Data and Experimental Setup

These experiments use several common data sets:

ML-100K The MovieLens 100K data set, consisting of 100K user ratings of movies from the MovieLens movie recommendation service.

ML-1M The MovieLens 1M data set.

ML-10M The MovieLens 10M data set. This data set also has 100K ‘tag applications’, events where users apply a tag to a movie.

Y!M The Yahoo! Music data set, containing user ratings of songs on the Yahoo! Mu- sic service and made available through the Yahoo! WebScope program. Unlike the MovieLens data sets, which have a single ﬁle of rating data, this data set is pre-split into 9 train/test segments. We do not re-combine the data, but use each train-test split as-is from Yahoo!.

Y!M Subset A subset of one of the training sets in the Yahoo! Music data set. The subset was produced by sampling 10% of the items and retaining all their ratings. We use a subset so that we can experiment with the sparser domain while maintaining rea- sonable experimental throughput. LensKit is capable of running on the full data set,

5.1. Data and Experimental Setup

but it takes substantial time to build and evaluate such models, making it diﬃcult to conduct extensive experiments.

Table 5.1 summarizes the size and sparsity of these data sets.

Data Set Range Ratings Users Items |𝑅|/|𝑈| |𝑅|/|𝐼| Density

ML-100K [1, 5]/1 100,000 943 1682 106.04 59.45 6.305%

ML-1M [1, 5]/1 1,000,209 6040 3706 165.60 269.89 4.468%

ML-10M [0.5, 5]/0.5 10,000,054 69,878 10,677 143.11 936.60 1.340%

Y!M [1, 5]/1 717,872,016 1,823,179 136,736 393.75 5250.06 0.288%

Y!Music [1, 5]/1 7,713,682 197,930 13,673 38.97 564.15 0.285%

Table 5.1: Rating data sets

Most of our results are using the ML-100K and ML-1M data sets. ML-100K allows us to directly replicate and compare with prior work, while ML-1M provides significantly more data while being small enough for good experimental throughput. We also ran some configurations on ML-10M and Yahoo! Music. Unless otherwise specified, all charts are over the ML-1M data set.

For each data set, we performed 5-fold cross-validation with LensKit’s default method described in section 3.8.2. 10 randomly-selected ratings were withheld from each user’s proﬁle for the test set, and the data sets only contain users who have rated at least 20 items. The Y!M data set is distributed by Yahoo! in 10 train-test sets, with each test set containing 10 ratings from each test user; we used the provided train/test splits and do not re-crossfold for this experiment.

For each train-test set, we built a recommender algorithm and evaluated its predict per- formance using MAE, RMSE, and nDCG.

In document Towards Recommender Engineering: tools and experiments for identifying recommender differences (Page 144-150)