• No results found

Note that it directly follows that Observations 3.1, 3.2, and 3.3 stating the “inde-pendence” from the range to the distance parameterizations carry over to d.

Simjour improved the running times provided in this chapter for the parameters k, dm, da, and d by decreasing the constant of the basis. The underlying algorithmic ideas rely on a transformation of Kemeny Score to Weighted Feedback Arc Set and applying and extending known algorithms for Weighted Feedback Arc Set. Recently, Karpinski and Schudy [142] provided a fixed-parameter algorithm with respect to d having a subexponential running time in d. Clearly, this implies subexponential running times with respect to k, dm, and da as well.

Above guaranteed value. Parameterization above guaranteed value has been in-troduced by Mahajan and Raman [160]. For Kemeny Score, the point is that

L := X

{a,b}⊆C

min{π(a, b), π(b, a)},

where π(a, b) denotes the number of votes that rank a higher than b, is an obvious lower bound for the Kemeny Score k. Hence, it is interesting to parameterize above this guaranteed lower bound, more precisely, by the parameter “k−L”. Applying a param-eter preserving-reduction from Kemeny Score to a weighted variant of Directed Feedback Vertex Set, Mahajan et al.[161] observed fixed-parameter tractability with respect to k− L.

3.9 Conclusion

We initiated a multivariate complexity analysis of Kemeny Score including the identification of meaningful parameterizations such as the “average KT-distance” and

“candidate range”. Our corresponding results are displayed in Table 3.1 (Section 3.2).

In the meantime our results have been extended and some of them improved by several authors [142, 161, 189]. An overview of the state of the art of the parameterized com-plexity of Kemeny Score and the two generalizations allowing for ties or incomplete information is given Table 3.2. There are numerous challenges for future studies:

• In several applications, it is useful to compute not just one optimal Kemeny consensus but to enumerate all of them. Simjour [189] provided a search-based algorithm for enumerating all Kemeny consensuses and showed that the expo-nential part of the running time is at most 36d. Can this result be improved and extended to some other parameterizations?

• A long-standing open question regards the computational complexity of Ke-meny Score with three votes [77, 2]. From the many-one reduction provided by Dwork et al. [77], NP-hardness follows only for an even number of votes, that is, for all odd fixed values such as 3,5,7,... the computational complexity is still open.

• Regarding Kemeny Score with Ties the running times for many parameters as provided in Table 3.2 still have a high combinatorial explosion and hence seek

for improvement. Note that the results for the “average KT-distance” rely on an approach based on data reduction (see Chapter 4). Some additional non-data reduction based algorithm complementing this result would be desirable. While the strategy provided in this chapter does not immediately transfer to the case with ties (see Subsection 3.7.1), it might be possible to adapt the algorithms provided by Simjour [189].

It might be also interesting to investigate whether special cases of Kemeny Score with Ties such p-ratings and top-m lists [1] allow for more efficient algorithms.

In addition to the above questions regarding Kemeny Score directly, there are several closely related problems for which it might be interesting to investigate how far the results obtained for Kemeny Score can be transfered.

• Kemeny Score is a median problem seeking to minimize the sum of distances from a preference list. Analogously, one can seek for a preference list minimizing the maximum distance (that is, searching for the center instead of the median).

Due to an application in graph drawing, this problem is known as Crossing Permutation and its computational complexity has been investigated by Biedl et al. [32]. Some first results regarding the parameterized complexity have been obtained by Schwarz [186]. One possible interpretation of Crossing Permu-tation in the context of voting concerns scenarios in which it is mandatory to protect minorities. Then, one might look for an outcome of an election that minimizes the damage for the “most aggrieved voter”.

• Fagin et al. [92] introduced various distance measures between “top k lists”, for example, the Hausdorff Kendall distance. For every such distance between two lists one can define a consensus problem analogous to Kemeny Score.

• The Metric s-Median problem can be stated as follows (see Shindler [188] for a survey). Given a set N of points in some metric space and some integers s and k, it asks whether there is a size-s subset K ⊂ N such that the sum of all N ’s points’ distances to their nearest element of K is at most k. Since the KT-distance is a metric, Kemeny Score can be considered as a special case of this problem with s = 1, that is, searching for one consensus. It might be interesting to identify scenarios where one is looking for a set of consensus ranking and investigate the computational complexity of the corresponding problems. From a voting point of view this directly leads to a “multiple winner” scenario.

In the following chapter, we further extend our results for the parameter average KT-distance by developing a new data reduction methodology.

Chapter 4

Partial kernelization for Kemeny

In the previous chapter, we started a multivariate analysis of Kemeny Score which will be further extended in this chapter. We provide some polynomial-time data re-duction rules with performance guarantee for Kemeny Score. More specifically, we show that the number of candidates in a reduced instance only depends on the “av-erage KT-distance” and another, newly introduced parameter. Then, fixed-parameter tractability with respect to these parameters follows from the fixed-parameter algo-rithm with respect to the “number of candidates” from Section 3.3. Although for the parameter “average KT-distance” daour results do not improve the bound on the worst-case running time of 2O(da)·poly [142] (see also Table 7.1), efficient polynomial-time data reduction clearly complements the previous results. Experiments showing the practical value of data reduction for the computation of Kemeny rankings are provided in Chapter 5.

Methodology. The results of this chapter rely on a new methodological framework for intractable median problems such as Kemeny Score. In median problems one is given a set of objects and the task is to find a “consensus object” that minimizes the sum of distances to the given input objects. The framework was exhibited for Kemeny Score with and without ties, and the problems Swap Median Permu-tation and Consensus Clustering [25]. Here, we only focus on Kemeny Score.

Our algorithmic framework shows that if the input objects are sufficiently “similar on average”, then there are provably effective data reduction rules.

Within our framework, two points deserve particular attention. First, the identi-fication of polynomial-time solvable special cases of the underlying problems. Second, a novel concept of kernelization based on polynomial-time data reduction that does not yield problem kernels in the classical sense of parameterized algorithmics but still allows for “partial problem kernels”. The basic idea can be explained as follows. In multi-dimensional problems, a partial kernelization reduces at least one dimension such that its size only depends on the parameter value. In our case, the reduced dimension refers to the “number of candidates” and the parameter to the “average KT-distance”. The concept of partial kernelization promises to be useful beyond the problems and parameterizations for which the framework has been exhibited [25].

On the way to proving our results with respect to the parameter “average distance”, we introduce another measurement of dissimilarity—the “number of dirty elements”—

which can be considered as an alternative parameterization. We also show fixed-parameter tractability with respect to this fixed-parameterization. As we will see, both parameterizations are closely related. In comparison, the “average distance” seems to be the more intuitive and easier to understand parameter whereas the “dirtiness”

parameterization seems to yield stronger results.

Results. Our results for Kemeny Score are summarized as follows. We intro-duce a concept of “dirtiness” for candidates and pairs of candidates. This concept is used to identify a polynomial-time solvable special case and allows for efficient data reduction rules resulting in a linear-size partial kernel with respect to the “average KT-distance” da. More specifically, our new data reduction rules can transform every instance into an equivalent one that contains less than 11da candidates. We further classify different “degrees of dirtiness” and, depending on this degree, either obtain a linear or quadratic partial kernel with respect to the “number of dirty pairs”. Finally, we briefly discuss analogous results for Kemeny Score with Ties settling the so far open question of fixed-parameter tractability with respect to da.

4.1 Framework and basic definitions

The outline of our framework adapted to Kemeny Score reads as follows.

Step 1. Identify a polynomial-time solvable special case by defining a concept of

“dirtiness” for candidates and proving that an instance without dirty candidates can be easily solved.

Step 2. Show that the number of dirty candidates is bounded from above by a polynomial only depending on the average KT-distance.

Step 3. Develop polynomial-time data reduction rules such that in a reduced instance the number of nondirty candidates is bounded from above by a polynomial only depending on the number of dirty candidates and, thus, also only depending on the average distance.

Step 4. Exploit the fact that Kemeny Score is fixed-parameter tractable with respect to the number of candidates (see Section 8.1).

This framework yields fixed-parameter tractability with respect to both parameters

“average KT-distance” and “number of dirty candidates”. In general, fixed-parameter tractability would also follow for nonpolynomial functions in Steps 2 and 3, but all our results provide polynomial bounds. A special feature of our framework is that in Step 3 we perform a “partial kernelization”, a concept of general interest. Herein, the term “partial” refers to the fact that only the number of candidates is reduced, but not the number of votes. This leads to the following general definition.

Definition 4.1. Let (I, k) be an instance of a parameterized problem P , where I∈ Σ denotes the input instance and k a parameter. Let d : Σ → N be a computable function such that P is fixed-parameter tractable with respect to d(I). The problem

4.2 Dirtiness and a polynomial-time special case 47