© 2012 Prof. Dr. Franz J. Brandenburg
Ranking Problems with Incomplete Information
Fixed Parameter Tractability of Distance Problems
Franz J. Brandenburg
University of Passau, Germany
© 2012 Prof. Dr. Franz J. Brandenburg
Survey
• the problem
• the motivation
• the solution
• new open problems
© 2012 Prof. Dr. Franz J. Brandenburg
Similarity of Permutations
Definition:
Given two total orders or permutations and on {1,2,....,n}.
How can we measure their dissimilarity?
1) count mismatches --> Kendall-tau
2) count moves --> Spearman footrule
3) others (Hamming, count exchanges of x‘s and y‘s)
rotate the middle swap the extremes
© 2012 Prof. Dr. Franz J. Brandenburg
Kendall tau Distance
Definition:
Given two total orders or permutations and on {1,2,....,n}.
The Kendall-tau (or Kemeny) distance K( ) is
K(, ) = #{(x, y) | x < y and ((x)–(x))•((y)–(y)) < 0}
= # disagreements: x <
y and y <
x,
= # the „dirty“ pairs
= # inversions (swaps) to transform into .
= bubbleSort distance
rotate the middle, K( ) = 2 swap the extremes, K( ) = 6
© 2012 Prof. Dr. Franz J. Brandenburg
Kendall-tau
Kendall-tau distance
named after Maurice Kendall in 1938 (statistics) invented by Gustav Fechner in 1897
Kendall-tau distance = two-layer crossing problem There is an O(n logn) algorithm to compute
- the number of crossings of n lines
- the Kendall-tau distance of two total orders Open problem
Is there an O(n) algorithm?
Compute the inversion numbers (D.E. Knuth 1968) inv(i) = #{ j | j > i and j left of i}
Updates in O(log n) in a search tree.
© 2012 Prof. Dr. Franz J. Brandenburg
Spearman-Footrule distance
Spearman-footrule distance or Spearman's rho
named after C. Spearman (1904) (correlation ranking in statistics)
compute displacements of two permutations move an element by k units
the L
1- vector norm F( , ) = ∑
i|((i)–(i)|
Lemma
For total orders the Spearman footrule distance can be computed in O(n)
value 8 value 2
© 2012 Prof. Dr. Franz J. Brandenburg
Diaconis-Graham
Theorem
The Diaconis-Graham inquality (1977) K(, ) ≤ F(, ) ≤ 2•K(, )
each crossing/mismatch/swap induces a displacement
each displacement is repaired by two crossings.
© 2012 Prof. Dr. Franz J. Brandenburg
Incomplete Information
total order
x < y or y < x for every pair of candidates x and y ties x ~ y
x and y are equivalent (an equivalence relation) I don't care for x and y
bucket orders with equivalent items in a bucket and a total order for the buckets
partial order
x ? y x any y are unrelated „apples and oranges“
? is not transitive
interval orders, hierarchical orders (trees)
© 2012 Prof. Dr. Franz J. Brandenburg
Generalization
Given:
a set of candidates X = {x
1,...,x
n} or simply {1,...,n}
a partial order π on X, and X is partially ordered by >
say x > y if x has a higher ranking if there is a preference for x properties:
transitive: x > y and y > z x > z partial: many pairs are unrelated Example: The Australian Open 2012
Djokovic
Nadal Murray
Federer a partial order imposes Djokovic beats Federer
Murray and Nadal / Federer are incomparable
© 2012 Prof. Dr. Franz J. Brandenburg
Given: A set of candidates X and a partial order π
Representation of π:
a DAG (directed acyclic graph) with transitive edges vertices X = {1,...,n}
directed edges x ---> y if x > y in π
the DAG displays only the generating edges, the transitive reduction
e.g. 1 and 8 are unrelated, and 8 > 2, 8 > 3, 8 > 5, 2 and 3 unrelated
Partial Orders
8 1 2 3
5
6 4
7
© 2012 Prof. Dr. Franz J. Brandenburg
Extensions
How shall we compare two partial orders?
... via their sets of extensions
Ext(π) = {total orders | does not disagree with π}
π(i) < π(i) (i) < (i)
Ext(π) = {any order obtained from the DAG of π by topological sorting}
Example:
Ext(π = Ø} = all permutations Ext(π) = {all shuffles from
left (8,2,3,5), (8,3,2,5) and
right (1, 7,4,6), (1,4,7,6), (1,4,6,7)}
8 1 2 3
5
6 4
7
© 2012 Prof. Dr. Franz J. Brandenburg
topSort
An extension of π is a topological sorting
topsort: do {
get any source x (no incoming edges);
print x;
delete x;
while (there are vertices) }
Ext(π) = the set of all topsort runs; all possibilities for "any"
Theorem (Brightwell, Winkler, ACM STOC 1991) Computing |Ext(P)| is #P complete.
– breadth first 8,1, 2, 3, 7, 4, 5, 6 – heap (min-heap 1, 4, 6, 7, 8, 2, 3, 5 – best 1,
8, 2, 3, 4, 5, 6, 71 8
2 3
5
6 4
7
© 2012 Prof. Dr. Franz J. Brandenburg
Distance Measures
Given: a partial order π and a total (partial) order
What is their distance?
nearest neighbor distance
KNN(π,) = min {K(,) | is an extension of π}
Interpretation: the positive view,
there is some extension of π at distance ≤ k
Hausdorff (farthest neighbor) distance
KFN(π,) = max {K(,) | is an extension of π}
Interpretation: the negative view
all extensions are within distance ≤ k
breadth first 8,1, 2, 3, 7, 4, 5, 6 K( , id) = 10
heap (min-heap) 1 ,4, 6, 7, 8, 2, 3, 5 K( ,id) = 11
best 1, 8, 2, 3, 7, 4, 5, 6 K( , id) = 6
© 2012 Prof. Dr. Franz J. Brandenburg
Distance Measures
nearest neighbors = closest red-blue pair or min min farthest neighbors = closest red-blue pair or max max Hausdorff distance = max {min distance{red,blue}}
center distance = min {max distance{red,blue}}
© 2012 Prof. Dr. Franz J. Brandenburg
Measures
Hausdorff distance (Felix Hausdorff 1968-1942) is a metric.
nearest neighbor, farthest neighbor, center distance are not ! since d(X,Y) = 0 does not imply X=Y
and no triangle inequality In R
2, for points p = (x,y)
all four distances are in (n log n).
© 2012 Prof. Dr. Franz J. Brandenburg
for Partial Orders
1 2 3 4 5 6 7 8
8 1 2 3 7 4 5 6 (breadth first) Kendall-tau = 10
Spearman = 18
1 2 3 4 5 6 7 8
1 4 6 7 8 2 3 5 (min-heap) Kendall-tau = 11
Spearman = 22
8 1 2 3
5
6 4
7
1 2 3 4 5 6 7 8
1 8 2 3 4 5 6 7 (opt) Kendall-tau = 6
Spearman = 12
© 2012 Prof. Dr. Franz J. Brandenburg
Applications
• ranking problems
– in sport:
who is the champion ranking
– in metasearch
aggregate data from several search engines top-k lists
– voring systems
© 2012 Prof. Dr. Franz J. Brandenburg
Sport
• Who was the best Formula 1 driver 2011?
• Sebastian Vettel, Ger 392 pts
• Jenson Button, Eng 270 pts
• Mark Webber, Aus 258 pts
Schema: weighted Borda scores, sum points (25,18,...,1)
• Who is the best tennis player? The best possible ranking?
– the winner of the Australian open?
but Djokovic did not play Federer?
– the aggregate winner of the for Grand Slams?
evaluate the data from four trees: incomplete data
The ranking list is by weighted Borda scores
• An alternative: US-sports
• phase 1: scores
• phase 2: finals
© 2012 Prof. Dr. Franz J. Brandenburg
Meta Search
• meta search machines (incl. google)
aggregate the rankings (results) of many single searchers
searcher_1 = (2,4,3,5,1,...) searcher_2 = (4,1,3,5,2,...) searcher_3 = (5,3,2,1,4,...)
e.g.
for hotels, flights, rental cars,...
but in practice many providers do give you the best offer.
They make the $$$
How do they aggregate?
How do they compute the ranking?
the top - k list?
the first page?
© 2012 Prof. Dr. Franz J. Brandenburg
Rank Aggregation
The rank aggregation problem:
Given: a collection of total orders / permutations over n elements (
1,
2,...,
m)
Problem: the best compromise (Kemeny score)
a permutation or a linear order * such that
distance(*,
1,
2,...,
m) = ∑
idistance(*,
i) MIN
treat every voter as fair as possible (Biedl, B., Deng, Disc. Math 2009) max
i{distance (*,
i)} MIN
As a decision problem:
Given k:
Is there a * with distance(*,
1,
2,...,
m) ≤ k ?
© 2012 Prof. Dr. Franz J. Brandenburg
Facts
Theorem:
The rank aggregation problem under the Kendall-tau distance (1) is NP hard
- for many voters (Bartoldi, Tovey, Trick, 1989) - for even numbers > 4 (Dwork et al, WWW 2001)
by a complex reduction from feedback arc set
(small corrections by Biedl, Brandenburg, Deng Disc. Math. 2009) - in the max version (Biedl et al 2009)
(2) in O(n) for two voters (you and your boy/girlfriend/husband...)
... take any of the two or any inbetween (crossing)
© 2012 Prof. Dr. Franz J. Brandenburg
Facts (2)
(3) Kemeny score (rank aggregation with Kendal tau) is fixed parameter tractable for several parameters
Kemeny score
number of candidates
max and average range of candidate positions
(Betzler,Fellows, Guo, Niedermeier, Rosamond, TCS 410(45) 2009) In contrast:
(4) rank aggregation under the Spearman footrule distance in O(n
3) for any number of voters (and even with local weights)
(Dwork et al, 2001) by weighted matching:
for i,j = 1,..., n set w
i,j= What does it cost to place i at j?
... and matching does the rest.
© 2012 Prof. Dr. Franz J. Brandenburg
Results on Distances
Given: a partial order π and a total order id = (1,...,n) nearest neighbor distance K(π, id)
a total order in Ext(π) such that K(,id) ––> MIN
Theorem (
Brandenburg, Gleissner, Hofmeier, Walcom 2012 / J. Comb. Opt. to appear)The nearest neigbor distance problem is NP-hard,
Kendall tau: by reduction from one-sided crossing minimization in the version OSCM-4-stars
Spearman: by reduction from clique
fixed
mobile
Idea: the lower level is π with 4 elements per point
no relations between points, and extra blockers and about n2 crossings
© 2012 Prof. Dr. Franz J. Brandenburg
NP-hard: What‘s next
Approximation
if we cannot solve the problem exactly (if P ≠ NP) can we solve it up to some small error
Theorem
The nearest neigbor distance problem of a partial and a total order
is 2-approximable for the Kendall tau distance
... by a reduction to a constraint feedback arc set problem on tournaments and an adaptation/improvement of the 3-approximation of Schalekamp/van Zylen
using Quicksort on the feedback arc set problem
is 4-approximable for the Spearman footrule distance
... using the Diaconis-Graham inequality
© 2012 Prof. Dr. Franz J. Brandenburg
FPT
Given: a partial order π and a total order over {1,....,n}
a parameter k
Problem: find an extension of π in polynomial time such that – distance(, ) ≤ k
– or show that all extensions of π have distance at least k+1 Theorem (Brandenburg, Hofmeier, Gleißner, Walcom 12, J.Comb. Opt)
The distance problems for a partial order π and a total order – for nearest neighbor Kendall tau distance
– for nearest neighbor Spearman footrule distance are fixed parameter tractable
with a linear kernel.
Transform the problem into a small version of size 2k.
© 2012 Prof. Dr. Franz J. Brandenburg
Intuition
.... the distance problem is NP-hard but
Suppose there are 1.000 elements and k=100.
Then
at most 100 "critical" pairs may cross at least 800 elements are not involved.
GOAL:
find these "800" elements and remove them solve the problem only on the "critical" pairs
- naive by exhaustive search on all ≤ k! extensions of π.
TODO: Improve upon the search
...x ..y... in π ...y ..x... in
© 2012 Prof. Dr. Franz J. Brandenburg
Our Key: a Derivation
Given: a partial order π and a total order over X = {1,....,n}
The derivation
(π) of π in direction is a binary relation over X For two elements x,y
let x < y in
(π) by
(i) agreement x < y both in π and
(ii) overrule x < y in π but x > y in (iii) takeover x y in π and x < y in
Example
1 < 4,6,7 by agreement 1 < 2,3,5,8 by takeover 8 < 2,3,5 by overrule 1,4,6,7 < 8 by takeover cycle : 8 < 2 < 4 < 8
(overrule takeover takeover)
... and only 8 is involved in eight cycles, excluding 1.
8 1 2 3
5
6 4
7
© 2012 Prof. Dr. Franz J. Brandenburg
Derivation (π)
Lemma (
Brandenburg, Gleissner, Hofmeier Walcom 2012, J. Comb. Opt.)
(π) is complete, defined for all x,y.
(π) may have cycles.
A cycle is made from one overrule and two overtakes.
Proof: Completeness is by definition
Cycles by a case analysis for (x,y,z)
If follows from the transitivity of π and
that an agreement cannot be part of a cycle
and that two overrules x < y, y < z imply x < z
by the transitivity of a partial order
© 2012 Prof. Dr. Franz J. Brandenburg
Cycle Rule
Given: a partial order π and a total order over X = {1,....,n}
a parameter k
Cycle rule for the reduction:
For every element x
remove x and keep k
if x is not in a cycle, in fact not in a triangle x < y < z < x of
(π)
and there is no overrule on x.
Example
There is no cycle and no overrule on 1.
All cycles use 8 -- 2, 8 --3, 8 -- 5
8 1 2 3
5
6 4
7
© 2012 Prof. Dr. Franz J. Brandenburg
Proof
Lemma
The cycle rule preserves the Kendall-tau distance.
Proof: (sketch)
Consider a nearest neighbor of in Ext(π) with K(, ) ≤ k For every x which is removed define
pred(x) = {y | y < x in (π)}
succ(x) = {z | x < z in (π) }.
Claim 1: There is an extension π* of π such that y < x for every y pred(x) and x < z for every z succ(x),
Then x serves as a "separator".
otherwise, consider the first y pred(x) with y < x in (π) and x < y in π*.
Then ... by some case analysis ...y and its left neighbor can be swapped which contradict to "y is the first„
This needs some more work (see our papers).
pred(x) x succ(x) in π*
pred(x) x succ(x) in
© 2012 Prof. Dr. Franz J. Brandenburg
Proof
In the running example, 1 can be removed.
Place 1 at the first position in the extension of π.
1 2 3 4 5 6 7 8
1 8 2 3 4 5 6 7 then K( ) = 6
8 1 2 3
5
6 4
7
© 2012 Prof. Dr. Franz J. Brandenburg
Kernel
Lemma
The nearest neighbor Kendall tau distance between π and is ≤ k if after the cycle rule
i.e. after the removal of all "separators" x
there is an instance with at most 2k elements which has K(π
2k,
2k) ≤ k.
Find this solution by exhaustive search an the 2k! extensions of π
2k. Conclusion
The distance problem is FPT.
TODO
Improve the search for K(π
2k,
2k) ≤ k.
© 2012 Prof. Dr. Franz J. Brandenburg
some open problems
for parameterized complexity
© 2012 Prof. Dr. Franz J. Brandenburg
Bucket Orders
Theorem
(Fagin et al 2006)There are O(n logn) algorithms to compute
the distances (Kendall-tau aud Spearman) between bucket orders.
with x < y and ties x ~ y The buckets are totally ordered.
Theorem
The rank aggregation problem is
– NP-hard for total orders under Kendall-tau (Dwork et al 2001)
– in P for many total orders under Spearman (Dwork et al 2001).
– NP-hard for many bucket orders under Spearman
(Brandenburg, Gleissner, Hofmeier FAW-AAAI 2011)
OPEN: Is it FPT?
© 2012 Prof. Dr. Franz J. Brandenburg
1-planarity
Definition (G. Ringel, 1965) A graph G is 1-planar
if each edge is crossed at most once (by all other edges) Properites
an edge coloring
black with crossings red x blue
a 6-vertex coloring (Borodin 1984)
#edges < 4n-8 (Pach, Toth 1997, and others) not closed under edge contraction
there are infinitely many minimal non-1-planar graphs (Korzhik, 2007)
test is NP-hard (Korzhik, Mohar Graph Drawing 2008, LNCS 5166)
© 2012 Prof. Dr. Franz J. Brandenburg
1-planar + Rotation System
Definition
a rotation system (embedding) of a graph G = (V,E)
is the cylic order of the edge (neighbors) of v for each vertex v The crossing pair system of a graph G = (V,E)
is G together with all pairs (e,e‘) of crossing edges.
Lemma
Given a crossing pair system.
Test for 1-planarity is in O(n),
and there is a straight-line drawing of G on a polynomial size grid.
Claim (under work) (Auer, Brandenburg, Gleißner, Reislhuber) Given a rotation system:
Test for 1-planarity is NP-hard
.... by a reduction from planar 3-SAT
© 2012 Prof. Dr. Franz J. Brandenburg