• No results found

Ranking Problems withIncomplete InformationFixed Parameter Tractabilityof Distance Problems

N/A
N/A
Protected

Academic year: 2021

Share "Ranking Problems withIncomplete InformationFixed Parameter Tractabilityof Distance Problems"

Copied!
37
0
0

Loading.... (view fulltext now)

Full text

(1)

© 2012 Prof. Dr. Franz J. Brandenburg

Ranking Problems with Incomplete Information

Fixed Parameter Tractability of Distance Problems

Franz J. Brandenburg

University of Passau, Germany

(2)

© 2012 Prof. Dr. Franz J. Brandenburg

Survey

• the problem

• the motivation

• the solution

• new open problems

(3)

© 2012 Prof. Dr. Franz J. Brandenburg

Similarity of Permutations

Definition:

Given two total orders or permutations  and  on {1,2,....,n}.

How can we measure their dissimilarity?

1) count mismatches --> Kendall-tau

2) count moves --> Spearman footrule

3) others (Hamming, count exchanges of x‘s and y‘s)

rotate the middle swap the extremes

(4)

© 2012 Prof. Dr. Franz J. Brandenburg

Kendall tau Distance

Definition:

Given two total orders or permutations  and  on {1,2,....,n}.

The Kendall-tau (or Kemeny) distance K( ) is

K(, ) = #{(x, y) | x < y and ((x)–(x))•((y)–(y)) < 0}

= # disagreements: x <

y and y <

x,

= # the „dirty“ pairs

= # inversions (swaps) to transform  into .

= bubbleSort distance 



rotate the middle, K( ) = 2 swap the extremes, K( ) = 6

(5)

© 2012 Prof. Dr. Franz J. Brandenburg

Kendall-tau

Kendall-tau distance

named after Maurice Kendall in 1938 (statistics) invented by Gustav Fechner in 1897

Kendall-tau distance = two-layer crossing problem There is an O(n logn) algorithm to compute

- the number of crossings of n lines

- the Kendall-tau distance of two total orders Open problem

Is there an O(n) algorithm?

Compute the inversion numbers (D.E. Knuth 1968) inv(i) = #{ j | j > i and j left of i}

Updates in O(log n) in a search tree.

(6)

© 2012 Prof. Dr. Franz J. Brandenburg

Spearman-Footrule distance

Spearman-footrule distance or Spearman's rho

named after C. Spearman (1904) (correlation ranking in statistics)

compute displacements of two permutations move an element by k units

the L

1

- vector norm F( , ) = ∑

i

|((i)–(i)|

Lemma

For total orders the Spearman footrule distance can be computed in O(n)

value 8 value 2

(7)

© 2012 Prof. Dr. Franz J. Brandenburg

Diaconis-Graham

Theorem

The Diaconis-Graham inquality (1977) K(, ) ≤ F(, ) ≤ 2•K(, )

each crossing/mismatch/swap induces a displacement

each displacement is repaired by two crossings.

(8)

© 2012 Prof. Dr. Franz J. Brandenburg

Incomplete Information

total order

x < y or y < x for every pair of candidates x and y ties x ~ y

x and y are equivalent (an equivalence relation) I don't care for x and y

bucket orders with equivalent items in a bucket and a total order for the buckets

partial order

x ? y x any y are unrelated „apples and oranges“

? is not transitive

interval orders, hierarchical orders (trees)

(9)

© 2012 Prof. Dr. Franz J. Brandenburg

Generalization

Given:

a set of candidates X = {x

1

,...,x

n

} or simply {1,...,n}

a partial order π on X, and X is partially ordered by >

say x > y if x has a higher ranking if there is a preference for x properties:

transitive: x > y and y > z  x > z partial: many pairs are unrelated Example: The Australian Open 2012

Djokovic

Nadal Murray

Federer a partial order imposes Djokovic beats Federer

Murray and Nadal / Federer are incomparable

(10)

© 2012 Prof. Dr. Franz J. Brandenburg

Given: A set of candidates X and a partial order π

Representation of π:

a DAG (directed acyclic graph) with transitive edges vertices X = {1,...,n}

directed edges x ---> y if x > y in π

the DAG displays only the generating edges, the transitive reduction

e.g. 1 and 8 are unrelated, and 8 > 2, 8 > 3, 8 > 5, 2 and 3 unrelated

Partial Orders

8 1 2 3

5

6 4

7

(11)

© 2012 Prof. Dr. Franz J. Brandenburg

Extensions

How shall we compare two partial orders?

... via their sets of extensions

Ext(π) = {total orders  |  does not disagree with π}

π(i) < π(i) (i) < (i)

Ext(π) = {any order obtained from the DAG of π by topological sorting}

Example:

Ext(π = Ø} = all permutations Ext(π) = {all shuffles from

left (8,2,3,5), (8,3,2,5) and

right (1, 7,4,6), (1,4,7,6), (1,4,6,7)}

8 1 2 3

5

6 4

7

(12)

© 2012 Prof. Dr. Franz J. Brandenburg

topSort

An extension of π is a topological sorting

topsort: do {

get any source x (no incoming edges);

print x;

delete x;

while (there are vertices) }

Ext(π) = the set of all topsort runs; all possibilities for "any"

Theorem (Brightwell, Winkler, ACM STOC 1991) Computing |Ext(P)| is #P complete.

– breadth first 8,1, 2, 3, 7, 4, 5, 6 – heap (min-heap 1, 4, 6, 7, 8, 2, 3, 5 – best 1,

8, 2, 3, 4, 5, 6, 7

1 8

2 3

5

6 4

7

(13)

© 2012 Prof. Dr. Franz J. Brandenburg

Distance Measures

Given: a partial order π and a total (partial) order 

What is their distance?

nearest neighbor distance

KNN(π,) = min {K(,) |  is an extension of π}

Interpretation: the positive view,

there is some extension of π at distance ≤ k

Hausdorff (farthest neighbor) distance

KFN(π,) = max {K(,) |  is an extension of π}

Interpretation: the negative view

all extensions are within distance ≤ k

breadth first 8,1, 2, 3, 7, 4, 5, 6 K( , id) = 10

heap (min-heap) 1 ,4, 6, 7, 8, 2, 3, 5 K( ,id) = 11

best 1, 8, 2, 3, 7, 4, 5, 6 K( , id) = 6

(14)

© 2012 Prof. Dr. Franz J. Brandenburg

Distance Measures

nearest neighbors = closest red-blue pair or min min farthest neighbors = closest red-blue pair or max max Hausdorff distance = max {min distance{red,blue}}

center distance = min {max distance{red,blue}}

(15)

© 2012 Prof. Dr. Franz J. Brandenburg

Measures

Hausdorff distance (Felix Hausdorff 1968-1942) is a metric.

nearest neighbor, farthest neighbor, center distance are not ! since d(X,Y) = 0 does not imply X=Y

and no triangle inequality In R

2

, for points p = (x,y)

all four distances are in (n log n).

(16)

© 2012 Prof. Dr. Franz J. Brandenburg

for Partial Orders

1 2 3 4 5 6 7 8

8 1 2 3 7 4 5 6 (breadth first) Kendall-tau = 10

Spearman = 18

1 2 3 4 5 6 7 8

1 4 6 7 8 2 3 5 (min-heap) Kendall-tau = 11

Spearman = 22

8 1 2 3

5

6 4

7

1 2 3 4 5 6 7 8

1 8 2 3 4 5 6 7 (opt) Kendall-tau = 6

Spearman = 12

(17)

© 2012 Prof. Dr. Franz J. Brandenburg

Applications

• ranking problems

– in sport:

who is the champion ranking

– in metasearch

aggregate data from several search engines top-k lists

– voring systems

(18)

© 2012 Prof. Dr. Franz J. Brandenburg

Sport

• Who was the best Formula 1 driver 2011?

• Sebastian Vettel, Ger 392 pts

• Jenson Button, Eng 270 pts

• Mark Webber, Aus 258 pts

Schema: weighted Borda scores, sum points (25,18,...,1)

• Who is the best tennis player? The best possible ranking?

– the winner of the Australian open?

but Djokovic did not play Federer?

– the aggregate winner of the for Grand Slams?

evaluate the data from four trees: incomplete data

The ranking list is by weighted Borda scores

• An alternative: US-sports

• phase 1: scores

• phase 2: finals

(19)

© 2012 Prof. Dr. Franz J. Brandenburg

Meta Search

• meta search machines (incl. google)

aggregate the rankings (results) of many single searchers

searcher_1 = (2,4,3,5,1,...) searcher_2 = (4,1,3,5,2,...) searcher_3 = (5,3,2,1,4,...)

e.g.

for hotels, flights, rental cars,...

but in practice many providers do give you the best offer.

They make the $$$

How do they aggregate?

How do they compute the ranking?

the top - k list?

the first page?

(20)

© 2012 Prof. Dr. Franz J. Brandenburg

Rank Aggregation

The rank aggregation problem:

Given: a collection of total orders / permutations over n elements (

1

, 

2

,...,

m

)

Problem: the best compromise (Kemeny score)

a permutation or a linear order * such that

distance(*, 

1

, 

2

,...,

m

) = ∑

i

distance(*, 

i

)  MIN

treat every voter as fair as possible (Biedl, B., Deng, Disc. Math 2009) max

i

{distance (*, 

i

)}  MIN

As a decision problem:

Given k:

Is there a * with distance(*, 

1

, 

2

,...,

m

) ≤ k ?

(21)

© 2012 Prof. Dr. Franz J. Brandenburg

Facts

Theorem:

The rank aggregation problem under the Kendall-tau distance (1) is NP hard

- for many voters (Bartoldi, Tovey, Trick, 1989) - for even numbers > 4 (Dwork et al, WWW 2001)

by a complex reduction from feedback arc set

(small corrections by Biedl, Brandenburg, Deng Disc. Math. 2009) - in the max version (Biedl et al 2009)

(2) in O(n) for two voters (you and your boy/girlfriend/husband...)

... take any of the two or any inbetween (crossing)

(22)

© 2012 Prof. Dr. Franz J. Brandenburg

Facts (2)

(3) Kemeny score (rank aggregation with Kendal tau) is fixed parameter tractable for several parameters

Kemeny score

number of candidates

max and average range of candidate positions

(Betzler,Fellows, Guo, Niedermeier, Rosamond, TCS 410(45) 2009) In contrast:

(4) rank aggregation under the Spearman footrule distance in O(n

3

) for any number of voters (and even with local weights)

(Dwork et al, 2001) by weighted matching:

for i,j = 1,..., n set w

i,j

= What does it cost to place i at j?

... and matching does the rest.

(23)

© 2012 Prof. Dr. Franz J. Brandenburg

Results on Distances

Given: a partial order π and a total order id = (1,...,n) nearest neighbor distance K(π, id)

a total order  in Ext(π) such that K(,id) ––> MIN

Theorem (

Brandenburg, Gleissner, Hofmeier, Walcom 2012 / J. Comb. Opt. to appear)

The nearest neigbor distance problem is NP-hard,

Kendall tau: by reduction from one-sided crossing minimization in the version OSCM-4-stars

Spearman: by reduction from clique

fixed

mobile

Idea: the lower level is π with 4 elements per point

no relations between points, and extra blockers and about n2 crossings

(24)

© 2012 Prof. Dr. Franz J. Brandenburg

NP-hard: What‘s next

Approximation

if we cannot solve the problem exactly (if P ≠ NP) can we solve it up to some small error

Theorem

The nearest neigbor distance problem of a partial and a total order

is 2-approximable for the Kendall tau distance

... by a reduction to a constraint feedback arc set problem on tournaments and an adaptation/improvement of the 3-approximation of Schalekamp/van Zylen

using Quicksort on the feedback arc set problem

is 4-approximable for the Spearman footrule distance

... using the Diaconis-Graham inequality

(25)

© 2012 Prof. Dr. Franz J. Brandenburg

FPT

Given: a partial order π and a total order  over {1,....,n}

a parameter k

Problem: find an extension  of π in polynomial time such that – distance(, ) ≤ k

– or show that all extensions of π have distance at least k+1 Theorem (Brandenburg, Hofmeier, Gleißner, Walcom 12, J.Comb. Opt)

The distance problems for a partial order π and a total order  – for nearest neighbor Kendall tau distance

– for nearest neighbor Spearman footrule distance are fixed parameter tractable

with a linear kernel.

Transform the problem into a small version of size 2k.

(26)

© 2012 Prof. Dr. Franz J. Brandenburg

Intuition

.... the distance problem is NP-hard but

Suppose there are 1.000 elements and k=100.

Then

at most 100 "critical" pairs may cross at least 800 elements are not involved.

GOAL:

find these "800" elements and remove them solve the problem only on the "critical" pairs

- naive by exhaustive search on all ≤ k! extensions of π.

TODO: Improve upon the search

...x ..y... in π ...y ..x... in 

(27)

© 2012 Prof. Dr. Franz J. Brandenburg

Our Key: a Derivation

Given: a partial order π and a total order  over X = {1,....,n}

The derivation 

(π) of π in direction  is a binary relation over X For two elements x,y

let x < y in 

(π) by

(i) agreement x < y both in π and 

(ii) overrule x < y in π but x > y in  (iii) takeover x  y in π and x < y in 

Example

1 < 4,6,7 by agreement 1 < 2,3,5,8 by takeover 8 < 2,3,5 by overrule 1,4,6,7 < 8 by takeover cycle : 8 < 2 < 4 < 8

(overrule takeover takeover)

... and only 8 is involved in eight cycles, excluding 1.

8 1 2 3

5

6 4

7

(28)

© 2012 Prof. Dr. Franz J. Brandenburg

Derivation  (π)

Lemma (

Brandenburg, Gleissner, Hofmeier Walcom 2012, J. Comb. Opt.)



(π) is complete, defined for all x,y.



(π) may have cycles.

A cycle is made from one overrule and two overtakes.

Proof: Completeness is by definition

Cycles by a case analysis for (x,y,z)

If follows from the transitivity of π and 

that an agreement cannot be part of a cycle

and that two overrules x < y, y < z imply x < z

by the transitivity of a partial order

(29)

© 2012 Prof. Dr. Franz J. Brandenburg

Cycle Rule

Given: a partial order π and a total order  over X = {1,....,n}

a parameter k

Cycle rule for the reduction:

For every element x

remove x and keep k

if x is not in a cycle, in fact not in a triangle x < y < z < x of 

(π)

and there is no overrule on x.

Example

There is no cycle and no overrule on 1.

All cycles use 8 -- 2, 8 --3, 8 -- 5

8 1 2 3

5

6 4

7

(30)

© 2012 Prof. Dr. Franz J. Brandenburg

Proof

Lemma

The cycle rule preserves the Kendall-tau distance.

Proof: (sketch)

Consider a nearest neighbor of  in Ext(π) with K(, ) ≤ k For every x which is removed define

pred(x) = {y | y < x in (π)}

succ(x) = {z | x < z in (π) }.

Claim 1: There is an extension π* of π such that y < x for every y  pred(x) and x < z for every z  succ(x),

Then x serves as a "separator".

otherwise, consider the first y  pred(x) with y < x in (π) and x < y in π*.

Then ... by some case analysis ...y and its left neighbor can be swapped which contradict to "y is the first„

This needs some more work (see our papers).

pred(x) x succ(x) in π*

pred(x) x succ(x) in 

(31)

© 2012 Prof. Dr. Franz J. Brandenburg

Proof

In the running example, 1 can be removed.

Place 1 at the first position in the extension of π.

1 2 3 4 5 6 7 8

1 8 2 3 4 5 6 7 then K( ) = 6

8 1 2 3

5

6 4

7

(32)

© 2012 Prof. Dr. Franz J. Brandenburg

Kernel

Lemma

The nearest neighbor Kendall tau distance between π and  is ≤ k if after the cycle rule

i.e. after the removal of all "separators" x

there is an instance with at most 2k elements which has K(π

2k

, 

2k

) ≤ k.

Find this solution by exhaustive search an the 2k! extensions of π

2k

. Conclusion

The distance problem is FPT.

TODO

Improve the search for K(π

2k

, 

2k

) ≤ k.

(33)

© 2012 Prof. Dr. Franz J. Brandenburg

some open problems

for parameterized complexity

(34)

© 2012 Prof. Dr. Franz J. Brandenburg

Bucket Orders

Theorem

(Fagin et al 2006)

There are O(n logn) algorithms to compute

the distances (Kendall-tau aud Spearman) between bucket orders.

with x < y and ties x ~ y The buckets are totally ordered.

Theorem

The rank aggregation problem is

– NP-hard for total orders under Kendall-tau (Dwork et al 2001)

– in P for many total orders under Spearman (Dwork et al 2001).

– NP-hard for many bucket orders under Spearman

(Brandenburg, Gleissner, Hofmeier FAW-AAAI 2011)

OPEN: Is it FPT?

(35)

© 2012 Prof. Dr. Franz J. Brandenburg

1-planarity

Definition (G. Ringel, 1965) A graph G is 1-planar

if each edge is crossed at most once (by all other edges) Properites

an edge coloring

black with crossings red x blue

a 6-vertex coloring (Borodin 1984)

#edges < 4n-8 (Pach, Toth 1997, and others) not closed under edge contraction

there are infinitely many minimal non-1-planar graphs (Korzhik, 2007)

test is NP-hard (Korzhik, Mohar Graph Drawing 2008, LNCS 5166)

(36)

© 2012 Prof. Dr. Franz J. Brandenburg

1-planar + Rotation System

Definition

a rotation system (embedding) of a graph G = (V,E)

is the cylic order of the edge (neighbors) of v for each vertex v The crossing pair system of a graph G = (V,E)

is G together with all pairs (e,e‘) of crossing edges.

Lemma

Given a crossing pair system.

Test for 1-planarity is in O(n),

and there is a straight-line drawing of G on a polynomial size grid.

Claim (under work) (Auer, Brandenburg, Gleißner, Reislhuber) Given a rotation system:

Test for 1-planarity is NP-hard

.... by a reduction from planar 3-SAT

(37)

© 2012 Prof. Dr. Franz J. Brandenburg

Parameterized Complexity

Given: a graph G = (V,E) a parameter k

Problem: Is G 1-planar with at most k pairs of crossing edges?

Given: G with a rotation system and k

Problem: Is G 1-planar with at most k pairs of crossing edges?

Given: a directed graph G = (V,E) a parameter k

Problem: Is G upward 1-planar with at most k pairs of crossing edges?

i.e. G has a 1-planar drawing such that all edges are upward (Y-monontone)

I need your help!

Thank you

References

Related documents