Sampling Orbits - Parallel Uniform Generation of Unlabelled Graphs

Chapter 2 Parallel Uniform Generation of Unlabelled Graphs

2.2 Sampling Orbits

The parallel algorithms for generating unlabelled graphs uniformly at random presented in the following sections are based on a general procedureDW for generating an orbit!of the setunder

the action of a permutation groupG, first described in [DW83].

(1) Select a conjugacy classC Gwith probabilityPr[C℄ =

jCjjFix(g)j mjGj

(by Theorem 16.2gcan be any member ofCsincejFix(g)jis the same

for allg2C; alsomis the number of orbits).

(2) Select uniformly at random2Fix(g)and return its orbit.

The important observation in the analysis ofDW(stated in Theorem 15.2) is the fact that

if we list pairs(g;)wheregis a permutation and2!\Fig(g)then each orbit occurs exactly jGjtimes in this listing.

Notice that the number of orbits is an input to the algorithm above. ThereforeDWis indeed

a nice and clean example of the relationship between counting and generation mentioned above. If the number of orbits is known, using this numberDWwill return an orbit distributed uniformly over

the set of all orbits.

AlgorithmDWcan be adapted to generate unlabelled graphs of given order assuming that

their number is known in advance. In this case=G n

, the set of all labelled graphs withnvertices,

andG =S

n, the symmetric group of order

n. The action ofS non

G n

is a mapping which, given a graph and a permutation, relabels the vertices of the graph according to the permutation. The

n = 5 n = 7

n = 8 n = 10

Figure 2.1: Probability distributions on conjugacy classes forn=5;7;8;10.

parametermis the number of unlabelled graphs onnvertices. A sequential algorithm running in O(n

)expected time is obtained by noticing that, although the number of conjugacy classes is super-

polynomial (see Theorem 19 below), the probability distribution defined in step (1) assigns very high probability to conjugacy classes containing permutations moving very few elements. Figure 2.1 (to be read in row major order) shows this distribution for values ofn=5;7;8;10. More formally we

give the following definition:

Definition 4 IfC iand

jare two conjugacy classes in S

n, then C

i C

jif the number of elements

fixed by the permutations inC

i is larger than the number of elements fixed by the permutations

inC

j. The partial order

is called weak ascending lexicographic order (w.a.l. order for short).

In particularC i

j if the permutations in the two conjugacy classes fix the same number of

elements.

Ideally we would like to prove thatPr[C i

℄is a decreasing function of the number of po-

sitions that are not fixed. But this is clearly false in all the examples in Figure 2.1. However it is possible to prove monotonicity on a subset of all conjugacy classes and then, with a little more effort, to establish that the probability of choosing a small (in w.a.l. order) conjugacy class in this subset is much larger than that of choosing any other conjugacy class.

Definition 5 A conjugacy class is called dominant if its cycle type is of the form[n 2k;k;0;:::;0℄

for somek2f0;:::;bn=2g.D

nis the family of all dominant conjugacy classes in S

It follows from the definition that ifC 2D nthen

Pr[C℄as defined inDW only depends

on the number of cycles of length two of the permutations belonging toC.

Lemma 2 For allnsufficiently large, ifC2D nthen

Pr[C℄is a decreasing function of the number

of cycles of length two in the permutations inC.

Proof. Theorem 16 gives an expression for the cardinality of the setFix(g). IfCis dominant then l(1)=n kfor somek2f0;:::;bn=2g,l(2)=k, all otherl(i)are null and

q(C)= 1 2 f'(1)(n k) 2 +'(2)k 2 (n k)+kg

Function'(i), as defined in Theorem 16.3, is the well known Euler function giving the number

of positive integers less thanithat are prime toi. In particular (see, for example, [Rio58, p. 62]) '(1)='(2)=1. Hence q(C) = 1 2 fn 2 2nk+k 2 +k 2 n+k+kg = n 2 k(n 1)+k 2

Thus it is possible to write

mPr[C℄= 2 ( n 2 ) k n+k 2 (n 2k)!k!

and the proof will be completed by showing thatf n (k):IN!IRdefined by f n (k)= df 2 ( n 2 ) k n+k 2 (n 2k)!k!

is a decreasing function ofk. For any fixedn, the ratio f n (k) f n (k+1) = (k+1)2 n 1 2k (n 2k)(n 2k 1) Ifkn=2 log[( p

2+)n℄for any>0, then f n (k) f n (k+1) 2 2log [( p 2+)n℄ 1 n(n 1) = [( p 2+)n℄ 2 2n(n 1)

which is larger than one for anyn2. Ifn=2 log[( p 2+)n℄<kbn=2 1then f n (k) f n (k+1) n=2 log[( p 2+)n℄ 2(2log[( p 2+)n℄ 2)(2log[( p 2+)n℄ 3) = n 2log[( p 2+)n℄ 8(log[( p 2+)n℄ 1)(2log[( p 2+)n℄ 3)

Lemma 3 For every integerksuch thatk=O(n=logn), ifC

1is a dominant conjugacy class with

associated cycle type[n 2k;k;0;:::;0℄, thenPr[C℄ Pr[C 1

℄for every conjugacy classCwith C

1 C.

Proof. LetC

1be a dominant conjugacy class with associated cycle type

[n 2k;k;0;:::;0℄. By

Lemma 2 we only need to prove thatPr[C 1

℄Pr[C℄for every conjugacy class whose permutations

move2k+1or2k+2elements. We state the argument explicitly for permutations moving2k+1

elements. Following [HP73] letg (i) n

= m P

Pr[C℄ where the sum is over all conjugacy classes

whose permutations move exactlyielements. Harary and Palmer [HP73] prove that g (i) n 2 ( n 2 ) +(i ni+i 2 =2)=2 (n i)! LetC

1be a conjugacy class whose permutations have cycle type

[n 2k;k;0;:::;0℄. We have g (2k +1) n 2 ( n 2 ) k (n 2)+k 2 n 2 + 3 4 (n 2k 1)! By Lemma 2 mPr[C 1 ℄= 2 ( n 2 ) k n+k 2 (n 2k)!k!

Hence ifCis a conjugacy class whose permutations move2k+1elements Pr[C℄ X Cmoving2k +1 Pr[C℄(n 2k)k!2 2k n 2 + 3 4 Pr[C 1 ℄

The result follows for sufficiently largen. The argument for conjugacy classes whose permutations

move2k+2objects is the same and the result follows sinceg (i)

n is a decreasing function of i. 2

A simple way to implement the first step of algorithmDWis to select a random real number between zero and one, to list conjugacy classes in w.a.l. order and pick the first conjugacy class

for which the sum of jCjjFix(g)j mjGj

over all conjugacy classes listed so far is larger than the threshold

. Lett

nbe the random variable counting the number of conjugacy classes listed by this algorithm.

The following result is proved in [DW83].

Theorem 17 1E(t

)3for everyn2IN.

The second step ofDWcan be implemented deterministically inO(n 2

)sequential time so

that the overall algorithm has expected running timeO(n 2

The same algorithm can be simulated in parallel. The following result appears in [Pu97] where, also, other parallel uniform generation algorithms for labelled graphs and subgraphs are presented.

Theorem 18 There exists an algorithm running inO(logn)expected time usingn 2

processors of a CREW PRAM which generates uniformly at random an unlabelled graph onnvertices, assuming

their number is known in advance.

This result can be improved by using the algorithms described in Section 2.5 instead of some of the procedures embedded in the proof of Theorem 18 to obtain an optimal EREW algorithm. However there are two major problems which are inherent to Dixon and Wilf’s algorithmic solution. First of all, the number of unlabelled graphs of ordernis assumed to be known. Although an exact

formula for this number exists, its computation uses a listing of all conjugacy classes inS n. The

reader is referred to Section 2.4 for further details on the complexity of listing conjugacy classes. Secondly it is not easy to convertDWinto an RNC algorithm which succeeds with high probability.

By the Markov inequality and monotonicity of probability, Theorem 17 implies that, for every>2with probability at least1

nis smaller than

3. It might then be argued that

fixing(for example to some polynomial function of the number of vertices), selecting a random 2 [0;1℄and listing at most3 conjugacy classes during step (1), provides an algorithm which

always runs in polynomial time and returns a graph with probability at least1=2. In Section 2.6

a parallelisation ofDW based on this idea is described. The resulting algorithm belongs to RNC

and achieves optimal work and very low failure probability. Unfortunately it will be shown that the distribution over the output graphs is not completely uniform. In a sense, usingDW, exact

uniformity seems possible only if the whole process is allowed to run for a super-polynomial number of steps from time to time.

In Section 2.7 and 2.8 a combination of the main ideas inDW and a different technique

presented in Section 2.3 will result in RNC algorithms with exactly uniform output probability. Lower failure probability will be traded-off for higher efficiency.

In document Randomised techniques in combinatorial algorithmics (Page 38-42)