Embedding rainbow trees with applications to graph labelling and decomposition

(1)

BIROn - Birkbeck Institutional Research Online

Montgomery, R. and Pokrovskiy, Alexey and Sudakov, B. (2019) Embedding

rainbow trees with applications to graph labelling and decomposition. Journal

of the European Mathematical Society , ISSN 1435-9855. (In Press)

Downloaded from:

Usage Guidelines:

Please refer to usage guidelines at

or alternatively

(2)

arXiv:1803.03316v2 [math.CO] 9 Dec 2018

Embedding rainbow trees with applications to graph labelling and

decomposition

R. Montgomery

∗

_{, A. Pokrovskiy}

†

_{, and B. Sudakov}

‡

Abstract

A subgraph of an edge-coloured graph is called rainbow if all its edges have distinct colours. The study of rainbow subgraphs goes back more than two hundred years to the work of Euler on Latin squares. Since then rainbow structures have been the focus of extensive research and have found applications in the areas of graph labelling and decomposition. An edge-colouring is locallyk-bounded if each vertex is contained in at most k edges of the same colour. In this paper we prove that any such edge-colouring of the complete graphKncontains a rainbow copy of every tree with at most (1₋o(1))n/kvertices. As a locallyk-bounded

edge-colouring of Knmay have only (n₋1)/k distinct colours, this is essentially tight.

As a corollary of this result we obtain asymptotic versions of two long-standing conjectures in graph theory. Firstly, we prove an asymptotic version of Ringel’s conjecture from 1963, showing that anyn-edge tree packs into the complete graphK2n+o(n) to cover all buto(n

2_{) of its edges. Secondly, we show that all}

trees have an almost-harmonious labelling. The existence of such a labelling was conjectured by Graham and Sloane in 1980. We also discuss some additional applications.

1 Introduction

In this paper, we study the appearance of large rainbow trees in edge-coloured complete graphs and present applications of our result to several old open problems in graph theory.

A rainbow subgraph of an edge-coloured graph is one whose edges have diﬀerent colours. The study of rainbow subgraphs of edge-coloured graphs goes back more than two hundred years to the work of Euler on Latin squares. ALatin square of ordernis ann_×narray ﬁlled withnsymbols such that each symbol appears once in every row and column. Apartial transversal is a collection of cells of the Latin square which do not share the same row, column or symbol. Starting with Euler (see [14, 28]), transversals in Latin squares have been extensively studied. The most famous open problem in this area is the Ryser-Brualdi-Stein Conjecture (see [11, 36, 37]), which says that every Latin square has a partial transversal of ordern₋1 and a full transversal (a partial transversal of ordern) if n is odd. To every Latin square one can assign an edge-colouring of the complete bipartite graph Kn,n by colouring the edge ij by the symbol in the cell (i, j). This is a proper

colouring, i.e., one in which any edges which share a vertex have distinct colours. Identifying the cell (i, j) with the edgeij, a partial transversal corresponds to a rainbow matching. Thus, ﬁnding transversals is a special case of ﬁnding rainbow subgraphs. Another reason to study rainbow subgraphs arises in Ramsey theory, more precisely in the canonical version of Ramsey’s theorem proved by Erd˝os and Rado [35]. Here the goal is to show that locally-bounded edge-colourings of the complete graphKn contain rainbow copies of certain graphs.

An edge-colouring islocally k-bounded if each vertex is in at mostk edges of any one colour.

The most natural problem in the study of rainbow structures is to determine which graphs are guaranteed to have a rainbow copy in any properly coloured complete graphKn. As, whennis even,Kncan be (n−

1)-edge-coloured, we may ask in general only for rainbow subgraphs with at mostn−1 edges. This leads to a natural question: which n-vertex trees have a rainbow copy in any proper colouring of Kn? Hahn [24] conjectured

in 1980 the special case that there would always be a rainbow copy of the n-vertex path. Disproving this conjecture, Maamoun and Meyniel [31] constructed a proper colouring ofKn with no rainbow Hamilton path.

∗_{Trinity College, Cambridge, UK, CB2 1TQ. [email protected]}

†_{Department of Mathematics, ETH, 8092 Zurich, [email protected]}

‡_{Department of Mathematics, ETH, 8092 Zurich, [email protected]. Research supported in part by}

(3)

Nevertheless, it is widely believed that any properly coloured Kn contains a rainbow path covering all but

exceptionally few vertices. In particular, Andersen [6] in 1989 conjectured that one can always ﬁnd a rainbow path coveringn₋1 vertices. The progress on this conjecture was initially slow, despite the eﬀorts of various researchers, see, for example, [2, 22, 23, 18, 12]. Only recently did Alon, Pokrovskiy and Sudakov [4] show that any properly colouredKn contains a rainbow path withn−O(n3/4) vertices (as part of a rainbow cycle).

Turning to more general trees, there are no previous general results to show that rainbow copies of large trees can be found in properly coloured complete graphs. The example of Maamoun and Meyniel [31] can be extended to show that there are proper colourings ofKn which do not contain rainbow copies of certain

n-vertex trees (see [7]). However, it is still possible that nearly-spanning trees exist in all proper colourings of complete graphs. In this paper, we prove the ﬁrst result of this type — we show that any properly colouredKn

contains a rainbow copy of any tree withn−o(n) vertices. We will, in fact, prove the following more general result of which this is a special case. Recall that an edge-colouring ofKnis locallyk-bounded if each vertex is

in at mostkedges of any one colour.

Theorem 1.1. If ε,1/k_≫1/n >0, then any locally k-bounded edge-colouring ofKn contains a rainbow copy

of every tree with at most (1−ε)n/k vertices.

As a locallyk-bounded edge-colouring ofKnmay have only (n−1)/kdistinct colours, Theorem 1.1 is tight

up to the constantεfor eachk. Let us note that one of the distinguishing features of our result is that we place no conditions on the trees other than the number of vertices they may have. In comparison, all of the graph packing and labelling results which we mention below require a maximum degree bound which is conjectured to be unnecessary. The techniques we introduce for embedding trees with high degree vertices may oﬀer new approaches to these problems.

An important tool in our methods is to demonstrate that, in properly coloured complete graphs, a large rainbow matching can typically be found into a random vertex set using a random set of colours chosen with the same density, in fact such a matching can cover almost all the vertices in the random vertex set. This allows most of a large tree to be embedded if it can be decomposed into certain large matchings. Contrastingly, we need deterministic methods to embed vertices in the tree with high degree. The interplay between the deterministic and random part of the embedding forms a key part of our methods. This is sketched in more detail in Section 2.

Applications to graph decompositions, labellings and orthogonal double-covers. Theorem 1.1 has applications in diﬀerent areas of graph theory. We present three applications of it in the areas of graph decompositions, graph labellings, and orthogonal double-covers. With each application we prove an asymptotic form of a well-known conjecture. In each case, we consider some special colouring ofKn (coming from a

graph-theoretic problem), and apply Theorem 1.1.

Graph decompositions. In graph decompositions one typically asks when the edge set of some graphGcan be partitioned into disjoint copies of another graph H. This is a vast topic with many exciting results and conjectures (see, for example, [17, 39, 40]). One of the oldest and best known conjectures in this area, posed by Ringel in 1963 [33], deals with the decomposition of complete graphs into edge-disjoint copies of a tree.

Conjecture 1.2 (Ringel). Any tree withn+ 1 vertices packs2n+ 1 times into the complete graph K2n+1.

(4)

Piguet [3] almost-perfectly packed arbitrary spanning graphs with maximum degreeO(n/logn) and constant degeneracy1 _{into large complete graphs.}

Here we obtain the ﬁrst asymptotic solution for this problem applicable with no degree restriction.

Theorem 1.3. Ifε≫1/n >0, then any (n+ 1)-vertex tree packs at least2n+ 1times into the complete graph

K(2+ε)n.

To see the connection between Theorem 1.1 and Conjecture 1.2 consider the following edge-colouring of the complete graph with vertex set_{0,1, . . . ,2n_}, which we call thenear distance (ND-) colouring. Colour the edgeij by colourk, wherek_∈[n], if eitheri=j+k orj =i+kwith addition modulo 2n+ 1. It is easy to show (as below) that if an (n+ 1)-vertex tree has a rainbow embedding into the ND-colouring ofK2n+1 then

Conjecture 1.2 holds for that tree.

Proof of Theorem 1.3. Letℓ= (1 +ε/3)n. Consider the ND-colouring of the complete graphK2ℓ+1, deﬁned

above. This is a locally 2-bounded colouring ofK2ℓ+1, which thus, by Theorem 1.1, contains a rainbow copy

of T, S0 say. Now, for each i ∈ [ℓ], let Si be the tree with vertex set {v+i : v ∈ V(S0)} and edge set

{{v+i, w+i_}:vw_∈E(S0)}.

Note that the colour ofvw_∈E(K2ℓ+1) is the same as the colour of the edge{v+i, w+i}for eachi∈[2ℓ],

and under the translationx_7→x+ 1 the edgevw moves around all 2ℓ+ 1 edges with the same colour. Thus, each treeSi is rainbow, and all the treesSi are disjoint.

Graph labelling. Graph labelling originated in methods introduced by R´osa [34] in 1967 as a potential path towards proving Ringel’s conjecture. In the intervening decades, a large body of work has steadily developed concerning diﬀerent labellings and their applications (see [17]). One old, well-known, conjecture in this area concerns the harmonious labelling. This labelling was introduced by Graham and Sloane [19] and arose naturally out of the study of additive bases. Given an Abelian group Γ and a graph G, we say that a labellingf :V(G)_→Γ is Γ-harmonious if the mapf′ _:_E₍_G₎

→Γ deﬁned byf′₍_xy_{) =}_f₍_x_{) +}_f₍_y_{) is injective.}

In the case when Γ is a group of integers modulonwe omit it from our notation and simply call such a labelling

harmonious. In the particular case of ann-vertex tree, Graham and Sloane asked for a harmonious labelling using Z_n−₁ _{in which each label is used on some vertex, so that a single label is used on two vertices – where}

this exists we call the tree harmonious. More generally, we also call a graph with m edges and at most m

verticesharmonious if it has an injective harmonious labelling withZ_m_{. According to an unpublished result of}

Erd˝os [19], almost all graphs are not harmonious. On the other hand, Graham and Sloane [19] in 1980 made the following conjecture for trees.

Conjecture 1.4 (Graham and Sloane). All trees are harmonious.

This conjecture is known for many very special classes of trees such as caterpillars, trees with_≤31 vertices, palm trees and ﬁreworks (see Chapter 2 of [17] and the references therein). ˙Zak conjectured [41] an asymptotic weakening of this conjecture — that every tree has an injectiveZ_n₊_o₍_n₎_{-harmonious labelling.}

Note that, for any injective labelling of the vertices of the complete graph by elements of an Abelian group, the edge-colouring which is obtained by taking sums of labels of vertices is proper. Therefore we can use Theorem 1.1 to study such colourings. In particular, we can obtain the following general result which shows that every tree is almost harmonious.

Theorem 1.5. Everyn-vertex treeT has an injectiveΓ-harmonious labelling for any Abelian groupΓof order

n+o(n).

When the group Γ is cyclic, this theorem proves ˙Zak’s conjecture from [41].

Proof of Theorem 1.5. Suppose_|Γ_|=ℓ= (1 +ε)n for someε >0 and letT be a tree onnvertices. Identify the vertices ofKℓwith the elements of Γ and consider an edge-colouring that colours the edgeij byi+j. This

is a proper colouring, so, whennis large, by Theorem 1.1 it contains a rainbow copy ofT, which corresponds to a harmonious labelling. By taking Γ =Z_ℓ_{, we deduce that for any}_{ε >}_{0 there exists}_n₀ _{such that any tree}

withn_≥n0 vertices has an injective harmonious labelling with at most (1 +ε)nlabels.

(5)

Orthogonal double covers. Theorem 1.1 can also be used to obtain an asymptotic solution for another old graph decomposition problem. An orthogonal double cover of the complete graphKn by a graphG is a

collectionG1, . . . , Gnof subgraphs ofKn such that eachGi is a copy ofG, every edge ofKn belongs to exacly

two of the copies and any two copies have exactly one edge in common. The study of orthogonal double covers was originally motivated by problems in statistical design theory (see [13], Chapter 2). Since the number of copies in the double cover isn, it follows thatGmust haven₋1 edges. The central problem here is to determine for which graphsGthere is a an orthogonal double cover ofKn byG. In this full generality, this extends the

existence question for both biplanes and symmetric 2-designs (see [25]), and so must be considered diﬃcult. About 20 years ago, Gronau, Mullin, and Rosa [20] made the following conjecture about trees.

Conjecture 1.6(Gronau, Mullin, and Rosa). LetT be ann-vertex tree which is not a path on3edges. Then,

Kn has an orthogonal double cover by copies ofT.

This conjecture is known to hold for certain classes of trees including stars, trees with diameter_≤3, comets, and trees with _≤13 vertices (see [20, 30]). To see the connection between Conjecture 1.6 and Theorem 1.1, we will consider a colouring of a complete graph on 2k vertices, where edges are coloured by the sum of their endpoints in the abelian group Zk₂_{. By considering such a colouring we can show that Conjecture 1.6 is}

asymptotically true whenevernis a power of 2.

Theorem 1.7. Let n= 2k _{and let}_T _{be a tree on}_n

−o(n)vertices. ThenKn containsncopies ofT such that

every edge of Kn belongs to at most two copies and any two copies have at most one edge in common.

Proof. IdentifyV(Kn) with the groupZk2. Colour each edgeij withi+j ∈Zk2. By Theorem 1.1, Kn has a

rainbow copy S of T. For all x∈ Zk₂_{, deﬁne a permutation} _φ_x _: _V₍_K_n₎ _→_V₍_K_n_{) by} _φ_x₍_v_{) =} _x₊_v _(with

addition inZk₂_{). Use}_φ_x₍_S_{) to denote the subgraph of}_K_n _{with edges}_{_φ_x₍_a₎_φ_x₍_b_{) :}_ab_∈_E₍_S₎_}_{. Notice that,}

since φx is a permutation of V(Kn), φx(S) is a tree isomorphic to T. We claim that the family of n trees

{φx(S) :x∈Zk2} satisﬁes the theorem.

Notice that since φx(a) +φx(b) =a+x+b+x=a+b, the permutationsφxpreserve the colours of edges.

This implies that the trees Tx are all rainbow. Notice that the only edges ﬁxed by the permutations φx are

those coloured byx. Finally, notice thatφx−y◦φy=φx. Combining these, we have that if a colourc edge is

in two treesφx(S) andφy(S), thenx−y=cin Zk2. This implies that the treesφx(S) cover any edge at most

twice, and any pair of them have at most one edge in common.

The rest of this paper is organized as follows. In the next section we sketch the proof of Theorem 1.1. In Section 3 we deﬁne our notation, and recall some probabilistic results. In Sections 4 – 10 we prove Theorem 1.1.

2 Sketch of the proof of Theorem 1.1

In this section we sketch a proof of Theorem 1.1. For simplicity assume that Kn is properly coloured, i.e.

that k = 1. The proof splits the tree into a sequence of subforests, and iteratively extends the embedding to each subforest. The splitting of the tree goes as follows. We show that every tree has a short sequence

T =Tℓ ⊇Tℓ−1⊇ · · · ⊇ T1 ⊇ T0 where |T0| =o(n), and each forestTi is constructed from the previous one

using one of the following three operations: (1) Add large stars whose centers are inTi−1.

(2) Add paths of length 3 whose endvertices are inTi−1.

(3) Add a large matching one side of which is inTi−1.

There are two additional properties we can ensure: (1) only needs to be performed once, when going fromT0

to T1, and in (2), the total number of vertices contained in all the paths that need to be added is o(n). See

Lemma 4 for a precise statement of this splitting.

To ﬁnd a rainbow embedding of T, we start with a rainbow embedding of T0 and iteratively extend it

(6)

techniques for performing (1) – (3): (1) is done deterministically, whereas (2) and (3) are done probabilistically. This interplay between deterministic and probabilistic techniques is one of the main new ideas introduced in this paper.

Stars: To ﬁnd large stars in (1), we use the deterministic technique of “switchings”. The particular technique that we use originated in the papers of Woolbright [38] and Brouwer, de Vries, Wieringa [10] about transversals in Latin squares. The idea is to consider a maximal rainbow family of stars, and show that it must be large enough for our purposes. If the family of stars is not large enough, then it is possible to perform local manipulations to turn it into a larger rainbow family of stars (contradicting the maximality). See Section 8 for further details.

Paths and matchings: To ﬁnd paths and matchings in (2) and (3), we use the probabilistic method. That is, we choose a random set of verticesXand a random set of coloursC, and then try to ﬁnd the paths/matching using only vertices inX and colours inC. Here “random set” means that we choose every vertex/colour inKn

independently at random with some ﬁxed probabilityp.

To show that it is possible to ﬁnd rainbow paths/matchings using random setsX andC, we show that with high probability the subgraph ofKnonX andChas certain pseudorandom properties. There are two relevant

pseudorandom properties. Firstly, we show that the subgraphGofKn of edges with colours inChas roughly

the same edge-distribution as an Erd˝os-Renyi random graph. Speciﬁcally, every vertex has degree aboutpnin

Gand every pair of disjoint vertex setsA, B have about p|A||B|edges between them. Such a result was ﬁrst proved by Alon, Pokrovskiy, and Sudakov in [4]. We use a generalization of their result to locally bounded colourings (see Lemma 5.1). Secondly, we show that for random setsX _⊆V(Kn) andC⊆C(Kn), the number

of colours of C betweenX and any set A_⊆V(Kn) is at least (1−o(1))|A|. See Lemma 6.1 for the precise

statement.

Once we have established these pseudorandom properties, we use them to embed the paths and matchings for (2) and (3). Embedding paths is easy — since we only look foro(n) of them, there is enough room to ﬁnd them greedily (see Lemma 9.2). Embedding matchings is harder, since there is less extra room. Matchings are embedded using a switching argument (like that used for stars), but one which exploits our pseudorandom properties (see Lemma 7.2).

Combining: There is one diﬃculty left — how do we combine the deterministic arguments for stars with the random ones for paths/matchings? The issue here is that when we deterministically embed the stars, we may not have control over which colours and vertices we use. The vertices and colours used for (2) and (3) need to be disjoint from those used for the stars in (1) and need to be random subsets of V(Kn) and C(Kn)

respectively. These two requirements are incompatible with our aim to choose the vertices and colours in (1) deterministically.

We get around this issue by randomizing the stars we build in (1). Specifically we deterministically embed stars which are bigger than we need, and then randomly delete each vertex with fixed probability. The result is that we find the stars we want, with sufficient randomness in the unused vertices and colours. There is a complication that arises with this argument — there will be a dependency between the colours used on the stars and their vertices. Because of this the setting of several of the lemmas in this paper is the following: in a properly coloured Kn, we randomly choose a sets X and C of vertices and colours. The vertices are

chosen independently of each other with probability p. The colours are chosen independently of each other with probability p. However, there may be arbitrary dependencies between the vertices and the colours. It turns out that the methods we use for (3) still work with these added dependencies, allowing us to combine (1) – (3) to embed the whole treeT. More detail is given on this in Section 10.2.

3 Preliminaries

(7)

3.1 Definitions and notation

We use standard graph theory notation as well as the following deﬁnitions and notation relevant to a chosen colour classC ofG. We say a graph is C-rainbow if each of its edges has a diﬀerent colour inC. A C-edge

is one with colour in C. We say a vertex x is a colour-C neighbour of v in a graph G if x _∈ NG(v), the

neighbourhood of xin G, andxv has colour in C. We denote the set of colour-C neighbours of v in Gby

NC(v), and deﬁneNC(A) = (∪x∈ANC(x))\Afor eachA⊂V(G). For a colourc, we makeNc(x) =N{c}(x)

and other similar abbreviations. LetC(G) be the set of the colours of the edges ofG. For an edgee_∈E(G), letc(e) be the colour ofe.

We also use the following two important deﬁnitions when splitting trees.

Definition 3.1. Abare path P in a treeT is a path whose interior vertices all have degree 2 inT. WhereP

is a bare path inT,T₋P is the graphT with the edges ofP and the interior vertices ofP deleted.

Definition 3.2. In a treeT, we sayLis a set ofnon-neighbouring leaves ifLis a set of leaves which pairwise share no neighbours. In other words, removingLfromT removes a matching.

We use common asymptotic notation for any strictly positive real functions f and g, as follows. If

f(n)/g(n) _→ 0 as n _{→ ∞}, then we say f = o(g) and g = ω(f). If there is some constant C for which |f(n)| ≤C|g(n)| for alln, then we say f =O(g) and g= Ω(f). In addition, we use thehierachy x≫y ≫z

if there exists some non-zero decreasing functions f and g such that if y ≤ f(x) and z ≤ g(y) then all the subsequent inequalities we need concerningx, y andz hold. Where we say a property almost surely holds or holds with probability 1₋o(1) in conjunction with a hierachy x_≫y _≫1/n, we mean that, for each ε >0, and xandy withx≪y, for all sufficiently largenthe property holds with probability at least 1−ε. In all our lemmas we assume thatnis an integer which is sufficiently large (e.g. bigger than 106 _{is sufficient).}

3.2 Concentration inequalities

We make use of Chernoﬀ’s lemma in the following form (see, for example, [26]).

Lemma 3.3. If X is a binomial variable with standard parameters n and p, denoted X = Bin(n, p), and ε

satisfies0< ε≤3/2, then

P₍_|_X₋E_X_{| ≥}_εE_X₎_≤_{2 exp} ₋_ε2E_X/₃

.

Often, the random variables we consider will depend on both random vertices and colours, with some dependencies between the vertices and colours. Here, we will use Azuma’s inequality, for which we need the following deﬁnition. Given a probability space Ω =Qn

i=1Ωi and a random variableX : Ω→Rwe make the

following deﬁnition. If there is a constantksuch that changingω_∈Ω in any one coordinate changesX(ω) by at mostk, then we say thatX isk-Lipschitz.

Lemma 3.4 (Azuma’s Inequality). Suppose that X:Qn

i=1Ωi→R isk-Lipschitz. Then P₍_|_X₋E_X_|_{> t}₎_≤_{2 exp(}₋_t2_/k2_n₎_.

4 Tree splitting

We will split each tree we seek to embed by initially removing large matchings iteratively (that is, removing non-neighbouring leaves, see Deﬁnition 3.2). When this is no longer possible, either few vertices remain, or we have many disjoint long bare paths (see Deﬁnition 3.1), as shown by Lemma 4.1. Where these long paths occur, we will remove a path with length 3 from each end of each path, before iteratively removing matchings to remove the remaining edges in the middle of the paths. Few vertices will then remain, giving us the required splitting, in Lemma 4.2.

(8)

Proof. For the appropriater, letQ1, . . . , Qr be the maximal bare paths inT. Note that each edge is in some

pathQi, and that these paths are edge-disjoint. Replacing each path Qi by an edge gives a tree,S say, with

redges, no vertices of degree 2, and at mostℓ leaves. SinceS hasr+ 1₋ℓ vertices of degree_≥3, the sum of the degrees givesℓ+ 3(r+ 1−ℓ)≤P

v∈V(S)dS(v) = 2r, which impliesr≤2ℓ.

For each i_∈ [r], ﬁnd within Qi as many vertex-disjoint length r subpaths as possible while avoiding the

endvertices of Qi (thus the subpaths from diﬀerent Qi are vertex-disjoint), say the paths Qi,j, with j ∈ [ji],

where due to the maximalityji=⌊(|Qi|−2)/(m+ 1)⌋ ≥(|Qi|−m−2)/(m+ 1). Removing the internal vertices

of a subpath with lengthmfromQi removesm−1 vertices. Thus, for eachi∈[r], we have

|Qi− ∪j∈[ji]Qi,j|=|Qi| −(m−1)ji≤

(m+ 1)_|Qi| −(m−1)(|Qi| −m−2)

m+ 1 ≤2m+ 2_|Qi|

m+ 1. (1) Note that

T−

[

i∈[r]

[

j∈[ji]

Qi,j ≤

X

i∈[r]

Qi−

[

j∈[ji]

Qi,j ≤

X

i∈[r]

(2m+ 2_|Qi|/(m+ 1))

≤2mr+ 2(|T|+ 2r)/(m+ 1)≤6mℓ+ 2|T|/(m+ 1).

The ﬁrst inequality comes fromT =∪i∈[r]Qi, while the second comes from (1). The third inequality comes

from e(T) = P

i∈[r](|Qi| −1) = |T| −1, and the last inequality comes from r ≤ 2ℓ. Thus, the set of paths

{Qi,j:i∈[r], j∈[ji]} have the property required.

Using Lemma 4.1, we can now ﬁnd the desired splitting of an arbitrary tree.

Lemma 4.2. Given integers D and n, µ > 0 and a tree T with at most n vertices, there are integers ℓ _≤

104_Dµ−2 _and_j

∈ {2, . . . , ℓ_}and a sequence of subgraphs T0⊂T1⊂. . .⊂Tℓ=T such that P1 for each i∈[ℓ]\ {1, j},Ti is formed fromTi−1 by adding non-neighbouring leaves,

P2 Tj is formed fromTj−1 by adding at mostµn vertex-disjoint bare paths with length3,

P3 T1 is formed fromT0 by adding vertex-disjoint stars with at least D leaves each, and

P4 _|T0| ≤µn.

Proof. Letm=_⌈100/µ_⌉. Iteratively remove sets of µn/16mDnon-neighbouring leaves fromT as many times as possible. Eventually we obtain a subtree which has< µn/16mDnon-neighbouring leaves. Call this subtree

Tm−4, and letTm−3⊂. . .⊂Tℓ=T be the intermediate subtrees, i.e. whereTi is formed fromTi−1by adding µn/16mD non-neighbouring leaves. Note that ℓ ≤ m+ 16mD/µ ≤ 104_Dµ−2_{. Fix a maximal collection of}

vertex-disjoint bare paths with length min Tm−4. Remove the subpath with length 3 from each end of each

path fromTm−4, and call the resulting forestTm−5(this is possible becausem≥8). Note that we removed at

most 2n/m≤µnsuch paths. For each i=m−5, m−6, . . . ,2 in turn, remove a leaf from each disconnected bare path of lengthi₋1 inTi to get the forest Ti−1. Note that, to get the forestT1, we have removed every

internal vertex except for one from each path in the maximal collection of vertex-disjoint bare paths with length

m.

Finally, for every vertex v∈T1 which has≥Dleaves inT1, remove all of the leaves ofv. Call the resulting

forest T0. We will show that|T0| ≤µn, whereupon the sequenceT0⊂. . . ⊂Tℓ =T satisﬁesP1 to P4 with

j=m₋4.

ConsiderS =Tm−4 and remove any leaves around a vertex if it has at least D leaves. Call the resulting

tree S′_{. By the choice of}_ℓ_{, at most} _µn/₁₆_mD _{vertices in} _S _{have some incident leaf. Each leaf in} _S′ _{had an}

incident leaf inS (which was then removed) or was incident to a vertex inS which had at mostD₋1 leaves. Thus,S′ _{has at most}_µn/₁₆_m_leaves.

By Lemma 4.1, there are vertex-disjoint bare pathsP1, . . . , Pr, for somer, with lengthm, so that|S′−P1− . . .₋Pr| ≤6µn/16 + 2n/m≤µn/2. As at mostµn/16mD vertices inS have incident leaves, all but at most

µn/16mD of these pathsPi are bare paths inS. At leastr−n/16mDmany vertex-disjoint bare paths with

lengthmmust have been removed fromS =Tm−4, except for one vertex in each, to getT1, so thereforeT1can

have at mostµn/2 +r+m_·µn/16mD_≤µnvertices which are not inV(S)_\V(S′_{) (using}_r

≤n/m_≤µn/100). As every vertex inV(S)_\V(S′_{) is removed from}_T

(9)

5 Choosing random colours

Consider a properly coloured complete graphKn, and form a random set of colours by choosing every colour

independently with probabilityp. LetGbe the subgraph ofKn consisting of the edges of the chosen colours.

Alon, Pokrovskiy and Sudakov [4] showed that, with high probability, the subgraph G has a similar edge distribution to the binomial random graph. Speciﬁcally they showed that, for any two sets of verticesAandB, the number of edges in the chosen colours betweenA andB is concentrated around the expectation p_|A_||B_|. In this section we will prove a generalization of this result on proper colourings to colourings ofKn which are

locallyk-bounded. The method we use is similar to the proof from [4].

Lemma 5.1. Letε >0andk∈N_{be constant and let}_p_≥_n−1/100_{. Let}_K_n _{have a locally}_k_{-bounded colouring}

and supposeGis a subgraph ofKn chosen by including the edges of each colour independently at random with

probability p. Then, with probability1₋o(n−1₎_{, for any disjoint sets}_{A, B}

⊂V(G), with _|A_|,_|B_{| ≥}n3/4_,

eG(A, B)−p|A||B|≤εp|A||B|.

Proof. We will ﬁrst show that, with high probability, the property we want holds for pairs of sets with many colours between them.

Claim 5.2. With probability1₋o(n−1₎_{the following holds.}

P For any disjoint setsA, B⊂V(Kn), with|A| ≥ |B| ≥n1/10, which have at least(1−εp/8)|A||B|different

colours between AandB inKn, we have |eG(A, B)−p|A||B|| ≤εp|A||B|/2.

Proof. For any such setsAandB, we can select a rainbow subgraphRofKn[A, B] with (1−εp/8)|A||B|edges.

Notice thate(R_∩G)_∼Bin((1₋εp/8)_|A_||B_|, p). By Lemma 3.3 applied withεp/8 forε, with probability at least 1−exp(−ε2_p3_|_A_||_B_|_/₁₀3_{), we have (1}₋_εp/₄₎_p_|_A_||_B_{| ≤}_e₍_R_∩_G₎_≤_{(1 +}_εp/₄₎_p_|_A_||_B_|_{, which in combination}

with e(Kn[A, B]−R) ≤ εp|A||B|/8 implies that P holds for A and B. Note that, for such sets, we have

p3

|B_{| ≥}n7/100₌_ω_(log_n_{), so when}

|A_{| ≥ |}B_{| ≥}n1/10_{, condition}_P_{holds for}_A_and_B _{with probability at least}

1−exp(−|A| ·ω(logn)).

Thus,Pholds with probability at least

1₋

n X

b=n1/10

n X

a=b

n a

n b

exp(₋a_·ω(logn)) = 1₋o(n−1).

Assuming thatPholds, we now show that the property in the lemma holds. LetA, B_⊂V(Kn) be disjoint

sets with|A| ≥ |B| ≥n3/4. Letℓ=⌈10√kn/ε2p2⌉ ≤n0.6+o(1).

Claim 5.3. There are partitionsA=A1∪. . .∪AℓandB =B1∪. . .∪Bℓsuch there are at mostε2p2|A||B|/100

edges abbetweenAandB for which there is another edgea′_b′ _with_c₍_ab_{) =}_c₍_a′_b′₎_and_{ab, a}′_b′

∈E(Kn[Ai, Bj])

for somei, j_∈[ℓ].

Proof. Pick random partitionsA=A1∪. . .∪AℓandB =B1∪. . .∪Bℓ, by choosing the part of each element

independently and uniformly at random. Fix any colour-cedgee, and letAi,Bjbe the classes it goes between.

The probability that there is another colour-cedge betweenAi andBj sharing a vertex witheis≤2(k−1)/ℓ

(since etouches at most 2(k−1) other colour-cedges). The probability there is a colour-cedge disjoint from

ebetweenAi andBj is≤kn/2ℓ2(since that are at most kn/2 colour-cedges in total). Combining these, the

probability there is another colour-cedge betweenAi andBj is at most

2(k₋1)

ℓ + kn

2ℓ2 ≤ kn

ℓ2 ≤ ε2_p2

100.

Thus, the expected number of edges which have a non-unique colour across their classes is at mostε2_p2

|A_||B_|/100.

Fix such a partition then for which there are at mostε2_p2

(10)

Claim 5.4. Let H be the subgraph of Kn[A, B] with all the edges between any pair Ai, Bj which does not

satisfy Premoved. Then

|eH(A, B)− |A||B|| ≤εp|A||B|/4. (2)

Proof. First, delete any edge adjacent to a classAi or Bj which contains at mostn1/10 vertices. For largen,

the number of edges deleted in this ﬁrst stage is at most

ℓ_·n1/10₍

|A_|+_|B_|)_≤n0.7+o(1)₍

|A_|+_|B_|)_≤εp_|A_||B_|/8. (3)

Secondly, delete edges between any class Ai and Bj if there are at most (1−εp/8)|Ai||Bj| diﬀerent colours

between Ai and Bj in Kn. Note that, at each such deletion, we lose |Ai||Bj| edges, at least εp|Ai||Bj|/8 of

which have non-unique colours between their classes. Thus, ifI is the set of pairs (i, j) for which we deleted edges betweenAi andBj for this reason, we have

X

(i,j)∈I

εp_|Ai||Bj|/8≤ε2p2|A||B|/100.

Thus, we have deleted at mostP

(i,j)∈I|Ai||Bj| ≤εp|A||B|/8 edges in this second stage, and, by (3), at most

εp_|A_||B_|/4 edges in total, proving (2).

Let I′ _{be the set of pairs (}_{i, j}_{) for which there remain edges between} _A

i and Bj in H. By P, for each

(i, j)_∈I′_{, we have}

|eG∩H(A, B)−p·eH(A, B)|=

X

(i,j)∈I′

eG(Ai, Bj)−p|Ai||Bj| ≤

X

(i,j)∈I′

|eG(Ai, Bj)−p|Ai||Bj||

≤ X

(i,j)∈I′

εp|Ai||Bj|/2≤εp|A||B|/2. (4)

The ﬁrst inequality is the triangle inequality. The second inequality is P. The third inequality follows as

A1∪. . .∪AℓandB1∪. . .∪Bℓ partitionAandB respectively. The required property then holds forAandB,

as follows.

|eG(A, B)−p|A||B|| ≤ |eG(A, B)−eG∩H(A, B)|+|eG∩H(A, B)−p·eH(A, B)|+p|eH(A, B)− |A||B||

≤2_|eH(A, B)− |A||B||+εp|A||B|/2≤εp|A||B|.

The ﬁrst inequality is the triangle inequality. The third inequality comes from (2). The second inequality comes from (4) and asEG(A, B)\EG∩H(A, B)⊆EKn(A, B)\EH(A, B).

6 Colours and random vertex sets

Consider a locally k-bounded coloured complete graph Kn, and let C be a random set of colours chosen

independently with probabilityp. Consider two sets of verticesA andB. In the previous section, we showed that the number of colour-C edges between Aand X is about p|A||B|. Since every colour can occur ≤k|A| times between A and B, this implies that the number of colours of C occuring between A and B is at least (1₋o(1))p_|B_|/k. Our goal here is to prove that ifB is arandom set of of vertices, then this number increases to (1₋o(1))_|B_|/k. In fact, the following main result of this section shows that this holds for every large subset

B of a random set of verticesX.

Lemma 6.1. Let k_∈N_and _{ε >}₀ _{be constant, and}_p_≥_n−1/104_{. Let}_K_n _{have a locally} _k_{-bounded colouring.}

Let X _⊂V(Kn) andC ⊂C(Kn) be random subsets where, for eachx∈V(Kn)andc ∈C(Kn), P(x∈X) =

P₍_c_∈_C_{) =}_p_{, all the events}_{_x_∈_X_}_{are independent and all the events}_{_c_∈_C_}_{are independent (but the event}

{x∈X} might depend on the events {c∈ C}). Then, with probability 1−o(n−1₎_{, for each} _A_⊂_V₍_K

n)\X

andB _⊂X with _|A_{| ≥}n3/4 _and

|B_{| ≥}εpn, there are at least(1₋ε)_|B_|/k colours which appear in Gbetween

(11)

To prove this lemma we consider a setA and a random setX, and analyse how many times each colour can appear between them. Without any assumptions onX, we know that every colour appears_≤k_|A_|times betweenAandX (from local k-boundedness). The following lemma shows that ifX is chosen randomly with probabilityp, thenmost colours appear only at most (1 +o(1))pk|A|times betweenAandX.

Lemma 6.2. Let k be constant andε, p _≥n−1/100_{. Let}_K

n have a locally k-bounded colouring and let X be

a random subset ofV(Kn) with each vertex included independently at random with probability p. Then, with

probability 1₋o(n−1₎_{, for each}_A

⊂V(Kn) with n1/20≤ |A| ≤n1/4, for all but at most εncolours there are

at most(1 +ε)pk|A|edges of that colour between AandX. Proof. Let ℓ =_⌈kn1/2

⌉. Fix A_⊂ V(G) with n1/20

≤ |A_{| ≤}n1/4_{. For each} _c

∈ C(Kn), letAc = Nc(A), so

that_|Ac| ≤k|A|. LetC′(Kn) be the colours with more thanpk|A|edges betweenAandV(Kn) inKn, so that

|C′₍_K

n)| ≤n/pk.

Claim 6.3. There is a partition C′₍_K

n) =C1∪. . .∪Cℓ so that, for eachi∈[ℓ]anda, b∈Ci,Aa andAb are

disjoint.

Proof. Create an auxillary graph with vertex setC′₍_K

n) whereab is an edge exactly ifAa∩Ab6=∅. For any

colour c _∈ C(Kn), there are at most k|A|2 ≤ ℓ edges betweenA and Ac, and hence at most ℓ−1 diﬀerent

colours fromC′₍_K

n)\ {c} on such edges. The auxillary graph then must have maximum degree at mostℓ−1,

and is thusℓ-colourable, so the claim holds.

Now, consider X, the random subset of V(Kn) with each vertex included independently at random with

probabilityp. For eachi_∈[ℓ], let

Bi={c∈Ci:|Ac∩X|>(1 +ε)pk|A|}

andB=_∪i∈[ℓ]Bi. To show that|B| ≤εn, we will use the following claim.

Claim 6.4. For eachi∈[ℓ], if|Ci| ≥εn/2ℓ, then P(|Bi| ≥εp|Ci|/2)≤exp(−ε2pn1/2/200k).

Proof. Recall that |A| ≥ n1/20 and ε, p ≥ n−1/100, so that ε2pk|A| = ω(−log(εp)). Note that |Ac∩X| ∼

Bin(_|Ac|, p). By Lemma 3.3, and as|Ac| ≤k|A|, for eachc∈Ci,

P₍_c_∈_B_i₎_≤_{2 exp(}₋_ε2_pk_|_A_|_/₃₎_≤_εp/₄_,

for suﬃciently large n. Furthermore, the events {c ∈ Bi}, c ∈ Ci, are independent (by the disjointness of

the sets Ac forc∈Ci). This implies that|Bi|is stochastically dominated by Bin(|Ci|, εp/4). By Lemma 3.3

applied with 1/2 forε, if_|Ci| ≥εn/2ℓ, then

P₍_|_B_i_{| ≥}_εp_|_C_i_|_/₂₎_≤_{2 exp(}₋₍_εp_|_C_i_|_/₄₎_/₁₂₎_≤_exp(₋_ε2_pn/₁₀₀_ℓ₎_≤_exp(₋_ε2_pn1/2_/₂₀₀_k₎_.

Note that from our choice of parameters we have thatε2_pn1/2_/₂₀₀_{k > n}2/5_{. Therefore the probability that}

for some subset A _⊂V(Kn), with n1/20 ≤ |A| ≤ n1/4, there is an index i ∈ [ℓ] such that |Ci| ≥εn/2ℓ and

|Bi| ≥εp|Ci|/2 is at most

n1/4

X

a=n1/20

_n

a

·ℓ_·exp(₋n2/5) =o(n−1).

Therefore, with high probability, we can assume that for every subset Aand every Ci with |Ci| ≥εn/2ℓthe

correspondingBi satisﬁes|Bi| ≤εp|Ci|/2. Then,

|B_| = X

i∈[ℓ]

|Bi| ≤

X

i:|Ci|≥εn/2ℓ

|Bi|+

X

i:|Ci|<εn/2ℓ

|Ci|

≤ εp₂ X

i∈[ℓ]

|Ci|+ℓ·

εn

2ℓ ≤ εn

2k + εn

2 < εn,

where we have used thatP

(12)

We can now show that the property in Lemma 6.2 is likely also to hold for any subsetA which is not too small.

Corollary 6.5. Letk be constant andε, p_≥n−1/103

. LetKn have a locally k-bounded colouring and let X be

a random subset ofV(Kn) with each vertex included independently at random with probability p. Then, with

probability 1−o(n−1), for each A ⊂V(Kn)\X with |A| ≥ n1/4, for all but at most εn colours there are at

most(1 +ε)pk|A|edges of that colour between AandX.

Proof. By Lemma 6.2 applied with ε2_p/_{4 for}_ε_{, with probability 1}₋_o₍_n−1_{), for each subset}_B _⊂_V₍_G_{) with} n1/20

≤ |B_{| ≤}n1/4_{, for all but at most} _ε2_pn/_{4 colours there are at most (1 +}_ε2₎_pk

|B_| edges of that colour betweenB andX.

LetA⊂V(G) satisfy|A| ≥n1/4_{and choose}_A₌_A

1∪. . .∪Aℓwithℓ=⌊|A|/n1/20⌋andn1/20≤ |Ai| ≤2n1/20

for eachi_∈[ℓ]. For eachi_∈[ℓ], letCi be the set of colours for which there are more than (1 +ε2)pk|Ai|edges

of that colour betweenAiandX, so that|Ci| ≤ε2pn/4. LetC′ be the set of colours for which there are more

than (1 +ε)pk|A| edges of that colour betweenA andX. We need then to show that|C′_{| ≤}_εn_.

Note that, if c_∈C′_{, then}

(1 +ε)pk_|A_{| ≤}X

a∈A

|Nc(a)∩X| ≤

X

i:c∈Ci

k_|Ai|+ X

i:c /∈Ci

(1 +ε2)pk_|Ai|

≤ |{i:c_∈Ci}| ·2kn1/20+ (1 +ε2)pk|A|, (5)

The ﬁrst equality comes fromc∈C′. The second inequality comes from the fact that the number of colour-c

edges between Ai andX is at most (1 +ε2)pk|Ai|for i 6∈Ci, and at most k|Ai| for all other i(by the local

k-boundedness ofKn). The third inequality comes from|Ai| ≤2n1/20andPi∈[ℓ]|Ai|=|A|. This implies that

εpk|A|/2≤(ε−ε2)pk|A|(5)≤ |{i:c∈Ci}| ·2kn1/20≤ |{i:c∈Ci}| ·2k|A|/ℓ.

Thus, ifc_∈C′_{, then}

|{i:c_∈Ci}| ≥εpℓ/4. This gives

|C′| ·εpℓ/4≤ X

c∈C′

|{i:c∈Ci}|=

X

i∈[ℓ]

|C′∩Ci| ≤

X

i∈[ℓ]

|Ci|.

Therefore,

|C′| ≤

P

i∈[ℓ]|Ci| εpℓ/4 ≤

ℓ·ε2_pn/₄ εpℓ/4 =εn, as required.

Combining the property in Corollary 6.5 with Lemma 5.1, we can now complete the proof of the main result of this section.

Proof of Lemma 6.1 . The desired property in the lemma strengthens asεdecreases, so we may assume that

ε≤1/2. By Corollary 6.5 applied withε2p2/4k for ε, with probability 1−o(n−1), for each A⊂V(G) with |A| ≥n3/4_{, for all but at most}_ε2_p2_n/₄_k_{colours there are at most (1 +}_ε2₎_pk_|_A_|_{edges of that colour between}_A

andX inKn. With probability 1−o(n−1), by Lemma 5.1 applied with ε2forε, for every two disjoint subsets

A, B_⊂V(G) with_|A_|,_|B_{| ≥}n3/4_{, we have}_e

G(A, B)≥(1−ε2)p|A||B|.

Now, for each A ⊂ V(Kn)\X and B ⊂ X with |A| ≥ n3/4 and |B| ≥ εpn ≥ n3/4, there are at least

(1₋ε2₎_p

|A_||B_| edges between A andB in G. Delete all edges between A and B in G whose colour appears more than (1 +ε2₎_pk

|A_|times betweenAandB. As each colour appears betweenAandB inKn at mostk|A|

times, this removes at mostk|A| ·ε2_p2_n/₄_k_edges.

Each remaining colour between AandB inGoccurs between AandB at most (1 +ε2₎_pk

|A_| times inG. Therefore, betweenAandB inGthe number of diﬀerent remaining colours is at least

(1−ε2₎_p_|_A_||_B_{| −}_ε2_p2_n_|_A_|_/₄

(1 +ε2₎_pk_|_A_| ≥(1−ε/2)|B|/k−ε 2_pn/₂_k

≥(1−ε)|B|/k,

(13)

7 Finding a large matching

In this section we show that one can ﬁnd an almost spanning rainbow matching between any set A with appropriate size and a random set X. Note that in the following lemma the random set of colours and the random set of vertices are not required to be independent of each other. This is important for the application of this statement in our methods. More detail on this is given in Section 10.2.

Lemma 7.1. Let k ∈ N _and _{ε >} ₀ _{be constant, and suppose} _K_n _{has a locally} _k_{-bounded colouring and} p_≥n−1/104

. Let X_⊂V(Kn) andC⊂C(Kn) be random subsets where, for eachx∈V(Kn)andc∈C(Kn),

P₍_x _∈ _X_{) =} P₍_c _∈ _C_{) =} _p_{, and all the events} _{_x _∈ _X_} _{are independent and all the events} _{_c _∈ _C_} _are

independent. Then, with probability 1−o(n−1₎_{, for each set} _A _⊂ _V₍_K

n)\X with |A| ≤ pn/k, there is a

C-rainbow matching inKn of size at least |A| −εpnfrom AintoX.

To prove this lemma we first show, using fairly standard edge swapping arguments, how to find a rainbow matching in a bipartite graph which has edges of many different colours between any two large subsets of vertices. Note that we established such a property in our setting in the previous section.

Lemma 7.2. Let ε _≫ η _≫ 1/n > 0. Let G be a bipartite graph with classes X and Y, with _|X_| = n and

|Y_|=kn, which has a locally k-bounded colouring. Suppose that between any two subsets A_⊂X and B _⊂Y

with size at leastηn there are at least (1−η)|B|/k colours in Gwhich appear between A andB. Then, there exists a rainbow matching inGwith at least(1₋ε)nedges.

Proof. LetM be a maximal rainbow matching in G. Let A0=X\V(M) andB0=Y \V(M). Suppose, for

later contradiction, that|A0| ≥εn.

Letting ℓ =⌈4/ε⌉and s = 5ℓ+1_{, deﬁne a sequence of colour sets}_C

0 ⊂ C1 ⊂. . . ⊂ Cℓ, and sequences of

vertex setsA1⊂. . .⊂Aℓ⊆X andB1⊂. . .⊂Bℓ⊆Y recursively as follows, for eachi≥1.

Ci−1={colours with at leastsdisjoint edges betweenAi−1 andBi−1} Ai=A0∪ {u∈X:∃v∈Y s.t.uvis an edge ofM with colour from Ci−1} Bi=B0∪ {v∈Y :∃u∈X s.t.uvis an edge ofM with colour from Ci−1}

Observe that, for each 0_≤i_≤ℓ,M ₋(Ai∪Bi) is a matching, and sok|Ai| ≤ |Bi|. LetC−1=∅.

Claim 7.3. For each 0 _≤i _≤ ℓ and any set of vertices U _⊆Ai∪Bi with |U| ≤5ℓ+1−i, there is a rainbow

matching MU in G, withV(MU)∩U =∅ andC(MU) =C(M), which contains every edge in M with colour

outsideCi−1.

Proof. The proof is by induction oni. For i= 0, the claim holds, takingMU =M for anyU. Suppose then

thati >0 and the claim holds fori−1.

LetU ⊆Ai∪Bi with|U| ≤5ℓ+1−i. LetU1=U ∩(Ai−1∪Bi−1) andU2=U\U1. By the deﬁnition ofAi

andBi, there is a set of at most|U2| edgesM2 ⊆M such thatU2⊆V(M2) andC(M2)⊆Ci−1\Ci−2. Note

thatV(M2)∩(Ai−1∪Bi−1) =∅.

By the deﬁnition of Ci−1, for every m ∈ M2, there are at least s disjoint edges between Ai−1 and Bi−1

with colour the same as m. Sinces= 5ℓ+1

≥5_·5ℓ+1−i

≥ |U1|+ 2|M2|, we can choose an edge,em say, with

colour the same asm, for allm_∈M2, such thatM2′ ={em:m∈M2} is a matching,V(M2′)⊂Ai−1∪Bi−1

andV(M2′)∩U1=∅. Note that|M2′|=|M2|andC(M2′) =C(M2)⊆Ci−1\Ci−2.

LetU′ ₌_V₍_M′

2)∪U1, so thatU′ ⊆Ai−1∪Bi−1 and|U′| ≤ |U1|+ 2|U2| ≤5ℓ+2−i. By induction, there is

a matching MU′ with C(M_U′) = C(M) which is disjoint from U′ and contains every edge in M with colour

outside Ci−2. Note that, therefore, M2 ⊂ MU′. Let MU = (MU′ \M2)∪M2′. This replaces each edge m ∈ M2 ⊂ MU′ with an edge of the same colour, em, which is disjoint from U, while keeping the edges in

the matching independent. Thus, C(MU) =C(MU′) =C(M). As the edges moved had colour in Ci−1, MU

contains every edge inMU′ with colour outsideCi−1, and thus every edge inM with colour outsideCi−1.

(14)

Proof. Suppose that there is an edgexy fromAi toBi whose colour does not occur onM. Claim 7.3 applied

withU =_{x, y_} gives a maximal rainbow matchingMU in GwithC(MU) =C(M) andV(MU)∩ {x, y}=∅.

Letting M′ ₌ _M

U +xy gives a rainbow matching in G with size |M|+ 1, contradicting the maximality of

M.

Letµ= 1/2ℓ, which is a non-zero decreasing function ofε, so thereforeµ≫η.

Claim 7.5. Let 0_≤ i_≤ℓ and let F be a set of at most (1₋µ)_|Ai| colours. Then, there is a colour c 6∈F

which occurs on at least sdisjoint edges between Ai andBi.

Proof. Let t = ⌈2ks/µε⌉, and note this is a function of ε and k. Noting that |Ai| ≥ |A0| ≥ εn, take in Ai disjoint sets S1, . . . , St, each with size at least ηn. By the assumption of the lemma, there are at least

(1₋η)_|Bi|/k ≥ (1−η)|Ai| colours between Sj and Bi, for eachj ∈ [t]. Therefore, there are at least (1−

η)_|Ai| − |F| ≥(µ−η)|Ai| ≥µεn/2 colours outsideF betweenSj andBi for eachj∈[t]. By Claim 7.4, there

are at mostncolours betweenAi andBi. Therefore, there is a colour, csay, outside F which occurs between

Bi and at least (µεn/2)·t/n≥ksdiﬀerent setsSj. As the colouring ofGis locallyk-bounded and the setsSj

are disjoint, there are thus at leastsdisjoint edges with colourcbetweenAi andBi.

Claim 7.6. For each0_≤i_≤ℓ,_|Ci| ≥(1−µ)|Ai|.

Proof. If|Ci|<(1−µ)|Ai|, then, by Claim 7.5 with F =Ci, there is some colourc6∈Ci occuring at least s

times betweenAi andBi. This colour should be inCi, a contradiction.

Note that all colours ofCi−1 occur between Ai−1 and Bi−1, and so, by Claim 7.4, occur on M. Since M

is rainbow, the deﬁnition of Ai implies that for each i ∈ [ℓ] we have |Ai| = |A0|+|Ci−1|. Combined with

Claim 7.6, this gives|Ai| ≥ (1−µ)|Ai−1|+|A0| for each i∈ [ℓ], and hence |Aℓ| ≥ Pjℓ−=01(1−µ)j|A0|. Note

that, asµ= 1/2ℓ, (1₋µ)j

≥1/2 for eachj _∈[ℓ], and therefore_|Aℓ| ≥ℓ|A0|/2. Thus,|Aℓ| ≥ℓ·εn/2 > n, a

contradiction, so that the original matchingM must satisfy the property in the lemma.

We now put this together with Lemma 6.1 to complete the proof of the main result of this section.

Proof of Lemma 7.1. Letη be a ﬁxed constant not dependent onnwhich satisﬁesε≫η >0. With probability 1₋o(n−1_{), by Lemma 6.1 applied with}_η/₂_k_for_ε_{, for each}_A

⊂V(Kn)\X andB⊂X with|A|,|B| ≥ηpn/2k,

there are at least (1₋η)_|B_|/k colours inC betweenA andB. With probability 1₋o(n−1_{), by Lemma 3.3,}

(1−η/2)pn≤ |X| ≤(1 +η/2)pn.

We claim that the property in the lemma holds. Let A _⊂ V(Kn)\X with |A| ≤ pn/k. Add vertices

to A from V(Kn)\X, or delete vertices from A, to get a set A′ with |A′| = ⌊(1−η/2)pn/k⌋ =: m and

|A_\A′

| ≤ηpn/k_≤εpn/2. LetX′_{be a subset of}_X _{of size}_km_{. Note that for any subsets}_A′′

⊂A′ _and_B

⊂X′

with |A′′_|_,_|_B_{| ≥}_ηm_{, there are at least (1}₋_η₎_|_B_|_/k _{colours in} _C _between _A′′ _and _B _(since _ηm _≥ _ηpn/₂_k_).

Thus, by Lemma 7.2, there is aC-rainbow matching with at least (1₋ε/4)m _{≥ |}A′

| −εpn/2 edges between

A′ _and_X′_{. As}

|A_\A′

| ≤εpn/2 at least_|A_{| −}εpnof the edges in thisC-rainbow matching must lie between

AandX.

We remark that, in the particular case whereKn has an equal number of edges of each colour, Lemma 7.1

can be proved from Lemma 5.1 and Corollary 6.5 using a standard implementation of the R¨odl nibble (see, for example, [5]).

8 Star finding

(15)

Lemma 8.1. Let 0 < ε < 1/100and ℓ ≤ε2_n/₂_{. Let} _G _{be an} _n_{-vertex graph with minimum degree at least}

(1₋ε)n which contains an independent set on the distinct verticesv1, . . . , vℓ. Let d1, . . . , dℓ ≥1 be integers

satisfying P

i∈[ℓ]di≤(1−3ε)n. Let G be edge-coloured so that no two edges touching the same vertexvi are

coloured the same.

Then, G contains vertex-disjoint stars S1, . . . , Sℓ so that, for each i∈[ℓ],Si is a star rooted at vi withdi

leaves, and_∪i∈[ℓ]Si is rainbow.

Proof. Letm=_⌈ε−1

⌉+ 1. Choose disjoint collectively-rainbow starsS1, . . . , Sℓ inGso that, for eachi∈[ℓ],

Si is rooted atvi and has at most di+m leaves, andPi∈[ℓ]|Si| is maximised. Either these stars satisfy the

lemma, or, without loss of generality, S1 has fewer than d1 leaves. Supose then, that S1 has fewer than d1

leaves.

LetA0=∅. Deﬁne the setsC1⊂C2⊂. . .⊂Cm⊂C(G) and A1⊂A2 ⊂. . .⊂Am recursively by letting

Cj be the set of colours inC(G) which do not appear on (∪i∈[ℓ]Si)−Aj−1 and lettingAj =NCj(v1). Claim 8.2. For eachs_∈[m],As⊂ ∪i∈[ℓ]V(Si).

Proof. Suppose, for contradiction, that there is some u1 ∈ As with u1 6∈ ∪i∈[ℓ]V(Si). By the deﬁnition of

As, we havec(v1u1) = c1 for some colour c1 ∈ Cs. Let u1, . . . , uj ∈V(G) andc1, . . . , cj ∈C(G) be longest

sequences of distinct vertices and colours satisfying the following properties. (1) c(v1uk) =ck fork= 1, . . . , j.

(2) c(ukvr) =ck−1 fork= 2, . . . , j, for somevrwithvruk∈E(Sr).

(3) For eachk= 1, . . . , j,ck ∈Cs+1−k.

We claim that j = s, so that cj ∈ C1 and the colour cj does not appear on ∪i∈[ℓ]Si. Suppose otherwise.

Then, by (3) we have cj ∈ Cs+1−j. By the deﬁnitions of Cs+1−j and C1, there is some r and some edge uvr ∈ Si∈[ℓ]E(Si) with u ∈ As−j and c(uvr) = cj. By the deﬁnition of As−j, the edgev1u is present and

has c(v1u) ∈ Cs−j. Notice that since cj =6 c1, . . . , cj−1 (by (1)) and Sℓ_i₌₁Si is rainbow, this implies that

u_{6∈ {}u1, . . . , uj} (by (2)). Since the colours of the edges atv1 are all distinct, we have c(v1u)6∈ {c1, . . . , cj}

(using (1)). Letcj+1=c(v1u) and uj+1=u. This is a longer sequence satisfying (1) – (3), contradicting the

maximality ofj.

Note that (3) implies that j _≤m+ 1. For each i _≥2, let S′

i =Si− {u2, . . . , uj} and let S1′ be the star

formed fromS1by adding the edgesv1u1, . . . , v1uj. In total, to get the starsSi′,i∈[ℓ], we added an edgev1uj

of colourcj, a colour not appearing on ∪i∈[ℓ]Si, and, for each colourci, i∈ {1, . . . , j−1} we removed some

edge (adjacent toui+1) of that colour from some starSi′ and then added an edge of that colour atv1 (v1ui).

Thus, the new stars are still collectively rainbow. AsS1had fewer thand1leaves,S1′ has at mostd1+mleaves.

Therefore, asP

i∈[ℓ]|Si′|= 1 + P

i∈[ℓ]|Si|, this violates the choice of the starsSi,i∈[ℓ].

For eachj_∈[m], we have by Claim 8.2, that

|C(G)_\Cj|=| ∪i∈[ℓ]E(Si)| − |Aj−1| ≤

X

i∈[ℓ]

(di+m)− |Aj−1| ≤n−3εn+ℓm− |Aj−1| ≤n−2εn− |Aj−1|.

Asv1has at least (1−ε)nneighbours, and thus is adjacent to edges of at least (1−ε)ndiﬀerent colours, we

then have, for eachj∈[m],

|Aj| ≥(1−ε)n− |C(G)\Cj| ≥ |Aj−1|+εn.

Therefore,_|Am| ≥m·εn > n, a contradiction.

(16)

Corollary 8.3. Let 0< ε <1/100andℓ≤ε2_n/₂_{. Let}_G_{be an} _n_{-vertex graph with minimum degree at least}

satisfyingP

i∈[ℓ]di≤(1−3ε)n, and supposeG has a locallyk-bounded edge-colouring.

Then, G contains disjoint starsS1, . . . , Sℓ so that, for each i∈[ℓ],Si is a star rooted atvi with di leaves,

and_∪i∈[ℓ]Si has at mostk edges of each colour.

Proof. For eachi∈[ℓ], recolour the edges next toviinGby, for each colourc∈C(G), letting the edges next to

vi with colourc have distinct colours from{(c,1), . . . ,(c, k)}. In this new colouring, no two edges of the same

colour can meet at some vertexvi. Applying Lemma 8.1, ﬁnd disjoint starsS1, . . . , Sℓso that, for eachi∈[ℓ],

Si is a star rooted atvi withdi leaves, and the stars are collectively rainbow in the new colouring. Observing

that each colour must appear at mostktimes on_∪i∈[ℓ]Si in the original colouring completes the proof.

We wish to ﬁnd vertex-disjoint collectively-rainbow stars in a locallyk-bounded colouring. We show they exist within the stars found by Corollary 8.3, as follows.

Corollary 8.4. Let 0< ε <1/100andℓ≤ε2_n/₂_{. Let}_G_{be an} _n_{-vertex graph with minimum degree at least}

satisfyingP

i∈[ℓ]di≤(1−3ε)n/k, and supposeGhas a locally k-bounded edge-colouring.

Then, G contains disjoint starsS1, . . . , Sℓ so that, for each i∈[ℓ],Si is a star rooted atvi with di leaves,

and_∪i∈[ℓ]Si is rainbow.

Proof. By Corollary 8.3, Gcontains disjoint starsS1, . . . , Sℓ so that, for each i∈[ℓ], Si is a star rooted atvi

with kdi leaves, and∪i∈[ℓ]Si has at mostk edges of each colour. Create an auxillary bipartite graphH with

vertex classes [ℓ] andC(G), where there is an edge betweeni∈[ℓ] andc∈C(G) exactly when there is an edge with colourcin Si.

Given any setA_⊂[ℓ], there areP

i∈Akdiedges in∪i∈ASi, and therefore at leastPi∈Adi diﬀerent colours.

Thus, for any set A _⊂ [ℓ], we have _|NH(A)| ≥ Pi∈Adi. As Hall’s generalised matching condition (see, for

example, [8, Chapter 3, Corollary 11]) is satisﬁed, there is a collection of disjoint starsS′

i,i∈[ℓ], inH, where,

for eachi_∈[ℓ], the starS′

i is rooted atiand hasdi leaves.

For eachi_∈[ℓ], pick a neighbour inSi for each colour in Si′, and call the resulting starSi′′. The starsSi′′,

i∈[ℓ], then satisfy the corollary.

9 Rainbow connecting paths

In this section we prove that if we choose a random set X and a random set of colours C, then, with high probability, we can connect any small collection of pairs of vertices by collectively-rainbow vertex-disjoint paths of length 3, whose edges have colours in C and whose intermediate vertices are inX. Note that unlike some previous sections here we do assume that the random choices forX andC are completely independent. First we need the following simple proposition.

Proposition 9.1. Let 1/k, p, q _≫ 1/n > 0 and suppose Kn has a locally k-bounded colouring. Let X ⊂

V(Kn)andC⊂C(Kn)be subsets with each element chosen independently at random with probabilitypandq

respectively. Almost surely, each vertex has at leastpqn/2 colour-C neighbours inX.

Proof. For eachx_∈V(Kn), letdx be the number of colour-C neighbours inX. Note that dx is k-Lipschitz,

and E_d_x ₌_pq₍_n₋_{1). Thus, by Azuma’s inequality (Lemma 3.4) with} _t₌_pqn/_{3, we have} P₍_d_x_{< pqn/}₂₎_≤

2 exp(−p2_q2_n/₉_k2_{) =}_o₍_n−1_{). Thus,}_d

x≥pqn/2 for eachx∈V(Kn) with probability 1−o(1).

Lemma 9.2. Let 1/k, p_≫µ_≫1/n >0 and supposeKn has a locally k-bounded colouring. LetX ⊂V(Kn)

andC_⊂C(Kn)be subsets with each element chosen independently at random with probabilityp. Almost surely,

for each pair of distinct vertices u, v _∈V(Kn)\X there are at least µn internally vertex-disjoint collectively

C-rainbowu, v-paths with length 3 and internal vertices in X.

Proof. Create a random partitionC=C1∪C2 by assigning each element to a class uniformly at random. By

(17)

Q1 Each vertex has at least 100k2_µ1/3_n_colour_C

1-neighbours inX.

Note that 10kµ1/3_n

≥n3/4_{. Almost surely, by Lemma 5.1, we have the following property.}

Q2 Between every pair of disjoint subsets A, B _⊂ V(Kn) with |A|,|B| ≥ 10kµ1/3n there are at least

p|A||B|/2≥4kµn2colour-C2edges.

Suppose then, for contradiction, there are some pair of distinct vertices u, v _∈ V(Kn) and at most µn

internally vertex-disjoint collectively-C-rainbow u, v-paths with length 3 and internal vertices inX. Fixing a maximal set of such paths,_P, letU _⊂X be their set of internal vertices andC′ _{their set of edge colours. Note}

that|U| ≤2µnand|C′_{| ≤}₃_µn_.

By Q1, we have_|NC1\C′(u, X\U)| ≥100k

2_µ1/3_n

−2µn₋3kµn_≥10kµ1/3_n_{. Let} _A

⊂NC1\C′(u, X\U)

satisfy_|A_|= 10kµ1/3_n_{, and let}_C′′ _{be the set of colours between}_u_and_A_{. Using}_Q1_{again, we have}

|NC1\(C′∪C′′)(v, X\(U∪A))| ≥100k

2_µ1/3_n

−2µn₋3kµn_{− |}A_{| −}k_|A_{| ≥}10kµ1/3n.

LetB_⊂NC1\(C′∪C′′)(v, X\(U∪A)) satisfy|B|= 10kµ

1/3_n_{. By}_Q2_{, there are at least 4}_kµn2_C

2-edges between AandB, at mostkn_{· |}C′

| ≤3kµn2 _{of which have their colour in}_C′_{. Thus, there is some} _x

∈Aandy_∈B so

thatuxyv is a (C\C′_{)-rainbow path with internal vertices in} _X_\_U_{. This contradicts the choice of}_P_.

10 Almost-spanning trees

We have now developed the tools that we will need to take an almost-spanning treeTand embed it into a locally

k-bounded edge-colouredKn. We will do this using (carefully chosen) random partitionsV(Kn) =X0∪. . .∪Xℓ

and C(Kn) =C0∪. . .∪Cℓ, with both ℓ and the distributions depending onT. The vertices in X0 and the

colours inC0are used at various points in the embedding to ﬁnd small parts of the treeT. This will be possible

greedily, but to ease the checking for these parts of the proof we give appropriate embedding results for small trees (or forests) in Section 10.1. The two random partitions will not be independent, and depend both onT

and each other, and on certain almost-sure properties holding. This complicates our use of the probabilistic method to ﬁnd the partition we use, and so we explain these particulars carefully in Section 10.2. We then put together our proof of Theorem 1.1 in Section 10.3.

10.1 Embedding small parts of

_T

When there are many spare colours and vertices, we can construct small rainbow trees, ﬁnish rainbow matchings and ﬁnd collectively-rainbow vertex-disjoint connecting paths, as we do in the following three propositions.

Proposition 10.1. Suppose we have anm-vertex tree T and a graph Gwith a locally k-bounded colouring in which δ(G)≥3km. Then, there is a rainbow copy ofT inG.

Proof. LetT′ _{be a maximal subtree of}_T _{which has a rainbow copy,}_S′ _{say, in} _G_{. Suppose, for contradiction,}

that_|S′

|< mso thatS′ _{has a vertex,}_s_{say, to which a leaf can be appended to ﬁnd a copy of a larger subtree}

thanT′_in_T_{. The edges of}_S′_{have at most}_m_{colours collectively, so}_s_{has at most}_km_colour-_C₍_S′_{) neighbours.}

Thus, smust have at least 3km_{− |}S′

| −km > 0 colour-(C(G)_\C(S′_{)) neighbours in} _V₍_G₎

\V(S′_{). Such a}

neighbour allowsS′ _{to be extended to a larger rainbow copy of a subtree of}_T _than_S′_{, a contradiction.}

Proposition 10.2. Suppose we have a graph Gwith a locallyk-bounded colouring and disjoint sets X, Y, Z_⊂ V(G)and disjoint sets of coloursC, C′_⊂_C₍_G₎_{, such that there is a rainbow}_C_{-matching with at least}_|_X_{| −}_m

edges fromX intoY, and each vertex inGhas at least 2kmcolour-C′ _{neighbours in}_Z_.

Then, there is a(C_∪C′₎_{-rainbow matching with}

|X_|edges fromX intoY_∪Zwhich uses at most mcolours inC′ _{and at most}_m_{vertices in} _Z_.

Proof. LetM0 be a C-rainbow matching with |X| −m edges fromX into Y. Greedily, pick matchingsM0 ⊂ M1⊂. . . ⊂Mmso that, for eachi∈ {0, . . . , m},Mi is a (C∪C′)-rainbow matching with|X| −m+iedges