RETRACTED ARTICLE: Strong limiting behavior in binary search trees

(1)

RE

TR

A

C

TED

30th

A

UGUST

2013

doi:

10.1186/1029-242X

-2013-416

R E S E A R C H

Open Access

Strong limiting behavior in binary search tress

Peishu Chen

*

_{and Weicai Peng}

A retraction article was published for this article. It is available from the following link: http://www.journaloﬁnequalitiesandapplications.com/content/2013/1/416.

*_{Correspondence:} [email protected] Department of Mathematics, Chaohu University, Chaohu, 238000, P.R. China

Abstract

In a binary search tree of sizen, each node has no more than two children, we denote the number of the node withkchildren by

ξ

n,k. In this paper, we study the strong limit behavior of the random variables

ξ

n,kand

σ

n,m, where

σ

n,mrepresents the number of subtrees of sizem. The results can imply some known results.

MSC: 60F05; 05C80

Keywords: binary search tree; strong limit properties; nodes; subtrees

1 Introduction

A binary tree is either empty or composed of a root node together with left and right subtrees which are themselves binary trees. A binary search treeTfor a set of keys from a total order, say{, , . . . ,n}, is a binary tree in which each node has a key value and all the keys of the left subtree are less than the key at the root, and all the keys of the right subtree are greater than the key at the root,i.e., the ﬁrst key is associated with the root, the next key is placed in the left child of the root if it is smaller than the key of the root and it is sent to the right child of the root if it is larger than the key of the root. In this way, we proceed further by inserting key by key. This property holds recursively for the left and right subtrees of the treeT.

Usually, it is assumed that every permutation of{, , . . . ,n}is equally likely and has the same probability /n!. Hence, any parameter of the binary search trees may be considered as a random variable.

The binary tree model turns out to be appropriate in formal language theory, com-puter algebra,etc., whereas the binary search tree model is of importance in sorting and searching algorithms and a lot of combinatorial algorithms. See,e.g., [–] for a detailed description. There are several papers devoted to the study of properties of the param-eters. Kirschenhofer (see []) considered the height of leaves; Panholzer and Prodinger (see []) studied the number of ascendants and the number of descendants of any ﬁxed node. Devroye and Neininger (see []) obtained the tail bounds and the order of higher moments for the path distance between any couples of nodes. Mahmoud and Neininger (see []) arrived at a Gaussian limit law for the distance between randomly selected pairs of nodes in random binary search trees and identiﬁed the rate of convergence. Svante (see []) got the exact and asymptotic formulas for moments and an asymptotic distribution

(2)

RE

TR

A

C

TED

30th

A

UGUST

2013

doi:

10.1186/1029-242X

-2013-416

for the diﬀerence between the left and right total pathlenghts. Devroye (see []) analyzed the properties of some parameter in binary search trees by applying the Stein’s method. There were also many authors, who were interested in the height of binary search trees and drew a variety of properties such as the asymptotic expected value, the variance and the limiting distribution of the height (see [–]). Prodinger (see []) computed the probability that a random binary tree withnnodes hadinodes with  children. Rote (see []) gave three combinatorial proofs for the number of the binary trees having a given number of nodes with , , and  children. Liuet al.(see []) have studied the limiting theorems for the nodes in binary search trees. Suet al.(see []) have studied some limit properties on the subtrees of random binary search trees.

LetTndenote a random binary search tree of sizen. In the binary search tree, every node has two children at most, we denote the number of the node withk (= , , ) children byξn,k. Letσn,mbe the number of subtrees of sizeminTn. In [], Liuet al.have studied the limit properties ofξn,kandσn,min the sense of probably. In this paper, by computing the exact expression of the fourth moment ofξn,k andσn,m, we obtain the strong limit properties (in the sense of a.e.) of them by the Chebyshev inequality and the Borel-Cantelli lemma. Obviously, the results can imply the case in [].

InTn, we haveσn,=ξn,, σn,k=  asn<k, andσk,k= . LetLn andRnbe the size of the left and right subtrees ofTn. It is clear thatLn+Rn=n– . From the procession of constructing binary search trees, we can see that LnandRn are two random variables with uniform distribution on{, , , . . . ,n– }. We writeX=DY ifXandY have the same distribution. It is easy to ﬁnd that

Rn=n–  –Ln D =Ln.

In the following, we will show several properties ofξn,k, andσn,m, the proofs can be seen

in [] and [].

Lemma (see []) Letξn,be the number of nodes having no children in a binary search

tree of size n.For any positive integer n≥,we have

ξn,

D

=ξLn,+ξn*––Ln,, ()

whereξLn,

D

=ξ_n*_––_L_n,,and the two random variablesξLn,andξn*––Ln,are the independent

conditioning on Ln.

Lemma (see []) In a binary search tree of size n,when n≥,we have

Eξn,=Eξn,=

n+ 

 , Eξn,= n– 

 ,

and

Varξn,=Varξn,=

n+ 

 , Varξn,= (n+ )

 .

When n>m,

Eσn,m=

(n+ )

(3)

RE

TR

A

C

TED

30th

A

UGUST

2013

doi:

10.1186/1029-242X

-2013-416

when n> m,

Varσn,m=

m(m– )(n+ )

(m+ )(m+ )_(_m_{+ )}:= (n+ )d 

m, ()

where EX andVarX denote the expectation and variance of X,respectively.

In the following, by computing the fourth moment ofξn,kandσn,m, we obtain the strong

limit properties (in the sense of a.e.) of them by the Chebyshev inequality and the Borel-Cantelli lemma.

2 Main result

Theorem  In a binary search tree of size n,for any integer k= , , ,we have

lim n→∞

ξn,k n =



 a.e. ()

Proof From the Theorem  of [], we already haveξn,=ξn,+  andξn,+ξn,+ξn,=n.

Then we have by Lemma 

E

ξn,–

n+  



=E

n+  – ξn,–

n+  



= E

ξn,–

n+  



, ()

E

ξn,–

n–  



=E

ξn,–  –

n–  



=E

ξn,–

n+  



. ()

In the following, we will show the exact expression ofE[ξn,–n+_ ], then we can get ()

by the Borel-Cantelli lemma.

E

ξn,–

n+  



=E

ξLn,+ξn*––Ln,–

n+  



=E _n_–

j=

ξj,–

j+   +ξ

*

n–j–,–

n–j 



P(Ln=j)

= n

n–

j=

E

ξj,–

j+   +ξ

*

n–j–,–

n–j 



= n

n–

j=

E

ξj,–

j+  



+ E

ξj,–

j+  



ξ_n*_–_j_–,–n–j 

+ E

ξj,–

j+  



ξ_n*_–_j_–,–n–j 



+ E

ξj,–

j+  

ξ_n*_–_j_–,–n–j 



+E

ξ_n*_–_j_–,–n–j 



(4)

RE TR A C TED 30th A UGUST 2013 doi: 10.1186/1029-242X -2013-416

By the independence ofξj,andξn*–j–,, andξj,D=ξn*–j–,, we have

E

ξj,–

j+  



ξ_n*_–_j_–,–n–j 

=E

ξj,–

j+  



E

ξ_n*_–_j_–,–n–j 

= , ()

E

ξj,–

j+  

ξ_n*_–_j_–,–n–j 



=E

ξj,–

j+  

E

ξ_n*_–_j_–,–n–j 



= . ()

Byξj,=Dξn*–j–,, (), (), and (), we have

E

ξn,–

n+    = n n– j= E

ξj,–

j+    + n n– j= E

ξj,–

j+  



E

ξ_n*_–_j_–,–n–j   = n n– j= E

ξj,–

j+    + n n– j=

(j+ )(n–j)

 . ()

Using () once again, we arrive at

E

ξn,–

n+    = n n– j= E

ξj,–

j+  



+ nE

ξn–,–

n   + n n– j=

(j+ )(n–j)  = n n– j= E

ξj,–

j+    + n n– j=

(j+ )(n–  –j)  +

 nE

ξn–,–

n   – n n– j=

(j+ )(n–  –j)  +  n n– j=

(j+ )(n–j) 

=n–  n

 n– 

n–

j=

E

ξj,–

j+  



+  n– 

n–

j=

(j+ )(n–  –j) 

+ nE

ξn–,–

n   + n n– j=

j+   +

 

=n+  n E

ξn–,–

n 



+n+  .

Hence, we have

E[ξn,–n+_ ]

n+  =

E[ξn–,–n_]

n +



. ()

By (), we obtain

E[ξn,–n+_ ]

n+  =· · ·=

E[ξ,– ]

 +

n–   =

n– 

 . ()

By (),

E

ξn,

n+ –  



= n–  (n+ ) ≤



(5)

RE

TR

A

C

TED

30th

A

UGUST

2013

doi:

10.1186/1029-242X

-2013-416

By the Chebyshev inequality and (), for arbitraryε> , we have

Pξn, n+ –

  ≥ε

≤E[

ξn,

n+ –  ]



ε ≤

 (n+ )_ε.

Then

∞

n=

Pξn, n+ –

  ≥ε

<∞. ()

By the Borel-Cantelli lemma and (), for arbitraryε> , we have

P

n≥k ξn,

n+ –   ≥ε

→ (k→ ∞). ()

By (), we have () holds in the casek= ; by (), (), and (), we have () holds in the casek=  andk= . This completes the proof.

In the following, we will show the strong limit property ofσn,min a random binary search tree of sizen.

Theorem  Let n,m be two positive integers,σn,mbe the number of subtrees of size m(<n) in random binary search trees of size n.Then

lim n→∞

σn,m n =



(m+ )(m+ ) a.e. ()

Proof Whenj<m,σj,m≡;σm,m= ; whenn>m, by () and (), we have

E

σn,m–

(n+ ) (m+ )(m+ )



=E

σLn,m–

(Ln+ ) (m+ )(m+ )

+

σ_n*_––_L_n,_m– (Rn+ ) (m+ )(m+ )



= n

n–

j=n–m E

σj,m–

(j+ ) (m+ )(m+ )

+

 – (n–j) (m+ )(m+ )



+  n

n–m–

j=m E

σj,m–

(j+ ) (m+ )(m+ )

+

σ_n*_––_j_,_m– (n–j) (m+ )(m+ )



= n

n–

j=m E

σj,m–

(j+ ) (m+ )(m+ )



+ n

n–m–

j=m E

σj,m–

(j+ ) (m+ )(m+ )



E

σ_n*_––_j_,_m– (n–j) (m+ )(m+ )



– n

n–

j=n–k

(n–j) (k+ )(k+ )E

σj,k–

(j+ ) (k+ )(k+ )

(6)

RE

TR

A

C

TED

30th

A

UGUST

2013

doi:

10.1186/1029-242X

-2013-416

+ n

n–

j=n–m

(n–j) (m+ )(m+ )



E

σj,m–

(j+ ) (m+ )(m+ )



+ n

n–

j=n–m

(n–j) (m+ )(m+ )



=n+  n E

σn–,m–

n (m+ )(m+ )



+d



m n

_n_–

j=m

(j+ )(n–j) – n–

j=m

(j+ )(n–  –j)

– 

n(m+ )(m+ ) _n_–

j=n–m (n–j)E

σj,m–

(j+ ) (m+ )(m+ )



– n–

j=n––m

(n–  –j)E

σj,m–

(j+ ) (m+ )(m+ )



+d



m n

_n_–

j=n–m

(j+ ) (m+ )(m+ )



(j+ ) – n–

j=n––m

(j+ ) (m+ )(m+ )



(j+ )

. ()

The second term of (),

d

m n

_n_–

j=m

(j+ )(n–j) – n–

j=m

(j+ )(n–  –j)

= (n– m– )d_m+(n–m)(m+ )

n d



m

=O(n+ ), ()

wherexn=O(n) means thatxn/n→constant asn→ ∞. Then byσn––m,m/n≤, the third term of (),

 n

_n_–

j=n––m

n–  –j (m+ )(m+ )E

σj,m–

(j+ ) (m+ )(m+ )



– n–

j=n–m

n–j (m+ )(m+ )E

σj,m–

(j+ ) (m+ )(m+ )



= –

n(m+ )(m+ )

× _n_–

j=n–m E

σj,m–

(j+ ) (m+ )(m+ )



–E

σn–,m–

n (m+ )(m+ )



+ m n(m+ )(m+ )E

σn––m,m–

(n–m) (m+ )(m+ )



= 

(m+ )(m+ ) n–

j=n–m E

–σj,m

n +

(j+ ) n(m+ )(m+ )

σj,m–

(j+ ) (m+ )(m+ )



+ 

(m+ )(m+ )E

–σn–,m

n +

 (m+ )(m+ )

σn–,m–

n (m+ )(m+ )



+ 

(7)

RE

TR

A

C

TED

30th

A

UGUST

2013

doi:

10.1186/1029-242X

-2013-416

×E

σn––m,m

n –

(n–m) n(m+ )(m+ )

σn––m,m–

(n–m) (m+ )(m+ )



<  (m+ )(m+ )

n–

j=n–m

 + (j+ ) n(m+ )(m+ )

E

σj,m–

(j+ ) (m+ )(m+ )



+ 

(m+ )(m+ )E

 + 

(m+ )(m+ )

σn–,m–

n (m+ )(m+ )



+ 

(m+ )(m+ )E

σn––m,m n – 

σn––m,m–

(n–m) (m+ )(m+ )



< d



m [(m+ )(m+ )]

n–

j=n–m (j+ )

n +

nd_m [(m+ )(m+ )]+

(n–m)d_m (m+ )(m+ )

=O(n+ ). ()

The last term of (),

d

m n

_n_–

j=n–m

(j+ ) (m+ )(m+ )



(j+ ) – n–

j=n––m

(j+ ) (m+ )(m+ )



(j+ )

=d



m n

n (m+ )(m+ )



n–

(n–m) (m+ )(m+ )



(n–m)

<d



m[n– (n–m)]

n[(m+ )(m+ )] =O(n+ ). ()

By (), (), (), and (), we have

E

σn,m–

(n+ ) (m+ )(m+ )



<n+  n E

σn–,m–

n (m+ )(m+ )



+O(n+ ). ()

Repeating (), we have

E[σn,m–₍_m(_+)(n+)_m_+)] n+  <· · ·<

E[σm+,m–₍_m(_+)(m_m+)_+)]

m+  +O(). ()

By (), fornlarge enough and ﬁxedm, we have

E

σn,m n+ –

 (m+ )(m+ )



<O

 n

. ()

By the Chebyshev inequality, the Borel-Cantelli lemma and (), for arbitraryε> , we have

P n≥m

σn,m n+ –

 (m+ )(m+ )

≥ε

→ (m→ ∞). ()

(8)

RE

TR

A

C

TED

30th

A

UGUST

2013

doi:

10.1186/1029-242X

-2013-416

Competing interests

The author declares that they have no competing interests.

Acknowledgements

This work is supported by Anhui Scientiﬁc Research Foundation (KJ2012A205, KJ2012B117, KJ2013B162) and Chaohu University Scientiﬁc Research Foundation.

Received: 14 June 2012 Accepted: 6 November 2012 Published: 20 February 2013

References

1. Kemp, R: Fundamentals of the Average Case Analysis of Particular Algorithms. Wiley-Teubner, Stuttgart (1984) 2. Knuth, D: The Art of Computer Programming: Sorting and Searching, vol. 3. Addison-Wesley, Reading (1973) 3. Mahmoud, H: Evolution of Random Search Trees. Wiley, New York (1992)

4. Sedgewick, R, Flajolet, P: An Introduction to the Analysis of Algorithms. Addison-Wesley, Reading (1996) 5. Kirschenhofer, P: On the height of leaves in binary trees. J. Comb. Inf. Syst. Sci.8, 44-60 (1983)

6. Panholzer, A, Prodinger, H: Descendants and ascendants in binary trees. Discrete Math. Theor. Comput. Sci.1, 247-266 (1997)

7. Devroye, L, Neininger, R: Distances and ﬁnger search in random binary search trees. SIAM J. Comput.33, 647-658 (2004)

8. Mahmoud, H, Neininger, R: Distribution of distances in random binary search trees. Ann. Appl. Probab.13(1), 253-276 (2003)

9. Svante, J: Left and right pathlenghts in random binary trees. Algorithmica46, 419-429 (2006)

10. Devroye, L: Applications of Stein’s method in the analysis of random binary search trees. In: Chen, L, Barbour, A (eds.) Stein’s Method and Applications. Institute for Mathematical Sciences Lecture Notes Series, vol. 5, pp. 247-297. World Scientiﬁc, Singapore (2005)

11. Drmota, M: An analytic approach to the height of binary search trees. J. Assoc. Comput. Mach.29, 89-119 (2001) 12. Drmota, M: An analytic approach to the height of binary search trees II. J. Assoc. Comput. Mach.50, 333-374 (2003) 13. Devroye, L: A note on the height of binary search trees. J. Assoc. Comput. Mach.33, 489-498 (1986)

14. Devroye, L: Branching processes in the analysis of the height of trees. Acta Inform.24, 277-298 (1987) 15. Drmota, M: The variance of the height of binary search trees. Theor. Comput. Sci.270, 913-919 (2002)

16. Flajolet, P, Martinez, C, Gourdon, X: Patterns in random binary search trees. Random Struct. Algorithms11, 223-244 (1997)

17. Robson, JM: On the concentration of the height of binary search trees. In: ICALP 97 Proceedings. Lecture Notes in Computer Science, vol. 1256, pp. 441-448 (1997)

18. Robson, JM: Constant bounds on the moments of the height of binary search trees. Theor. Comput. Sci.276, 435-444 (2002)

19. Su, C, Liu, J, Feng, QQ: A note on the distance in random recursive trees. Stat. Probab. Lett.76, 1748-1755 (2006) 20. Su, C, Liu, J, Hu, ZS: The law of large numbers of the size of complete interval trees. Adv. Math. China36(2), 181-188

(2007) (in Chinese)

21. Prodinger, H: A note on the distribution of the three types of nodes in uniform binary trees. Sémin. Lothar. Comb.38, 38b (1996) (electronic)

22. Rote, G: Binary trees having a given number of nodes with 0, 1, and 2 children. Sémin. Lothar. Comb.38, 38b (1997) (electronic)

23. Liu, J, Su, C, Chen, Y: Limiting barticle for the nodes in binary search trees. Sci. China Ser. A, Math.51(1), 101-114 (2008)

24. Su, C, Miao, BQ, Feng, QQ: On the subtrees of random binary search trees. Chin. J. Appl. Probab. Stat.22(3), 304-310 (2006) (in Chinese)

doi:10.1186/1029-242X-2013-60