Elliptic integrals, the arithmetic-geometric mean and the Brent-Salamin algorithm for π

(1)

Elliptic integrals, the arithmetic-geometric mean and the

Brent-Salamin algorithm for

π

Notes by G.J.O. Jameson

Contents

1. The integrals I(a, b) and K(b) 2. The arithmetic-geometric mean

3. The evaluation ofI(a, b): Gauss’s theorem

4. Estimates for I(a,1) for large a and I(1, b) for smallb 5. The integrals J(a, b) and E(b)

6. The integrals L(a, b)

7. Further evaluations in terms of the agm; the Brent-Salamin algorithm 8. Further results on K(b) and E(b).

1. The integrals I(a, b) and K(b) Fora, b >0, define I(a, b) = Z π/2 0 1 (a2_cos2_θ₊_b2_sin2_θ)1/2 dθ. (1) An important equivalent form is:

1.1. We have I(a, b) = Z ∞ 0 1 (x2₊_a2₎1/2_(x2₊_b2₎1/2 dx. (2)

Proof. With the substitution x=btanθ, the stated integral becomes

Z π/2 0 bsec2_θ (a2₊_b2_tan2_θ)1/2_b_sec_θ dθ = Z π/2 0 1 (a2₊_b2_tan2_θ)1/2_cos_θ dθ = Z π/2 0 1 (a2_cos2_θ₊_b2_sin2_θ)1/2 dθ. We list a number of facts that follow immediately from (1) or (2):

(I1) I(a, a) = π 2a; (I2) I(b, a) =I(a, b); (I3) I(ca, cb) = 1

cI(a, b), hence aI(a, b) = I(1, b/a);

(I4) I(a, b) decreases with a and with b; (I5) if a≥b, then π

2a ≤I(a, b)≤ π 2b.

(2)

Note that the substitutionx= 1/y in (2) gives I(a, b) = Z ∞ 0 1 (1 +a2_x2₎1/2_{(1 +}_b2_x2₎1/2 dx. (3) 1.2. We have I(a, b)≤ π 2(ab)1/2.

Proof. This follows from the integral Cauchy-Schwarz inequality, applied to (2). Note further that, by the inequality of the means,

π 2(ab)1/2 ≤ π 4 1 a + 1 b .

This upper bound forI(a, b) is also obtained directly by applying the inequality of the means to the integrand in (2).

Clearly,I(a,0) is undefined, and one would expect that I(a, b)→ ∞ as b →0+. This is, indeed, true. In fact, the behaviour ofI(1, b) forb close to 0 is quite interesting; we return to it in section 4.

The “complete elliptic integral of the first kind” is defined by K(b) =

Z π/2

0

1

(1−b2_sin2_θ)1/2 dθ. (4)

For 0 ≤b ≤ 1, the “conjugate”b∗ is defined by b∗ = (1−b2₎1/2_{, so that} _b2₊_b∗2 _{= 1. Then} cos2_θ₊_b2_sin2_θ _{= 1}₋_b∗2_sin2_{θ, so that} _{I(1, b) =}_K_(b∗_).

(An “incomplete” elliptic integral is one in which the integral is over a general interval [0, α]. We do not consider such integrals in these notes.)

Clearly,K(0) = π₂ and K(b) increases with b. Another equivalent form is:

1.3. We have K(b) = Z 1 0 1 (1−t2₎1/2₍₁₋_b2_t2₎1/2 dt. (5)

Proof. Denote this integral byI. Substituting t = sinθ, we have I =

Z π/2

0

1

cosθ(1−b2_sin2_θ)1/2 cosθ dθ =K(b).

1.4PROPOSITION. For 0≤b <1, 2 πK(b) = ∞ X n=0 d2nb2n = 1 +1₄b2+ ₆₄9b4+· · ·, (6)

(3)

where d0 = 1 and

d2n=

12_.32_{. . .}_(2n₋₁₎2 22_.42_{. . .}_(2n)2 .

Proof. By the binomial series,

(1−b2sin2θ)−1/2 = 1 + ∞ X n=1 a2nb2nsin2nθ, where a2n = (−1)n ₋1 2 n = 1 n!( 1 2)( 3 2)· · ·(n− 1 2) = 1.3. . . .(2n−1) 2.4. . . .(2n) . For fixedb <1, the series is uniformly convergent for 0≤θ≤ π

2. Now Rπ/2 0 sin 2n_{θ dθ}₌ π 2a2n: identity (6) follows.

In a sense, this series can be regarded as an evaluation of I(a, b), but a more effective one will be described in section 3.

1.5COROLLARY. For 0≤b <1,

1 + 1₄b2 ≤ 2

πK(b)≤1 + b2 4(1−b2₎.

Proof. Just note that 1 +1₄b2 ≤P∞

n=0d2nb2n≤1 + 1₄(b2+b4+. . .).

Hence 2_πK(b) ≤ 1 + 1₂b2 _{for 0} _≤ _b _≤ _√1

2. These are convenient inequalities for K(b), but when translated into an inequality for I(1, b) = K(b∗), the right-hand inequality says

2

πI(1, b)≤1 + (1−b

2_)/(4b2_{) =} 3

4 + 1/(4b

2_{), and it is easily checked that this is weaker than} the elementary estimate _π4I(1, b)≤1 + 1/b given above.

The special case I(√2,1)

The integral I(√2,1) equates to a beta integral, and consequently to numerous other integrals. We mention some of them, assuming familiarity with the beta and gamma func-tions. 1.6PROPOSITION. We have I(√2,1) = Z 1 0 1 (1−x4₎1/2 dx (7) = Z ∞ 1 1 (x4 ₋₁₎1/2 dx (8) = 1₄B(1₄,1₂) (9) = Γ( 1 4) 2 4√2π. (10)

(4)

Proof. For (7), substitutex= sinθ, so that 1−x4 = cos2θ(1 + sin2θ). Then Z 1 0 1 (1−x4₎1/2 dx= Z π/2 0 1 (1 + sin2θ)1/2 dθ = Z π/2 0 1 (cos2_θ_{+ 2 sin}2_θ)1/2 dθ =I( √ 2,1). Substituting x= 1/y, we also have

Z 1 0 1 (1−x4₎1/2 dx = Z ∞ 1 1 (1−y−4₎1/2 1 y2 dy = Z ∞ 1 1 (y4 ₋₁₎1/2 dy. Substituting x=t1/4 in (7), we obtain I(√2,1) = Z 1 0 1 (1−t)1/2( 1 4t −3/4 )dt= 1₄B(1₄,1₂).

Now using the identities B(a, b) = Γ(a)Γ(b)/Γ(a+b) and Γ(a)Γ(1−a) = π/(sinaπ), we have Γ(1₂) = √π and Γ(1₄)Γ(3₄) =π/(sinπ₄) = π√2, hence

B(1₄,1₂) = Γ( 1 4)Γ( 1 2) Γ(3₄) = √ πΓ(1₄)2 π√2 = Γ(1₄)2 √ 2π . 1.7. We have Z π/2 0 1 (sinθ)1/2 dθ = 2I( √ 2,1), (11) Z 1 0 (1−x4)1/2 dx= Z 1 0 (1−x2)1/4dx = 2₃I(√2,1). (12)

Proof. For (11), substitute x = (sinθ)1/2 in (7). Denote the two integrals in (12) by I1, I2. The substitutions x= t1/4 and x = t1/2 give I1 = ₄1B(1₄,3₂) and I2 = 1₂B(1₂,5₄). The identity (a+b)B(a, b+ 1) =bB(a, b) now gives I1 =I2 = 1₆B(1₄,1₂).

2. The arithmetic-geometric mean

Given positive numbers a, b, the arithmetic mean is a1 = 1₂(a+b) and the geometric

mean is b1 = (ab)1/2. If a=b, then a1 =b1 =a. If a > b, thena1 > b1, since 4(a2₁−b2₁) = (a+b)2−4ab= (a−b)2 >0.

Witha > b, consider the iteration given by a0 =a, b0 =b and an+1 = 1₂(an+bn), bn+1 = (anbn)1/2.

At each stage, the two new numbers are the arithmetic and geometric means of the previous two. The following facts are obvious:

(5)

(M1) an > bn;

(M2) an+1 < an and bn+1 > bn;

(M3) an+1−bn+1 < an+1−bn = 1₂(an−bn), hence an−bn< ₂1n(a−b).

It follows that (an) and (bn) converge to a common limit, which is called the arithmetic-geometric mean of a and b. We will denote it byM(a, b).

Convergence is even more rapid than the estimate implied by (M3), as we will show below. First, we illustrate the procedure by the calculation of M(√2,1) and M(100,1):

n bn an 0 1 1.414214 1 1.189217 1.207107 2 1.198124 1.198157 3 1.198140 1.198140 n bn an 0 1 100 1 10 50.5 2 22.4722 30.25 3 26.0727 26.3611 4 26.2165 26.2169 5 26.2167 26.2167

Of course,anandbnnever actually become equal, but more decimal places would be required

to show the difference at the last stage shown. In fact, in the first example,a3−b3 <4×10−10, as we will see shortly.

Clearly, M(a, b) = M(an, bn) and bn < M(a, b) < an for all n. Also, M(ca, cb) =

cM(a, b) for c >0, so M(a, b) =aM(1, b/a) = bM(a/b,1).

For further analysis of the process, definecn by a2n−b2n=c2n. 2.1. With cn defined in this way, we have

(M4) cn+1 = 1₂(an−bn), (M5) an =an+1+cn+1, bn=an+1−cn+1, (M6) c2 n= 4an+1cn+1, (M7) cn+1 ≤ c2 n 4b.

Proof. For (M4), we have

c2_n₊₁ =a2_n₊₁−b2_n₊₁ = 1₄(an+bn)2−anbn= 1₄(an−bn)2.

(M5) follows, since an+1 = 1₂(an +bn), and (M6) is given by c2n = (an +bn)(an− bn) =

(2an+1)(2cn+1). Finally, (M7) follows from (M6), since an+1 > b. (M5) shows how to derive an and bn from an+1 and bn+1, reversing the agm iteration.

(6)

(M7) shows thatcn(hence also an−bn) converges to 0quadratically, hence very rapidly. We

can derive an explicit bound for cn as follows:

2.2. Let R >0 and let k be such that ck ≤4b/R. Then cn+k ≤ 4b/R2

n

for all n ≥0. The same holds with b replaced by bk.

Proof. The hypothesis equates to the statement for n = 0. Assuming it now for a general n, we have by (M7) cn+k+1 < 1 4b (4b)2 R2n+1 = 4b R2n+1.

In the case M(√2,1), we have c0 =b = 1. TakingR= 4, we deduce cn≤4/42

n

for all n. For a slightly stronger variant, note that c2 =a1−a2 < ₁₀₀1 , hence cn+2 ≤ 4/(4002

n ) = 4/(202n+1 ) forn ≥0, orcn≤4/(202 n−1 ) forn ≥2. In particular, c4 ≤4×20−8 <2×10−10. We finish this section with a variant of the agm iteration in the form of a single sequence instead of a pair of sequences.

2.3. Let b >0. Define a sequence (kn) by: k0 =b and kn+1 = 2kn1/2 1 +kn . Then M(1, b) = ∞ Y n=0 1 2(1 +kn).

Proof. Let (an), (bn) be the sequences generated by the iteration for M(1, b) (with

a0 = 1, b0 =b, whether or not b <1), and let kn =bn/an. Thenk0 =b and kn+1 = bn+1 an+1 = 2(anbn) 1/2 an+bn = 2k 1/2 n 1 +kn . So the sequence (kn) defined by this iteration is (bn/an). Also,

1 2(1 +kn) = an+bn 2an = an+1 an , so n−1 Y r=0 1 2(1 +kr) = an→M(1, b) as n→ ∞.

Quadratic convergence of (kn) to 1 is easily established directly:

1−kn+1 =

1−2k1n/2+kn

1 +kn

(7)

Another way to present this iteration is as follows. Assume now that b < 1. Recall that for 0 ≤k≤1, the conjugate k∗ is defined by k2+k∗2 = 1. Clearly, if t= 2k1/2/(1 +k) (where 0 ≤k ≤1), then t∗ = (1−k)/(1 +k). So we have k_n∗₊₁ = (1−kn)/(1 +kn), hence

1 +k∗_n₊₁ = 2/(1 +kn), so that M(1, b) = ∞ Y n=1 1 1 +k∗ n .

3. The evaluation of I(a, b): Gauss’s theorem

We now come to the basic theorem relatingI(a, b) and M(a, b). It was established by Gauss in 1799. There are two distinct steps, which we present as separate theorems. In the notation of section 2, we show first that I(a, b) =I(a1, b1). It is then a straightforward deduction that I(a, b) = π/[2M(a, b)]. Early proofs of the first step, including that of Gauss, were far from simple. Here we present a proof due to Newman [New1] based on an ingenious substitution.

3.1THEOREM. Let a≥b >0, and let a1 = ₂1(a+b), b1 = (ab)1/2. Then

I(a, b) = I(a1, b1). (1)

Proof. We write I(a1, b1) as an integral on (−∞,∞), using the fact that the integrand is even: I(a1, b1) = 1 2 Z ∞ −∞ 1 (x2₊_a2 1)1/2(x2+b21)1/2 dx. Substitute 2x=t− ab t . Then x→ ∞ ast → ∞and x→ −∞ as t→0+_{. Also,} dx

dt = 1 2(1 +ab/t 2_{) and} 4(x2+b2₁) = 4x2 + 4ab=t2+ 2ab+a 2_b2 t2 = t+ab t 2 , hence (x2+b2₁)1/2 = 1₂ t+ ab t . Further, 4(x2 +a2₁) = 4x2+ (a+b)2 =t2 +a2+b2+a 2_b2 t2 = 1 t2(t 2 +a2)(t2+b2).

(8)

Hence I(a1, b1) = 1 2 Z ∞ 0 t 1 2(t 2₊_a2₎1/2_(t2₊_b2₎1/2 1 2(t+ ab t) 1 2 1 + ab t2 dt = Z ∞ 0 1 (t2₊_a2₎1/2_(t2₊_b2₎1/2 dt = I(a, b).

Lord [Lo2] gives an interesting alternative proof based on the area in polar coordinates.

3.2THEOREM. We have

I(a, b) = π

2M(a, b) (2)

Proof. Let (an), (bn) be the sequences generated by the iteration for M(a, b). By

Theorem 3.1, I(a, b) = I(an, bn) for all n. Hence

π 2an

≤I(a, b)≤ π

2bn

for all n. Since both an and bn converge to M(a, b), the statement follows.

We describe some immediate consequences; more will follow later.

3.3. We have

π

a+b ≤I(a, b)≤ π

2(ab)1/2. (3)

Proof. This follows from Theorem 3.1, since π/(2a1)≤I(a1, b1)≤π/(2b1). The right-hand inequality in (3) repeats 1.2, but I do not know an elementary proof of the left-hand inequality.

From the expressions equivalent toI(√2,1) given in 1.6, we have:

3.4. Let M0 =M( √ 2,1) (≈1.198140). Then Z 1 0 1 (1−x4₎1/2 dx=I( √ 2,1) = π 2M0 (≈1.311029), B(1₄,1₂) = 2π M0 (≈5.244116), Γ(1₄) = (2π) 3/4 M₀1/2 (≈3.625610).

Example. From the valueM(100,1)≈26.2167 found in section 2, we have I(100,1)≈

(9)

Finally, we present several ways in which Theorem 3.1 can be rewritten in terms of K(b). Recall that I(1, b) = K(b∗), where b2 +b∗2 = 1, and that ifk = (1−b)/(1 +b), then k∗ = 2k1/2_{/(1 +}_b).

3.5. For 0≤b <1,

K(b) = I(1 +b,1−b). (4)

Proof. By Theorem 3.1,

I(1 +b,1−b) =I[1,(1−b2)1/2] =I(1, b∗) = K(b).

3.6. For 0< b <1, K(b) = 1 1 +b K 2b1/2 1 +b , (5) K(b∗) = 2 1 +b K 1−b 1 +b . (6)

Proof. By (4) and the propertyI(ca, cb) = 1_cI(a, b), K(b) = I(1 +b,1−b) = 1 1 +b I 1, 1−b 1 +b = 1 1 +b K 2b1/2 1 +b . Also, by Theorem 3.1, K(b∗) = I(1, b) =I 1 +b 2 , b 1/2 = 2 1 +b I 1, 2b 1/2 1 +b = 2 1 +b K 1−b 1 +b .

Conversely, any of (4), (5), (6) implies (1) (to show this for (4), start by choosing c and k so that ca= 1 +k and cb= 1−k). For another proof of Theorem 3.1, based on this observation and the series expansion of (1 +ke2iθ)−1/2)(1 +ke−2iθ)−1/2, see [Bor2, p. 12].

4. Estimates for I(a,1) for large a and I(1, b) for small b

In this section, we will show thatI(1, b) is closely approximated by log(4/b) forb close to 0. Equivalently, aI(a,1) is approximated by log 4a for large a, and consequently M(a,1) is approximated by

F(a) =: π 2

a log 4a.

By way of illustration, F(100) = 26.2172 to four d.p., while M(100,1)≈26.2167.

This result has been known for a long time, with varying estimates of the accuracy of the approximation. For example, a version appears in [WW, p. 522], published in 1927.

(10)

A highly effective method was introduced by Newman in [New2], but only sketched rather briefly, with the statement given in the formaI(a,1) = log 4a+O(1/a2). Following [Jam], we develop Newman’s method in more detail to establish a more precise estimation. The method requires nothing more than some elementary integrals and the binomial and logarithmic series, and a weaker form of the result is obtained with very little effort. The starting point is: 4.1. We have Z (ab)1/2 0 1 (x2₊_a2₎1/2_(x2₊_b2₎1/2 dx= Z ∞ (ab)1/2 1 (x2₊_a2₎1/2_(x2₊_b2₎1/2 dx.

Proof. With the substitution x=ab/y, the left-hand integral becomes

Z ∞ (ab)1/2 1 (a_y2b22 +a2)1/2( a2_b2 y2 +b2)1/2 ab y2 dy= Z ∞ (ab)1/2 1 (b2₊_y2₎1/2_(a2₊_y2₎1/2 dy. Hence I(1, b) = 2 Z b1/2 0 H(x)dx, (1) where H(x) = 1 (x2_{+ 1)}1/2_(x2₊_b2₎1/2. (2) Now (1 +y)−1/2 >1−1

2y for 0< y <1, as is easily seen by multiplying out (1 +y)(1− 1 2y) 2_, so for 0< x < 1, (1− 1 2x 2 ) 1 (x2₊_b2₎1/2 < H(x)< 1 (x2 ₊_b2₎1/2. (3)

Now observe that the substitution x=by gives

Z b1/2 0 1 (x2₊_b2₎1/2 dx= Z 1/b1/2 0 1 (y2_{+ 1)}1/2 dy= sinh −1 1 b1/2. 4.2LEMMA. We have log 2 b1/2 <sinh −1 1 b1/2 <log 2 b1/2 + 1 4b. (4)

Proof. Recall that sinh−1x= log[x+ (x2 _{+ 1)}1/2_{], which is clearly greater than log 2x.} Also, since (1 +b)1/2 _<_{1 +} 1 2b, 1 b1/2 + 1 b + 1 1/2 = 1 b1/2[1 + (1 +b) 1/2_]_< 1 b1/2(2 + 1 2b) = 2 b1/2(1 + 1 4b).

(11)

This is already enough to prove the following weak version of our result:

4.3PROPOSITION. For 0< b≤1 and a ≥1,

log4 b − 1 2b < I(1, b)<log 4 b + 1 2b, (5) log 4a− 1 2a < aI(a,1)<log 4a+ 1 2a. (6)

Proof. The two statements are equivalent, because aI(a,1) =I(1,_a1). By (1) and (3), 2 sinh−1 1 b1/2 −R1(b)< I(1, b)<2 sinh −1 1 b1/2, where R1(b) = Z b1/2 0 x2 (x2₊_b2₎1/2 dx. (7) Since (x2₊_b2₎1/2 _{> x, we have} R1(b)< Z b1/2 0 x2 x dx= Z b1/2 0 x dx= 1₂b. Both inequalities in (5) now follow from (4), since 2 log_b12/2 = log

4

b.

This result is sufficient to show thatM(a,1)−F(a)→0 asa→ ∞. To see this, write aI(a,1) = log 4a+r(a). Then

1 I(a,1) −

a

log 4a =−

ar(a)

log 4a[log 4a+r(a)], which tends to 0 as a→ ∞, since |ar(a)| ≤ 1

2.

Some readers may be content with this version, but for those with the appetite for it, we now refine the method to establish closer estimates. All we need to do is improve our estimations by the insertion of further terms. The upper estimate in (3) only used (1 +y)−1/2 <1. Instead, we now use the stronger bound

1 (1 +y)1/2 <1− 1 2y+ 3 8y 2

for 0 < y < 1. These are the first three terms of the binomial expansion, and the stated inequality holds because the terms of the expansion alternate in sign and decrease in mag-nitude. So we have instead of (3):

(1− 1 2x 2 ) 1 (x2₊_b2₎1/2 < H(x)<(1− 1 2x 2 + 3₈x4) 1 (x2 ₊_b2₎1/2. (8) We also need a further degree of accuracy in the estimates for sinh−1 _b11/2 and R1(b).

(12)

4.4LEMMA. For 0< b≤1, we have log 2 b1/2 + 1 4b− 3 32b 2 _<_sinh−1 1 b1/2 <log 2 b1/2 + 1 4b− 1 16b 2 ₊ 1 32b 3_. ₍₉₎

Proof. Again by the binomial series, we have 1 + 1₂b− 1 8b 2 _<_{(1 +}_b)1/2 _<_{1 +} 1 2b− 1 8b 2₊ 1 16b 3_. ₍₁₀₎ So, as in Lemma 4.2, 1 b1/2 + 1 b + 1 1/2 < 1 b1/2(2 + 1 2b− 1 8b 2₊ 1 16b 3_{) =} 2 b1/2(1 + 1 4b− 1 16b 2₊ 1 32b 3_).

The right-hand inequality in (9) now follows from log(1 +x)≤x. Also, 1 b1/2 + 1 b + 1 1/2 > 1 b1/2(2 + 1 2b− 1 8b 2_{) =} 2 b1/2(1 +B), where B = 1₄b− 1 16b 2 _(so_{B <} 1

4b). By the log series, log(1 +x)> x− 1 2x 2 _{for 0}_{< x <}_{1, so} log(1 +B)> B− 1 2B 2 > 1₄b− 1 16b 2₋ 1 32b 2 = 1₄b− 3 32b 2 .

4.5LEMMA. For 0< b≤1, we have:

R1(b)< 1₂b+₄1b2− 1₂b2log 2 b1/2, (11) R1(b)> 1₂b+₄1b2− 1₂b2log 2 b1/2 − 3 16b 3_, ₍₁₂₎ R1(b)< 1₂b− 1₅b2. (13)

Proof: We can evaluateR1(b) explicitly by the substitution x=bsinht:

R1(b) = Z c(b) 0 b2sinh2t dt= 1₂b2 Z c(b) 0 (cosh 2t−1)dt = 1₄b2sinh 2c(b)− 1 2b 2_c(b),

where c(b) = sinh−1(1/b1/2_{). Now}

sinh 2c(b) = 2 sinhc(b) coshc(b) = 2 b1/2 1 b + 1 1/2 = 2 b(1 +b) 1/2_, so R1(b) = 1₂b(1 +b)1/2− 1₂b2c(b). Statement (11) now follows from (1 +b)1/2 _< _{1 +} 1

2b and c(b) > log 2

b1/2. Statement (12)

(13)

Unfortunately, (13) does not quite follow from (11). We prove it directly from the integral, as follows. Write x2/(x2 +b2)1/2 = g(x). Since g(x) < x, we have R_bb1/2g(x)dx <

1 2(b−b

2_{). For 0}_{< x < b, we have by the binomial expansion again}

g(x) = x 2 b(1 +x2_/b2₎1/2 ≤ x2 b 1− x 2 2b2 + 3x4 8b4 , and hence Z b 0 g(x)dx <(1₃ − 1 10 + 3 56)b 2 < ₁₀3b2.

Together, these estimates give R1(b)< 1₂b−1₅b2.

We can now state the full version of our Theorem.

4.6THEOREM. For 0< b≤1 and a≥1, we have

log4 b < I(1, b)<(1 + 1 4b 2 ) log4 b, (14) log 4a < aI(a,1)< 1 + 1 4a2 log 4a. (15)

Moreover, the following lower bounds also apply:

I(1, b)>(1 + 1₄b2) log4 b − 7 16b 2_, ₍₁₆₎ aI(a,1)> 1 + 1 4a2 log 4a− 7 16a2. (17) Hence I(1, b) = (1 +1₄b2) log4 b +O(b 2₎ _for ₀_{< b <}_1, aI(a,1) = 1 + 1 4a2

log 4a+O(1/a2) for a >1.

Note. As this shows, the estimation log 4a+O(1/a2) in [New2] is not quite correct.

Proof: lower bounds. By (8), (9) and (13), I(1, b) > 2 sinh−1 1 b1/2 −R1(b) > log4 b + 1 2b− 3 16b 2₋₍1 2b− 1 5b 2₎ > log4 b, (with a spare term 1

80b

2_{). Of course, the key fact is the cancellation of the term} 1

2b. Also, using (11) instead of (13), we have

I(1, b) > log 4 b + 1 2b− 3 16b 2₋ (1₂b+ 1₄b2− 1 4b 2 log4 b) = (1 +1₄b2) log4 b − 7 16b 2 .

(14)

Upper bound: By (8), I(1, b)<2 sinh−1 1 b1/2 −R1(b) + 3 4R2(b), where R2(b) = Z b1/2 0 x4 (x2₊_b2₎1/2 dx < Z b1/2 0 x3dx= 1₄b2. So by (9) and (12), I(1, b) < log4 b + 1 2b− 1 8b 2₊ 1 16b 3 ₋₍1 2b+ 1 4b 2₋ 1 4b 2_log 4 b − 3 16b 3_{) +} 3 16b 2 = (1 + 1₄b2) log 4 b − 3 16b 2₊1 4b 3_. If b < 3₄, then ₁₆3b2 ≥ 1 4b 3_{, so} _{I(1, b)}_<_{(1 +} 1 4b 2_{) log} 4 b. Forb ≥ 3

4, we reason as follows. By 1.2, I(1, b)≤

π 4(1+ 1 b). Writeh(b) = (1+ 1 4b 2_{) log}4 b− π 4(1 + 1 b). We find that h( 3

4)≈1.9094−1.8326>0. One verifies by differentiation thath(b) is increasing for 0 < b≤1 (we omit the details), henceh(b)>0 for 3₄ ≤b ≤1. By Gauss’s theorem and the inequality 1/(1 +x) > 1−x, applied with x = 1/(4a2), statement (15) translates into the following pair of inequalities for M(a,1):

4.7COROLLARY. Let F(a) = (πa)/(2 log 4a). Then for a ≥1,

1− 1

4a2

F(a)< M(a,1)< F(a). (18)

Since M(a, b) = b M(a/b, 1) we can derive the following bounds for M(a, b) (where a > b) in general: π 2 a log(4a/b) 1− b 2 4a2 < M(a, b)< π 2 a log(4a/b). (19)

Note: We mention very briefly an older, perhaps better known, approach (cf. [Bow, p. 21–22]). Use the identity I(1, b) = K(b∗), and note thatb∗ →1 whenb →0. Define

A(b∗) =

Z π/2

0

b∗sinθ

(1−b∗2_sin2_θ)1/2 dθ

and B(b∗) = K(b∗)−A(b∗). One shows by direct integration thatA(b∗) = log[(1 +b∗)/b] and B(1) = log 2, so that when b→0, we have A(b∗)−log2_b →0 and henceK(b∗)−log4_b →0. This method can be developed to give an estimation with error term O(b2log 1_b), [Bor1, p. 355–357], but the details are distinctly more laborious, and (even with some refinement) the end result does not achieve the accuracy of our Theorem 4.6.

(15)

By a different method, we now establish exact upper and lower bounds for I(1, b)−

log(1/b) on (0,1]. The lower bound amounts to a second proof that I(1, b)>log(4/b).

4.8. The expression I(1, b)−log(1/b) increases with b for 0< b≤1. Hence

log 4≤I(1, b)−log 1 b ≤

π

2 for 0< b≤1; (20)

log 4 ≤aI(a,1)−loga≤ π

2 for a ≥1; (21) π 2 a loga+ π₂ ≤M(a,1)≤ π 2 a

loga+ log 4 for a≥1. (22)

Proof. We show that K(b)−log(1/b∗) decreases with b, so that I(1, b)−log(1/b) = K(b∗)−log(1/b) increases withb. Once this is proved, the inequalities follow, sinceI(1, b)−

log(1/b) takes the value π/2 at b= 1 and (by 4.3) tends to log 4 as b→0+. By 1.4, for 0≤b <1, we have K(b) = π₂ P∞ n=0d2nb 2n_{, where}_d 0 = 1 and d2n= 12_.32_{. . .}_(2n₋₁₎2 22_.42_{. . .}_(2n)2 . Hence K0(b) =πP∞

n=1nd2nb2n−1. Meanwhile, log(1/b∗) =−1₂log(1−b2), so

d dblog 1 b∗ = b 1−b2 = ∞ X n=1 b2n−1.

By well-known inequalities for the Wallis product, nπd2n <1, so K0(b)< _dbd log(1/b∗). Application to the calculation of logarithms and π. Theorem 4.6 has pleasing appli-cations to the calculation of logarithms (assuming π known) and π (assuming logarithms known), exploiting the rapid convergence of the agm iteration. Consider first the problem of calculating logx(where x >1). Choose n, in a way to be discussed below, and let 4a=xn_,

so thatnlogx= log 4a, which is approximated byaI(a,1). We calculate M =M(a,1), and hence I(a,1) =π/(2M). Then logx is approximated by (a/n)I(a,1).

How accurate is this approximation? By Theorem 4.6, log 4a =aI(a,1)−r(a), where 0< r(a)< 1

4a2 log 4a. Hence

logx= a nI(a,1)− r(a) n , in which 0< r(a) n < logx 4a2 = 4 logx x2n .

We choose n so that this is within the desired degree of accuracy. It might seem anomalous that logx, which we are trying to calculate, appears in this error estimation, but a rough estimate for it is always readily available.

(16)

For illustration, we apply this to the calculation of log 2, taking n = 12, so that a= 210= 1024. We find that M =M(1024,1)≈193.38065, so our approximation is

π 2

1024

12M ≈0.6931474,

while in fact log 2 = 0.6931472 to seven d.p.. The discussion above (just taking log 2 < 1) shows that the approximation overestimates log 2, with error no more than 4/224_{= 1/2}20_.

We now turn to the question of approximating π. The simple-minded method is as follows. Choose a suitably large a. By Theorems 3.2 and 4.6,

π

2 =M(a,1)I(a,1) =M(a,1) log 4a

a +s(a), where

s(a) = M(a,1)r(a)

a < M(a,1) log 4a

4a3 .

So an approximation to π is _a2M(a,1) log 4a, with error no more than 2s(a). Using our original example with a= 100, this gives 3.14153 to five d.p., with the error estimated as no more than 8×10−5 _{(the actual error is about 6.4}_×₁₀−5_).

For a better approximation, one would repeat with a larger a. However, there is a way to generate a sequence of increasingly close approximations from a single agm iteration, which we now describe (cf. [Bor1, Proposition 3]).

In the original agm iteration, let cn = (a2n−bn2)1/2. Note that cn converges rapidly

to 0. By Gauss’s theorem, π₂ = M(a0, c0)I(a0, c0). To apply Theorem 4.6, we need to equate I(a0, c0) to an expression of the form I(dn, en), where en/dn tends to 0. This is

achieved as follows. Recall from 2.1 that an = an+1 +cn+1 and c2n = 4an+1cn+1, so the arithmetic mean of 2an+1 and 2cn+1 is an, while the geometric mean is cn. So by Theorem

3.1, I(an, cn) =I(2an+1,2cn+1), and hence

I(a0, c0) =I(2nan,2ncn) = 1 2n_a n I 1, cn an . Since cn/an tends to 0, Theorem 4.6 (or even 4.3) applies to show that

I(a0, c0) = lim n→∞ 1 2n_a n log 4an cn .

Now take a0 = 1 and b0 = √1₂. Then c0 = b0, so M(a0, b0) = M(a0, c0), and (an)

converges to this value. Since π= 2M(a0, c0)I(a0, c0), we have established the following:

4.9PROPOSITION. With an, bn, cn defined in this way, we have

π = lim n→∞ 1 2n−1 log 4an cn .

(17)

Denote this approximation bypn. With enough determination, one can derive an error

estimation, but we will just illustrate the speed of convergence by performing the first three steps. Values are rounded to six significant figures. To calculate cn, we use cn=c2n−1/(4an)

rather than 1₂(an−1−bn−1), since this would require much greater accuracy in the values of an−1 and bn−1. n an bn cn pn 0 1 0.707106 0.707106 1 0.853553 0.840896 0.146447 3.14904 2 0.847225 0.847201 0.00632849 3.14160 3 0.847213 0.847213 0.0000118181 3.14159

More accurate calculation gives the value p3 = 3.14159265, showing that it actually agrees with π to eight decimal places.

While this expression does indeed converge rapidly to π, it has the disadvantage that it requires the calculation of logarithms at each stage. This is overcome by the Gauss-Brent-Salamin algorithm, which we describe in section 7.

5. The integrals J(a, b) and E(b) Fora, b≥0, define

J(a, b) =

Z π/2

0

(a2cos2θ+b2sin2θ)1/2dθ. (1) Note that 4J(a, b) is the perimeter of the ellipse x2/a2+y2/b2 = 1. Some immediate facts:

(E1) J(a, a) = Z π/2 0 a dθ= πa 2 , (E2) J(a,0) = Z π/2 0 acosθ dθ =a, (E3) for c >0, J(ca, cb) = cJ(a, b);

(E4) J(b, a) =J(a, b) (substitute θ= π₂ −φ); (E5) J(a, b) increases with a and with b; (E6) if a≥b, then πb

2 ≤J(a, b)≤ πa

2 .

(E6) follows from (E1) and (E5), since J(b, b)≤J(a, b)≤J(a, a).

Since x+ _x1 ≥ 2 for x > 0, we have I(a, b) +J(a, b) ≥ π. For 0 < b ≤ 1, we have J(1, b)≤ π

(18)

We start with some further inequalities forJ(a, b). Given that J(a, b) is the length of the quarter-ellipse, it would seem geometrically more or less obvious that

(a2+b2)1/2 ≤J(a, b)≤a+b, (2)

since this says that the curve is longer than the straight-line path between the same points, but shorter than two sides of the rectangle. An analytic proof of the right-hand inequality is very easy: since (x2₊_y2₎1/2 _≤_x₊_y _{for positive} _x, _{y, we have (a}2_cos2_θ₊_b2_sin2_θ)1/2 _≤ acosθ+bsinθ for each θ. Integration on [0, π/2] givesJ(a, b)≤a+b. We will improve this inequality below.

For the left-hand inequality in (2), and some further estimations, we use the Cauchy-Schwarz inequality, in both its discrete and its integral form.

5.1. We have

J(a, b)≥(a2+b2)1/2. (3)

Proof. By the Cauchy-Schwarz inequality,

a2cosθ+b2sinθ =a(acosθ) +b(bsinθ)≤(a2+b2)1/2(a2cos2θ+b2sin2θ)1/2. Integrating on [0, π/2], we obtain

a2+b2 ≤(a2+b2)1/2J(a, b),

hence (3).

A slight variation of this reasoning gives a second lower bound, improving upon (E6).

5.2. We have

J(a, b)≥ π

4(a+b). (4)

Proof. By the Cauchy-Schwarz inequality,

acos2θ+bsin2θ = (acosθ) cosθ+ (bsinθ) sinθ

≤ (a2cos2θ+b2sin2θ)1/2(cos2θ+ sin2θ)1/2 = (a2cos2θ+b2sin2θ)1/2.

Integration gives (4), since Rπ/2

0 cos

2_{θ dθ}₌Rπ/2

0 sin

2_{θ dθ}₌ π

4.

(19)

5.3. We have

J(a, b)≤ π

2√2(a

2₊_b2₎1/2_. ₍₅₎

Equality holds when a=b.

Proof. By the Cauchy-Schwarz inequality for integrals (with both sides squared),

J(a, b)2 ≤ Z π/2 0 1 ! Z π/2 0 (a2cos2θ+b2sin2θ)dθ= π 2 π 4 (a 2 ₊_b2_).

Since π/(2√2)≈1.1107, J(a, b) is actually modelled fairly well by (a2+b2)1/2. For u = (a, b), write kuk = (a2 ₊_b2₎1/2_{: this is the} _{Euclidean norm} _on

R2. In this notation, we have J(a, b) = Z π/2 0 k(acosθ, bsinθ)kdθ.

The triangle inequality for vectors says that ku1+u2k ≤ ku1k+ku2k. We show that J(a, b) also satisfies the triangle inequality (so in fact is also a norm on _R2_).

5.4. Let uj = (aj, bj) (j = 1,2), where aj, bj ≥0. Then

J(u1+u2)≤J(u1) +J(u2). (6)

Proof. For eachθ, we have

k[(a1+a2) cosθ,(b1+b2) sinθ]k ≤ k(a1cosθ, b1sinθ)k+k(a2cosθ, b2sinθ)k. Integrating on [0,π₂], we deduce that

J(a1+a2, b1+b2)≤J(a1, b1) +J(a2, b2).

By suitable choices ofu1andu2, we can read off various inequalities forJ(a, b). Firstly, J(a, b)≤J(a,0) +J(0, b) =a+b, as seen in (2). Secondly:

Second proof of (4). Since (a+b, a+b) = (a, b) + (b, a), we have (a+b)π

2 =J(a+b, a+b)≤J(a, b) +J(b, a) = 2J(a, b).

Thirdly, we can compare J(a, b) with the linear function of b that agrees with J(a, b) at b = 0 and b = a. The resulting upper bound improves simultaneously on those in (E6) and (2).

(20)

5.5. For 0≤b≤a, we have

J(a, b)≤a+π 2 −1

b. (7)

Equality holds when b= 0 and when b=a. Proof. Since (a, b) = (a−b,0) + (b, b),

J(a, b)≤J(a−b,0) +J(b, b) = (a−b) + π

2b.

Inequality (7) reflects the fact that J(a, b) is a convexfuncton ofb for fixeda; this also follows easily from (6).

We now consider equivalent and related integrals.

5.6. We have J(a, b) = b2 Z ∞ 0 (x2₊_a2₎1/2 (x2₊_b2₎3/2 dx. (8)

Proof. Substituting x=btanθ, we obtain

Z ∞ 0 (x2+a2)1/2 (x2₊_b2₎3/2 dx. = Z π/2 0 (a2+b2tan2θ)1/2 b3_sec3_θ bsec 2_{θ dθ} = 1 b2 Z π/2 0 (a2+b2tan2θ)1/2cosθ dθ = 1 b2J(a, b). The “complete elliptic integral of the second kind” is

E(b) =

Z π/2

0

(1−b2sin2θ)1/2dθ. (9)

Since cos2_θ₊_b2_sin2_θ _{= 1}₋_b∗2_sin2_{θ, we have} _J_{(1, b) =}_E(b∗_).

Clearly,E(0) =π/2, E(1) = 1 and E(b) decreases with b. Also, K(b)−E(b) = Z π/2 0 1−(1−b2_sin2_θ) (1−b2_sin2_θ)1/2 dθ= Z π/2 0 b2_sin2_θ (1−b2_sin2_θ)1/2 dθ. (10) We use the usual mathematical notationE0(b) for the derivative _dbdE(b). We alert the reader to the fact that some of the literature on this subject, including [Bor2], writesb0where we have b∗ and then usesE0(b) to mean E(b0).

5.7. We have

E0(b) = 1

(21)

Proof. This follows at once by differentiating under the integral sign in (9) and com-paring with (10). 5.8. We have E(b) = Z 1 0 (1−b2_t2₎1/2 (1−t2₎1/2 dt. (12)

Proof. Denote this integral byI. The substitution t= sinθ gives I =

Z π/2

0

(1−b2sin2θ)1/2

cosθ cosθ dθ =E(b).

5.9. For 0≤b ≤1, 1− 1 2b 2 _≤ 2 πE(b)≤1− 1 4b 2_. ₍₁₃₎

Proof. For 0≤x≤1, we have 1−x≤(1−x)1/2 ≤1−1

2x. Hence E(b)≥ Z π/2 0 (1−b2sin2θ)dθ = π 2(1− 1 2b 2_),

and similarly for the upper bound.

When applied to J(1, b) = E(b∗), (13) gives weaker inequalities than (4) and (5). We now give the power series expression. Recall from 1.4 that 2

πK(b) = P∞ n=0d2nb 2n_{, where} d0 = 1 and d2n= 12_.32_{. . .}_(2n₋₁₎2 22_.42_{. . .}_(2n)2 . 5.10PROPOSITION. For 0≤b <1, 2 πE(b) = 1− ∞ X n=1 d2n 2n−1b 2n_{= 1}₋1 4b 2₋ 3 64b 4₊_{· · ·} ₍₁₄₎

Proof. By the binomial series,

(1−b2sin2θ)1/2 = 1 + ∞ X n=1 a2nb2nsin2nθ, where a2n = (−1)n 1 2 n = (−1) n n! ( 1 2)(− 1 2)· · ·( 1 2 −n+ 1) =− 1.3. . . .(2n−3) 2.4. . . .(2n) . For fixed b < 1, the series is uniformly convergent for 0 ≤ θ ≤ π

2. The statement follows, since Z π/2 0 sin2nθ dθ = π 2 1.3. . . .(2n−1) 2.4. . . .(2n) .

(22)

6. The integral L(a, b)

Most of the material in this section follows the excellent exposition in [Lo2].

Further results onI(a, b) andJ(a, b) are most conveniently derived and expressed with the help of another integral. For a >0 and b≥0, define

L(a, b) =

Z π/2

0

cos2_θ

(a2_cos2_θ₊_b2_sin2_θ)1/2 dθ. (1) The first thing to note is that L(b, a)6=L(a, b). In fact, we have:

6.1. For a≥0 and b >0, L(b, a) = Z π/2 0 sin2θ (a2_cos2_θ₊_b2_sin2_θ)1/2 dθ. (2)

Proof. The substitutionθ = π₂ −φ gives

L(b, a) = Z π/2 0 cos2θ (b2_cos2_θ₊_a2_sin2_θ)1/2 dθ = Z π/2 0 sin2φ (a2_cos2_φ₊_b2_sin2_φ)1/2 dφ. Hence, immediately: 6.2. For a, b >0, L(a, b) +L(b, a) = I(a, b), (3) a2L(a, b) +b2L(b, a) =J(a, b). (4) 6.3COROLLARY. For a, b >0,

(a2 −b2)L(a, b) = J(a, b)−b2I(a, b). (5)

Some immediate facts: (L1) L(a,0) = 1

a; L(0, b) is undefined; (L2) L(a, a) = π

4a;

(L3) L(ca, cb) = 1_cL(a, b) for c >0; (L4) L(a, b) decreases with a and with b; (L5) L(a, b)≤ 1

a for all b; (L6) if a≥b, then π

4a ≤L(a, b)≤ π

(23)

6.4. We have

∂

∂bJ(a, b) =bL(b, a). (6)

Proof. Differentiate under the integral sign in the expression for J(a, b). So if we write J1(b) for J(1, b), then J₁0(b) =bL(b,1). In particular, J₁0(1) =L(1,1) = π/4. 6.5. We have L(a, b) = Z ∞ 0 b2 (x2₊_a2₎1/2_(x2₊_b2₎3/2 dx, (7) L(b, a) = Z ∞ 0 x2 (x2₊_a2₎1/2_(x2₊_b2₎3/2 dx, (8)

Proof. With the substitution x=btanθ, the integral in (7) becomes

Z π/2 0 b3sec2θ (a2₊_b2_tan2_θ)1/2_b3_sec3_θ dθ = Z π/2 0 cosθ (a2₊_b2_tan2_θ)1/2 dθ = Z π/2 0 cos2_θ (a2_cos2_θ₊_b2_sin2_θ)1/2 dθ.

The integral in (8) now follows from (3) and the integral (1.2) for I(a, b).

6.6. If a > b, then L(a, b)< L(b, a).

Proof. By (7), applied to bothL(a, b) and L(b, a), L(b, a)−L(a, b) =

Z ∞

0

a2(x2+b2)−b2(x2+a2) (x2₊_a2₎3/2_(x2₊_b2₎3/2 dx.

This is positive, since the numerator is (a2₋_b2_)x2_.

6.7. We have ∂ ∂bI(a, b) = − 1 bL(a, b), (9) b(a2−b2)∂ ∂bI(a, b) =b 2_{I(a, b)}₋_{J(a, b).} ₍₁₀₎

Proof. (9) is obtained by differentiating the integral expression (1.2) for I(a, b) under the integral sign and comparing with (7). (The integral expressions in terms of θ do not deliver this result so readily.) Using (5) to substitute for L(a, b), we obtain (10).

Hence, writingI1(b) for I(1, b), we have I₁0(1) =−L(1,1) =−π/4.

(24)

Identities (9) and (10) are an example of a relationship that is expressed more pleasantly in terms of L(a, b) rather than J(a, b). More such examples will be encountered below.

The analogue of the functions K(b) and E(b) is the pair of integrals L(1, b∗) = Z π/2 0 cos2_θ (1−b2_sin2_θ)1/2 dθ, L(b ∗ ,1) = Z π/2 0 sin2θ (1−b2_sin2_θ)1/2 dθ. Corresponding results apply. For example, the substitution t = sinθ gives

L(1, b∗) =

Z 1

0

(1−t2)1/2 (1−b2_t2₎1/2 dt

(compare 1.3 and 5.8), and power series in terms ofbare obtained by modifying 1.5. However, we will not introduce a special notation for these functions.

Recall Theorem 4.6, which stated: log4_b < I(1, b)< (1 +1₄b2_{) log}4

b for 0< b < 1. We

now derive a corresponding estimation for J(1, b) (actually using the upper bound from 4.8).

6.8LEMMA. For 0< b <1, log1 b +c1 ≤L(b,1)≤log 1 b +c2, (11) where c1 = log 4−1, c2 =π/4.

Proof. By 4.8, we have log1_b + log 4 ≤ I(1, b) ≤ log _b1 + π₂. Also, π₄ ≤ L(1, b) ≤ 1.

Statement (11) follows, by (3).

6.9PROPOSITION. For 0< b <1,

J(1, b) = 1 + 1₂b2log1

b +r(b), (12)

where c3b2 ≤r(b)< c4b2, with c3 = log 2− 1₄ and c4 = 1₄ +π₈.

Proof. WriteJ1(t) = J(1, t). By (6),J₁0(t) =tL(t,1), soJ1(b)−1 =R₀btL(t,1)dt. Now

Z b 0 (−tlogt)dt = −1 2t 2 logtb₀+ Z b 0 1 2t dt = 1₂b2log 1 b + 1 4b 2_.

By (11), a lower bound for J(1, b) is obtained by adding Rb

0 c1t dt = 1 2c1b

2_{, and an upper}

bound by adding 1₂c2b2.

Since J(a,1) =aJ(1,1_a), we can derive the following estimation fora >1, J(a,1) = a+loga

2a +r1(a), where c3/a≤r1(a)≤c4/a.

(25)

It follows from (12) that J₁0(0) = 0 (this is not a case of (6)). Note that 1_bJ₁0(b) = L(b,1)→ ∞ asb →0+, so J1 does not have a second derivative at 0.

By (4), we haveL(1, b) +b2_L(b,_{1) =}_J_{(1, b). By (11) and (12), it follows that}_{L(1, b) =} 1− 1

2b 2_log1

b +O(b

2_{). A corresponding expression for} _L(b,_{1) can be derived by combining} this with Theorem 4.6.

Values at (√2,1)

Recall from 1.6 that

I(√2,1) = Z 1 0 1 (1−x4₎1/2 dx = 1 4B( 1 4, 1 2). 6.10. We have L(√2,1) = Z 1 0 x2 (1−x4₎1/2 dx= 1 4B( 1 2, 3 4). (13)

Proof. Substitutex= sinθ, so that 1−x4 = cos2θ(1 + sin2θ) = cos2θ(cos2θ+ 2 sin2θ). Then Z 1 0 x2 (1−x4₎1/2 dx= Z π/2 0 sin2θ (cos2_θ_{+ 2 sin}2_θ)1/2 dθ =L( √ 2,1). Also, the substitution x=t1/4 gives

Z 1 0 x2 (1−x4₎1/2 dx= Z 1 0 t1/2 (1−t)1/2 1 4t −3/4_dt ₌ 1 4 Z 1 0 1 t1/4₍₁₋_t)1/2 dt= 1 4B( 1 2, 3 4).

Now using the identities Γ(a+1) =aΓ(a), Γ(1 2) =

√

πandB(a, b) = Γ(a)Γ(b)/Γ(a+b), we deduce:

6.11. We have

L(√2,1)I(√2,1) = π

4. (14)

Proof. By the identities just quoted, B(1₂,1₄)B(1₂,3₄) = Γ( 1 2)Γ( 1 4) Γ(3₄) Γ(1₂)Γ(3₄) Γ(5₄) = Γ(1₂)2Γ(1₄) Γ(5₄) = 4π, hence (14). Now I(√2,1) = π/(2M0), where M0 =M( √ 2,1). We deduce: 6.12. Let M0 =M( √ 2,1). Then L(√2,1) = M0 2 (≈0.599070), (15)

(26)

L(1,√2) = π 2M0 − M0 2 (≈0.711959), (16) J(√2,1) = π 2M0 +M0 2 (≈1.910099). (17)

Proof. (15) is immediate. Then (16) follows from (3), and (17) from (4), in the form

J(√2,1) = 2L(√2,1) +L(1,√2).

Two related integrals

We conclude with a digression on two related integrals that can be evaluated explicitly.

6.13. For 0< b <1, Z π/2 0 cosθ (1−b2_sin2_θ)1/2 dθ= sin−1b b , Z π/2 0 sinθ (1−b2_sin2_θ)1/2 dθ = 1 2blog 1 +b 1−b.

Proof. Denote these two integrals by F1(b), F2(b) respectively. The substitution x = bsinθ gives F1(b) = 1 b Z b 0 1 (1−x2₎1/2 dx= sin−1b b .

Now let b∗ = (1−b2₎1/2_{, so that} _b2₊_b∗2 _{= 1. Then 1}₋_b2_sin2_θ₌_b∗2₊_b2_cos2_{θ. Substitute} x=bcosθ to obtain F2(b) = 1 b Z b 0 1 (b∗2₊_x2₎1/2 dx= 1 b sinh −1 b b∗.

Now sinh−1u= log[u+ (1 +u2₎1/2_{] and 1 +}_b2_/b∗2 _{= 1/b}∗2_{, so}

F2(b) = 1 b log

1 +b b∗ .

Now observe that

1 +b b∗ = 1 +b (1−b2₎1/2 = (1 +b)1/2 (1−b)1/2.

The integral F2(b) can be used as the basis for an alternative method for the results section 4. Another application is particularly interesting. Writing t for b, note that

F2(t) = ∞ X n=0 t2n 2n+ 1, so that Z 1 0 F2(t)dt= ∞ X n=0 1 (2n+ 1)2.

(27)

By reversing the implied double integral, one obtains an attractive proof of the well-known identity P∞ n=1 1 n2 =π 2_{/6 [JL].}

7. Further evaluations in terms of the agm; the Brent-Salamin algorithm

We continue to follow the exposition of [Lo2]. Recall that Gauss’s theorem (Theorem 3.1) says that I(a1, b1) = I(a, b), where a1 = ₂1(a+b) and b1 = (ab)1/2. There are corre-sponding identities for L and J, but they are not quite so simple. We now establish them, using the same substitution as in the proof of Gauss’s theorem.

7.1THEOREM. Let a6=b and a1 = ₂1(a+b), b1 = (ab)1/2. Then L(b1, a1) = a+b a−b [L(b, a)−L(a, b)]. (1) L(a1, b1) = 2 a−b [aL(a, b)−bL(b, a)]. (2)

Proof. By (6.7), written as an integral on (−∞,∞), L(b1, a1) = 1 2 Z ∞ −∞ a2 1 (x2₊_a2 1)3/2(x2+b21)1/2 dx. As in Theorem 3.1, we substitute 2x=t− ab t . As before, we find (x2+b2₁)1/2 = 1₂ t+ ab t , 4(x2 +a2₁) = 4x2+ (a+b)2 =t2 +a2+b2+a 2_b2 t2 = 1 t2(t 2 +a2)(t2+b2), so that L(b1, a1) = 1 8(a+b) 2 Z ∞ 0 8t3 (t2₊_a2₎3/2_(t2₊_b2₎3/2 1 2(t+ ab t) 1 2 1 + ab t2 dt = (a+b)2 Z ∞ 0 t2 (t2₊_a2₎3/2_(t2₊_b2₎3/2 dt. But by (6.7) again, L(b, a)−L(a, b) = Z ∞ 0 a2_(t2₊_b2₎₋_b2_(t2₊_a2₎ (t2₊_a2₎3/2_(t2₊_b2₎3/2 dt = (a2−b2) Z ∞ 0 t2 (t2₊_a2₎3/2_(t2₊_b2₎3/2 dt.

(28)

This proves (1). To deduce (2), write I(a, b) = I, L(a, b) = L and L(b, a) = L∗. Then I =L+L∗, also I =I(a1, b1) = L(a1, b1) +L(b1, a1), so (a−b)L(a1, b1) = (a−b)[I−L(b1, a1)] = (a−b)(L+L∗) + (a+b)(L−L∗) = 2aL+ 2bL∗. 7.2COROLLARY. We have

2J(a1, b1) =abI(a, b) +J(a, b). (3)

Proof. By (6.4), J(a1, b1) = a21L(a1, b1) + b21L(b1, a1). Continue to write I, L, L∗ as above. Substituting (1) and (2), we obtain

2(a−b)J(a1, b1) = 1₂(a+b)22(aL−bL∗) + 2ab(a+b)(L∗−L)

= (a+b)[a(a+b)−2ab]L+ (a+b)[2ab−b(a+b)]L∗ = a(a+b)(a−b)L+b(a+b)(a−b)L∗,

hence

2J(a1, b1) = a(a+b)L+b(a+b)L∗ =ab(L+L∗) + (a2L+b2L∗) = abI +J.

It is possible, but no shorter, to prove (3) directly from the integral expression in 5.6, without reference to L(a, b).

It follows from (1) and (2) that ifa > b, thenL(a, b)< L(b, a) (as already seen in 6.6) and aL(a, b)> bL(b, a).

Note. If a and b are interchanged, then a1 and b1 remain the same: they are not

interchanged! The expressions in (1), (2) and (3) are not altered.

We now derive expressions forL(a, b),L(b, a) andJ(a, b) in terms of the agm iteration. In this iteration, recall that cn is defined byc2n=a2n−b2n: for this to make sense whenn = 0,

we require a > b. We saw in 2.2 that for a certain k, we have cn ≤4b/42

n−k

for n≥k, from which it follows that limn→∞2nc2_n= 0 and the series P∞_n₌₀2nc2_n is convergent.

7.3 THEOREM. Let a > b > 0, and let an, bn, cn be defined by the agm iteration for

M(a, b), so that c2 =a2−b2. Let S =P∞ n=02

n−1_c2

n. Then

(29)

c2L(a, b) = (c2−S)I(a, b), (5)

J(a, b) = (a2−S)I(a, b). (6)

Proof. Note first that (5) and (6) follow from (4), because L(a, b) +L(b, a) = I(a, b) and J(a, b) = a2_{I(a, b)}₋_c2_{L(b, a) (see 6.3).}

Write I(a, b) = I, and recall that I(an, bn) = I for all n. Also, write L(bn, an) = Ln.

Note that L(bn, an)−L(an, bn) = 2Ln−I. Recall from 2.2 that cn+1 = 1₂(an−bn). So by

(1), applied to an, bn, we have (for each n≥0

4c2_n₊₁Ln+1 = (an−bn)2Ln+1 = (a2n−b2n)(2Ln−I) =c2n(2Ln−I),

hence

2nc2_nLn−2n+1cn2+1Ln+1 = 2n−1c2nI.

Adding these differences for 0≤n ≤N, we have

c2₀L0−2N+1c2N+1LN+1 = N X n=0 2n−1c2_n ! I.

Taking the limit as N → ∞, we have c2

0L0 =SI, i.e. (4).

Since I(a, b) = π/(2M(a, b)), this expresses L(a, b) and J(a, b) in terms of the agm iteration.

Recall from 6.11 that L(√2,1)I(√2,1) =π/4. With Theorem 7.3, we can deduce the following expressions for π:

7.4THEOREM. Let M(√2,1) =M0 and let an, bn, cn be defined by the agm iteration for M0. Let S = P∞ n=02n −1_c2 n. Then π= M 2 0 1−S. (7)

Hence π= limn→∞πn, where

πn=

a2_n₊₁ 1−Sn

, (8)

where Sn=Pn_r₌₀2r−1c2r.

Proof. By (5), L(√2,1) = (1−S)I(√2,1). So, by 6.11, π 4 = (1−S)I( √ 2,1)2 = (1−S) π 2 4M2 0 ,

(30)

The sequence in (8) is called the Brent-Salamin iteration forπ. However, the idea was already known to Gauss.

A minor variant is to use an rather than an+1. Another, favoured by some writers (including [Bor2]) is to work with M₀0 = M(1,√1

2) = M0/

√

2. The resulting sequences are a0_n, b0_n,c0_n, where a0_n=an/ √ 2 (etc.). Then S=P∞ n=02 n_c02 n and π = 2M 02 0 /(1−S).

Of course, the rapid convergence of the agm iteration ensures rapid convergence of (8). We illustrate this by performing the first three steps. This repeats the calculation of M0 in section 2 (with slightly greater accuracy), but this time we record cn and c2n. To calculate

cn, we use the identity cn=c2n−1/(4an).

n an bn cn c2n

0 1.41421356 1 1 1

1 1.20710678 1.18920712 0.20710678 0.04289322

2 1.19815695 1.19812352 0.00894983 0.00008010

3 1.19814023 1.19814023 0.00001671 2.79×10−10

From this, we calculate S3 = 0.54305342, hence 1− S3 = 0.45694658 and π3 = 3.14159264 to eight d.p. (while π≈3.14159265).

We now give a reasonably simple error estimation, which is essentially equivalent to the more elaborate one in [Bor2, p. 48]. Recall from 2.2 that cn+1 ≤4/(202

n

) for all n ≥1.

7.5. With πn defined by (8), we have πn < π and

π−πn < 2n+9 202n+1. (9) Proof. We have π−πn= M2 0 1−S − a2 n+1 1−Sn = Xn (1−S)(1−Sn) , where Xn =M02(1−Sn)−an2+1(1−S) = (S−Sn)M02−(1−S)(a 2 n+1−M 2 0).

Now Sn< S, so 1−Sn>1−S. From the calculations above, it is clear that 1−S > 2₅, so

1

(1−S)(1−Sn)

< 25₄ <23.

Now consider Xn. We have 0< a2n+1−M02 < a2n+1 −b2n+1 = c2n+1. Meanwhile, 1 < M02 <2 and S−Sn= ∞ X r=n+1 2r−1c2_r >2nc2_n₊₁.

(31)

At the same time, since 2rc2_r₊₁ < ₂12r−1c2_r, we have S−Sn < 2n+1c2n+1. Hence Xn > 0 (so πn < π) and Xn< M02(S−Sn)<2(S−Sn)<2n+2c2n+1 < 2n+6 202n+1. Inequality (9) follows.

A slight variation of the proof that Xn>0 shows that πn increases withn.

8. Further results on K(b) and E(b)

In this section, we translate some results from sections 6 and 7 into statements in terms ofK(b) andE(b). First we give expressions for the derivatives, with some some applications.

We have already seen in 5.7 that E0(b) = 1

b[E(b)−K(b)]. The expression for K

0_{(b) is}

not quite so simple:

8.1. For 0< b <1,

bb∗2K0(b) = E(b)−b∗2K(b). (1)

Proof. Recall thatb∗ = (1−b2₎1/2_{. By 6.7, with} _a_{= 1,} bb∗2 d

dbI(1, b) =b

2_{I(1, b)}₋_J_{(1, b).}

With b replaced by b∗, this says

b∗b2 d

db∗K(b) = b

∗2_K(b)₋_E(b). Now db∗/db=−b/b∗, so by the chain rule

K0(b) =−b b∗ d db∗K(b) = − b b∗ 1 b2_b∗[b ∗2_K(b)₋_{E(b)] =}₋ 1 bb∗2[b ∗2_K(b)₋_E(b)].

Alternatively, 8.1 can be proved from the power series for K(b) and E(b). Note that these power series also show that K0(0) =E0(0) = 0.

The expressions forK0(b) andE0(b) can be combined to give the following more pleasant identities:

8.2COROLLARY. For 0< b <1,

b∗2[K0(b)−E0(b)] =bE(b), (2)

(32)

Proof. We havebE0(b) =E(b)−K(b)]. With (1), this gives bb∗2[K0(b)−E0(b)] = (1−b∗2)E(b) =b2E(b),

hence (2), and (3) similarly.

8.3THEOREM (Legendre’s identity). For 0< b <1,

E(b)K(b∗) +E(b∗)K(b)−K(b)K(b∗) = π

2. (4)

Proof. WriteK(b)−E(b) =F(b). Then (4) is equivalent to E(b)E(b∗)−F(b)F(b∗) = π

2.

NowE0(b) =−F(b)/band, by (2), F0(b) =−(b/b∗2)E(b). Hence, again using the chain rule, we have d dbE(b)E(b ∗ ) =−1 bF(b)E(b ∗ ) +E(b) b b∗2F(b ∗ ) and d dbF(b)F(b ∗ ) = b b∗2E(b)F(b ∗ ) +F(b) − b b∗ b∗ b2E(b ∗ ) = b b∗2E(b)F(b ∗ )−1 bF(b)E(b ∗ ). Hence E(b)E(b∗)−F(b)F(b∗) is constant (say c) on (0,1).

To find c, we consider the limit as b → 0+_{. Note first that 0} _{< F}_(b) _{< K(b) for} 0 < b < 1. Now E(0) = π₂ and E(1) = 1. By 1.5 and 5.9, F(b) < π₂b2 for 0 < b < 1₂. Also, by 1.2, F(b∗) < K(b∗) = I(1, b) ≤ π/(2b1/2). Hence F(b)F(b∗) → 0 as b → 0+, so

c=E(0)E(1) =π/2.

Taking b = b∗ = √1

2 and writing E( 1

√

2) = E1, and similarly K1, M1, so that K1 = π/(2M1), this gives 2E1K1−K12 = π 2, so that E1 = π 4K1 +K1 2 = M1 2 + π 4M1 , as found in 6.12 (note that M0 =M(

√

2,1) = √2M1 and J(

√

2,1) = √2E1).

In 3.5 and 3.6, we translated the identity I(a1, b1) = I(a, b) into identities in terms of K(b). In particular, I(1 +b,1−b) = K(b). The corresponding identity for J was given in 7.2:

(33)

We now translate this into statements involving K and E. The method given here can be compared with [Bor2, p. 12]. Recall that if k = (1−b)/(1 +b), then k∗ = 2b1/2/(1 +b): denote this by g(b).

8.4. We have

2E(b)−b∗2K(b) = (1 +b)E[g(b)]. (6)

Proof. By (5), witha and b replaced by 1 +b and 1−b,

b∗2I(1 +b,1−b) +J(1 +b,1−b) = 2J(1, b∗) = 2E(b). As mentioned above, I(1 +b,1−b) = K(b). Further

J(1 +b,1−b) = (1 +b)J 1, 1−b 1 +b = (1 +b)E[g(b)]. 8.5. We have E(b∗) +bK(b∗) = (1 +b)E 1−b 1 +b . (7) Proof. By (5), E(b∗) +bK(b∗) = J(1, b) +bI(1, b) = 2J 1₂(1 +b), b1/2 = (1 +b)J 1, 2b 1/2 1 +b = (1 +b)E 1−b 1 +b .

Note that (5) and (6) can be arranged as expressions for E(b) and E(b∗) respectively.

Outline of further results obtained from differential equations

Let us restate the identities forE0(b) and K0(b), writing out b∗2 _{as 1}₋_b2_:

bE0(b) = E(b)−K(b), (8)

b(1−b2)K0(b) + (1−b2)K(b) = E(b). (9)

We can eliminate E(b) by differentiating (9), using (8) to substitute for E0(b), and then (9) again to substitute forE(b). The result is the following identity, amounting to a second-order differential equation satisfied by K(b):

(34)

Similarly, we find that

b(1−b2)E00(b) + (1−b2)E0(b) +bE(b) = 0. (11)

By following this line of enquiry further, we can obtain series expansions for K(b∗) = I(1, b) and E(b∗) = J(1, b). Denote these functions by K∗(b) and E∗(b). To derive the differential equations satisfied by them, we switch temporarily to notation more customary for differential equations. Suppose we know a second-order differential equation satisfied by y(x) for 0 < x <1, and let z(x) =y(x∗), where x∗ = (1−x2₎1/2_{. By the chain rule, we find} that y0(x∗) =−(x∗/x)z0(x) and y00(x∗) = x ∗2 x2 z 00_(x)₋ 1 x3z 0_(x).

Substituting in the equation satisfied by y(x∗), we derive one satisfied by z(x). In this way, we find that K∗(x) satisfies

x(1−x2)y00(x) + (1−3x2)y0(x)−xy(x) = 0. (12) Interestingly, this is the same as the equation (10) satisfied by K(x). Meanwhile, E∗(x) satisfies

x(1−x2)y00(x)−(1 +x2)y0(x) +xy(x) = 0 (13) (which is not the same as (11)).

Now look for solutions of (12) of the form y(x) = P∞

n=0anxn+ρ. We substitute in the

equation and equate all the coefficients to zero. The coefficient of xρ−1 _is _ρ2_{. Taking} _ρ_{= 0,} we obtain the solution y0(x) =

P∞

n=0d2nx2n, whered0 = 1 and d2n =

12_.32_{. . . .}_(2n₋₁₎2 22_.42_{. . . .}_(2n)2 .

Of course, we knew this solution already: by 1.4, y0(x) = _π2K(x). A second solution can be found by the method of Frobenius. We give no details here, but the solution obtained is

y1(x) =y0(x) logx+ 2 ∞ X n=1 d2ne2nx2n, where e2n= n X r=1 1 2r−1− 1 2r .

Now relying on the appropriate uniqueness theorem for differential equations, we conclude thatK∗(x) =c0y0(x) +c1y1(x) for somec0 andc1. To determine these constants, we use 4.3, which can be stated as follows: K∗(x) = log 4−logx+r(x), where r(x)→0 asx→0+. By

(35)

1.4, we have 0 < y0(x)−1< x2 forx < 1₂. Hence [y0(x)−1] logx, therefore alsoy1(x)−logx, tends to 0 as x→0+. So

K∗(x) =c0y0(x) +c1y1(x) =c0+c1logx+s(x),

where s(x) → 0 as x → 0+. It follows that c1 = −1 and c0 = log 4. We summarise the conclusion, reverting to the notation b (compare [Bor2, p. 9]):

8.6THEOREM. Let 0< b <1. With d2n, e2n as above, we have

I(1, b) = K(b∗) = 2 πK(b) log 4 b −2 ∞ X n=1 d2ne2nb2n (14) = log4 b + ∞ X n=1 d2n log 4 b −2e2n b2n. (15)

It is interesting to compare (15) with Theorem 4.6. The first term of the series in (15) is 1₄b2_(log 4 b −1), so we can state I(1, b) = (1 +1₄b2) log4 b − 1 4b 2 +O b4log1 b

for small b, which is rather more accurate than the corresponding estimate in 4.6. Also, e2n <log 2 for all n, since it is increasing with limit log 2. Hence log4_b −2e2n >0 for all n,

so (15) gives the lower bound

I(1, b)>(1 + 1₄b2) log4 b −

1 4b

2_,

slightly stronger than the lower bound in 4.6. However, the upper bound in 4.6 is (1 + 1₄b2_{) log(4/b), and some detailed work is needed to recover this from (15).}

For the agm, the estimation above translates into M(a,1) =F(a)− F(a)

4a2

log 4a−1

log 4a +O(1/a

3₎ _as _a_{→ ∞}_,

where F(a) = (πa)/(2 log 4a).

One could go through a similar procedure to find a series expansion ofJ(1, b). However, it is less laborious (though still fairly laborious!) to use 6.7 in the form

J(1, b) =b2I(1, b)−b(1−b2)d

dbI(1, b) and substitute (15). The result is:

(36)

8.7THEOREM. For 0< b <1, J(1, b) = 1 + ∞ X n=1 f2n log4 b −g2n b2n, (16) where f2n = 12.32. . . .(2n−3)2(2n−1) 22_.42_{. . . .}_(2n₋₂₎2_(2n) , g2n= 2 n−1 X r=1 1 2r−1 − 1 2r + 1 2n−1 − 1 2n. The first term in the series in (16) is (1₂log4_b−1

4)b

2_{. Denoting this by}_{q(b), we conclude} that J(1, b)>1 +q(b) (since all terms of the series are positive), and

J(1, b) = 1 +q(b) +O(b4log 1/b) for small b. This can be compared with the estimation in 6.9.

References

[Bor1] J.M. Borwein and P.B. Borwein, The arithmetic-geometric mean and fast com-putation of elementary functions, SIAM Review 26 (1984), 351–365, reprinted in

Pi: A Source Book (Springer, 1999), 537–552.

[Bor2] J.M. Borwein and P.B. Borwein, Pi and the AGM, (Wiley, 1987).

[Bow] F. Bowman, Introduction to Elliptic Functions, (Dover Publications, 1961). [Jam] G.J.O. Jameson, An approximation to the arithmetic-geometric mean, Math.

Gaz. 98 (2014).

[JL] G.J.O. Jameson and Nick Lord, Evaluation of P∞ n=1

1

n2 by a double integral, Math. Gaz. 97 (2013).

[Lo1] Nick Lord, Recent calculations of π: the Gauss-Salamin algorithm, Math. Gaz. 76 (1992), 231–242.

[Lo2] Nick Lord, Evaluating integrals using polar areas, Math. Gaz. 96 (2012), 289–296. [New1] D.J. Newman, Rational approximation versus fast computer methods, in Lectures

on approximation and value distribution, pp. 149–174, S´em. Math. Sup. 79

(Presses Univ. Montr´eal, 1982).

[New2] D.J. Newman, A simplified version of the fast algorithm of Brent and Salamin,

Math. Comp. 44 (1985), 207–210, reprinted in Pi: A Source Book (Springer, 1999), 553–556.

[WW] E.T. Whittaker and G.N. Watson, A Course of Modern Analysis (Cambridge Univ. Press, 1927).

Dept. of Mathematics and Statistics, Lancaster University, Lancaster LA1 4YF