Comparison with Harte’s method - Tests on empirical data

Part II From Local to Global: The Problem of Upscaling

5.6 Tests on empirical data

5.6.1 Comparison with Harte’s method

Another very popular upscaling procedure was proposed by Harte (Harte, Zillio et al.,2008;Harte, Smith et al.,2009;Kitzes and Harte,2015) on the basis of maximum

entropy principle (MaxEnt). Here we brieﬂy describe Harte’s method and we then

compare its performance with respect to our framework.

Harte considered a system described by four state variables: the whole forest area

A, the total number of species S, the total number of individuals N and the total

metabolic rate E. Then he deﬁned the joint probability distribution that a species has n individuals and that one of its individuals, chosen at random, has metabolic rate ǫ, R(n, ǫ) and he maximised its information entropy under three constraints: the normalisation condition and the constraints on the mean number of individual per species (equal to N/S) and on the mean energy per species (equal to E/S). Through the Langrange multipliers method, he found the following expression for

R(n, ǫ):

R(n, ǫ) = e

−(λ1+λ2ǫ)n

Z(λ1, λ2)

, (5.35)

where Z(λ1, λ2) is the partition function and λ1 and λ2 are the multipliers associated

to the constraints on N/S and E/S. Imposing these latter and under some particular assumptions (Harte, Zillio et al., 2008), one can ﬁnd the following relations for λ1

and λ2: S N N X n=1 e−λ1n= N X n=1 e−λ1n n (5.36) λ2 = S E (5.37)

By substitutingeqs. (5.36) and(5.37)intoeq. (5.35)and integrating this latter over the metabolic rate, one gets that the species-abundance distribution is given by a log-series (truncated at N)

PHarte_{(n|1) = λ}e−λ1n

n , (5.38)

where λ is the normalisation constant satisfying the following equation

λ = N S

(1 − e−λ1)

e−λ1 − e−λ1(N +1).

With MaxEnt Harte also obtained a form for the spatial abundance distribution P(k|n, p) describing the probability that a species has k individuals in a sample covering a fraction p of the total area A, given that it has total abundance n in the whole forest area. Under the constraints due to the normalisation and on the mean number of individuals in the sample (which is np if one assumes that the total number of individuals scales linearly with the surveyed area), Harte found that P(k|n, p) is given by a geometric distribution truncated at n

P(k|n, p) = Pgeom(k|n, p) = λ′e−λ3k,

where λ′ _{is the normalisation constant and λ}

3 is the Lagrange multiplier associated

to the constraint on the mean number of individuals at p. Under some particular assumptions (Harte, Zillio et al., 2008), Harte found the following relation

np= Pn k=1ne−λ3k Pn k=1e−λ3k = 1 1 − e−λ3(n+1) " e−λ3 1 − e−λ3 − e −λ3(n+1) n+ 1 1 − e−λ3 # (5.39) from which one can numerically compute the value of λ3.

He thus found its key expression for the species-area relationship

Sp = S N X n=1 (1 − Pgeom(0|n, p))PHarte(n|1) = −S XN n=1 " 1 −Xn k=0 e−λ3k −1# 1 log(λ1) e−λ1n n . (5.40)

Let us see how Harte exploited this result to upscale species richness from a sample of area a/2 to a double area a. For this special case, Pgeom(0|n, p) has the simple

form (Harte, Smith et al.,2009)

Pgeom(0|n, p) =

5.6. Tests on empirical data since, from eq. (5.39), the λ3 parameter equals zero.

Then, by inverting eq. (5.40) and performing the computations, the total number of species in a/2 is given by

S(a) = S(a/2)e−λ1(a)_{− N(a)} 1 − e

−λ1(a)

e−λ1(a)− e−λ1(a)(N (a)+1) 1 −

e−λ1(a)N (a)

N(a) + 1

, (5.41) where we explicated the dependence of the number of individuals, the number of species and the Lagrange multiplier λ1 on the area.

Since, by hypothesis, the number of individuals in a is given by N(a) = 2N(a/2), eq. (5.41)contains only two unknowns, the total number of species in a, S(a), and the value of the Lagrange multiplier λ1. One can therefore numerically solve eq. (5.41)

together witheq. (5.36) and obtain the species richness in the area a. By iterating the procedure, one can upscale the biodiversity up to areas which are powers of two of the anchor area p∗_A_.

We apply Harte’s procedure4 _{on four empirical forests (see} _{table 5.8). For each of}

them, we sub-sample a fraction p = 0.1 of the individuals and predict the species richness at the p = 1 scale, where the true value of S∗ _{is known (second column}

of table 5.8). Because Harte’s upscaling procedure only allows one to scale up by successive factors of two (Harte, Smith et al., 2009), we cannot obtain an estimate at p = 1. Last two columns of table 5.8 refer to the predictions at p = 0.8 and

p= 1.6, which represent, respectively, a lower and an upper bound for the species

richness at the desired scale. For the ﬁrst three forests, Harte’s method does not perform as well as the others, with a typical error around 20%. For the last forest, the performance is comparable to Chaowor, while being a bit worse than both the

NB and LS methods. These two latter methods yield the very same results because the best ﬁtting of the empirical SAD with a negative binomial resulted in an r parameter very close to zero. The SAD, in this case, does in fact resemble a log- series (hypothesis on which Harte’s method is based).

We ﬁnally compare the NB, LS, Chaowor and Harte methods on BCI empirical data,

when a contiguous area is sampled (seetable 5.9). More precisely, we sub-sample a fraction p = 0.25 and p = 0.5 of the individuals and predict the species richness at the p = 1 scale. In both cases NB method outperforms Harte’s one, whose estimates are comparable to those of the LS method. This is in accordance with theoretical expectations, since both LS and Harte’s procedure are based on the assumption of a log-series SAD.

In document Mathematical modelling and statistics of biodiversity (Page 155-157)