• No results found

Aggregation of expert knowledge

Chapter 4. Bayesian network models of habitat suitability

4.1.3. The use of expert knowledge in BBNs

4.1.3.2. Aggregation of expert knowledge

When eliciting expert knowledge from multiple experts, some form of aggregation is required to combine the answers from the experts. There are two main types of aggregation; behavioural and mathematical. Behavioural aggregation occurs during, rather than after the elicitation session and attempts to generate agreement among the experts by having them interact in some way (Clemen and Winkler, 1999). It is therefore only suitable for interactive groups or Delphi situations, through which a consensus is elicited from the group as a whole (O'Hagan et al., 2006).

Behavioural aggregation has the advantage of an aggregated result being produced during the session, which is based on a consensus (if experts are able and willing to do so), instead of requiring separately elicited beliefs to be combined afterwards (O'Hagan et al., 2006). However, it can foster a group-think situation in which no- one truly thinks but unconsciously complies and can be very time consuming if group-think does not facilitate unconscious agreement (Meyer and Booker, 1991). It can also suppress differences between the experts answers and the reasons for the differences, both of which can be critical to the understanding, analysis, and use of these data (Meyer and Booker, 1991).

Mathematical aggregation involves the elicitation of answers from each individual individually and independently of the others and then mathematically combining the results into a single estimate (O'Hagan et al., 2006). This form of aggregation has the

140

advantage of not having to be planned as early or as closely in conjunction with the elicitation situations as behavioural aggregation and means that different mathematical schemes can be applied in succession to the individual‘s data, whereas with the behavioural aggregation the process can usually only be done once (Meyer and Booker, 1991). However, like any type of aggregation, mathematical aggregation obscures the differences between the expert‘s answers (Meyer and Booker, 1991). Meyer and Booker (1991) also caution that it is easy to perform mathematical aggregation incorrectly, such as by combining the estimates from experts who have made such different assumptions in answering the question that they have essentially solved different questions. In addition, they suggest that mathematical aggregation can lead to the creation of a single answer that all of the experts would reject.

In a review of the literature on combining probability distributions from experts in risk analysis, Clemen and Winkler (1999) suggest that mathematical aggregation outperforms intuitive aggregation (based on the decision-makers assessment, rather than formal aggregation) and that mathematical and behavioural approaches tend to be similar in performance, with mathematical rules having a slight edge. O‘Hagan et al. (2006) also report that generally, the group often performs less well than individual elicitations followed by a simple mathematical averaging of the individual judgements.

4.1.3.2.1. Aggregation estimators

The most commonly used method of combining a set of answers is to calculate a single summary value (estimator) based on all the values in a data set, with the most popular estimators for central tendency being the mean, median and geometric mean (Meyer and Booker, 1991). The mean is the average of the values and gives equal weight to each datum. This has a serious implication: if only a few experts provide answers and one expert gives an answer that is far away in value from the rest, then that extreme value will greatly influence the mean value, which may not be desirable, especially if the extreme value appears questionable or seems unreasonable (Meyer and Booker, 1991). However it is possible to run different versions of the models, with and without the extreme values.

141

To overcome the influence of extreme values when forming an aggregation estimate, Meyer and Booker (1991) suggest using the median or geometric mean (the average of the data values based on a logarithmic scale) as the central values of the data set tend to influence both of these estimators while the extreme values do not. The median is the 50th percentile value: half of the data is larger than the median and half of the data is smaller than the median. If the data set has an odd sample size, the median is calculated by finding the central value of the ordered data points and if the data set has an even sample size, then the median is the average or halfway point between the two centre values (Meyer and Booker, 1991).

Another alternative is to use a weighted average or mean where each datum (expert answer) is given its own individual weight. The advantage of this aggregation method is that the analyst can control which values (or experts) have the most influence on the estimator (Meyer and Booker, 1991). However, the biggest disadvantage is that the weights must be determined for each expert, which can be difficult (Meyer and Booker, 1991). There are several different ways of determining weights but Meyer and Booker (1991) suggest using equal weights unless some unusual circumstances indicate the use of some different weights. O‘Hagan et al. (2006) also suggest that the general message of the literature is that simple aggregation methods (e.g. a simple average) work well in comparison with more complex methods. This is also suggested by Clemen and Winkler (1999), in a review of the literature, who advocate that simple rules are also practical because of their ease of use and robust performance.

There is limited coverage in the literature on the aggregation of probability values from experts for ecological BBNs. However, Uusitalo et al. (2005) combined the probability distributions of experts in their BBN for estimation of salmon smolt capacity of rivers using a simple average, based on evidence that simple combinational models outperform group judgements compared with more complex combinational rules. They also considered their experts to be exchangeable in the sense that their probabilities were treated equally and symmetrically. Martin et al. (2005) also used an unweighted average to determine the mean response elicited from their experts (on the impact of grazing on birds). By taking an unweighted average they avoided difficulties concerned with rating the comparative ‗accuracy‘

142

of each expert‘s opinion. They also note that to some extent, the expert‘s ability to provide this measure was contained within the expert data, since experts only provided responses for which they were confident.