• No results found

Chapter 3 Entropy Maximisation

3.4 Numerical Example

This section provides an illustrative example to demonstrate some features of the entropy concept, especially in relation to probability construction and subsequent update with new data or information.

Problem setup: Consider the system of container movements between a seaport and various warehouses in the metropolitan region. The only information available to the study team are that the only modes of transport available for transporting 120 TEUs of cargo from the port to various warehouses are road, rail and barge. The question posed to the research team was to determine the mode share or the quantity of containers carried by each mode.

Solution approach: Since the researchers wanted to avoid assuming or adding more knowledge than they have, they employed the principle of entropy maximisation to solve the problem. For simplicity, Jaynes’ entropy will be used to find the mode shares directly. If 𝑑 = 120 and 𝑑1, 𝑑2, 𝑑3 are quantities of cargo carried by road, rail and barge respectively, with corresponding mode shares 𝑝1, 𝑝2, 𝑝3 then from Equation (3.14) we maximise:

𝐻(𝑝1, 𝑝2, 𝑝3) = βˆ’ βˆ‘ 𝑝𝑖ln (𝑝𝑖)

3

𝑖=1

Subject to:

𝑝1+ 𝑝2+ 𝑝3 = 1 𝑝𝑖 β‰₯ 0; 𝑖 = 1,2,3

Forming the Lagrangian equation and enforcing the first order condition for maximum 𝐻(𝑝1, 𝑝2, 𝑝3) with respect to 𝑝𝑖′𝑠 satisfy the following equation:

βˆ’ln(𝑝𝑖) βˆ’ 1 βˆ’ πœ‘ = 0; 𝑖 = 1,2,3

where πœ‘ is the Lagrangian multiplier associated with the equality constraint. Solving for 𝑝𝑖 by enforcing the equality constraints we have the maximum of 𝐻(𝑝1, 𝑝2, 𝑝3) occurring at 𝑝1 = 𝑝2 = 𝑝3 = 1/3 and the corresponding flows are 𝑑1 = 𝑑2 = 𝑑3 = 40. It can also be shown using Equation (3.10) that 𝑑1 = 𝑑2 = 𝑑3 = 40 has the highest number of possible ways of occurring (entropy).

Now suppose that new information becomes available in terms average costs of transport by each mode with the average cost of road 𝑐1 = $5, the average cost of rail 𝑐2 =

$8, the cost of barge 𝑐3 = $10 and the average cost over all modes 𝑐 = $7. As discussed in Section 3.3, there are two ways of updating the prior probabilities 𝓅1 = 𝓅2 = 𝓅3 = 1/3;

the absolute entropy update (AEU) approach and the relative entropy update (REU). Each of these updating methods is applied to the problem, starting with the REU, and then the AEU.

Let 𝑝1 , 𝑝2 , 𝑝3 be the updated (posterior) probabilities.

Relative entropy update (REU): This approach uses the Kullback-Leibler function in (3.18) subject to only the new information converted into a constraint as follows:

𝐾𝐿(𝑝1, 𝑝2, 𝑝3) = min (𝑝1log 𝑝1

𝓅1 + 𝑝2 log 𝑝2

𝓅2 + 𝑝3 log 𝑝3 𝓅3 ) subject to:

𝑐1𝑝1+ 𝑐2 𝑝2 + 𝑐3 𝑝3 = 𝑐 𝑝1+ 𝑝2+ 𝑝3 = 1

𝑝𝑖 β‰₯ 0; 𝑖 = 1,2,3

Solving the above problem, we have the following posterior probability distributions governed by the parameter 𝛽:

𝑝𝑖 = 𝓅𝑖 π‘’βˆ’π›½π‘π‘–

βˆ‘3𝑗=1 𝓅𝑗 π‘’βˆ’π›½π‘π‘— ; 𝑖 = 1,2,3

Since 𝓅1 = 𝓅2 = 𝓅3 = 1/3, it implies that:

𝑝𝑖 = π‘’βˆ’π›½π‘π‘–

βˆ‘3𝑗=1π‘’βˆ’π›½π‘π‘— ; 𝑖 = 1,2,3

With some algebraic manipulation (or use of a root finding algorithm) the parameter value 𝛽 =0.1562 minimises the entropy function and satisfies all the constraints. This produces 𝑝1 =0.48, 𝑝2 =0.30 and 𝑝3 = 0.22. Thus 𝑑1 = 58, 𝑑2 = 36, and 𝑑3 = 26 indicating that the new information has increased the share of road by 14.7% (from 33.3% to 48%).

Absolute entropy update (AEU): This approach of update throws away the prior probabilities and re-construct new probabilities using the new information together with the existing information. Thus, we maximise:

𝐻(𝑝1, 𝑝2, 𝑝3) = βˆ’ βˆ‘ 𝑝𝑖ln(𝑝𝑖)

3

𝑖=1

subject to:

𝑐1𝑝1+ 𝑐2 𝑝2 + 𝑐3 𝑝3 = 𝑐 𝑝1+ 𝑝2+ 𝑝3 = 1 𝑝𝑖 β‰₯ 0; 𝑖 = 1,2,3

Solving the above problem produces the following probability distributions governed by:

𝑝𝑖 = π‘’βˆ’π›½π‘π‘–

βˆ‘3𝑗=1π‘’βˆ’π›½π‘π‘— ; 𝑖 = 1,2,3

Again, the value 𝛽 =0.1562 maximises 𝐻(𝑝1, 𝑝2, 𝑝3) and satisfies both old and new constraints and 𝑝1 =0.48, 𝑝2 =0.30 and 𝑝3 = 0.22. The probability or the share of each mode can readily be computed for any change in the cost of each mode:

𝑝𝑖 = π‘’βˆ’0.1562𝑐𝑖

βˆ‘3𝑗=1π‘’βˆ’0.1562𝑐𝑗 ; 𝑖 = 1,2,3 (3.19)

In summary, we started with lack of information about the container system, other than the constraint that the sum of the number of containers carried by the three modes should add

up to 120. Clearly, there are many values of 𝑑1, 𝑑2 and 𝑑3 that satisfy this constraint-we could have chosen 𝑑1 = 120; 𝑑2 = 0; 𝑑3 = 0 or 𝑑1 = 20; 𝑑2 = 70; 𝑑3 = 30 and both will satisfy the constraint. Neither choice of values seems particularly appropriate, because each goes beyond what we know. We are assuming something which can turn out to be true or false. To avoid introducing potential false information, we employed the principle of maximum entropy to select that probability distribution or modal demands which is consistent with the constraint and contains the least added information (maximum entropy). The resulting probability distribution turns out to be the uniform distribution with equal mode share. The researchers were then introduced to new set of information in terms of the average cost of using each mode.

The new information required the probability distributions constructed to be updated in light of the new information, which was done using two updating methods; AEU and REU methods.

Both methods were shown to yield the same results.

The numerical example also demonstrates the subjective nature of the probabilities constructed by the researchers to cope with their lack of knowledge about the system. It shows that a new research team with new set of information or additional set of information about the system are likely to generate different probability distributions. What is also very clear is that if the two research teams have the same set of information, they will end up constructing the same probability distributions. Additionally, all the information does not have to be available to the two teams at the same time-each can receive different amount of information at a given time and each will produce different probability distributions during the information flow and update them as new information comes along. In the end, both teams will generate the same probability distributions if it turns out that each have access to the same amount of information.

It has also been demonstrated that having additional information ought to yield on average better probability distribution in the sense that it produces less entropy or uncertainty.