Chapter 3 Entropy Maximisation
3.4 Numerical Example
This section provides an illustrative example to demonstrate some features of the entropy concept, especially in relation to probability construction and subsequent update with new data or information.
Problem setup: Consider the system of container movements between a seaport and various warehouses in the metropolitan region. The only information available to the study team are that the only modes of transport available for transporting 120 TEUs of cargo from the port to various warehouses are road, rail and barge. The question posed to the research team was to determine the mode share or the quantity of containers carried by each mode.
Solution approach: Since the researchers wanted to avoid assuming or adding more knowledge than they have, they employed the principle of entropy maximisation to solve the problem. For simplicity, Jaynesβ entropy will be used to find the mode shares directly. If π = 120 and π1, π2, π3 are quantities of cargo carried by road, rail and barge respectively, with corresponding mode shares π1, π2, π3 then from Equation (3.14) we maximise:
π»(π1, π2, π3) = β β ππln (ππ)
3
π=1
Subject to:
π1+ π2+ π3 = 1 ππ β₯ 0; π = 1,2,3
Forming the Lagrangian equation and enforcing the first order condition for maximum π»(π1, π2, π3) with respect to ππβ²π satisfy the following equation:
βln(ππ) β 1 β π = 0; π = 1,2,3
where π is the Lagrangian multiplier associated with the equality constraint. Solving for ππ by enforcing the equality constraints we have the maximum of π»(π1, π2, π3) occurring at π1 = π2 = π3 = 1/3 and the corresponding flows are π1 = π2 = π3 = 40. It can also be shown using Equation (3.10) that π1 = π2 = π3 = 40 has the highest number of possible ways of occurring (entropy).
Now suppose that new information becomes available in terms average costs of transport by each mode with the average cost of road π1 = $5, the average cost of rail π2 =
$8, the cost of barge π3 = $10 and the average cost over all modes π = $7. As discussed in Section 3.3, there are two ways of updating the prior probabilities π 1 = π 2 = π 3 = 1/3;
the absolute entropy update (AEU) approach and the relative entropy update (REU). Each of these updating methods is applied to the problem, starting with the REU, and then the AEU.
Let π1 , π2 , π3 be the updated (posterior) probabilities.
Relative entropy update (REU): This approach uses the Kullback-Leibler function in (3.18) subject to only the new information converted into a constraint as follows:
πΎπΏ(π1, π2, π3) = min (π1log π1
π 1 + π2 log π2
π 2 + π3 log π3 π 3 ) subject to:
π1π1+ π2 π2 + π3 π3 = π π1+ π2+ π3 = 1
ππ β₯ 0; π = 1,2,3
Solving the above problem, we have the following posterior probability distributions governed by the parameter π½:
ππ = π π πβπ½ππ
β3π=1 π π πβπ½ππ ; π = 1,2,3
Since π 1 = π 2 = π 3 = 1/3, it implies that:
ππ = πβπ½ππ
β3π=1πβπ½ππ ; π = 1,2,3
With some algebraic manipulation (or use of a root finding algorithm) the parameter value π½ =0.1562 minimises the entropy function and satisfies all the constraints. This produces π1 =0.48, π2 =0.30 and π3 = 0.22. Thus π1 = 58, π2 = 36, and π3 = 26 indicating that the new information has increased the share of road by 14.7% (from 33.3% to 48%).
Absolute entropy update (AEU): This approach of update throws away the prior probabilities and re-construct new probabilities using the new information together with the existing information. Thus, we maximise:
π»(π1, π2, π3) = β β ππln(ππ)
3
π=1
subject to:
π1π1+ π2 π2 + π3 π3 = π π1+ π2+ π3 = 1 ππ β₯ 0; π = 1,2,3
Solving the above problem produces the following probability distributions governed by:
ππ = πβπ½ππ
β3π=1πβπ½ππ ; π = 1,2,3
Again, the value π½ =0.1562 maximises π»(π1, π2, π3) and satisfies both old and new constraints and π1 =0.48, π2 =0.30 and π3 = 0.22. The probability or the share of each mode can readily be computed for any change in the cost of each mode:
ππ = πβ0.1562ππ
β3π=1πβ0.1562ππ ; π = 1,2,3 (3.19)
In summary, we started with lack of information about the container system, other than the constraint that the sum of the number of containers carried by the three modes should add
up to 120. Clearly, there are many values of π1, π2 and π3 that satisfy this constraint-we could have chosen π1 = 120; π2 = 0; π3 = 0 or π1 = 20; π2 = 70; π3 = 30 and both will satisfy the constraint. Neither choice of values seems particularly appropriate, because each goes beyond what we know. We are assuming something which can turn out to be true or false. To avoid introducing potential false information, we employed the principle of maximum entropy to select that probability distribution or modal demands which is consistent with the constraint and contains the least added information (maximum entropy). The resulting probability distribution turns out to be the uniform distribution with equal mode share. The researchers were then introduced to new set of information in terms of the average cost of using each mode.
The new information required the probability distributions constructed to be updated in light of the new information, which was done using two updating methods; AEU and REU methods.
Both methods were shown to yield the same results.
The numerical example also demonstrates the subjective nature of the probabilities constructed by the researchers to cope with their lack of knowledge about the system. It shows that a new research team with new set of information or additional set of information about the system are likely to generate different probability distributions. What is also very clear is that if the two research teams have the same set of information, they will end up constructing the same probability distributions. Additionally, all the information does not have to be available to the two teams at the same time-each can receive different amount of information at a given time and each will produce different probability distributions during the information flow and update them as new information comes along. In the end, both teams will generate the same probability distributions if it turns out that each have access to the same amount of information.
It has also been demonstrated that having additional information ought to yield on average better probability distribution in the sense that it produces less entropy or uncertainty.