• No results found

Modelling clay uncertainty in a mining operation using decision tree classification

5.2 Methodology

5.2.2 Modelling clay uncertainty in a mining operation using decision tree classification

tree classification

Decision trees are a class of regression models that have been restructured in the form of a tree that is more visual than conventional statistical regression models. Thus, it is a form of machine learning that analyses the past data and uses it to predict future events of similar characteristics. The input data, which is also referred to as the predictor, is broken down into smaller units and the model continues to break down the data until the target variable is predicted. The theory behind decision tree classification is based on the ID3 algorithm. This algorithm deploys what is referred to as “greedy search” where all the possible branches are explored in the probability space without backtracking. This algorithm uses entropy and information gain (also referred to as gain ratio) in the RapidMiner (2017) and Orange (Demsar et al., 2013) software programs (Sayad, 2017). These will be utilised in the present study.

The decision tree was chosen as the best model for this research due to the following reasons (Peng et

al., 2007):

• It has the ability to generalise unobserved instances that have features that are correlated with target variables.

• It is efficient and intuitive in its computation.

• The resultant tree provides a conceptual representation that is transparent and easy to understand.

5.2.2.1 Building a decision tree using entropy and information gain ratio

The decision tree is built from the top to the bottom as shown in Fig. 5.1. The dataset is subdivided into a small group of similar values or with homogenous properties and then their entropy is calculated. If the dataset is completely homogenous, it is classed as having an entropy of zero whereas an entropy of one applies if the data is equally divided.

Therefore, Eq. (5.1) is the underlying equation for the entropy of a single predictor:

𝐸(𝑆) = − ∑ 𝑝𝑖

𝑛 1

𝑙𝑜𝑔2𝑝𝑖 5.1

and Eq. (5.2) is for more than one predictor:

𝐸(𝑇, 𝑋) = ∑ 𝑃(𝑐)𝐸(𝑐)

𝐶∈𝑋

5.2

APPLICATION OF PREDICTIVE DATA MINING 120

Note that 𝐿𝑜𝑔2 can be replaced with 𝐿𝑜𝑔𝑁 where 𝑁 is the number of variable classes to normalise the entropy. Therefore, the entropy of the target variable, which is the ore processing risk, will be calculated in the next sections.

To clearly explain the mathematical model of the decision tree classification which will be used to predict clay uncertainty in the present case study, 35 samples were randomly selected from the training data which was composed of 832 mining blocks. The main variables were Ironfe, SilicaSiO2,

AluminaAl2O3, Magnesiummgo, Calciumcao, and Phosphorousp.The influence of these variables on clay pods were tested as geologists have long noticed associations between these variables, particularly between SiO2, Al2O3 and clay. As shown in Table 5.1, the target variable is the overall processing risk

and the variables to the left of the table are the predictors. To illustrate the concept of entropies and information gain, the numerical values of predictors shown in Table 5.1 are converted into categorical data based on the risk rating of each block depending on the ease of processing the ore. As mentioned previously, the key features of each block, which are associated with process risk, are lump (%), fine alumina and water reactive clay values if intercepted. These characteristics have been classified as shown

in Table 5.2 in accordance with the ore rating that is commonly used in most grade control processes in

iron ore mines.

Riskex’s risk score calculator (Fig. 5.2) was then used to obtain the rating for sample blocks. Thus, the numerical values of this data were encoded into categorical data as shown in Table 5.3 as this will be utilised in the calculation logic to be explained in three steps (shown in the following subsections).

Table 5.1, Grade control sample data for illustrating the ID3 algorithm

***Note that the training data had 832 mining blocks and what is showed here in this paper is a small sample for illustration purposes.

APPLICATION OF PREDICTIVE DATA MINING 121

Table 5.2, Ore block processing risk rating

Fig. 5.2, Riskex’s risk score calculator (Riskex, 2017).

Ore Block Feature Probability Exposure Consequence Riskex Risk Score Calculator Rating

Processing Risk

Riskex Risk Score Calculator Rating

Lump % <32.99 Quite possible Frequent Important 65.5 Low 2.1 Lump % <35.99 Unusual but possible Frequent Important 23.7 Moderate 2 Lump % <42.99 Conceivable Frequent Noticeable 2.1 Substantial 65.2 Lump % >43 Unusual but possible Frequent Important 23.7 High 338.1 Al & Fines Al % <3 Conceivable Frequent Noticeable 2.1 Very High 775 Al & Fines Al % >3 Unusual but possible Frequent Important 23.7

Water Reactive Clay % > 0.5 Quite possible Frequent Important 60.7 Fe, Si & Loi % Conceivable Frequent Noticeable 2.1

APPLICATION OF PREDICTIVE DATA MINING 122

Table 5.3, Conversion of numerical grade control data to categorical for explanation of entropy and information gain concepts

***Note that the training data had 832 mining blocks and what showed here a sample of 35 blocks to be used

in explanation and illustration of the entropy and gain ratio concepts.

5.2.2.2 Entropy of ore processing risk as per Equation 2.

Eq. (5.2) will be applied in this section to calculate the entropy of ore processing risks based on Table 5.4. Table 5.4, Overall Ore Processing Risk Frequency table.

Overall Ore Processing Risk

Very High High Substantial Moderate Low

APPLICATION OF PREDICTIVE DATA MINING 123

Entropy (Processing Risk) = Entropy (6, 5, 2, 4, 18)

= - { (0.171 log5 0.171) + (0.143 log5 0.143) + (0.057 log5 0.057) + (0.114 log5 0.114) + (0.514 log5 0.514) }

= 0.829

When a fitting is based on the notion of decreasing entropy once the sample data have been split, it is referred to as the information gain ratio. The algorithm loops through the data until the attribute with the highest gain ratio is found and this is then used to construct the decision tree. The information gain is the most popular among other classification tree algorithms and this is the default classification in both the

RapidMiner (2017) and Orange (Demsar et al., 2013) software programs, which will be used in this

research. The equation for the entropies is shown below.

Gain (T, X) = Entropy (T) – Entropy (T, X)

5.2.2.3 Entropy and gain ratio of predictors

To explain how the entropies and gain ratio were achieved, the entropy and gain ratio of lump (%) are shown as an example below:

E(Processing Risk, Lump (%)) =P(Very High)*E(0,0,0,0,0) + P(High)* E(0,0,0,0,0) +

P(Substantial)*E(6,4,0,0,0) + P(Moderate)*E(0,1,2,4,0) + P(Low)*E(0,0,0,0,18) = 0 + 0 +10 35(0.418) + 7 35(0.594) + 10 35(0) = 0.238

Gain (Processing Risk, Lump( %)) = 0.8269 – 0.238 = 0.522

Table 5.5 shows the frequencies and summary of the calculation for ore processing risk using the other

APPLICATION OF PREDICTIVE DATA MINING 124

Table 5.5, Predictor frequencies calculated entropies and gain ratios.

5.2.2.4 Identifying decision mode

As shown in subsection b), the element with the highest gain ratio of 0.522 is iron ore lump (%)

(Table 5.5). Thus, lump (%) will be used as the decision node as summarised in Fig. 5.3and Fig.

5.4. It should be noted that any attribute with a gain of zero will not be a priority during the

algorithm search. This process is iterative and is best performed by a machine learning tool such

as RapidMiner (2017) or Orange (Demsar et al., 2013) software programs. The decision rule in

this analysis that will be used to create operational flexibility is that ‘If an overall processing risk is Very High, High, Substantial or Moderate then the option must be created, otherwise no action is required’.

APPLICATION OF PREDICTIVE DATA MINING 125

APPLICATION OF PREDICTIVE DATA MINING 126

Fig. 5.4, Predictive decision tree classification model for clay material uncertainty in a mine plan based on ID3 algorithm.