• No results found

3. METHODOLOGY

3.5 Methodology in Reliability Engineering

3.5.2 The methodology for this study

This study tries to investigate the possibility of using Bayesian networks for modeling failures in the industrial domain. The dataset used in this study is a single variable da- taset of time to failure for industrial pumps.

Failures in industrial pumps, in case the available data is only for time to failure, are traditionally modeled with homogenous or non-homogenous counting processes (Dudenhoeffer, 1994). A process {𝑁(𝑡), 𝑡 ≥ 0} is a counting process if:

1. 𝑁(𝑡) ≥ 0,

2. 𝑁(𝑇) is an integer 3. If 𝑖 < 𝑗, then 𝑁(𝑖) < 𝑁(𝑗)

4. For 𝑖 < 𝑗, 𝑁(𝑗) − 𝑁(𝑖) is the number of events occurring in the interval (𝐼, 𝑗) In case of having a constant failure rate, it is possible to model the failures in a pump with a homogenous Poisson process. A counting process {𝑁(𝑡), 𝑡 ≥ 0} is a homogenous Poisson process if with a rate 𝜆 > 0,

1. 𝑁(0) = 0

2. The number of events occurring in disjoint time intervals is independent. This attribute is also stated as independent increments

3. The number of events in any interval of length 𝑡 = 𝑡𝑗− 𝑡𝑖 is Poisson distributed

𝑃[𝑁(𝑡𝑗) − 𝑁(𝑡𝑖) = 𝑛] = 𝑒−𝜆(𝑡𝑗− 𝑡𝑖) (𝜆(𝑡𝑗− 𝑡𝑖)) 𝑛 𝑛! 𝑡𝑗< 𝑡𝑖 , 𝑛 = 0,1, … (73)

Moreover, the conditional probability that the system will survive until time 𝑡𝑗 given that it

is operating at the time 𝑡𝑖 is

𝑅(𝑡𝑖, 𝑡𝑗) = 𝑒−𝜆(𝑡𝑗− 𝑡𝑖)

𝑡𝑗< 𝑡𝑖

(74)

A non-homogenous Poisson process is similar to the homogenous Poisson process, but the occurrence function is a function of the age of the system, i.e., 𝜆 = 𝑓(𝑡). Normally pump failure data indicate that the rate of failures increases with the age of the pump.

Methods of this study

Bayesian networks are generally used to model the interaction between several varia- bles. In this study, the data consists of consecutive failure times of several similar pumps. Usually, the maintenance policy for these pumps is corrective maintenance, and the quality of maintenance is close to perfect. Therefore, normally it is confidently assumed that the consecutive failure times are independent of each other.

As mentioned before, the time to failure of a system is not only depending on time, but the environment of operation and the usage pattern is effective on the failure times of the equipment. Therefore, the assumption of having independent consecutive time to failures (TTF) can be doubted, because the usage and environment conditions will not change after corrective maintenance on equipment.

Some of the TTF times in the dataset are censored because the observation of pump failures is stopped at a certain time and the TTFs after that time is not recorded. Since these unobserved times are censored failure times (CF), therefore there might be a relationship between their duration and the recorded failure times

To investigate the effects of these condition on the failure times, this study attempts to create Bayesian network models from TTF and CF data while relaxing the assumption of independence of TTF distributions. A Bayesian learning algorithm is used to find a dependence relationship between consecutive TTFs and CFs, i.e., a causal graph which shows a causal relationship between consecutive failures and CFs. The Bayesian learn- ing algorithm finds the relations between the variables using an association metric. These algorithms are described in section 3.1.5.

The methods used for this case study is mostly described along with the implementation in section 4.2, and a summary is only provided here. To create the model, first, the da- taset should be modified. The single variable TTF dataset, which is representing the consecutive TTFs for multiple pumps, is reorganized to separate TTF variables, i.e. time to first failure, time to second failure and so on. This is possible because the pumps are made by the same manufacturer and with same mechanical design. The instance of different pumps can be considered as a unit pump. Having multiple instances for each TTF, a distribution for each of them becomes available. Moreover, the censored TTF times are grouped into separate CFs based on the TTF they are related to.

The algorithms for structural learning and inference used in this study are designed for discrete variables. The time to failure data are continuous variables. Therefore they should be discretized. The methods for discretization are discussed in section 3.1.10. Since the genetic algorithm based method is the optimal method, it is used for discreti- zation.

Then for learning the structure of the Bayesian network, the EQ algorithm is used. The EQ algorithm is described in section 3.1.5. The threshold for the minimum association metrics is set in a low value, so even a small dependency is considered between the variables of the system. The metrics of associations used in EQ is described in section 3.1.5 and the metrics are described in section 3.1.2.

To select the network, contingency table fit used as the metric. Based on the minimum value of acceptable structural coefficient chosen for the structural learning algorithm, the resulting network may be ranging between a fully connected network to completely un- connected network. As described in section 3.1.6, if the acceptable structural coefficient is low, the final network will be close to a fully connected network and vice versa. The detailed description of the process of creating the model is described in section 4.2.