REDUNDANT SYSTEMS - SYSTEM RELIABILITY EVALUATION AND ALLOCATION

SYSTEM RELIABILITY EVALUATION AND ALLOCATION

4.7 REDUNDANT SYSTEMS

A redundant system contains one or more standby components or subsystems in system conﬁguration. These standby units will enable the system to continue

the function when the primary unit fails. Failure of the system occurs only when some or all of standby units fail. Hence, redundancy is a system design technique that can increase system reliability. Such a technique is used widely in critical systems. A simple example is an automobile equipped with a spare tire. Whenever a tire fails, it is replaced with the spare tire so that the vehicle is still drivable.

A more complicated example is described in W. Wang and Loman (2002). A power plant designed by General Electric consists of n active and one or more standby generators. Normally, each of the n generators runs at 100(n− 1)/n percent of its full load and together supplies 100% load to end users, where n− 1 generators can fully cover the load. When any one of the active generators fails, the remainingn− 1 generators will make up the power loss such that the output is still 100%. Meanwhile, the standby generator is activated and ramps to 100(n− 1)/n percent, while the other n − 1 generators ramp back down to 100(n− 1)/n percent.

If a redundant unit is fully energized when the system is in use, the redundancy is called active or hot standby. Parallel and k-out-n:G systems described in the preceding sections are typical examples of active standby systems. If a redun-dant unit is fully energized only when the primary unit fails, the redundancy is known as passive standby. When the primary unit is successfully operational, the redundant unit may be kept in reserve. Such a unit is said to be in cold standby.

A cold standby system needs a sensing mechanism to detect failure of the pri-mary unit and a switching actuator to activate the redundant unit when a failure occurs. In the following discussion we use the term switching system to include both the sensing mechanism and the switching actuator. On the other hand, if the redundant unit is partially loaded in the waiting period, the redundancy is a warm standby. A warm standby unit usually is subjected to a reduced level of stress and may fail before it is fully activated. According to the classiﬁcation scheme above, the spare tire and redundant generators described earlier are in cold standby. In the remainder of this section we consider cold standby systems with a perfect or imperfect switching system. Figure 4.15 shows a cold standby system consist-ing ofn components and a switching system; in this ﬁgure, component 1 is the primary component and S represents the switching system.

n . . . S

FIGURE 4.15 Cold standby system

REDUNDANT SYSTEMS 81

4.7.1 Cold Standby Systems with a Perfect Switching System

If the switching system is 100% reliable, system reliability is determined by the n components. Let Ti denote the time to failure of componenti (i= 1, 2, . . . , n) andT denote that of the entire system. Obviously,

T =

n i=1

Ti. (4.27)

If T1, T2, . . . , Tn are independently and exponentially distributed with failure rateλ, T follows a gamma distribution with parameters n and λ. The probability density function (pdf) is

f (t)= λⁿ

(n)tⁿ⁻¹e^−λt, (4.28)

where (·) is the gamma function, deﬁned in Section 2.5. The system reliabil-ity is

The mean time to failure of the system is given by the gamma distribution as MTTF= n

λ. (4.30)

Alternatively, (4.30) can also be derived from (4.27). Speciﬁcally,

MTTF= E(T ) =

If there is only one standby component, the system reliability is obtained from (4.29) by settingn= 2. Then we have

R(t)= (1 + λt)e^−λt. (4.31)

Example 4.7 A small power plant is equipped with two identical generators, one active and the other in cold standby. Whenever the active generator fails, the redundant generator is switched to working condition without interruption. The life of the two generators can be modeled with the exponential distribution with λ= 3.6 × 10⁻⁵ failures per hour. Calculate the power plant reliability at 5000 hours and the mean time to failure.

SOLUTION Substituting the data into (4.31) yields

R(5000)= (1 + 3.6 × 10⁻⁵× 5000)e^−3.6×10⁻⁵^×5000= 0.9856.

By settingn= 2 in (4.30), we obtain the mean time to failure as

MTTF= 2

3.6× 10⁻⁵ = 5.56 × 10⁴ hours.

If the n components are not identically and exponentially distributed, the computation of system reliability is rather complicated. Now let’s consider a simple case where the cold standby system is comprised of two components.

The system will survive timet if any of the following two events occurs:

ž The primary component (whose life is T1) does not fail in time t; that is, T1 ≥ t.

ž If the primary component fails at timeτ (τ < t), the cold standby component (whose life isT2) continues the function and does not fail in the remaining time (t− τ). Probabilistically, the event is described by (T1< t)· (T2≥ t− τ).

Since the above two events are mutually exclusive, the system reliability is R(t)= Pr[(T1≥ t) + (T1 < t)· (T2≥ t − τ)] = Pr(T1≥ t)

+ Pr[(T1 < t)· (T2 ≥ t − τ)]

= R1(t)+

t 0

f1(τ )R2(t− τ) dτ, (4.32)

whereRi andfi are, respectively, the reliability and pdf of componenti. In most situations, evaluation of (4.32) requires a numerical method. As a special case, when the two components are identically and exponentially distributed, (4.32) can result in (4.31).

4.7.2 Cold Standby Systems with an Imperfect Switching System

A switching system consists of a failure detection mechanism and a switching actuator, and thus may be complicated in nature. In practice, it is subject to failure.

Now we consider a two-component cold standby system. By modifying (4.32), we can obtain the system reliability as

R(t)= R1(t)+

t 0

R0(τ )f1(τ )R2(t− τ) dτ, (4.33)

whereR0(τ ) is the reliability of the switching system at time τ . In the following discussion we assume that the two components are identically and exponentially distributed with parameterλ, and deal with two cases in which R0(τ ) is static or dynamic.

REDUNDANT SYSTEMS 83

For some switching systems, such as human operators, the reliability may not change over time. In these situations,R0(τ ) is static or independent of time. Let R0(τ )= p0. Then (4.33) can be written as

R(t)= e^−λt+ p0

t 0

λe^−λτe^{−λ(t−τ)}dτ = (1 + p0λt)e^−λt. (4.34) Note the similarity and difference between (4.31) for a perfect switching system and (4.34) for an imperfect one. Equation (4.34) reduces to (4.31) whenp0= 1.

The mean time to failure of the system is MTTF=

_∞

R(t) dt= 1+ p0

λ . (4.35)

Now we consider the situation whereR0(τ ) is dynamic or dependent on time.

Most modern switching systems contain both hardware and software and are com-plicated in nature. They can fail in different modes before the primary components break. If such failure occurs, the standby components will never be activated to undertake the function of the failed primary components. Since switching sys-tems deteriorate over time, it is realistic to assume that the reliability of such systems is a function of time. If the life of a switching system is exponentially distributed with parameterλ0, from (4.33) the reliability of the entire system is

R(t)= e^−λt+

The mean time to failure is MTTF= As will be shown in Example 4.8, an imperfect switching system reduces the reliability and MTTF of the entire system. To help better understand this, we ﬁrst denote by r0 the ratio of the reliability at time 1/λ with an imperfect switching system to that with a perfect system, byr1 the ratio of the MTTF with an imperfect switching system to that with a perfect one, and byδ the ratio of λ toλ0. Then from (4.31) and (4.36), we have

Figure 4.16 plots r0 and r1 for various values of δ. It can be seen that the unreliability of the switching system has stronger effects on MTTF than on the

0.65 0.70 0.75 0.80 0.85 0.90 0.95 1

0 10 20 30 40 50 60 70 80 90 100

r0,r1

r₁ r₀

FIGURE 4.16 Plots ofr0andr1for different values ofδ

reliability of the entire system. Both quantities are largely reduced when λ0 is greater than 10% of λ. The effects are alleviated by the decrease in λ0, and become nearly negligible whenλ0 is less than 1% ofλ.

Example 4.8 Refer to Example 4.7. Suppose that the switching system is sub-ject to failure following the exponential distribution withλ0 = 2.8 × 10⁻⁵failures per hour. Calculate the power plant reliability at 5000 hours and the mean time to failure.

SOLUTION Substituting the data to (4.36) yields R(5000)= e^−3.6×10⁻⁵^×5000

1+3.6× 10⁻⁵ 2.8× 10⁻⁵

1− e^−2.8×10⁻⁵^×5000

= 0.9756.

The mean time to failure is obtained from (4.37) as

MTTF= 1

3.6× 10⁻⁵ + 1

2.8× 10⁻⁵ − 3.6× 10⁻⁵

2.8× 10⁻⁵(3.6× 10⁻⁵+ 2.8 × 10⁻⁵)

= 4.34 × 10⁴ hours.

Comparing these results with those in Example 4.7, we note the adverse effects of the imperfect switching system.

In document Life Cycle Reliability Engineering (Page 91-96)