To begin our illustrations, we will use a relatively small power system with four units. The risk-analysis steps will be used to determine which unit is the most risk-critical to the system and which components in that unit are creating that criticality. We will determine when the overhaul needs to be and which components need to be overhauled to mitigate this risk and produce the highest net value to the system.
Analysis Process
Gather
A download of the last six years for NERC-GADS submissions was performed as indicated in Figure 3-1. There is one line for each forced outage or derate event. Six years were selected by the power company because they were all that were available and could minimize the number of components that have already been replaced so that minimal effort would be expended in
determining whether already-replaced components are still in the group of those selected as risk-critical.
EPRI Licensed Material
System Examples
3-2
Figure 3-1
A Portion of the System NERC-GADS Submissions Forced Outage Event Data for the Small System (ET – Event Type, MDC – Maximum Dependable Capacity)
Process
The raw data were sorted by event type to ensure that only U1-U3 and D1-D3 outages1 were included. They were then sorted by equivalent hours in ascending order, and all rows with zero equivalent hours were deleted. All columns and rows were sorted by “Unit/Cause Code” and
“Year” in ascending order. Then only the columns headed “Unit/Cause Code,” “Year,” “Net Minimum Capacity,” and “Equivalent Hours” were into the left side of a new worksheet, as shown in Figure 3-2.
1 U1 – Unplanned (Forced) Outage – Immediate U2 – Unplanned (Forced) Outage – Delayed U3 – Unplanned (Forced) Outage – Postponed D1 – Unplanned (Forced) Derate – Immediate D2 – Unplanned (Forced) Derate – Delayed D3 – Unplanned (Forced) Derate – Postponed MO – Maintenance Outage
SF – Startup Failure
Figure 3-2
Processed NERC-GADS Data for Each Forced Outage Event on the Left, and the Consolidated Data for Each Plant/Unit/Cause Code by Year on the Right for the Small System
On the right side of Figure 3-2 is the consolidated forced outage event data by “Plant/Unit/Cause Code” by “Year.” Note that the number of forced outage occurrences is counted and the
equivalent hours are totaled.
Risk Rank and Risk Plot
The consolidated data are copied and pasted into the risk-rank workbook as shown on the left side in Figure 3-3. Note that the “Total Annual MWH Loss” for “Plant/Unit/Cause Code” for each year is calculated. Under the “Tools” menu, “Aggregate” is selected, and the aggregated data by “Plant/Unit/Cause Code” for all years appears on the right.
Figure 3-3
Small System Input Data in the “Raw Data” Tab in the Risk-Rank Workbook Is on the Left, and the Plant/Unit/Cause Code for All Years Is Shown on the Right
EPRI Licensed Material
System Examples
3-4
The aggregated data are then copied and pasted into the “Rank” tab, where risk is calculated and then all the data are sorted by risk in descending order, as shown in Figure 3-4. At this point, we have the “Plant/Unit/Cause Codes” sorted by risk for the system.
Figure 3-4
Risk-Ranked System Data for the Small System
All four columns of the risk-ranked data are now pasted into the risk-plot workbook, as shown in Figure 3-5. Note that the risk rank and cumulative risk of each “Plant/Unit/Cause Code” are determined and presented in the right-most columns.
Figure 3-5
Risk-Ranked Data Inserted Into Risk-Plot Workbook
Figure 3-6
Log-Log Risk Plot of Small System Forced Outage Data by Plant/Unit/Component Cause Code With the Line of Constant Risk Just to the Left of the 27 Risk-Critical Points
Figure 3-7
EPRI Licensed Material
System Examples
3-6 Select
Based on the rapid reduction of incremental cumulative risk as you increase in rank at the 27th ranked component, the top 27 risk components were chosen as risk-critical for this system at this time. To produce the final version of Figure 3-7, the component identifiers are changed to one blank for all components, the plot points are relabeled, and then the identifiers for the top 27
components are replaced and the points relabeled again. These labeled points assist in the placement of the line of constant risk.
Copying the four columns of the risk-critical components into another worksheet, sorting them by component identifier, and summing the total risk from these components for each unit in
Figure 3-8 provide a method of selection of Unit D1 as the unit needing an outage business plan using the Boiler OIO. Note that Unit D1 is the unit with the highest risk.
Figure 3-8
The Small System Risk-Critical Components Sorted by Plant/Unit/Component Cause Code
Estimate
The probability-of-failure curves for the run case or “without overhaul” case were generated for the 10 components that were selected as critical in Figure 3-8 for Unit D1. Only the last six years of forced outage data were used to generate the probability curves, because these were all that were available. The cause codes in Table 3-1 were dropped because an overhaul plan was not appropriate for them or they had already been addressed.
4580 Generator end bells and bolting
9290 Other fuel-quality problems
1700 Feedwater controls
8600 Flue gas additive
Figure 3-9 shows the annual probability change calculated for Cause Code 9630, “Opacity,” in workbook ProbCalc. The data for the three columns on the left came from columns A, B, and C of the system risk-rank workbook (see Figure 3-3). The “Operation Year” and “Probability Change by Year” for this component were copied and pasted into the “Fit of History” tab of Baycom11 in Figure 3-10. After entering a base year, a Weibull curve fit is performed on these data by clicking “Fit of History” on the “Tools” menu, and the history and curve fit data are produced as shown in Figure 3-11. If the fit is not satisfactory, then modify the base year in
Baycom11, cell G2, until a satisfactory fit is obtained. Once all component curves are fitted, then the Weibull alpha, beta, and base year values for this “without overhaul” probability curve can be input for each component into the Boiler OIO.
Figure 3-9
ProbCalc Workbook Calculating the Change in Probability by Year for Cause Code 9630, Opacity
EPRI Licensed Material
System Examples
3-8
Figure 3-10
Input Worksheet in Baycom11, “Fit of History” Tab, Where the Weibull Shape (Alpha) and Scale (Beta) Parameters Are Calculated for Input Into the Boiler OIO for Cause Code 9630, Opacity
Figure 3-11
The Cumulative Probability-of-Failure Plot in Baycom11 That Compares the Weibull Fitted Curve to a Curve Linking the Failure History Points for Cause Code 9630, Opacity
There were three tube failures for component 1000, “Furnace Wall,” which is insufficient to perform a Weibull curve fit. For this component, a probabilistic opinion interview was conducted
using the same process as the software tool STACKER. The results are shown in Figure 3-12 for the run and overhaul case.
Figure 3-12
Results From the Probabilistic Opinion Interview for the Projected Probability of Failure for Unit D1 Furnace Wall
The “Operation Year” and “Probability Change by Year” for this component were copied and pasted into the “Fit of Interview” tab of Baycom11, as shown in Figure 3-13. After entering a base year, a Weibull curve fit is performed on these data by clicking “Fit of Interview” on the
“Tools” menu, and the interview and curve fit data are produced as shown in Figure 3-14. If the fit is not satisfactory in the judgment of the analyst, then modify the base year in Baycom11, cell G2, until a satisfactory fit is obtained.
Figure 3-13
Input Worksheet In Baycom11, “Fit of Interview” Tab, Where the Weibull Shape (Alpha) and Scale (Beta) Parameters Are Calculated for Input Into the Boiler OIO for Cause Code 1000, Furnace Wall
EPRI Licensed Material
System Examples
3-10
Figure 3-14
The Cumulative Probability-of-Failure Plot “Without Overhaul” in Baycom11 That
Compares the Weibull Fitted Curve to a Curve Linking the Interview Points for Cause Code 1000, Furnace Wall
The fitted cumulative probability-of-failure “without overhaul” curves versus future year curves for the 10 component cause codes on Unit D1 are shown in Figure 3-15.
Figure 3-15
The Resulting Fitted Probability-of-Failure “Without Overhaul” Curves for the 10 Risk-Critical Components Included in the Boiler OIO Analysis
component between the outage and the retirement of the unit, except for component 1000,
“Furnace Wall.” This can be a good first analysis approach, unless the value of the outage is considered marginal; then the “with overhaul” probability needs to be generated. This probability usually comes from an opinion interview. For each component with the no-forced outage
assumption in the “With Overhaul” tab, the Weibull shape parameter is input as 12 and the Weibull scale parameter is input as 1000 to make the resulting probability-of-failure curve zero.
For the component 1000, “Furnace Wall,” the results from the opinion interview were used for the “with overhaul” case.
Figure 3-16
Boiler OIO Input “With Overhaul” for Unit D1
The inputs for the “Without Overhaul” tab are shown in Figure 3-17. These inputs were derived from the history data for these 10 components and processed with ProbCalc and Baycom11.
EPRI Licensed Material
System Examples
3-12
Figure 3-17
Boiler OIO Input “Without Overhaul” for Unit D1
The operation parameters by year (unit replacement power cost, projected capacity factor, and service factor) were input in their respective tabs, as well as the financial assumptions for time value of money and taxes. The annual budget limits, forced outage rate limit, and probability of safety flag limit are input in their respective columns in the “Summary” tab, as shown in Figure 3-18. After loading all the data, the “Launch Optimization” button is clicked and the outage business plan for Unit D1 is produced. Figure 3-18 shows the overhaul year that will produce the
highest net present value (NPV) within the constraints and when and if the safety limit is
exceeded. To the right top, the totals of the present-value cash flows for this analysis period (in this case, 20 years) for the outage performed in 2003 are shown. These after-tax, present-value totals come from columns D, F, and K, respectively. The current-value totals before taxes are shown to the left of each of these columns. To the far right is a series of total expected NPVs for the overhaul being conducted in each of the respective years of the analysis period.
Figure 3-18
Boiler OIO Summary Worksheet for Unit D1
Figure 3-19
Boiler OIO NPV Versus Overhaul Year Results for Unit D1
Examine
Upon examination of Boiler OIO inputs for the two components for Unit D1, it was determined that projected service factor is constant at around 75%, and the projected replacement energy value is rising linearly with time. Note that the probability-of-failure curve for the 3440 – High-Pressure Heater Tube Leaks, shown in Figure 3-15, is rising fairly linearly. Examination of the expected consequential cash flows, Figure 3-20, for this component, indicates that it is the highest consequential cost component. For this reason, the NPV curve shape is driven by the curve shape of the probability curve for component 3440, high-pressure heater tube leaks, because the service factor and replacement values are linear. The large service factor, the large
number of tubes in the high-pressure heater, the large forced outage duration for these tube
failures, and a unit expected retirement date significantly beyond the end of the analysis window result in a large NPV for all years.
EPRI Licensed Material
System Examples
3-14
Figure 3-20
Boiler OIO Cost to Overhaul Worksheet Showing the Annual Consequential Cost for the 10 Components Selected for Unit D1
Figure 3-15 shows the Weibull projected curves for the run case or “without overhaul” case for the Cause Code 9630, “Opacity.” Note that the projected probability curve reached a cumulative value of one in 2003 for the run case. It would be expected that the component that continues to be run would have a continuing rising probability-of-failure curve resulting in a higher calculated
cost without overhaul. This is a current weakness in using this form of Weibull curve projection for large and rapidly rising probabilities for components that are developing significant
degradation, as shown in Figure 3-11. This situation with the probability curves for these components will be addressed in future developments of this process. For now, the
recommendation would be to project the future probability of failure from a probabilistic opinion interview using STACKER with plant personnel.
Conclusion
Unit D1 needs an overhaul as soon as possible to reap a $123,000,000 net present value savings.
Some delay in the overhaul will not have a serious consequence because the NPV versus overhaul year curve is of such low slope.
The high-pressure heater with a reasonably high increasing annual rate of probability of failure dominated the value for the timing of an overhaul with its large tube population and the duration of a forced outage from a tube leak large at 77 hours.
selected for overhaul timing optimization. Again, we will determine when the overhaul is needed selected for overhaul timing optimization. Again, we will determine when the overhaul is needed for these units and what components need to be involved in the overhauls.
for these units and what components need to be involved in the overhauls.
Analysis Process Analysis Process Gather
Gather
The last five years of NERC-GADS submissions were downloaded
The last five years of NERC-GADS submissions were downloaded as shown in Figure 3-21. Theas shown in Figure 3-21. The last five years of forced outage data were selected by the power company to represent the recent last five years of forced outage data were selected by the power company to represent the recent problems on this system.
problems on this system.
Figure 3-21 Figure 3-21
A Portion of the System NERC-GADS Submissions Forced Outage Event Data for the A Portion of the System NERC-GADS Submissions Forced Outage Event Data for the Large System
Large System
Process Process
The raw data were processed by removing all but U1-U3, D1-D3, and MO type of outages
The raw data were processed by removing all but U1-U3, D1-D3, and MO type of outages22. The. The zero equivalent hour entries were deleted. The columns headed “Unit/Cause Code,” “Year,” “Net zero equivalent hour entries were deleted. The columns headed “Unit/Cause Code,” “Year,” “Net
2
2 U1 – Unplanned (Forced) Outage – Immediate U1 – Unplanned (Forced) Outage – Immediate U2 – Unplanned (Forced) Outage – D
U2 – Unplanned (Forced) Outage – Delayedelayed U3 – Unplanned (Forced) Outage –
U3 – Unplanned (Forced) Outage – PostponedPostponed D1 – Unplanned (Forced) Derate – Immediate D1 – Unplanned (Forced) Derate – Immediate D2 – Unplanned (Forced) Derate – Delayed D2 – Unplanned (Forced) Derate – Delayed
EPRI Lice
EPRI Licensed Matnsed Materialerial
System Examples System Examples
3-16 3-16
Maximum Capacity,” and “Equivalent Hours” were sorted by “Unit/Cause Code” and “Year” in Maximum Capacity,” and “Equivalent Hours” were sorted by “Unit/Cause Code” and “Year” in descending order. The data for these individual forced outage events are shown in the left side of descending order. The data for these individual forced outage events are shown in the left side of Figure 3-22. On the right of this figure is the consolidated forced outage event data by
Figure 3-22. On the right of this figure is the consolidated forced outage event data by
“Unit/Cause Code” by “Year” showing the total annual number of occurrences and total
“Unit/Cause Code” by “Year” showing the total annual number of occurrences and total equivalent hours.
equivalent hours.
The raw data were processed by removing all but U1-U3, D1-D3, and MO type of outages
The raw data were processed by removing all but U1-U3, D1-D3, and MO type of outages33. The. The zero equivalent hour entries were deleted. The columns headed “Unit/Cause Code,” “Year,” “Net zero equivalent hour entries were deleted. The columns headed “Unit/Cause Code,” “Year,” “Net Maximum Capacity,” and “Equivalent Hours” were sorted by “Unit/Cause Code” and “Year” in Maximum Capacity,” and “Equivalent Hours” were sorted by “Unit/Cause Code” and “Year” in descending order. The data for these individual forced outage events are shown in the left side of descending order. The data for these individual forced outage events are shown in the left side of Figure 3-22. On the right of this figure is the consolidated forced outage event data by
Figure 3-22. On the right of this figure is the consolidated forced outage event data by
“Unit/Cause Code” by “Year” showing the total annual number of occurrences and total
“Unit/Cause Code” by “Year” showing the total annual number of occurrences and total equivalent hours.
equivalent hours.
Figure 3-22 Figure 3-22
Processed NERC-GADS Data for Each Forced Outage Event on the Left and the Processed NERC-GADS Data for Each Forced Outage Event on the Left and the Consolidated Data for Each Plant/Unit/Cause Code by Year on the Right for the Large Consolidated Data for Each Plant/Unit/Cause Code by Year on the Right for the Large System
System
3
3 U1 – Unplanned (Forced) Outage – Immediate U1 – Unplanned (Forced) Outage – Immediate U2 – Unplanned (Forced) Outage – D
U2 – Unplanned (Forced) Outage – Delayedelayed U3 – Unplanned (Forced) Outage –
U3 – Unplanned (Forced) Outage – PostponedPostponed D1 – Unplanned (Forced) Derate – Immediate D1 – Unplanned (Forced) Derate – Immediate D2 – Unplanned (Forced) Derate – Delayed D2 – Unplanned (Forced) Derate – Delayed D3 – Unplanned (Forced) Derate – Postponed D3 – Unplanned (Forced) Derate – Postponed MO – Maintenance Outage
MO – Maintenance Outage SF – Startup Failure
SF – Startup Failure
all years is shown on the
all years is shown on the right after running the aggregation macro.right after running the aggregation macro.
Figure 3-23 Figure 3-23
Large System Input Data in the “Raw Data” Tab in the Risk-Rank Workbook Is on the Left, Large System Input Data in the “Raw Data” Tab in the Risk-Rank Workbook Is on the Left, and the Plant/Unit/Cause Code for All Years Is Shown on the Right
and the Plant/Unit/Cause Code for All Years Is Shown on the Right
The aggregated data are copied and “paste special/value” pasted into the “Rank” Tab, where they The aggregated data are copied and “paste special/value” pasted into the “Rank” Tab, where they are then sorted by risk in descending order as shown in Figure 3-24.
are then sorted by risk in descending order as shown in Figure 3-24.
EPRI Licensed Material
System Examples
3-18
All four columns of data are now copied and “paste special/value” pasted into the risk-plot workbook as shown in Figure 3-25. Note that the risk rank and cumulative risk of each
“Plant/Unit/Cause Code” is determined on the right-most columns.
Figure 3-25
Risk-Ranked Data Inserted Into Risk-Plot Workbook
From this risk-ranked data, a log-log risk plot was produced as shown in Figure 3-26, and the diminishing-risk plot is shown in Figure 3-27 by clicking the “Label Plot Points” button.
Figure 3-26
Log-Log Risk Plot of Large System Forced Outage Data by Plant/Unit/Component Cause Code
Figure 3-27
Diminishing-Risk Plot for Large System Showing up to the 25th Ranked Component as the Highest Contributors to Incremental Cumulative Risk
Diminishing-Risk Plot for Large System Showing up to the 25th Ranked Component as the Highest Contributors to Incremental Cumulative Risk