
Vol. 58 No. 1
January 2006
Jim Renfroe, Executive Vice President, Production Optimization, Halliburton
The intensity of today’s global energy appetite requires our industry to become increasingly demanding in terms of innovative, cost-effective technologies and day-to-day performance. These requirements are growing almost exponentially as we go after deeper and less accessible reserves. As we stretch to achieve new limits, reliable performance in all facets of operations is pivotal to reducing downtime, improving efficiencies, and optimizing production. In short, now more than ever, reliability is a business imperative—and achieving maximum reliability does not happen without intentional focus and effort. There are no random acts of reliability.
Reliability has to be designed and built into oilfield technology, equipment, and processes by service companies that have made it a strategic imperative. Once achieved, reliability enables all of us to focus a bit more sharply on maximizing financial performance while never diverting our attention from minimizing exposure and risk.
What is reliability? Essentially, it provides an answer to the question, “How much reliance should you place on a particular product and/or technology?” Stated another way, it is the probability that a device will perform its intended function under known conditions for a specified time. I define it as consistently providing the expected results over and over. That is reliability.
Suppliers have a unique opportunity to improve reliability because it is an attribute influenced and affected at all stages of product and/or service delivery, including:
It is during the procurement process that operators have a unique opportunity to choose reliable technology to positively affect their overall economic equation. After all, service, maintenance, and repair often contribute the majority of total life-cycle costs. Factoring reliability into the procurement equation can result in economic improvements.
When reliability is a core value, it echoes throughout the organization—from engineering to manufacturing to operations—with all involved consistently working to eliminate even the smallest factors that lead to downtime and lost revenue. Reliability can be ascertained through metrics like functional details and life metrics. But a word of caution: metrics are open to interpretation.
What, then, are some factors to consider when assessing metrics? How can you be more assured of translating probabilities of reliability into actualities? How can metrics represent more accurate performance indicators to better assist in economic calculations during procurement?
Functional Details. Functional details relate to the boundaries of intended use such as temperature, pressure, and load. Primary functional details are easy to quantify and qualify and are typical of most engineering specifications. But are there other design parameters to consider? For example, a design specifies a small, 5/8-in.-diameter electric motor and gear train that will be used to manipulate a small, high-pressure shuttle valve in a downhole tool for use in pressures as high as 15,000 psi and in environments up to 149°C (300°F). The functional specification is easily matched for the environment and load requirements to shuttle the valve back and forth. However, in this example, a proper functional specification should note that the motor gear-train assembly will be powered to a hard stop. Without this latter specification, the motor gear-train unit would likely not be designed to withstand the inertia effect of a hard stop.
Life Metrics. Life-metric requirements define the retention of successful function to complete the planned mission. In this case, the electric motor and gear-train assembly should be specified with life metrics such as 99% reliability for 1,000 hours service or “X” cycles of maintenance-free service. Combining and documenting functional details with life metrics provides a greater probability of achieving the desired mission.
So, in the case of the 5/8-in.-diameter electric motor and gear train, both the functional specifications and the life metrics would deviate from what can be characterized as “commoditized indicators” of reliability. When interpreting metrics, the challenge is understanding that specific functional requirements may not necessarily be captured in traditional measures.
Mean Time Between Failures (MTBF) or Mean Time to Failure (MTTF). MTBF or MTTF can be misleading if used as a sole reliability measure. MTBF is a mean-time measure of repairable fatigue for maintainable devices such as replacing the oil and filter on your car every 4,800 km (3,000 miles). MTTF is used in conjunction with nonrepairable items that are discarded when they reach end of life, such as light bulbs. While this may be obvious, it is important to note that both MTBF and MTTF represent the mean value (average), not the median value (midpoint of entire data set), of when a device reaches a deficit in functionality. The only time the mean and median values are the same is when the data set conforms to a symmetrical distribution, the bell curve. For data sets that conform to asymmetrically shaped distributions, the mean and median are no longer the same point in the data set. In these distributions, a group of fatigue-related incidents may be occurring at a faster rate either before or after the median time.
Another misconception of MTBF or MTTF is that either one represents a measure of the failure-free period or the time to the first failure. Because both measures are the average before reaching a fatigue event, there will be events for individual items occurring before and after. Therefore, using MTBF or MTTF as the only reliability metric should be done with knowledge of the risks associated with the possibility of selecting something that falls outside of the “average” band.
To better illustrate the explanation of the risk, Fig. 1 charts a studied evaluation of four different model numbers of a downhole tool. Unit 1 had the highest MTBF (2,609) followed by Unit 2 (1,898), Unit 4 (1,046), and Unit 3 (610). Typically, one would anticipate Unit 1 to be the most reliable regardless of service time since it has the greater MTBF; however, for a period of time, Unit 3 is slightly more reliable up to 220 hours, and Unit 2 is slightly more reliable up to about 840 hours. This may be important if the usage life or time between maintenance activities is under 840 hours. In some instances, the reliability curves can exhibit significant shape changes that cannot be detected by MTBF or MTTF values.

Fig. 1—Evaluation of downhole tool.
Achieving a highly reliable system requires relentless focus and evaluation of failure-free operating envelopes, including:
Reliability in action means reliability designed and built into all the components of a device, process, or system. Properly interpreting reliability metrics can make the difference between minimizing exposure and realizing the potential of your assets.