What are your thoughts on the following troubleshooting proposal:
“We’re going to analyze our maintenance records to map out which components were replaced for which fault messages – and then, in the future, whenever Message X appears, we will know that in the past 9 times out of 10 we replaced component Y, so we can go directly to replacing that component.”
It would be nice if pre-emptive maintenance or effective troubleshooting was that simple, but it’s not – for several reasons.
The Challenge With Traditional Troubleshooting Practices
A highly reliable system like an aircraft operates almost entirely on the flat bottom part of the well-known reliability “bathtub curve”, which is characterized by random failures. (The front and back parts of that bathtub curve represent failures due to infant mortality and wear-out, respectively.) Random failures are surprises, almost by definition, unlike wear-out failures which are increasingly expected with age.
To help us out, Built-In Test and On-Board Diagnostic Systems are designed to detect failures and produce fault messages. But automatically isolating the cause of that failure can be a different matter. Isolating the cause is difficult in a complex system. For some messages, the on-board diagnostic systems usually can reliably identify a single component that needs to be replaced. Other fault messages will have an “ambiguity group”, meaning that there is more than one possible cause of the message, relying on a technician to diagnose further. These are the messages that some people hope or believe have the answers in the maintenance action history, nine times out of ten.
Understanding The 9 Times Out of 10 Approach
- First, relatively few components are consistently guilty 90% of the time. A component that truly is guilty nine times out of ten would be subject to replacement prior to failure on a maintenance schedule, and only the ones that fail abnormally early would fail in service. And, eventually, that component would either be made more reliable, or designed right out of the system. (That’s what reliability engineering does so well.)
- Second, the parts replacement records to be mined seldom tell the whole story. Was the aircraft fixed permanently, or did the problem return later? Was that part replacement the only maintenance action taken for that issue? Was the removed component confirmed faulty in the repair shop? Has the data already been polluted by this ridiculous proposed practice of replacing parts that are most often replaced? (Talk about a self-fulfilling prophecy!)
- Third, such an analysis is much more likely to reveal an array of components with relatively similar failure rates. Selecting a part for replacement just because statistically it has been replaced most often is a recipe for aggravating the infamous No-Fault-Found problem for any parts that are less than 100% guilty.
No, there is no free lunch here, I’m afraid. The right solution requires ‘something extra’ to identify what has failed this time, not last time: additional evidence that differentiates among the possible causes of the reported fault – simple things like additional messages, specific test results, prior behaviour of the aircraft, and recent, seemingly unrelated, maintenance actions.
Using Modern Technology To Overcome Aircraft Maintenance Troubleshooting Challenges
There are modern technologies available now that capture and deliver that information in the Line Maintenance environment – and some of the best of those solutions are offered by ATP. The SpotLight® solution looks like good old-fashioned troubleshooting guidance, and it’s fast and smart. This interactive and collaborative troubleshooting software is a powerful problem-solving tool. It seamlessly merges and delivers curated and validated field experience alongside OEM diagnostic guidance to improve troubleshooting accuracy. New experience is captured, reviewed, and fed back into the system, enabling all support crews to consistently perform like experts.
The SpotLight solution consists of two components, a diagnostic database that contains symptoms, causes and solutions for equipment defects, and a diagnostic reasoning engine that uses this database to optimize the troubleshooting process. It gathers information from multiple sources into a single location, that can be updated quickly to reflect new and emerging failure modes and trends. It empowers line maintenance and customer support personnel to quickly isolate the root cause of defects and performance issues; helping them improve “first-time-fix” rates and reduce No-Fault-Found parts.