How to effectively plan for FMEA challenges to improve quality and save costs

A commercial developer of precision special-purpose test equipment asked us for guidance and review of their newly developed Failure Modes and Effects Analysis (FMEA) program for one of their automated test systems. In this case reliability was important to lessen lost time/revenue impact. If the test equipment failed during the week-long test sequence, the test had to start from the beginning – severely hurting the customer’s production schedule and threatening the reputation of the test equipment manufacturer. The analysis was a unique challenge because the FMEA focused on the software that controlled the test equipment, rather than on the test equipment hardware which was largely commercial off-the-shelf (COTS).

Thanks to the early efforts of the military and aerospace industries to prove the merits of FMEA, this analysis has been shown to be applicable to commercial organizations as a means to identify ways to save money in long term products and projects, as in the above example. FMEA helps to find unacceptable consequences of every kind of component failure, identifies undesirable (but perhaps acceptable if unlikely) consequences of component failures, and improves system reliability by identifying weak links in the reliability chain – finding where the aggregate reliability of the product might be improved by use of alternate components, or by design changes.

Image

An FMEA can be a tedious undertaking, which means without the right expertise it can be error-prone. Fortunately, this can be mitigated by a structured process that guides the effort and assures that the analysts remain on the right track.

Step 1:

You must first decide what the lowest level of abstraction will be for the analysis. For a hardware FMEA, these could be electronic components like transistors, hardware modules like signal conditioning circuits that occupy part of a circuit board, whole circuit boards, and so forth. For a software FMEA, these components could be variables, messages and control signals among modules, modules themselves, etc.
Step 3:

You must have full understanding of the system to be analyzed. Especially for large or complex systems, someone should be appointed as analysis manager, and that person should assign parts of the analysis to appropriately qualified people – if the effort is to be run efficiently. Each person should become an expert on his/her assigned portion of the system and remain productive in their domain. In a system with hardware and software, system-level consequences of hardware failures that are manifested by software should be determined by the appropriate software person, and vice versa.
Image
Step 2:

You must then establish analysis rules that the stakeholders and analysts can agree that make sense for a particular FMEA given its ultimate application. The idea is to assure that each analyst considers the same possible failure modes of the same kinds of low-level components, provides consistent descriptions of the components and their failure modes, and describes effects of mid-level failures and system-level failures in a consistent way. We have found that capturing mid-level and system-level failure effects in tables – shared by all analysts – makes it faster to proceed and avoids different people finding different ways to describe the same thing. Better yet, the table of system-level effects contains severity ratings for each effect (on a simple scale where the most severe effects are usually represented as 1 and benign effects as 4), so the severity figures predicted by different people for the same system-level effects are pre-agreed upon and consistent. The tables also make quality review of the FMEA easier and faster if worksheet cells are linked to table entries – allowing for the most efficient coordination amongst the team.
Image
Step 4:

The fourth step is performing the analysis and reviewing progress along the way. Fortunately, following the first three steps makes the effort easier, more efficient, and of better quality – where better quality means better consistency, thoroughness and usefulness. After all, the purpose of the FMEA is to gain clarity on the reliability of the system and the available options to bolster that reliability.

We have found that this approach works well for every kind of FMEA and we continually pursue process improvements. We are confident that if you have a product which is in production or has a long lifespan, an FMEA can ultimately save lifecycle cost and increase customer satisfaction through reduction of defects and the delays they cause. Explore HBM Prenscia Engineering Services and Solutions for more information and technical papers on this, and many other related topics. 

Omnicon is pleased to have presented two introductory workshops about this topic, plus FTA and FHA during this year’s ARDC conference in Portland, Oregon. It was a wonderful opportunity to explore these concepts, share best practices with other experts, and find out how they are applied in daily practice.
Join ARDC Connect to download all ARDC presentations, including our “FMEA, FTA and FHA” workshop material, and to stay in touch with the reliability and durability community.