It is inevitable that failures will occur and it is only a matter of time before we are confronted with their effects. Our concern regards our ability to anticipate and respond to failures when they occur. How soon is too soon to respond to a change or shift in the process? Do we shut down the process at the very instant a defect is discovered? How do we know what conditions warrant an immediate response?
The quality of a product is directly dependent on the manufacturing process used to produce it and, as we know all too well, tooling, equipment, and machines are subject to wear, tear, and infinitely variable operating parameters. As a result, it is imperative to understand those process parameters and conditions that must be monitored and to develop effective responses or corrective actions to mitigate any negative direct or indirect effects.
Statistical process control techniques have been used by many companies to monitor and manage product quality for years. Average-Range and Individual-Moving Range charts, to name a few, have been used to identify trends that are indicative of process changes. When certain control limits or conditions are exceeded, production is stopped and appropriate corrective actions are taken to resolve the concern. Typically the corrective actions are recorded directly on the control chart.
Process parameters and product characteristics may be closely correlated, however, few companies make the transition to solely relying on process parameters alone. One reason for this is the lack of available data, more specifically at launch, to establish effective operating ranges for process parameters. While techniques such as Design of Experiments can be used, the limited data set rarely provides an adequate sample size for conclusive or definitive parameter ranges to be determined for long-term use.
Learning In Real-Time
It is always in our best interest to use the limited data that is available to establish a measurement baseline. The absence of extensive history does not exempt us from making “calculated” adjustments to our process parameters. The objective of measuring and monitoring our processes and product characteristics is to learn how our processes are behaving in real-time. In too many cases, however, operating ranges have not evolved with the product development cycle.
Although we may not have established the full operating range, any changes outside of historically observed settings should be cause for review and possibly cause for concern. Again, the objective is to learn from any changes or deviations that are not within the scope of the current operating condition.
A trigger event occurs whenever a condition exceeds established process parameters or operating conditions. This includes failure to follow prescribed or standardized work instructions. Failing to understand why the “new” condition developed, is needed, or must be accepted jeopardizes process integrity and the opportunity for learning may be lost.
Our ability to detect or sense “abnormal” process conditions is critical to maintain effective process controls. A disciplined approach is required to ensure that any deviations from normal operating conditions are thoroughly reviewed and understood with applicable levels of accountability.
An immediate response is required whenever a Trigger Event occurs to facilitate the greatest opportunity for learning. “Cold Case” investigations based on speculation tend to align facts with a given theory rather than determining a theory based solely on the facts themselves.
Recurring variances or previously observed deviations within the normal process may be cause for further investigation and review. As mentioned in previous posts, “Variance – OEE’s Silent Partner” and “OEE in an Imperfect World“, one of our objectives is to reduce or eliminate variance in our processes.
Interactions and Coupling
When we consider the definition of normal operating conditions, we must be cognizant of possible interactions. Two conditions observed during separate events may actually create chaos if the events actually occurred at the same time. I have observed multiple equipment failures where we subsequently learned that two machines on the same electrical grid cycled at the exact same time. One machine continued to cycle without incident while a catastrophic failure occurred on the other.
Although the chance of cycling the machines at the exact same moment was slim and deemed not to be a concern, reality proved otherwise. Note that monitoring each machine separately showed no signs of abnormal operation or excessive power spikes. One of the machines (a welder) was moved to a different location in the plant operating on a separate power grid. No failures were observed following the separation.
Another situation occurred where multiple machines were attached to a common hydraulic system. Under normal circumstances up to 70% of the machines were operating at any given time. On some occasions it was noted that an increase in quality defects occurred with a corresponding decrease in throughput although no changes were made to the machines. In retrospect, the team learned that almost all of the machines (90%) were running. Later investigation showed that the hydraulic system could not maintain a consistent system pressure when all machines were in operation. To overcome this condition, boosters were added to each of the hydraulic drops to stabilize the local pressure at the machine.
To summarize our findings here, we need to make sure we understand the system as a whole as well as the isolated machine specific parameters. Any potential interactions or affects of process coupling must be considered in the overall analysis.
I recommend using a simple reporting system to gather the facts and relevant data. The objective is to gain sufficient data to allow for an effective review and assessment of the trigger condition and to better understand why it occurred.
It is important to note that a trigger event does not automatically imply that product is non-conforming. It is very possible, especially during new product launches, that the full range of operating parameters has not yet been realized. As such, we simply want to ensure that we are not changing parameters arbitrarily without exercising due diligence to ensure that all effects of the change are understood.
After a 10 month investigation into the cause of “Sudden Unintended Acceleration”, the results of the Federal Investigation were finally released on February 8, 2011, stating that no electronic source was found to cause the problem. According to a statement released by Toyota, “Toyota welcomes the findings of NASA and NHTSA regarding our Electronic Throttle Control System with intelligence (ETCS-i) and we appreciate the thoroughness of their review.”
The findings do,however, implicate some form of mechanical failure and do not necessarily rule out driver error. It is foreseeable that a mechanical failure could be cause for concern and was seriously considered as part of Toyota’s initial investigation and findings that also included a concern with floor mats. While the problem is very real, the root cause may still remain to be a mystery and although the timeline for this problem has extended for more than a year, it demonstrates the importance of gathering as much vital evidence as possible as events are unfolding.
A Follow Up to Sustainability
When a product has reached maximum market penetration it becomes vulnerable. According to USA Today, “Activision announced it was cancelling a 2011 release of its massive music series Guitar Hero and breaking up the franchise’s business unit citing profitability as a concern.”
I find it hard to imagine all of the Guitar Hero games now becoming obsolete and eventual trash. The life span of the product has exceeded the company’s ability to support it. This is a sad state of affairs.
Until Next Time – STAY lean!