Messages from the machine: Maintenance process goes smart
Predictive asset maintenance uses data to predict equipment failures, pinpoint causes
By Ashe Menon, National Oilwell Varco
It has previously been discussed how to make predictive asset maintenance a reality in the drilling industry (SPE/IADC 128865). Further research has been conducted, and a field-proven solution for implementing predictive asset maintenance has been developed.
From the beginning it was known that the “playbook” of conventional, tribal knowledge and technology infrastructure had to be thrown out. Given known limitations of operator/contractor demands, of communication via satellite and of space/support at a rig level, a solution was developed that can operate on the rig in “near time” and let the equipment tell the user when it needs attention.
New processes and solutions were applied to three rigs, and the testing process involved applying statistical correlation analysis to understand “data.” Different types of sensors were also analyzed, as well as how they can work together to provide more predictability for failures and more understanding of the impact that a human has on the success or failure of the system. The application not only provided a measurable reduction in downtime but also had an impact on behavior of the rig crew.
Statistical analysis software was applied, and it became apparent that the lack of standardization in the data collection process is preventing the industry from making significant leaps in this area. A correlation analysis of 10 years of operational and maintenance data from five drilling contractors was conducted, and the result was clear: The No. 1 cause of downtime was “other.”
The evidence showed that there is a need to clean up the data collection process from the field when there is a problem. This will help eliminate people filling out unplanned work orders or service tickets where the problem would have read “top drive not working.” Instead, predictive asset maintenance would drill down to the component of the top drive that actually caused the problem. This necessitated the adoption of ISO 14224 to classify asset hierarchy; this would help to better categorize failures and troubleshooting mechanisms for equipment specific to the drilling industry.
The aforementioned test laid the foundation for further testing to be conducted to make condition monitoring (CM), condition-based maintenance (CBM) and predictive maintenance (PdM) a reality. In simple terms, CM would monitor the condition of the asset, which could involve adding sensors (temperature, pressure, vibration, etc), developing new sensors (debris monitors) or using existing data (RPM, torque, horsepower, etc).
CBM involves using the data obtained in the previous step to conduct maintenance only when required. For example, when there are metal particles in the oil, it means that the gearbox should be inspected. PdM is getting smarter, with data that is available to help users make better decisions.
Armed with new knowledge, crews wouldn’t have to stop everything and tear the equipment apart to inspect the gearbox. This is because historical data has shown the equipment typically runs approximately three months after the initial signs of wear, so the user knows there is no need to worry right now. The machine is simply monitored more closely for changes that might alter the original plan.
The bane of solutions like these are a “false alarm.” Trust in such systems are inversely proportional to the number of false alarms generated by the system. To address this, a significant amount of time was spent vetting the alarms and understanding the impact of ambient noise and variability of what would be considered “normal operation.” This is because a piece of equipment can sound different when placed in different operating conditions. This could be the type of rig it is on or the kind of formation it’s drilling. It was necessary to make sure that the system was smart enough to learn its new baseline.
This doesn’t just prevent false alarms but also provides understanding on “remaining life.” A piece of equipment working hard in extremely difficult formations would have a significantly different useful life compared with one that is operating in soft formations.
Identifying the different failure modes and the criticalities of the failures helped to determine whether to add a sensor. When it was decided not to add a sensor, it was not based on the fact that a criticality was low; a sensor was added only if the user could not identify a failure from existing signals in the control system.
This was an important part of the process because it meant one fewer thing for rig personnel to maintain and because additional sensors meant additional risks for dropped objects. Further, efforts to avoid adding sensors involved additional analyses of data (RPM, torque, power, current, etc). This helped to identify data patterns as predictors of future failures, driving the development of supervised and unsupervised learning modes for different asset classes.
In supervised learning, the system learns a failure pattern from a past failure. In unsupervised learning, the system has not seen the failure before but recognizes that something is different. If the unsupervised learning pattern progresses into a failure, it would then save that as supervised profile to prevent future failures. This pattern is also compared against historical data and failures on other rigs to ensure that it is not an anomaly. Once deemed valid, this profile is shared across all rigs that have the same asset class in a process known as transfer learning.
Another challenge was ensuring the system works in a disconnected mode. Rigs rely on satellite for connectivity, and data-intensive systems typically require an “always on” high-bandwidth connection. The system was therefore built with compression technologies that allowed the user to send only required data; it also has sufficient intelligence on the rig so all data does not have to be sent all the time.
Figures 1 and 2 show how the system becomes smarter. It starts by letting the user know there is a potential problem and where to look. Over time, it starts letting the user know what to do. When combined with vibration sensors and debris monitors, the level of predictability increases exponentially.
Part of the motivation for doing this is the recurring concern that the overall experience level of personnel in the industry is going down. Compared with a pilot who might fly a single-engine Cessna, drilling personnel are operating rigs that might cost more than a half-billion dollars. Industry must provide its personnel with as much help as possible and make it simpler for them to do their jobs.
We were able to send a man to the moon in 1969 even though the entire processing power of the onboard computers at that time was probably less than some high-end laptops today. Like former NASA flight director Gene Kranz said, “Failure is not an option.” We have the tools and technologies to make really smart machines, and we don’t have an excuse to not do it anymore.
This article is based on a presentation at the 2013 IADC Advanced Rig Technology Conference & Exhibition, 17-18 September, Stavanger, Norway.
Very good examples and well written – my compliments!