Software change management on modern drilling rigs must go beyond ‘checking the box’
Management buy-in, detailed deployment plan among best practices for rigorous software MOC program
By Christopher Goetz, Kingston Systems
As an industry, managing software has long been a peripheral concern in rig operations. However, as new rigs with increasingly complex control systems are delivered, effective software change management is critical in avoiding downtime. Software is ubiquitous on modern rigs, controlling everything from dynamic positioning to ram activation and drilling automation. Software regression, malware and cyber-attacks are critical issues that cause NPT and could lead to an LTI or an HSE incident.
Question yourself. Why is a permit to work (PTW) needed for a simple welding job, while a vendor can make untested drawworks code changes without the operator’s knowledge of the event or full understanding of the implications to safe operations? Is our maintenance and safety focus on the wrong area? No, we need to expand that focus to include software and control systems.
Software management of change (SMOC) with cybersecurity protocols is the solution. SMOC is a set of procedures and policies designed to control, track and understand changes to software systems for the purpose of increasing predictability, disaster recovery, auditability and overall reliability. This article reinforces the benefits and will cover key implementation tips and best practices.
Case studies
Operators are struggling to consistently implement a solid SMOC program, which are often put in place with a “check the box” mentality. Consider these field observations from Kingston Systems:
1. Lack of planning: Software upgrade is installed, leading to a collision between the top drive and the top of the drill pipe because the update was designed for a rig with a shorter derrick. The pipe was bent out of position and was in danger of popping out of the vertical pipe handler gripper arm. The upper stop limit set point had been unknowingly changed by the software upgrade.
2. Poor testing plan: A software change was made to zone management settings and was retested between two machines. The interaction with a third machine was not tested and caused a collision resulting in injury and two months of critical machinery downtime.
3. Upgrade installation failure: A software change request (SCR) was filed and approved. Time was allocated under the PTW process, and other users were locked out of the network and from access to affected machinery. Unfortunately, the technician was unknowingly provided with a bad release package. To further complicate the situation, no offsite support was available. When contact with the home office was reestablished, the missing files were sent but blocked by antivirus software. Eventually, an alternate route for software delivery was found. After the installation was completed, it was found to be incorrectly programmed. Several hours were lost performing tasks that should have been done offline, preventable through better planning and communication.
4. Not following process: A drilling contractor may have all SMOC policies and procedures in place, but the culture is often not there to execute. During a troubleshooting investigation, the rig chief electrician and ET were observed, laptop open, making changes to the PLC ladder logic to the drawworks. Additionally, they were doing so without a PTW and not following the corporate SMOC process. While the rig management faced some challenging questions from their client, the NPT continued, and the vendor was additionally challenged with a significant version control quandary.
The solution
Management of change (MOC) is a well-executed concept that can be effectively applied to software. A good software MOC program consists of a set of policies supported by procedures and tools to control and track changes to software and its configuration. To be effective, it is funded and supported by executive-level management and tied to existing maintenance programs. SMOC allows management to make educated decisions about what program changes are being made, why and what they will affect. It also enforces implementation planning, testing and recovery procedure for changes along with an audit trail. The basic components and, thus, the blueprint for typical SMOC program are:
• A corporate process: A corporately established policy that defines roles, workflows, procedures and boundaries for shore and offshore management, crew and vendors. Management buy-in is critical to ensure success.
• Training: Responsibility for SMOC goes well beyond the ET. Other key personnel – the driller, for example – need to understand their fiduciary responsibility to prevent unwarranted system access and to control post-change testing. A best practice is to have a complete training regimen on roles and competencies prepared for all roles, from vendors to managers.
• A Registry (Figure 2) to inventory software assets and define the versioning on that asset. The first step to being able to manage something is to know what you have. A registry accomplishes this task.
• The next component is a change request form capturing the reason for the change, the pre- and post-testing required, as well as a recovery plan. The form, specifically the testing component, is important. It forces the implementer to stop and think about their actions and those implications.
• Corporate and rig-specific rollout: No application or mandated process will be successful without a clear rollout plan. The rollout plan should incorporate resources for the initial additional workload, training time, role adjustment, funding, as well as a capacity to audit the implementation and reward successful behaviors.
• Cybersecurity: Tied to the corporate operational technology cybersecurity defense plan, the SMOC should include applicable steps, including ISA 62443 assessments, on each rig asset to identify and mitigate cyber attack conduits.
Best practices
Implementing SMOC is not without its challenges. However, if these best practices are followed, they can help to increase the program’s effectiveness and the rig’s level of protection (Figure 1).
1. Lack of corporate support: It can be consistently seen that corporate management is quick to add workload and requirements for activities but slow to add funding and technical or service support for the key policies and procedures that must be in place for SMOC to take hold. This lack of a clear mandate sends a signal to maintenance crews that they must figure this out on their own. A defined corporate mandate with funding and support for the initial design, roll-out and long-term maintenance of SMOC is the solution.
2. Deployment support: The change cannot happen overnight. As with any new process, there is a deployment front-end load. A large amount of activity must be done before SMOC can be effective. To be done correctly, the contractor should have a dedicated deployment team that attacks this front-end workload, provides training and coaching and gets the system up and running. Additionally, make plans to audit and continually review and renew the program at least annually. These points often do not happen. Instead, the new SMOC process, roles and tools are emailed out, and the expectation is that SMOC should be in place immediately. Corporate is often surprised when, one year later, there has been little positive motion, and SMOC is still not functioning. Deploying SMOC is a large endeavor. The cultural and personnel needs have to be addressed as part of the deployment plan. Safety is now integrated into daily operations.
3. Roles and responsibilities not enforced: If the SMOC gets little attention from corporate management, the defined roles and responsibilities carry little weight in the field. While all field personnel have a defined role in SMOC, in practice these responsibilities are ignored, and the job falls to a single individual. Often, the role is taken onboard with resentment due to the poor role definition, support and training. Thus, the workload becomes unmanageable, and it is poorly and inconsistently applied. Kingston Systems has seen wide variance in execution levels between facilities and even between crews on a single facility. A cultural adjustment is required to make SMOC work. This requires a team-based deployment and support plan.
4. No training: A consistent gap Kingston Systems sees in audits is the lack of even the most basic training to relay the who, how, what and whys of SMOC. The rigor of following SMOC requires a cultural change and requires that everyone understands their role in making the process work. The lack of training further supports the lack of corporate and deployment support. Without these, the roles and responsibilities are misconstrued, the application becomes weak and inconsistent. An inconsistent SMOC program falls critically short of meeting its mission of protecting code and securing hardware.
5. Vendor inclusion: Vendors of the control systems have been slow to support the owners in their efforts of SMOC deployment. However, this is changing. Several vendors have started presenting better SMOC solutions and have been more open about sharing versioning and other components with drilling contractors and equipment owners.
6. Incorporate cybersecurity: The threat of cyber attacks on operational technology (OT) is not coming – it is here. Most SMOC programs give only cursory mention to cybersecurity via basic security measures and outlining roles. The better approach is to ensure that the OT cybersecurity plan and SMOC policy and procedures are integrated and working in unison. One way to do this is to perform detailed security risk assessments on each rig in scope. The current best template approach for this is the cybersecurity risk assessment program outlined by ISA 62443-3-2.
The reward
A good SMOC program provides increased system uptime via stability, predictability and accountability. It ensures only tested and well-understood software changes are installed and that any changes made can be recovered in a timely manner. A strong SMOC will deliver improved vendor and client relationships, better disaster recovery, a strong maintenance crew and reduced NPT. It will pay for itself many times over with reduced NPT alone. Go beyond the “check the box” approach and implement a best-in-class SMOC program.
This article is based on a presentation at the 2015 IADC Asset Integrity & Reliability Conference & Exhibition, 16-17 September, Houston.