Sign in

Centralized change management at Calpine Case Study

Case Study

itile-logo.svg
Centralized change management at Calpine Case Study

Case Study

itile-logo.svg
  • Case Study
  • Change management
  • ITIL

February 25, 2021 |

 9 min read

  • Case Study
  • Change management
  • ITIL

All of our White Papers and Case Studies are subject to the following Terms of Use.

Calpine Corporation is America’s largest generator of electricity from natural gas and geothermal resources. Calpine recently changed its IT infrastructure, from a fragmented system of localized systems to a single, centralized system. This paper discusses how Calpine implemented this change.

This paper also explores how Calpine implemented a centralized change management system to transform its IT infrastructure across its sites.

Introduction

Calpine Corporation1 is America’s largest generator of electricity from natural gas and geothermal resources with robust commercial, industrial, and residential retail operations in key competitive power markets. Founded in 1984, Calpine uses advanced technologies to generate power in an efficient, cost-effective, and environmentally responsible manner. With 76 power plants in operation or under construction, Calpine’s fleet has the capacity to generate nearly 26,000 megawatts of electricity, which is enough to power approximately 20 million homes. The fleet capitalizes on trends in the nation’s most robust power markets.

The power plant fleet is operated to the highest standards of safety, reliability and efficiency; values that lie at the heart of Calpine’s company mission. This unwavering focus on operational excellence benefits employees, customers, and communities.

Complex Business Needs

Calpine is a complex business comprised of many business areas, such as power operations, commercial operations, retail operations, and supply chain management. Each business area performs a different, yet vitally important role that must align with other business areas to co-create value.

Calpine’s North American operations are divided into the three regions, East, Texas, and West. The regions of the United States of America (USA) have entities that support reliability efforts, each with their own differing policies. For example, the Electric Reliability Council of Texas (ERCOT), which is an Independent System Operator (ISO) that manages the flow of electric power in Texas, requires a greater amount of electricity in the summer due to the state’s very hot climate and the subsequent greater use of air conditioning. Nonetheless, the changeable weather in spring and autumn leads to fluctuations of demand on the electricity system, due to changing demand and wind output.

Calpine must also meet market needs, as well as North American Electric Reliability Corporation (NERC)2 standards. NERC is a regulatory authority that assures the reliability and efficiency of energy supply across North America and parts of Mexico. The depth and variety of regulatory requirements and business areas within Calpine’s purview necessitates a strong IT infrastructure. Calpine has ensured this by adopting ITIL best practice to maintain its core business and technologies. The use of ITIL ensures that all IT service delivery to Calpine’s business areas are streamlined and share the best IT standard.

Background to the change

Various factors played a part in the implementation of the centralized change management system in Calpine. These include, but are not limited to: an existing diverse IT infrastructure, multiple company acquisitions, data centre migrations, implementation of a new IT&SM system, implementation of a new CMDB, implementation of a new service desk, consolidation of disparate processes, and updates to existing processes.

Calpine’s IT infrastructure, applications, and processes are diverse, with multiple data centres (on-site, as well as cloud), offices, and plants across the USA. Each plant and business possessed different software, architecture, systems, environment, and culture. As such, change management needs varied across each area, such as an upgrade to an Enterprise Resource Planning (ERP) system in the corporate environment or the implementation of real-time communications from plant locations to external entities.

Calpine’s decision to expand into the energy retail market resulted in the acquisition of Champion Energy Services in 2015, Noble Americas Energy and North American Power in 2016. These acquisitions necessitated changes to the existing IT infrastructure, including integrating the acquired businesses into Calpine’s IT&SM system. Calpine’s previous IT&SM system was replaced with a more robust ITIL technology platform. Consequently, IT&SM processes needed to be redesigned.

Calpine also moved a large part of its corporate IT footprint to data centres in the cloud, using the Infrastructure as a Service (IaaS) model. This involved tracking the IT systems, its performance and maintenance details, as well as the accessibility and accuracy of the migrating data.

Communication was key to the success of these initiatives, with the involvement of key stakeholders to prevent unnecessary delays. Calpine adopted the ITIL best practice of tailoring the method of communication to the audience. For example, a mixture of email, instant messenger, text and/or telephone calls, was used to communicate with stakeholders so that they would receive information even if they were out of the office or away from their desk.

Figure 3.1 The creation of a single, centralized system

Rectified infrastructure

Calpine’s IT infrastructure was fragmented, with many plants having their own localized systems and software. Calpine decided to rectify this by creating a single, centralized system. A project of this scale would take time, so it was imperative that all systems and data were still accessible and usable.


Figure 3.1 The creation of a single, centralized system

Implementing ITIL Principles

ITIL principles were adopted through the creation of structured change management, incident, and problem management processes. The change management process was redesigned to include changes such as: a well-defined, end-to-end change lifecycle workflow, with multiple distinct stages like new, assess, authorize, schedule, implement, review, and closed.

Fields in the CR form were amended so that the following fields had to be completed to capture more information: business services, assignment group, change window, planning details (justification, implementation plan, test plan, backout, and so on). As a result, the change manager will not process any inaccurate CR forms and will resend it to the sender.

An automated risk assessment and calculation feature was integrated into the change management process. This was to measure the level of risk associated with the change, at different stages of the change process. The risk was calculated from questions and input values, including the number of users, the impacted systems, the outage time, prior implementation history, and so on.

A multi-layered approval process was also incorporated, to involve the stakeholders associated with the change, such as business, technical, CAB, and others. A structured communications mechanism was adopted to notify all impacted users. All changes were required to be adopted during the defined maintenance window at weekends or after business hours. Any deviation from this process required extra review and approval processes.

A change period or deployment freeze period was not created to moderate changes during holidays. This was to reduce the impact on the production environment due to reduced support resources. The change was validated, tested, and closed to better track the change and create references for similar changes or issues in the future. Changes that were adopted as an enhancement were analysed and tested in a nonproduction environment. The implementation plan, test results, business approval, impact reports, and other relevant reports were documented in a localized site or attached to the change request.

Figure 4.1 Using the centralized system to find and prevent further issues

Prevention or cure?

ITIL principles were adopted through the creation of a single change management process. This resulted in a reduction of power outages at power plants. It also prevented further issues from arising. When an issue in one plant was resolved, others were notified of the fault, which could be fixed before it could cause an issue. After all, prevention is better than the cure.

Figure 4.1 Using the centralized system to find and prevent further issues


An emergency change process with CAB approval was created, to accelerate the implementation of urgent changes and system outages. Nonetheless, low risk, low impact, and recurrent changes were adopted via the standard change process. Consequently, changes are better planned, documented, communicated, authorized, executed, tested, and closed, resulting in a more stable IT environment. Incremental changes will be made to the change management process, as per Calpine’s continual improvement policy.

Calpine also adopted an enhanced incident management process with a new service desk vendor. Service desk staff were trained to rate incidents as either P1, P2, P3, or P4. An issue that affects a single person or team would be logged as P3 or P4. An incident that has a critical impact on the business would be logged as P1 and a bridge opened so that it can be worked on immediately. The issues are then logged, analysed, and resolved during the call. If the issue is not resolved, a workaround is adopted and a problem ticket logged to allow for further analysis. When a resolution is found, it is adopted as a change through the change management process.

To find a permanent solution, a problem ticket is created, as outlined in the new updated problem management process. Then, the system is searched to discover if the same problem is occurring elsewhere. If multiple locations are reporting the same problem, the team responsible at one location coordinates with others, and a detailed root cause analysis (RCA) is conducted to find a solution. When the resolution is found and tested, the support team at each location creates and implements the change to fix the problem.

The service desk practice was integrated with change management and problem management practices. The three practices are used in reporting, classifying, and resolving issues, forming a continuous cycle that is used to resolve most issues. As an energy company, Calpine is constantly working and must minimize system outages. Therefore, the IT&SM system and processes must be very strong. The validation process ensures that any problems with the change are logged, fixed and documented. This process occurs after the change has been adopted, when any further issues are issued a ticket and logged. Issues are logged based on the impact that they would have. When the validation process is complete, the change case is closed, any feedback is documented and used in future enhancements. Validation ensures that the change process is robust and open to improvement.

Figure 4.2 Integrating the service desk, change management, and problem management practices

The three practices

The service desk practice was integrated with change management and problem management practices. The three practices are used in reporting, classifying, and resolving issues, forming a continuous cycle that is used to resolve most issues.


Figure 4.2 Integrating the service desk, change management, and problem management practices


Barriers to change and how it was solved

Calpine faced many obstacles when implementing the new change management system. For instance, every new change met some resistance; however these issues were resolved using the measures discussed below.

Firstly, employees had to be retrained in how to use the new system. Experts in the new system visited workplaces to provide workshops and training, including one-to-one training, to ensure that employees could easily use the new system. A knowledge base was created to contain documents that outlined the change processes. Experts were made available for end-user support and training for a time.

Secondly, the new process delayed the approval of the change, as it now required detailed analysis within a multi-layered approval system. Yet, many of the stakeholders were not aware of this new process. This issue was resolved by communicating the changes to stakeholders via multiple communications, follow-ups, and (in some cases) escalations. This resulted in the widespread adoption of the new process, so that delayed approvals are no longer an issue.

There was also some confusion among change requesters regarding the submission of change requests as either normal, emergency, or standard changes. This resulted in multiple cancellations and the recreation of change requests. However, the issue was quickly resolved as errors were automatically grounded and requesters have since corrected their approach to change requests.

Moreover, collaborating with multiple implementation teams presented a significant challenge. For instance, on occasions one or more teams were unavailable or not in agreement. As a result, some teams withdrew or the change was not adopted. Poorly defined changes also created confusion. This was resolved by providing clearly defined, stepwise, implementation plans, back-out plans, and test plans, which was agreed upon by all of the stakeholders.

There were instances where previously adopted changes were processed, without testing or analysing its impact on other areas. In response to this, procedures were created and changes must be tested, analysed, and approved before it can be adopted. This new process faced resistance from some teams, who were unhappy with the delays that resulted from these additional steps.

Previously, many approvers were unaware of the proposed changes before CAB, where they would discuss and analyse the changes, resulting in significant delays. The new process ensured that the CAB approvers were informed of the changes before the meeting. This reduced the time needed to analyse, discuss, and approve the changes during the meeting.

The use of multiple stakeholders led to possible delays and a lapse in communication. To overcome this, various forms of communication was used including email, instant messaging, SMS, telephone calls, video calling, and so on. The method of communication varied to meet the stakeholders’ preferences and availability at that time.

An ongoing issue was that impacted users were unaware of changes, as they did not read the change team’s weekly communications. This issue was addressed by using Calpine’s corporate communications channels such as company-wide newsletters, bulletin boards, brief video conferences, Glassdoor notifications, stakeholder meetings, and so on to communicate changes. Nonetheless, some teams still ignored the change.

The change was eventually accepted using a top-down approach. For instance, managers enacted the change within their team, leading to an organization wide acceptance. Through a mixture of training, communication, the creation of a knowledge base, as well as manager and senior staff buy-in, Calpine was able to have the changes accepted by the employees.

Figure 5.1 The business continuity plan and recent events

Continuity in all forms

Moreover, if the headquarters in Texas is affected, the retail business will temporarily transfer to another office in another city. This all forms part of its business continuity plan, which is part of the service continuity management practice, one of the 34 practices within ITIL 4. Calpine was able to instigate the processes involved in this practice during the recent Covid-19 pandemic, so that they could rapidly arrange for employees to work from home where possible.

Figure 5.1 The business continuity plan and recent events


Benefits of the new approach

A single, centralized change system provides several benefits when compared to the previous localized system. For instance, if an issue is found in one location, it can be analysed to discover if the same issue affects another site. Fewer people are needed to monitor a centralized system, leading to a reduction in overhead costs. Moreover, a single person or team will deal with the same issue across multiple locations, resulting in quicker decisions and greater clarity. It will also be quicker to create an impact metric of an issue, resolve it, and perform a root cause analysis of the issue. Overall, a single system is easier and cheaper to manage.

Usually, teams at each location devise a different solution for the same or similar issue, compared to teams in other locations. Some solutions resolved the issue whereas others did not, instead creating new issues. This was resolved by implementing a centralized change management system, where users log details of their changes and view the details of each site. When multiple issues are reported from various sites, the support team can devise an approach, which can be tested in one location and applied to other locations if successful.

The data centre and cloud migration of Calpine systems provided significant benefits. The change management system captured system details before and after the migration. The structured approval process collected the systematic approvals, logs, test evidences, planning and discussions, performance matrices, and so on from the stakeholders. The information was then available at a single location for convenient review and audit. This resulted in a successful data centre migration to the cloud.

The United States’ varied geography and climate creates other issues, mainly due to problems related to natural disasters. Certain parts of the country are affected by annual hurricane or wildfire seasons that temporarily disrupt local data centres or offices. In these events, the disaster recovery centre takes control and temporarily transfers everything from the downed data centre to another. These migrations require efficient planning and a controlled change management process.

For example, if Calpine’s headquarters in Texas is affected, the business will temporarily transfer to another office in another city. This all forms part of its business continuity plan, which is part of the service continuity management practice, one of the 34 practices within ITIL 4. Calpine was able to implement the processes involved in this practice during the hurricane, as well as the recent Covid-19 pandemic3, so that they could rapidly arrange for employees to work from home where possible.

Summary

Calpine needed to modernize and centralize its IT&SM processes, to incorporate its new business and improve the efficiency and effectiveness of the overall business. The varied geography, climate, and regulations of the continental United States accentuated the importance of a single, centralized system.

Calpine started the change management process by integrating all systems and services into its ITIL technology platform and moving its data centre into the cloud. The change management system was improvised with a centralized ITIL-based system, integrated within a new service desk and problem management system. The service desk function was improved by training all staff to analyse and log tickets based on urgency. The service desk was also integrated with problem management and change management practices, to ensure that a single solution could be adopted, either to resolve or prevent issues in multiple locations.

The risk assessment process was formalized to better understand and document changes. It was important that all of the changes had the employees’ buy-in. Calpine ensured this by creating a structured approval process, as well as a communication mechanism that used the stakeholder’s preferred channel of communication.

Consequently, Calpine has created a single, robust, centralized change management system that is well-equipped to deal with any issue. This includes exceptional issues, such as the recent Covid-19 pandemic, as well as the annual hurricane and wildfire seasons.

End notes

1. Calpine.com, (2021). ‘Calpine homepage’. Available at: https://www.calpine.com/About-Us [Accessed 19 January 2021].

2. Nerc.com, (2020). ‘NERC homepage’. Available at: https://www.nerc.com/Pages/default.aspx [Accessed 26 August 2020].

3. https://www.calpine.com/about-us/news/calpine-covid-19-update [Accessed 25 August 2020].

About the author

Nayeem Khan is a passionate planner and implementer with a successful track record. He is always ready to improve and implement different frameworks and procedures to achieve better results. He is a service management professional, with a broad range of technical and business experience across multiple domains.

He is currently a senior system analyst at Calpine Corporation, an energy producing and retailing company based in the United States, with power plants across North America. As a change manager at Calpine, he has improved and adopted the centralized change management system across the company. He has a master’s degree in computer appliances from Bharathiyar University in India and has over 17 years of change, configuration, build, and release management experience across different domains. He is also involved with many volunteering and charitable endeavours.

Calpine Case study PDF