The U.S. power grid faces numerous stability and security risks, such as transmission failures, adversarial threats, and extreme weather. These disturbances—known as contingencies—pose operational challenges that are increasingly difficult to predict and manage. For example, wildfires can quickly destroy or disable critical infrastructure components like transmission lines and generators. Regional grid operators need to understand where the fire could spread and decide how to dispatch electricity safely during the emergency.
Tackling this control optimization problem must integrate transient stability (TS) analysis of regional grid operations during a contingency event with security-constrained alternating current optimal power flow (SC-ACOPF) procedures. But state-of-the-art software simplifies TS analyses and doesn’t take full advantage of high performance computing (HPC).
Accordingly, researchers at LLNL’s Center for Applied Scientific Computing are developing a sophisticated optimization framework that combines HPC, machine learning (ML) models, and mathematical algorithms. SLOPE-Grid, which stands for Scalable Learning and Optimization for Secure and Economic Grid Operations, builds on an earlier HPC-driven grid project called gollnlp.
The three-year project is a Scientific Discovery through Advanced Computing (SciDAC) partnership funded by the Department of Energy’s Advanced Scientific Computing Research (ASCR) program and Office of Electricity. Alongside collaborators at Dartmouth College and Argonne National Laboratory, the Livermore team includes principal investigator Cosmin Petra, J.P. Watson, Nai-Yuan Chiang, Claudio Santiago, and Jingyi Wang.
Contingency Planning
SLOPE-Grid speeds up the time to solution by dividing the optimization workflow into offline and real-time phases. In the offline loop, advanced mathematical solvers build simulations of critical TS behaviors that can occur during a range of contingencies. The simulation data is used to train probabilistic surrogate ML models that approximate TS risks. Because all of these steps take place in advance, the rest of the workflow—including the SC-ACOPF process—can run quickly at the point of decision making, such as determining where to divert or shut down electricity. (See Figure 1.)
Petra explains, “Simulating just 60 seconds of transient activity can take hours or days to compute, so we do this beforehand with machine learning. The surrogate works in real time based on the data generated offline.” Both phases include error estimation and uncertainty quantification of the surrogates’ accuracy to improve predictions.
The project’s biggest challenge is where its biggest efficiency gain lies. Regional power grids may consist of thousands of electrical components subject to millions of contingency variables. The SLOPE-Grid team is working to scale the ML models to accommodate realistic power grid sizes without excessively large numbers of training data points.
Wildfire Response
Using LLNL’s Dane computing cluster, the team generated predictive simulations with the SLOPE-Grid framework, incorporating wildfire data and curated data resembling grid components. These interactive visualizations are available on the project website as examples of how wildfires can impact California’s power grid. (See Figure 2.) Grid operators can leverage this information as contingencies develop instead of relying solely on retrospective analysis.
