+ draft grid flexibility paper for wg review and revise#169
+ draft grid flexibility paper for wg review and revise#169jmcook1186 wants to merge 1 commit intomainfrom
Conversation
|
Hi @jmcook1186 , can you please confirm if we need to review all sections or only software specific sections that leverages SCI -Software as the Enabling Layer |
|
hey @jmcook1186 and everyone else, - I have added a link to the rendered version of the response, for easier readability. Joseph - I'm tagging you because I ended up editing your initial issue text slightly to do this, and I wanted you to be aware. |
|
Thanks @mrchrisadams ! @navveenb please review the whole piece - at least to flag any objections - it will go out authored by the PWG if approved. |
|
Abstract In the third paragraph of the Abstract, it feels like a bit of a search for an opposing viewpoint to cite the Knittel, Senga, and Wang (2025) paper. It might be more accurate to say optimizing for cost by shifting loads to off-peak hours can increase emissions. But is that necessary to highlight here? I see there's a "cost-emissions" discussion in Section 6. A better counterpoint for the abstract, if it's addressed later in the paper, might be that this sort of approach to flexibility is best suited for AI training workloads. [citation needed] US-focused, but should the Tyler Norris (Duke 2025, I think, mentioned later) be cited in the Abstract? Or is it just assumed at this point? Section 1: Introduction In the fourth paragraph, grid interconnection is mentioned as a constraint, citing research that focuses on North America. Is it as much of a constraint in the UK? Section 3: Sources of Flexibility I think the "Electrons" and "Molecules" approach is OK, but both might be better categorized as "behind-the-meter" or "co-generation" depending on the context? Later in 3.2 and 3.3, there could be a little more specificity about whether these resources are behind-the-meter or backup or neither. Right now it's not clear to me. There are also a range of downsides and challenges to data centres bringing their own fossil-fuel-powered generation, including the long wait for gas turbines, the cost of natural gas, pipeline availability, the expertise required to build and maintain these resources, etc. I think this section is also missing the Emerald AI-style flexibility, which as noted in Section 4.2 is to reduce GPU throughput, aka "underclocking" the chips during AI training to slow them down a bit, reducing electricity usage for a period of time. Section 4.2: Workload Orchestration Platforms We could consider also mentioned Neuralwatt here, though they differentiate from Emerald AI's approach by focusing on AI inference, rather than training: https://neuralwatt.com/ They are smaller and newer than Emerald AI (and are getting less attention), but the founders are former Microsoft data center energy folks. Section 7.2: Gas Supply Section 9.1: Operator Willingness Section 9.3: Social Licence |
|
Thanks for this @ryansholin ! I'll add a few comments inline based on your feedback 👍 , as I found them quite helpful |
|
|
||
| This context reframes the data center question. The issue is not simply *whether* data centers can be accommodated but *how much, how fast, and on what terms*. Compute capacity is economically important, but a tenfold increase in national data center capacity within five years — the trajectory implied by current UK connection applications — is not self-evidently compatible with climate targets, grid security, or affordable energy for other consumers. The question of scale deserves scrutiny, not just the question of connection design. A data center fleet connected through flexible, grid-responsive, renewables-aligned infrastructure can be a participant in the electrification transformation. One that grows without constraint, locked into firm, inflexible, fossil-backed connections, risks overwhelming it. | ||
|
|
||
| Three forces are driving the specific crisis in GB. First, **scale**: global data center consumption is projected to reach 945 TWh by 2030, equivalent to Japan's total electricity consumption ([IEA, 2025](https://www.iea.org/reports/energy-and-ai/executive-summary)). Second, **grid interconnection is a binding constraint**: new grid assets take 5–10 years to build versus 18–24 months for a data center ([Sidewalk Infrastructure Partners, 2025](https://www.datacenterflexibility.com)). Third, **AI workloads are changing load profiles**: training clusters exhibit bursty, high-intensity demand that creates both challenges and opportunities. |
There was a problem hiding this comment.
response from @ryansholin
Section 1: Introduction
In the fourth paragraph, grid interconnection is mentioned as a constraint, citing research that focuses on North America. Is it as much of a constraint in the UK?
Yes - it's a significant constraint, and the UK has a massive backlog as well, but it might not be of the precisely the same proportions. My understanding is that capacity is distributed differently (the North of England has more generation capacity, but most datacentre projects in are in the congested South, for example, and there is congestion North to South).
The call for input explicitly references it.
|
|
||
| #### 3.1.1 Temporal Workload Shifting | ||
|
|
||
| Not all workloads require immediate execution. By deferring non-urgent work (batch processing, model training, analytics) to periods of low grid stress or high renewable generation, data centers modulate demand profiles. Google's carbon-intelligent computing system has used Virtual Capacity Curves in production since \~2020 ([Radovanovic et al., 2023](https://ieeexplore.ieee.org/document/9770383)). Flexibility windows of 30 minutes to 3 hours are typical ([Takci, Day & Qadrdan, 2025b](https://arxiv.org/abs/2511.07159)). |
There was a problem hiding this comment.
| Not all workloads require immediate execution. By deferring non-urgent work (batch processing, model training, analytics) to periods of low grid stress or high renewable generation, data centers modulate demand profiles. Google's carbon-intelligent computing system has used Virtual Capacity Curves in production since \~2020 ([Radovanovic et al., 2023](https://ieeexplore.ieee.org/document/9770383)). Flexibility windows of 30 minutes to 3 hours are typical ([Takci, Day & Qadrdan, 2025b](https://arxiv.org/abs/2511.07159)). | |
| Not all workloads require immediate execution. By deferring non-urgent work (batch processing, model training, analytics) to periods of low grid stress or high renewable generation, data centers modulate demand profiles. Google's carbon-intelligent computing system has used Virtual Capacity Curves in production since \~2020 ([Radovanovic et al., 2023](https://ieeexplore.ieee.org/document/9770383)). Flexibility windows of 30 minutes to 3 hours are typical ([Takci, Day & Qadrdan, 2025b](https://arxiv.org/abs/2511.07159)). It's also important to note that not all jobs can be practically shifted in space. A method for labelling workloads would be very useful, making it immediately obvious which can be run flexibly. |
update to address suggestions from @ryansholin , cc @mrchrisadams
|
|
||
| - NGP/Emerald AI (2026). "Power Flexible AI Factories." White paper. [Link](https://www.ngpartners.com/stories/emerald-ai-whitepaper) | ||
|
|
||
| - Norris et al. (2025). "Rethinking load growth: Assessing the potential for integration of large flexible loads in US power systems." Duke University. |
There was a problem hiding this comment.
| - Norris et al. (2025). "Rethinking load growth: Assessing the potential for integration of large flexible loads in US power systems." Duke University. | |
| - Norris et al. (2025). "Rethinking load growth: Assessing the potential for integration of large flexible loads in US power systems." Duke University. https://dukespace.lib.duke.edu/items/bb350296-d7a1-4d8f-acb0-2fba9b1f03de |
adds link to Norris 2025 study
|
|
||
| ### 3.1 Demand Response | ||
|
|
||
| Rather than building new generation to meet data centre loads, demand response uses computational workload flexibility to balance supply and demand. A 2025 Duke University study estimates that curtailment of only 0.25–1% of annual consumption during critical hours could enable the grid to absorb up to 100 GW of new load without major upgrades. |
There was a problem hiding this comment.
| Rather than building new generation to meet data centre loads, demand response uses computational workload flexibility to balance supply and demand. A 2025 Duke University study estimates that curtailment of only 0.25–1% of annual consumption during critical hours could enable the grid to absorb up to 100 GW of new load without major upgrades. | |
| Rather than building new generation to meet data centre loads, demand response uses computational workload flexibility to balance supply and demand. A 2025 Duke University study (Norris et al. 2025) estimates that curtailment of only 0.25–1% of annual consumption during critical hours could enable the grid to absorb up to 100 GW of new load without major upgrades. |
Adds explicit citation to Norris et al (already in reference list)
| - **Biogas/RNG**: can achieve carbon-negative profiles from landfill methane capture. | ||
| - **Green hydrogen fuel cells**: zero operational emissions. | ||
|
|
||
| A climate-aligned hierarchy should incentivise these in order: demand response, electrons, non-fossil molecules, with fossil generation as a last resort carrying mandatory transition timelines. |
There was a problem hiding this comment.
| A climate-aligned hierarchy should incentivise these in order: demand response, electrons, non-fossil molecules, with fossil generation as a last resort carrying mandatory transition timelines. | |
| A climate-aligned hierarchy should incentivise these in order: demand response, electrons, non-fossil molecules, with fossil generation as a last resort carrying mandatory transition timelines. | |
| Furthermore, on-site fossil-fuel generation also faces significant practical barriers: gas turbine lead times have extended to several years, capital costs have roughly doubled in the past 18 months, new gas pipeline connections face substantial backlogs and the specialist expertise required to build and maintain generation assets at data centre scale is scarce (SemiAnalysis, 2025). Renewable natural gas supply is also limited, constraining the viability of biogas/RNG as a large-scale transitional fuel. These supply-side constraints further strengthen the case for prioritising demand response and battery storage over molecule-based approaches. |
Suggested edit in response to @ryansholin comment, cc @mrchrisadams
| **Emerald AI Conductor** connects grid signals to workload management, classifying jobs into flexibility tiers (Flex 0–3) allowing 0–50% throughput reduction over 3–6 hours, with 15-minute graceful ramps. In the EPRI DCFlex pilot at Oracle's Arizona facility, it achieved 25% reduction in AI cluster power during peak demand while maintaining performance ([Emerald AI](https://emeraldai.co); [ACEEE, 2025](https://www.aceee.org/blog-post/2025/10/data-center-efficiency-and-load-flexibility-can-reduce-power-grid-strain-and)). | ||
|
|
||
| **EPRI DCFlex**, launched October 2024, has expanded from 14 to 45 participants including Google, Meta, Microsoft, NVIDIA, and Oracle ([Tilton, 2025](https://spectrum.ieee.org/dcflex-data-center-flexibility)). Major cloud providers already operate sophisticated schedulers (Google's [Borg](https://research.google/pubs/large-scale-cluster-management-at-google-with-borg/), Kubernetes), but extending these to incorporate grid signals is organisationally challenging: compute teams are measured on utilisation, not carbon. | ||
|
|
There was a problem hiding this comment.
| NeuralWatt (https://neuralwatt.com/) demonstrate that software-driven GPU power optimisation can deliver meaningful flexibility headroom: real-time adaptive tuning on NVIDIA H100 systems increased inference throughput by 33% while reducing idle GPU power draw by over 40%, effectively allowing eight GPUs to operate within the power envelope of six (NeuralWatt/Crusoe, 2025). By reclaiming stranded power within existing grid connections, such platforms complement workload-level orchestration and reduce the capacity requirements that drive the connections queue. |
Addressing suggestion by @ryansholin, cc @mrchrisadams
|
|
||
| ### 9.3 Social Licence | ||
|
|
||
| Data centres share grid infrastructure with residential and commercial consumers. A flexibility programme capturing financial benefit from high-price load reduction without corresponding benefit to local users during genuine constraint does not have a sound social licence. A programme responding to genuine system stress — including local distribution constraints — with equitably shared financial benefits represents a genuine social good and is more likely to sustain planning consent. The Connect pillar should include local constraint management as an eligible flexibility service, with revenue sharing designed for community benefit. |
There was a problem hiding this comment.
| Data centres share grid infrastructure with residential and commercial consumers. A flexibility programme capturing financial benefit from high-price load reduction without corresponding benefit to local users during genuine constraint does not have a sound social licence. A programme responding to genuine system stress — including local distribution constraints — with equitably shared financial benefits represents a genuine social good and is more likely to sustain planning consent. The Connect pillar should include local constraint management as an eligible flexibility service, with revenue sharing designed for community benefit. | |
| Data centres share grid infrastructure with residential and commercial consumers. A flexibility programme capturing financial benefit from high-price load reduction without corresponding benefit to local users during genuine constraint does not have a sound social licence. A programme responding to genuine system stress — including local distribution constraints — with equitably shared financial benefits represents a genuine social good and is more likely to sustain planning consent. Emerging models suggest how this might work in practice. In the US, WattCarbon's "Repowering California" programme enables large energy consumers, including data centres, to procure grid capacity from distributed community energy resources rather than relying solely on their own flexibility (WattCarbon, 2025). UK equivalents are taking shape: mySociety and the Social Investment Business have mapped community organisations positioned to participate in DNO flexibility tenders (mySociety, 2026), while Piclo's (https://www.piclo.com/) independent flexibility marketplace already connects distributed energy resources with National Grid and NESO. These approaches point toward a model in which data centre flexibility and community energy participation are complementary — strengthening the social licence case for large-scale connections. The Connect pillar should include local constraint management as an eligible flexibility service, with revenue sharing designed for community benefit. |
addressing commentary from @ryansholin , cc @mrchrisadams
| - Emerald AI. "Conductor: Flexibility Management Platform." [Link](https://emeraldai.co) | ||
|
|
||
| - Tilton, J. (2025). "Big Tech Tests Data Center Flexibility for Local Power Grids." *IEEE Spectrum*, 12 June 2025\. [Link](https://spectrum.ieee.org/dcflex-data-center-flexibility) | ||
|
|
There was a problem hiding this comment.
Add missing citations to Wattcarbon, MySociety
| mySociety, 2026: Mapping energy data to help community spaces take part in electricity network flexibility. [Link](https://www.mysociety.org/2026/02/17/mapping-energy-data-to-help-community-spaces-take-part-in-electricity-network-flexibility/) | |
| WattCarbon, 2025: "Repowering California", [Link](https://blog.wattcarbon.com/p/repowering-california) | |
|
hey @ryansholin, thank you for your feedback. @jmcook1186 and I met yesterday to incorporate your feedback, and the draft is updated to reflect them. I'll be submitting the version with the changes this CET afternoon. |
Adds a white paper response to the UK Ofgem call for input (deadline 13th march) on the topic of the role of software in data center grid flexibility.
This draft is intended to bootstrap the working group into a review and refine cycle to clarify their position before submission to OFGEM.
Please review with specific changes by 11th March (1 week from now) to give us time to incorporate feedback.
We will have to conduct this review async as we do not have a scheduled call between now and the submission deadline.
See the rendered version of the response
As always, members are free to OBJECT. If we reach the deadline with no explciit objections, we will proceed to submission.
Cheers!
#167