-
Notifications
You must be signed in to change notification settings - Fork 952
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Add Carbon Efficient design document #4686
Changes from 24 commits
99c3c18
9920bda
4ab9c26
0ec5f12
22f665e
cfa42ce
ba614e2
4d73c8a
8d4304d
216f45c
b4d938d
8b53ecd
f6184c0
54ee1c4
5c8be31
81779a2
08364b0
3051b00
0a3a170
a56ece2
14657a8
b902dd5
6757753
0ffcda7
a7134dd
af57352
6a9948f
c0cf244
5465656
e6038e6
0ae48ea
653353b
d74b37b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,197 @@ | ||
# Carbon Aware Karpenter: Optimizing Kubernetes Cluster Autoscaling for Environmental Sustainability | ||
*Author: [@JacobValdemar](https://github.com/JacobValdemar)* | ||
|
||
## Context & Problem | ||
There is a growing concern about the environmental impact of Kubernetes clusters. Karpenter's opportunities within environmental sustainability is referenced in multiple comments that back [`karpenter-core`'s move to CNCF](https://github.com/kubernetes/org/issues/4258). | ||
|
||
I am currently working on my master's thesis in Computer Engineering (Master of Science in Engineering) at Aarhus University located in Denmark. The objective of the thesis is to enable Karpenter to minimize carbon emissions from Kubernetes clusters that run on cloud infrastructure (scoped to AWS). | ||
|
||
RFC: https://github.com/aws/karpenter/issues/4630 | ||
|
||
## Fundamentals of Green Software | ||
I will try to keep it simple. The reader should be familiar with the following. | ||
|
||
A cluster's emissions is made of two elements: embodied emissions and operational emissions. To get the total emissions, one can add them togeather. | ||
|
||
- **Embodied carbon emissions**: Manufacturing emissions (CO₂e) amortized over instance lifetime (usually 4 years) divided by how long we use the instance | ||
- **Operational carbon emissions**: Carbon emitted by electricity grid to produce electricity for the instance in the region where it is used, multiplied by PUE | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Would this be different for regions that focus on more sustainable energy sources like hydroelectric or geothermal energy? Or are you forgoing the energy sources, and talking purely about the byproduct of the heat and other leftover elements in electricity grids? |
||
|
||
There is a lot more to Green Software. If you want to learn more, I recommend you to visit [Green Software Practitioner](https://learn.greensoftware.foundation/) (a Green Software Foundation project - an affiliate of the Linux Foundation). | ||
|
||
## Solution | ||
|
||
### Feature Gate | ||
JacobValdemar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
The feature is proposed to be controlled using a [feature gate](https://karpenter.sh/docs/concepts/settings/#feature-gates). | ||
|
||
| **Feature** | **Default** | **Config Key** | **Stage** | **Since** | **Until** | | ||
| :---------: | :---------: | :-----------------------------: | :-------: | :-------------: | :-------: | | ||
| CarbonAware | false | featureGates.carbonAwareEnabled | Alpha | v0.32.0/v0.33.0 | | | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Small detail: we're planning to drop the config key so this feature flag would probably have the same format a-la Kubernetes upstream. If we take the last solution, this feature flag might look like |
||
|
||
### Carbon emissions data source | ||
Currently the best option is to create estimates based on the methodology used in [Boaviztapi](https://github.com/Boavizta/boaviztapi). | ||
|
||
[Try out Boaviztapi on the Datavizta demo website](https://datavizta.boavizta.org/cloudimpact). | ||
|
||
#### Licensing | ||
Boaviztapi is licensed under [`GNU Affero General Public License v3.0`](https://github.com/Boavizta/boaviztapi/blob/main/LICENSE). Therefore, as far as I know, we must license their data under the same license if used in the Karpenter repository. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will there be conflicts in licensing their data according to these requirements and the CNCF guidance for licensing the karpenter-core source itself? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jackfrancis So I am in no way a lawyer, but I guess that there wont be licensing issues. We probably just have to put a different license notice in the top of files that is based on their works. Currently, I would expect the data with that license only to be in cloud provider repos (e.g. https://github.com/aws/karpenter) and not karpenter-core. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Famous last words. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'd definitely need to check with AWS legal for this (while AWS owns it) and consider how this interacts with the CNCF guidance since this would probably live in the karpenter-core repo if it was gen-ed. I could also see this living in its own separate repo that provided configuration plugins to the pricing overrides that we are thinking about here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The particular license that Boaviztapi chooses to use suggests that it's using the "copyleft malware" approach (for Good, not Evil, of course) to ensure that its definitions of freedom (as in both beer and speech) are enforced upon downstream projects. So that's the only reason to bring this up. AWS and CNCF are probably going to combine w/ such a license in their own unique interesting ways. Non-zero chance there will be friction. |
||
|
||
#### Limitations | ||
There is a discrepancy between the available instances known to Karpenter and instances know to Boaviztapi. This means that as it is right now, it is not possible to get carbon emissions data for all instances types. This is mostly the case for new instance types such as m7g. Around 290 out of 700 instance types is missing data. See full comparison in [this Gist](https://gist.github.com/JacobValdemar/e1342013c0f5c980126f6a1feb66b4a1). | ||
JacobValdemar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
I will attempt to eleminate this discrepancy, but it might not be possible. It will probably not always be possible to have an updated list of estimated carbon emissions for all instances as AWS continue to release new instance types. We should consider what to do with instance types that we do not have carbon emission estimates for. | ||
JacobValdemar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Approaches to handle this: | ||
JacobValdemar marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not exclude instances that are not emission-priced when the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jackfrancis I agree. I also think an exclusion is the best option in case we can't estimate the emissions accurately. The method I thought about excluding them was to assigning them an absurd high price so they will never be picked voluntarily. However, I want to improve the dataset that we depend on, so that we can use as many instance types as possible. See this issue for reference Boavizta/boaviztapi#232 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My personal take is that if I were a customer I would prefer not to use something at all that isn't carbon-priced.
|
||
1. Estimate extremely high emissions to effectively filter out unknown instance types (recommended) | ||
2. Estimate zero emissions | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Zero is probably wrong. There may be a third option here which would be to do some estimation function. What on? I haven't thought through it that hard; however, I agree with @jackfrancis that the easy way out is to just exclude for now. |
||
|
||
### Launch strategy | ||
To enable emission based priotization, the launch strategy should be changed from `lowest-price` to `prioritized`. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We probably want to do |
||
|
||
### Changes to consolidation (karpenter-core) | ||
Single Machine Consolidation (`singlemachineconsolidation.go`) and Multi Machine Consolidation (`multimachineconsolidation.go`) as well as `consolidation.go` is currently consolidating nodes to reduce costs. We want to change this when Carbon Aware is enabled. They should consolidate to minimize carbon emissions. | ||
|
||
### Changes to Provisioning | ||
Currently, provisioning (roughly) filter instances based on requirements, sort instances by price, and launch the cheapest instance. We want to change this when Carbon Aware is enabled. It should sort instances by carbon emissions and launch the instance which has the lowest Global Warming Potential[^1]. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Might we want a balanced approach to this that factors in price and emissions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it could be interesting, like the carbon tax discussed in option 3, but I think that I something we should consider it something we add later |
||
|
||
### Option 1: Use Carbon Aware provisioning and concolidation methods | ||
JacobValdemar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### Consolidation | ||
Create two new consolidation methods `carbonawaresinglemachineconsolidation.go` and `carbonawaremultimachineconsolidation.go` that will be used when Carbon Aware is enabled. | ||
|
||
<details> | ||
|
||
<summary>Change to `karpenter-core/pkg/controllers/deprovisioning/controller.go`</summary> | ||
|
||
```diff | ||
-func NewController(clk clock.Clock, kubeClient client.Client, provisioner *provisioning.Provisioner, | ||
- cp cloudprovider.CloudProvider, recorder events.Recorder, cluster *state.Cluster) *Controller { | ||
+func NewController(ctx context.Context, clk clock.Clock, kubeClient client.Client, provisioner *provisioning.Provisioner, | ||
+ cp cloudprovider.CloudProvider, recorder events.Recorder, cluster *state.Cluster) *Controller { | ||
|
||
+ if settings.FromContext(ctx).CarbonAwareEnabled { | ||
+ return &Controller{ | ||
+ clock: clk, | ||
+ kubeClient: kubeClient, | ||
+ cluster: cluster, | ||
+ provisioner: provisioner, | ||
+ recorder: recorder, | ||
+ cloudProvider: cp, | ||
+ lastRun: map[string]time.Time{}, | ||
+ deprovisioners: []Deprovisioner{ | ||
+ NewExpiration(clk, kubeClient, cluster, provisioner, recorder), | ||
+ NewDrift(kubeClient, cluster, provisioner, recorder), | ||
+ NewEmptiness(clk), | ||
+ NewEmptyMachineConsolidation(clk, cluster, kubeClient, provisioner, cp, recorder), | ||
+ NewCarbonAwareMultiMachineConsolidation(clk, cluster, kubeClient, provisioner, cp, recorder), | ||
+ NewCarbonAwareSingleMachineConsolidation(clk, cluster, kubeClient, provisioner, cp, recorder), | ||
+ }, | ||
+ } | ||
+ } | ||
|
||
return &Controller{ | ||
clock: clk, | ||
kubeClient: kubeClient, | ||
cluster: cluster, | ||
provisioner: provisioner, | ||
recorder: recorder, | ||
cloudProvider: cp, | ||
lastRun: map[string]time.Time{}, | ||
deprovisioners: []Deprovisioner{ | ||
NewExpiration(clk, kubeClient, cluster, provisioner, recorder), | ||
NewDrift(kubeClient, cluster, provisioner, recorder), | ||
NewEmptiness(clk), | ||
NewEmptyMachineConsolidation(clk, cluster, kubeClient, provisioner, cp, recorder), | ||
NewMultiMachineConsolidation(clk, cluster, kubeClient, provisioner, cp, recorder), | ||
NewSingleMachineConsolidation(clk, cluster, kubeClient, provisioner, cp, recorder), | ||
}, | ||
} | ||
} | ||
``` | ||
</details> | ||
|
||
#### Provisioning | ||
In `karpenter-core`, create a new method `types.go/OrderByCarbonEmissions` and use that in `nodeclaimtemplate.go/ToMachine` and `nodeclaimtemplate.go/ToNodeClaim` instead of `types.go/OrderByPrice` when Carbon Aware is enabled. | ||
|
||
In `karpenter`, create a new method `CarbonAwareCreate` in `pkg/providers/instance/instance.go` that is used in `pkg/cloudprovider/cloudprovider.go/Create` instead of `pkg/providers/instance/instance.go/Create` when Carbon Aware is enabled. | ||
|
||
#### Considerations | ||
1. 👍 Current consolidation methods are unaffected. | ||
1. 👎 There might be copy-paste of code from the original consolidation methods to the carbon aware consolidators. | ||
|
||
### Option 2: Use Carbon Aware filtering/sorting methods | ||
|
||
#### Consolidation | ||
Create carbon aware implementations of low-level functions like `filterByPrice`, `filterOutSameType`, `getCandidatePrices`, etc. that is used when Carbon Aware is enabled. Usage of aforementioned functions might assume that it is price that they are getting, but in reality it is data about carbon emissions. | ||
JacobValdemar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
#### Provisioning | ||
Use same changes to provisioning as in [option 1](#option-1-use-carbon-aware-provisioning-and-concolidation-methods). | ||
|
||
#### Considerations | ||
1. 👍 Less code copy-paste. | ||
1. 👍 Improvements to original consolidation methods also improve the Carbon Aware feature. | ||
1. 👎 Has a risk of breaking undocumented invariants. | ||
1. 👎 Adds complexity to the original consolidation methods. | ||
|
||
### Option 3: Override instance price with carbon price (recommended) | ||
Minimize carbon emissions by defining a price per kgCO₂e and override the instance price with the carbon price (USD/kgCO₂e). Using the `prioritized` launch strategy, carbon emissions will be minimized during provisioning. Consolidation will unknowingly consolidate to minimize carbon emissions. | ||
|
||
The carbon price will depend on on `region` and `instanceType` and assume constant resource utilization (e.g. always 80% utilization). The carbon price will be generated in a "hack" and included as consts (same method as used for generating initial pricing[^2]). The carbon price / emission estimates can be updated with new versions. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you clarify how "assume constant resource utilization" concept affects carbon price? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jackfrancis This is one place I am constrained in my knowledge about Karpenter, so please correct me if this is incorrect: as far as I know, Karpenter can not provide information about current resource utilization (CPU/RAM) nor about expected utilization for nodes considered to be created from provisioning or consolidation. To calculate the operational Global Warming Potential (kgCO2e) we must know how much power (W) that an instance ("server") consume. The power consumption is not static, but depends on how large the workload (%) is (e.g. see image). Power consumption is not a linear function of the workload. Therefore, CPU Power Consumption profiles (based on empirical data, aka. server testing) are used to estimate how much power a CPU consume depending on the workload. When calculating the operational carbon emissions we therefore should know what the current/expected workload is to be able to accurately estimate the operational footprint. However, if we do not know the workload (%) we must make an assumption about how large it is to complete the calculation. I can elaborate further on this in the call today (Wednesday) if you would like, because the calculation is a bit complex to explain in a comment ;) You can read more about the calculation in the Boavizta API (v.dev) docs here. See an illustration of the concept in this picture (AGPL-3.0, Boavizta API):
JacobValdemar marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Another feature (added later) can be to add carbon price to instance price to simulate a [carbon tax](https://en.wikipedia.org/wiki/Carbon_tax). Administrators could configure a custom carbon price or use a default. | ||
|
||
#### Considerations | ||
1. 👍 Change is constrained to the pricing domain, so most of Karpenter's logic remains unaffected. | ||
1. 👍👍 A simulated carbon tax could be appealing for *Beta* or *General Availability*[^3] as it combines the real price with the carbon price. | ||
1. 👎 Adds complexity to the *price* concept. Price is not *just* price, but rather becomes an optimization function. | ||
1. 👎 Depending on implementation, the `karpenter_cloudprovider_instance_type_price_estimate` metric *may* represent more than just price when Carbon Aware is enabled. | ||
|
||
### Option 4: Enable custom instance price overrides | ||
Enable administrators to configure custom instance price overrides, e.g. in a ConfigMap. A configuration using emission factors (varying with region and instance type) masked as prices can be pre-generated. Administrators then copy-paste a Carbon Aware `priceOverride` into their environment. | ||
|
||
```yaml | ||
priceOverrides: | ||
- instanceType: "m5.large" | ||
region: "eu-west-1" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there's some misunderstanding of region/zone going on. |
||
capacityType: OnDemand | ||
price: 0.007712 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd want to lean toward calling this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree. The term |
||
- instanceType: "m5.xlarge" | ||
region: "eu-west-1" | ||
capacityType: OnDemand | ||
price: 0.015424 | ||
``` | ||
|
||
<details> | ||
<summary>Alternative interface</summary> | ||
|
||
Alternatively, a more flexible interface could be: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oddly, I think of this as more rigid. We risk inventing a DSL for pricing in yaml. |
||
|
||
```yaml | ||
priceModification: | ||
operator: Add # Add or Override | ||
modifications: | ||
- instanceType: "m5.large" | ||
region: "eu-west-1" | ||
capacityType: OnDemand | ||
price: 0.007712 | ||
- instanceType: "m5.xlarge" | ||
region: "eu-west-1" | ||
capacityType: OnDemand | ||
price: 0.015424 | ||
``` | ||
</details> | ||
|
||
A ConfigMap with price overrides for all combinations of instance types and regions will be very huge. 632 instances * 29 regions = 18,328 pairs. Four lines per pair gives a file with 73,312 lines. The file/configmap will approximately have a size of 2 MB. That exceeds the [`1 MiB` limit on ConfigMap size in Kubernetes](https://kubernetes.io/docs/concepts/configuration/configmap/#motivation). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. EC2 launches new instance types regularly, and we should expect this to grow. |
||
|
||
#### Considerations | ||
1. 👍 Simple solution. | ||
1. 👍 Can be used for other purposes. | ||
2. 👎👎 ConfigMap cannot contain all data. | ||
3. 👎 Hard to discover the carbon aware "feature". | ||
4. 👎 Carbon emission price cannot be combined with actual price. | ||
5. 👎 Carbon emissions are completely static without possibility to improve it in the future. | ||
7. 👎 Feature can not be enabled as a toggle. | ||
8. 👎 Depending on implementation, the `karpenter_cloudprovider_instance_type_price_estimate` metric *may* represent more than just price when Carbon Aware is enabled. | ||
|
||
[^1]: The potential impact of greenhouse gases on global warming. Measured in terms of CO₂e. | ||
[^2]: See [prices_gen.go](/hack/code/prices_gen.go) and [zz_generated.pricing.go](/pkg/providers/pricing/zz_generated.pricing.go) | ||
[^3]: https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we clarify that "instance" in this sense refers to a percentage of a physical machine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jackfrancis Good point. That is totally not obvious when reading the text. When we say instance lifetime we of course mean the lifetime of the physical machine that the instance is part of 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a7134dd