Kubernetes v1.36 Enhances Memory QoS with Tiered Protection and Opt-In Reservations

From Touriddu, the free encyclopedia of technology

Introduction

As Kubernetes evolves, ensuring predictable performance for containerized workloads remains a top priority. In version 1.36, the Memory QoS feature (still in alpha) receives significant enhancements, giving cluster operators finer control over how the kernel handles container memory under pressure. Originally introduced in v1.22 and refined in v1.27, this feature leverages the cgroup v2 memory controller to provide smarter guidance to the kernel. This article explores the key updates in v1.36: opt-in memory reservation, tiered protection by Quality of Service (QoS) class, new observability metrics, and a kernel‑version warning for memory.high.

Kubernetes v1.36 Enhances Memory QoS with Tiered Protection and Opt-In Reservations

What’s New in v1.36

The Memory QoS feature has been re‑architected to separate throttling from memory reservation. Previously, enabling the feature gate immediately configured memory.min for every container with a memory request, a hard guarantee that the kernel would never reclaim. In v1.36, administrators can adopt a more graduated approach.

Opt‑In Memory Reservation with memoryReservationPolicy

Starting in v1.36, the kubelet introduces a new configuration field, memoryReservationPolicy, which controls whether the kubelet writes memory.min or memory.low for pods. This field offers two options:

  • None (default): The kubelet does not write memory.min or memory.low for any containers. Throttling via memory.high still works, controlled by the memoryThrottlingFactor (default 0.9). This lets operators enable throttling first to observe workload behavior without hard guarantees.
  • TieredReservation: The kubelet writes tiered memory protection based on the Pod’s QoS class, providing differentiated levels of hardware‑assisted memory reservation.

Tiered Protection by QoS Class

When memoryReservationPolicy is set to TieredReservation, the kubelet applies distinct cgroup v2 parameters for each QoS class:

  • Guaranteed Pods: Receive hard protection via memory.min. For example, a Guaranteed Pod requesting 512 MiB of memory will have memory.min set to 536870912 bytes (512 MiB). The kernel will never reclaim this memory under any circumstances; if it cannot honor the guarantee, it invokes the OOM killer on other processes to free pages.
  • Burstable Pods: Receive soft protection via memory.low. For the same 512 MiB request on a Burstable Pod, the cgroup file shows memory.low = 536870912. The kernel avoids reclaiming this memory under normal memory pressure but may reclaim it if the alternative is a system‑wide OOM.
  • BestEffort Pods: Receive neither memory.min nor memory.low. Their memory remains fully reclaimable, offering the most flexibility to the kernel under pressure.

This tiered approach ensures that the most critical workloads receive the strongest guarantee, while burstable pods get a preference that can be relaxed in extreme situations.

Comparison with Previous Behavior

In earlier versions (v1.27 and before), enabling the MemoryQoS feature gate immediately set memory.min for every container with a memory request, regardless of QoS class. memory.min is a hard reservation – the kernel will not reclaim that memory, even under extreme pressure. Consider a node with 8 GiB of RAM where Burstable Pod requests total 7 GiB. In earlier versions, that 7 GiB would be locked as memory.min, leaving very little headroom for the kernel, system daemons, or BestEffort workloads. This increased the risk of OOM kills and made the node inflexible.

With v1.36’s tiered reservation, Burstable requests map to memory.low instead of memory.min. Under normal pressure, the kernel still protects that memory, but under extreme pressure it can reclaim a portion to avoid system‑wide OOM. Only Guaranteed Pods use memory.min, which keeps the hard‑reservation footprint lower. With the memoryReservationPolicy field, operators can first enable throttling (None), observe workload behavior, and then opt into reservation (TieredReservation) when the node has enough headroom.

Observability Metrics

To help administrators monitor the impact of Memory QoS, v1.36 exposes two alpha‑stability metrics on the kubelet’s /metrics endpoint:

MetricDescription
kubelet_memory_qos_node_memory_min_bytesTotal amount of memory.min reserved across all pods on the node.
kubelet_memory_qos_node_memory_low_bytesTotal amount of memory.low reserved across all burstable pods on the node.

These metrics allow operators to track how much memory is hard‑reserved vs. soft‑reserved, making it easier to tune the memoryReservationPolicy and memoryThrottlingFactor for optimal resource utilization.

Kernel‑Version Warning for memory.high

One subtle but important addition in v1.36 is a startup warning when the host kernel is older than 5.11. The memory.high cgroup file, used for throttling, only works reliably as a limiter starting from kernel 5.11 (and was completely absent in some older kernels). The kubelet now checks the kernel version and emits a warning if the feature might behave unexpectedly. This helps administrators avoid confusing behavior when deploying Memory QoS on nodes running an older kernel.

Conclusion

Kubernetes v1.36 marks a significant step forward for Memory QoS. By separating throttling from reservation, introducing tiered protection based on QoS class, and providing observability metrics, the feature gives administrators the flexibility to balance performance and resource efficiency. The opt‑in approach allows safe rollout: enable throttling first, monitor, then add hard reservations for Guaranteed Pods when confidence grows. Combined with the kernel‑version warning, these changes make the alpha feature more production‑ready than ever. As with all alpha features, careful testing in non‑critical environments is advised before wide deployment.