KB:Kubernetes POD CPU/RAM Requests and Limits


In Kubernetes, requests and limits manage resources like CPU and memory. Requests ensure a container receives the specified resources by scheduling it on an appropriate node. Limits cap the maximum resources a container can use. The limit must always be equal to or greater than the request, otherwise, Kubernetes will prevent the container from running.

 

Keep Requests Minimal:
Keeping resource requests minimal in Kubernetes is a best practice for optimizing node utilization and reducing costs. By setting minimal requests, you ensure that containers only reserve the resources they truly need, preventing nodes from being underutilized and avoiding unnecessary expenses for over-provisioned resources. This efficient resource management helps in better scheduling and balancing of workloads across the cluster.

However, minimal requests do not mean your containers will lack resources. Kubernetes allows containers to use additional resources up to their defined limits if available, ensuring they can handle variable workloads without being constrained by minimal requests. This approach provides a balance where your applications have the necessary resources while minimizing costs. By carefully managing requests and limits, you optimize both performance and cost-efficiency in your Kubernetes cluster.

 

Scaling:
In addition, the AKS Cluster can automatically scale by adding additional nodes if the need arises, ensuring that your pods and containers have sufficient resources even when competing for node resources. This auto-scaling capability helps maintain performance and resource availability during peak loads or unexpected demand spikes. Furthermore, our CloudOps team continuously monitors the cluster's health to ensure that pods do not lack resources, proactively addressing any issues that may impact resource allocation and overall cluster performance.

CPU:
CPU resources in Kubernetes are measured in millicores, where a value of "2000m" represents two full cores and "250m" indicates ¼ of a core. However, exceeding the core count of the largest node in the cluster with CPU requests will prevent pod scheduling. It's generally advisable to set CPU requests at or below '1' to enhance flexibility and reliability, scaling out with more replicas if needed. A good value should be 25m

Memory:
Memory resources in Kubernetes are specified in bytes, typically using mebibyte values. However, allocations can range from bytes to petabytes. Similar to CPU, setting a memory request exceeding node capacity will prevent pod scheduling. Unlike CPU, memory cannot be compressed, and exceeding memory limits results in container termination.

Starting Numbers:

  • Memory:
    A common standard starting number for memory requests in Kubernetes is 64MiB (megabytes). This value provides a reasonable baseline for many applications and allows for flexibility in scaling up or down based on actual resource usage patterns.

  • CPU:
    For CPU a common industry standard starting number is 100m milliCPU (millicores), which is equivalent to 0.1 CPU core. This starting point provides a modest allocation of CPU resources for most applications while allowing for scalability and efficient resource utilization.


Comments

Popular posts from this blog

KB: Azure ACA Container fails to start (no User Assigned or Delegated Managed Identity found for specified ClientId)

Electron Process Execution Failure with FSLogix

KB:RMM VS DEX (Remote Monitoring Management vs Digital Employee Experience)