About "technical information" in the order form

Last modified by aulembin@helsinki_fi on 2024/02/07 06:33

What to do if the maximum resources in order form are insufficient?

  • Just make the order and try it - you might be surprised! You are not running an operating system there, only containers.
  • If - after actually trying to run your workloads - the resources really do seem to be insufficient, send email to grp-openshift-owner@helsinki.fi . We can hand-tune them for you within reason.

About storage:

  • The specified storage is ONLY about persistent data storage capacity! The container image sizes do NOT count against this!
  • If you're running "stateless" containers only - i.e. no PersistentVolumeClaims, your project does not need to check the persistent storage box!
  • If you later find you do actually want or need to use persistent storage in your project, send us email to grp-openshift-owner@helsinki.fi .
    We will add this for you. No need to re-submit your project!
  • The persistent storage is backed by vpshere dynamic provision. This does has certain limitations:
    • Short version: The ONLY supported AccessMode is ReadWriteOnce.
    • Long version: While it is possible for multiple pods to simultaneously mount a single PVC, (including for reading and writing, don't be fooled by the AccessMode name), all those pods must and will be scheduled on the same underlying VM node. If you try to force the pods to be scheduled on different nodes (e.g. with anti-affinities) it will just refuse to schedule more than one pod. This is due to the fact that there is no networked filesystem magic whatsoever applied here: Openshift simply connects the virtual disk device onto the single node in vmware layer and mounts that disk in the VM, and then points the containers in those pods towards that disk.
      This will mean that if you are building a service to be "always on" you will need to work around the fact that there are cluster updates which will cause downtime for every persistentvolumeclaim in your project.
    • We do know and agree this is not completely satisfactory state of things. We do want to make this better if time / resources permit.


What do these values actually mean? How do I work with them? Or, "CPU and RAM in Kubernetes: how to work with requests, limits and quotas"

Official docs about container resource management here: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/

For simplicity, we are looking at three types of resources that need to be controlled:

  • CPU cycles
  • RAM capacity
  • Disk capacity (PersistentVolumes)

There are four main concepts that work together to facilitate resource usage controls in Kubernetes:

  • Resourcequota (This is the thing you can ask for your project in the Onify order form)
  • Limitrange
  • Limits
  • Requests
  • (and default values for limits and requests - set up in LimitRange object for your project – when we get to fixing the order form first)


Here, we attempt to explain how these things work.

  • ResourceQuota states at namespace/project level how much CPU, RAM or Disk the pods/containers may use, combined. But the story of resource usage limiting does not end there!

Request and Limit values should be attached to every single pod, at the very least for RAM and CPU, in order to allow kubernetes keep the system running smoothly.

  • Request: This is the amount of CPU/RAM that Kubernetes guarantees every container in the pod may get. If Kubernetes sees that the cluster does not have the resources to honor this guarantee, the pod will not be scheduled. IOW, this is the lower bound of resource for the containers.
  • Limit: This is the amount of CPU/RAM that Kubernetes may give to every container in the pod. This is very useful value in many ways: The pod may claim this amount as a self-restriction mechanism to protect the cluster from itself (bugs or surprise loads for server software), and the Kubernetes cluster has some usable information of how much resources the program running inside the container may actually need to use in order to be useful, while still being able to restrict huge spikes.
    • A usable strategy might be to have a absolutely minimal Request values for CPU / RAM, just enough for the pods to barely run, and use the Limit values to cap the container-level resource usage. Maybe?

Lastly, let's look at LimitRange object, that is used by cluster administrators to streamline the management of resource usage limitations:

  • LimitRange is applied to a project / namespace and it also is not and should not be rewritable by project/namespace admins in a proper multi-tenant cluster. With LimitRange, cluster administration can force minimum & maximum amounts for pod/container -level requests and limits, as well as some sensible defaults to let project/namespace admins / developers to just use the cluster.
    • If you find the default values set up in your project's LimitRange cumbersome, you can and should discuss the matter with cluster administrators at grp-openshift-owner@helsinki.fi !
    • If the maximum & minimum limits are suitable but just want to use different request & limit values for the pods that you deploy, the quickest way is to set the limit and request yourself in the pod spec (usually located in the template part of your Deployment / DeploymentConfig object).

It is up to cluster administration to make sure the ResourceQuota and LimitRange are sensibly coherent together, like not giving a default limit and default request values that are farther from each other than the allowed maxLimitRequestRatio since that mistake will mean developers will have to state at least one of those values or Kubernetes will refuse to schedule the pods thanks to the applied default values being in violation of the max ratio value. Also, setting a higher maximum per-container Limit than the ResourceQuota in effect for the namespace is also somewhat non-sensical, if still doable, as long as the Request falls below the ResourceQuota. And so on. If you find such (or any) problems in your default limits / requests / quota, please do not hesitate to contact cluster admins! We are here to actually help you...

WHY IS RESOURCEQUOTA INSUFFICIENT?

If the cluster starts to experience CPU / Memory shortage, Kubernetes will need a way to prioritise the pods. Pods that do not have requests / limits attached to them are, from kube-scheduler's point of view, unreasonable. "Unreasonable" meaning that there is no way for the scheduler to make any sensible decisions about what to put and where. Therefore pods without limits set will always be the first ones to get evicted from cluster.

With CPU, kube-scheduler may throttle the pods and the processes will simply run slower. With RAM, this is not possible due to obvious reasons, and evictions will happen.

The full story and details about how kubernetes selects evictions in case of resource shortage will have to wait for a later time.

OBJECT-COUNT resources

Then there are limits how many configmaps, secrets, pods, and other kubernetes objects a namespace can hold. There is no kubernetes-side hard limit on how big a configmap or secret can be, but the etcd backend datastore can only hold 1MB objects, and other parts of the plumbing in apiserver might also have other limits. Word on the streets seems to think that around 1MB is the current upper limit for a single configmap/secret/other objects. Some more information here: https://stackoverflow.com/a/53015758/1889463

What does the number "0.5", "1.0", "1.5" or "2.0" etc. for CPU actually mean?


It goes directly to the value of resourcequota object's spec for hard.requests.cpu for your new project. It will throttle the TOTAL amount of cpu usage for containers under your project at any single point of time.
Whether you want to run 10 containers using 1/10 of your quota each or a single container using all of your quota is completely up to you.

What the number technically ultimately means is ... hairy subject that cannot be exactly guaranteed or explained. It should correspond 1:1 -ish to vCPU cores in the VM nodes openshift runs on, but there are some caveats there:

  • The physical vmware hosts these VM nodes are running on are not exactly uniform in terms of cpu power. What "1 vCPU" means for a process in a VM therein is not really set in stone.
  • How well the VMs get scheduled into the vmware platform hosts is another question altogether (which is also out of control of Openshift Container Platform administration), which brings another layer of complications when thinking about how much CPU your containers will get.
  • CPU is throttlable resource which means that in the event of CPU shortage, kubernetes will start to shave off cpu cycles from pods if it can. In extreme cases, it can also evict (shut down) pods.



Contact information

The recommended contact for questions is:

https://helsinkifi.slack.com #kontit 

All changes related to resources for the project/namespace should be emailed to:

grp-openshift-owner@helsinki.fi (platform administration and development)tike-ohjelmistotuotanto@helsinki.fi (program development)