Best way to find out and set application resource limits/request on kubernetes

Hope you can help me with this!

What is the best approach to get and set request and limits resource per pods?

I was thinking in setting an expected number of traffic and code some load tests, then start a single pod with some "low limits" and run load test until OOMed, then tune again (something like overclocking) memory until finding a bottleneck, then attack CPU until everything is "stable" and so on. Then i would use that "limit" as a "request value" and would use double of "request values" as "limit" (or a safe value based on results). Finally scale them out for the average of traffic (fixed number of pods) and set autoscale pods rules for peak production values.

Is this a good approach? What tools and metrics do you recommend? I'm using prometheus-operator for monitoring and vegeta for load testing.

What about vertical pod autoscaling? have you used it? is it production ready?

BTW: I'm using AWS managed solution deployed w/ terraform module

Thanks for reading


The VerticalPodAutoScaler is more about making sure that a Pod can run. So it starts it low and doubles memory each time it gets OOMKilled. This can potentially lead to a Pod hogging resource. It is also limited as it doesn't take account of under-performance. If your app is under-resourced it might still respond but not respond in a timeframe you consider acceptable.

I think you are taking a good approach as you are looking at the application under load and assessing what it needs to perform as you want it to. I doubt I can suggest any tools you aren't already aware of but if it helps there is some more discussion in and the threads that link from it

Posted on by Ryan Dawson

I usually start my pods with no limits nor resources set. Then I leave them running for a bit under normal load to collect metrics on resource consumption.

I then set memory and CPU requests to +10% of the max consumption I got in the test period and limits to +25% of the requests.

This is just an example strategy, as there is no one size fits all approach for this.

Posted on by whites11