Best way to find out and set application resource limits/request on kubernetes

Hope you can help me with this!

What is the best approach to get and set request and limits resource per pods?

I was thinking in setting an expected number of traffic and code some load tests, then start a single pod with some "low limits" and run load test until OOMed, then tune again (something like overclocking) memory until finding a bottleneck, then attack CPU until everything is "stable" and so on. Then i would use that "limit" as a "request value" and would use double of "request values" as "limit" (or a safe value based on results). Finally scale them out for the average of traffic (fixed number of pods) and set autoscale pods rules for peak production values.

Is this a good approach? What tools and metrics do you recommend? I'm using prometheus-operator for monitoring and vegeta for load testing.

What about vertical pod autoscaling? have you used it? is it production ready?

BTW: I'm using AWS managed solution deployed w/ terraform module

Thanks for reading

Answers

The VerticalPodAutoScaler is more about making sure that a Pod can run. So it starts it low and doubles memory each time it gets OOMKilled. This can potentially lead to a Pod hogging resource. It is also limited as it doesn't take account of under-performance. If your app is under-resourced it might still respond but not respond in a timeframe you consider acceptable.

I think you are taking a good approach as you are looking at the application under load and assessing what it needs to perform as you want it to. I doubt I can suggest any tools you aren't already aware of but if it helps there is some more discussion in https://stackoverflow.com/questions/53950168/how-to-set-the-right-cpu-millicores-for-a-container and the threads that link from it

Posted on by Ryan Dawson

I usually start my pods with no limits nor resources set. Then I leave them running for a bit under normal load to collect metrics on resource consumption.

I then set memory and CPU requests to +10% of the max consumption I got in the test period and limits to +25% of the requests.

This is just an example strategy, as there is no one size fits all approach for this.

Posted on by whites11