-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collector resource requests are very low #964
Comments
Hi @Harmelodic, Thanks for reaching out.
Are you saying the graphs are not optimal? Or the YAML spec itself is not optimal? And can you elaborate on what you mean by "optimal"?
Sure - #943. This will be released in the coming weeks on GKE. Note: there are a few gotchas with using VPAs on a DaemonSet experiencing heterogenous resource usage, which is why it's an opt-in feature.
See above answer. If you're interested, you can definitely configure the VPA example to adjust according to CPU usage cc @bernot-dev Hope that helps. |
Heyo! Thanks for the info! 🤗
Yeah, kinda assumed that the previous information explained that by inference. My bad ❤️ What I mean is that, since the resource requests on the DaemonSet are so low (4m & 32M), but the usage by the Pod is so high (relatively), it introduces cluster scaling issues and workload prioritisation issues. By cluster scaling issues, I mean basically that if we want to scale our cluster effectively, it is important to set resource requests on all workloads, where the requested resource is pretty much what the container needs to fulfil it's job (plus a little bit of a buffer for flexibility). By prioritisation issues, it's probably worth giving some context/examples: Let's say we have a cluster that is packed full of applications that use a bunch of CPU and memory and produce a large amount of metrics, and on that cluster is Managed Prometheus (with this In this context, it leads to:
There are then two scenarios I need to think about: Scenario 1: I prioritise my applications over metric collection Scenario 2: I prioritise metric collection over my applications Out of these two scenarios (at least of the use cases that I'm dealing with) there are very few use cases where my applications getting most/all available node resources is a higher priority than ensuring all metrics get collected - especially since the attractiveness of Managed Prometheus as a product is "I don't have to worry about metric collection or storage". Since you, as Google folks, cannot define what a Managed-Prometheus-customer's prioritisations should be, minimising the resource requests for the collection is understandable, since the customer might want to prioritise their applications over metric collection. However, for Managed-Prometheus-customers that are prioritising metric collection over application resource usage, it makes using Managed Prometheus as a solution for metric collection & storage a bittersweet one, since metrics aren't guaranteed to be collected and so those customers need to spend time thinking about and working with what was supposed to be a managed solution. Possible improvements that come to mind:
Again, thanks for linking to the PR! Once it gets released into Regular channel, I'm sure we'll take a look at the VPA. Incidentally, I'm not sure what you mean by "heterogenous resource usage"? Also, what are the gotchas? |
There are known limitations of Vertical Pod Autoscaling. The "heterogeneous resource usage" comment refers to situations where load is not distributed equally across the pods where the workload is being autoscaled. In a Prometheus context, image that you have a cluster where only some of your workloads expose metrics, and those workloads are all scheduled on a single node. In that instance, the Prometheus pod on the node where all of the metrics are being scraped could be working hard (high CPU/memory usage), while the Prometheus pods on other nodes would be essentially idle. This would be a poor fit for Vertical Pod Autoscaling over the DaemonSet of collectors because the same recommendation would apply to all of the pods in the DaemonSet, even though they have unequal load. In extreme instances, this problem could also result in pods becoming unschedulable. Some of the known limitations have plans to be addressed, and it's possible we could see better performance across our use cases in the future through enhancements to Kubernetes. For now, we cannot guarantee that VPA will result in better results for all of our customers, so we are leaving the feature opt-in. |
Thank you very much for this info! 🙏 Incidentally, ...
Thankfully, we are not in this instance, since (a) all workloads produce metrics and (b) we spread our workloads as evenly as possible across the cluster. So, we'll definitely take a look at the VPA as soon as we have it available to our cluster 👍 However, whilst we don't have the VPA right now, and in case the VPA isn't a fit (for a different reason) or at least to satisfy other customers' needs, it'd still be good to consider making some improvements to the existing resource configuration 😊 ❤️ |
For GMP users, there is no "one size fits all" solution for resource configuration. GMP is on by default when you create a GKE cluster, but some customers do not use Prometheus metrics or do not use GMP. Of those who do use GMP, they range from very low usage to extremely high usage. Any static resource request level we set will be suboptimal for some customers, either requesting too much or too little. Setting resource requests too high is problematic because it could artificially displace user workloads, and the effect is multiplied over the number of nodes because the collectors run in a DaemonSet. As an extreme example, if we set the collector to request 1 CPU and 1 GB of memory, it would actually take up 3 CPUs and 3GB of memory on a small 3-node cluster, which would be unacceptable for all but heavy users of GMP, and may not meet their needs, either. This problem persists across different request levels, with different proportions of users being affected. We ultimately chose to set requests to what we would expect the GMP components to consume at idle, because it will never waste resources. In most cases, allowing "bursting" (using resources greater than the requested amounts) is preferable when the nodes are not at full capacity. VPA can be a step in the right direction for some workloads, and we are continuing to explore additional options that will allow us to better meet the varied needs of our users. |
We were looking into the resource usage of Pods across our cluster and saw that the
collector
DaemonSet has quite low resource requests, and the Pods are consistently using well-above those requests.Looking into the DaemonSet config, we can see that the requested resources for the
prometheus
container in thecollector
Pods are:This results in the "CPU Request % Used" and the "Memory Request % Used" graphs in GKE showing very high usage percentages:
This is not optimal.
Please consider making resource requests configurable per cluster, or be more dynamic (using a VPA).
Incidentally, we noticed that there is a VPA example in this repo - however this example does not cover CPU and doesn't appear to be applied (when looking through the Manifests as part of the GCP Managed Prometheus documentation)
The text was updated successfully, but these errors were encountered: