Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/kubeletstat] Review cpu.utilization naming #27885

Open
TylerHelmuth opened this issue Oct 20, 2023 · 37 comments
Open

[receiver/kubeletstat] Review cpu.utilization naming #27885

TylerHelmuth opened this issue Oct 20, 2023 · 37 comments
Labels

Comments

@TylerHelmuth
Copy link
Member

TylerHelmuth commented Oct 20, 2023

Component(s)

receiver/kubeletstats

Is your feature request related to a problem? Please describe.

The Kubeletestats Receiver currently uses *.cpu.utilization as the name for cpu metrics that report the CPUStats UsageNanoCores value.

I believe that UsageNanoCores reports the actual amount of cpu being used not the ratio of the amount being used out of a total limit. If this is true, then our use of utilization is not meeting semantic convention exceptions.

I would like to have a discussion about what exactly UsageNanoCores represents and if our metric naming needs updating.

Related to discussion that started in #24905

@TylerHelmuth TylerHelmuth added enhancement New feature or request needs triage New item requiring triage priority:p2 Medium discussion needed Community discussion needed receiver/kubeletstats and removed enhancement New feature or request needs triage New item requiring triage labels Oct 20, 2023
@TylerHelmuth
Copy link
Member Author

/cc @jinja2 @dmitryax @povilasv

@povilasv
Copy link
Contributor

povilasv commented Oct 23, 2023

Did some digging:

Kubernetes Docs state:

// Total CPU usage (sum of all cores) averaged over the sample window.
// The "core" unit can be interpreted as CPU core-nanoseconds per second.
// +optional
UsageNanoCores *uint64 json:"usageNanoCores,omitempty"

Looks like it's getting these metrics from CRI and if CRI doesn't have stats it's computing using this formula:

		nanoSeconds := newStats.Timestamp - cachedStats.Timestamp

		usageNanoCores := uint64(float64(newStats.UsageCoreNanoSeconds.Value-cachedStats.UsageCoreNanoSeconds.Value) /
			float64(nanoSeconds) * float64(time.Second/time.Nanosecond))

Ref: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/stats/cri_stats_provider.go#L791-L800

Where:

// Cumulative CPU usage (sum across all cores) since object creation.
UsageCoreNanoSeconds *UInt64Value protobuf:"bytes,2,opt,name=usage_core_nano_seconds,json=usageCoreNanoSeconds,proto3" json:"usage_core_nano_seconds,omitempty"

🤔

Playing a bit with the formula:

Limit - is total available cpu time.

Let's say we collect every 1 second, and app uses total available cpu time so 1 second.

nanoSeconds := now() - (now() - 1s) = 1s = 1,000,000,000 nanoseconds

UsaeNanocores := (2,000,000,000 - 1,000,000,000)  / 1,000,000,000  * 1,000,000,000 = 1,000,000,000 
or simplified:

UsageNanocores := (2s - 1s) / 1s * float64(time.Second/time.Nanosecond)) =  unit64(1 * float64(time.Second/time.Nanosecond)))  =  1,000,000,000

Based on this example, the result is actual usage of 1,000,000,000 nano seconds or 1second.

So this metricunit seems to be nanoseconds, not percentage.

If my calculations are correct, I think we should rename to cpu.usage with proper unit (nanoseconds)?

@TylerHelmuth
Copy link
Member Author

@povilasv thank you!

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 25, 2023
@TylerHelmuth TylerHelmuth removed the Stale label Jan 8, 2024
dmitryax pushed a commit that referenced this issue Jan 12, 2024
**Description:** 
Starts the name change processor for `*.cpu.utilization` metrics. 

**Link to tracking Issue:** 
Related to
#24905
Related to
#27885
cparkins pushed a commit to AmadeusITGroup/opentelemetry-collector-contrib that referenced this issue Feb 1, 2024
…elemetry#25901)

**Description:** 
Starts the name change processor for `*.cpu.utilization` metrics. 

**Link to tracking Issue:** 
Related to
open-telemetry#24905
Related to
open-telemetry#27885
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ChrsMark
Copy link
Member

ChrsMark commented Mar 27, 2024

FYI @TylerHelmuth @povilasv In SemConv we have merged open-telemetry/semantic-conventions#282 which adds the container.cpu.time metric for now.

For the dockerstats receiver we have #31649 which will try to align the implementation with the added SemConvs.

Do we have a summary so far for what is missing from the kubeletstats receiver in terms of naming changes (like this current issue)?

Shall we try to adopt the kubeletstats receiver with #31649? Happy to help with that.

At the moment the implementation of the receiver provides the following:

Are we planning to keep them all?
Are those all allgined with https://github.com/open-telemetry/semantic-conventions/blob/71c2e8072596fb9a4ceb68303c83f5389e0beb5f/docs/general/metrics.md#instrument-naming?

From https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/25901/files#diff-3343de7bfda986546ce7cb166e641ae88c0b0aecadd016cb253cd5a0463ff464R352-R353 I see we are going to remove/deprecate container.cpu.utilization?
Could we keep it instead as optional metric and find a proper way to calculate it? I see it was mentioned at #25901 (comment) but not sure how it was resolved. I guess that would be possible by adding a cpuNodeLimit (retrieved from the Node Resource) at

similarly with set resource limit. I drafted a very WiP implementation for this to illustrate the point -> ChrsMark@27ce769

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Apr 2, 2024

@ChrsMark in my opinion yes to all questions. We want to be aligned with the spec (although I'd love to reduce the number of iterations in the receivers to gain that alignment, how long till we're stable lol).

I don't have a lot of time to dedicate to getting kubeletstatsreceiver update-to-date with the non-stable spec. At this point I was planning to wait for things to stabilize before making any more changes besides the work we started in this issue.

@ChrsMark
Copy link
Member

ChrsMark commented Apr 3, 2024

Thank's @TylerHelmuth, I see the point of not chasing after an unstable schema/spec.

Just to clarify regarding the container.cpu.utilization though: shall we abort its deprecation and try to calculate this properly as it was mentioned at #25901 (comment)? This can happen based on the cpuNodeLimit, and seems to be doable based on a quick research I did: ChrsMark@27ce769

@TylerHelmuth
Copy link
Member Author

@ChrsMark yes I'd be fine with keeping the metric if we can calculate it correctly. We'd still need to go through some sort of feature-gate processor to make it clear to users that the metric has changed and that if they want the old value they need to use the new .usage metric.

@ChrsMark
Copy link
Member

@TylerHelmuth @povilasv I have drafted a patch to illustrate the point at #32295. My sightings look promising :).

If we agree on the idea I can move the PR forward to fix the details and open it for review. Let me know what you think.

@TylerHelmuth
Copy link
Member Author

Seems reasonable. @jinja2 please take a look

@povilasv
Copy link
Contributor

This looks reasonable for me as well. I would add the informer too so we don't call get Node everytime we scrape data, in practice Node cpu capacity doesn't change

@ChrsMark
Copy link
Member

Thank's for the feedback folks!
I have adjusted #32295 to use an informer instead of getting the Node on every scrape. Something I noticed is that we have several places where there are different informer based implementations, like in k8sclusterreceiver and k8sattributesprocessor. Maybe it would make sense to extract that common logic into a common lib and re-use it from the various receivers and processors (even in resourcedetectorprocessor), but we can file a different issue for this.
For now we can continue any discussion at #32295.

Copy link
Contributor

github-actions bot commented Jul 8, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 8, 2024
andrzej-stencel pushed a commit that referenced this issue Jul 10, 2024
…ion` metrics (#33591)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
Similar to
#32295
and
#33390,
this PR adds the `k8s.{container,pod}.memory.node.utilization` metrics.


**Link to tracking Issue:** <Issue number if applicable>
#27885

**Testing:** <Describe what testing was performed and which tests were
added.> Added unit test.

**Documentation:** <Describe the documentation added.> Added

### Manual testing

1. Using the following target Pod:
```yaml
apiVersion: v1
kind: Pod
metadata:
  name: memory-demo
spec:
  containers:
  - name: memory-demo-ctr
    image: polinux/stress
    resources:
      requests:
        memory: "8070591Ki"
      limits:
        memory: "9070591Ki"
    command: ["stress"]
    args: ["--vm", "1", "--vm-bytes", "800M", "--vm-hang", "4"]
```
2. 

![memGood](https://github.com/open-telemetry/opentelemetry-collector-contrib/assets/11754898/fae04b30-59ca-4d70-8446-f54b5a085cf7)

On a node of 32,5G memory the 800Mb container/Pod consumes the
`0.8/32.5=0.0246...=0.025`.

---------

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
@ChrsMark
Copy link
Member

Any thoughts about the k8s.node.cpu.utilization? Should this be calculated properly instead of deprecated? This one should be straightforward compared to the container's/pod's ones since the node does not have the respective request/limit based metrics.

So it would just be

k8s.node.cpu.utilization = k8s.node.cpu.usage / num_of_cores

@povilasv
Copy link
Contributor

Sounds good to me, if we can compute it properly

@ChrsMark
Copy link
Member

ChrsMark commented Jul 15, 2024

I wonder though if we should first deprecate the metrics and re-introduce this specific one back as it was pointed out at #27885 (comment).

Otherwise we can fix this one directly.

@povilasv
Copy link
Contributor

I personally, would be okay if we just fix it, but not sure what others think?

@ChrsMark
Copy link
Member

ChrsMark commented Jul 22, 2024

The thing is that change the way it is calculated requires enabling the NodeInformer. So if we change this now as is, any existing users' configurations will become invalid and the metric will being reported as 0. This is highlighted by the failing tests at #34191.

So what I would suggest here is:

  1. Update the README file, add a warning in the release notes, and update the printed warning that the value will "swap" in a specific future release. Maybe write a blog post too.
  2. Make the *.cpu.usage metrics enabled by default: [receiver/kubeletstats] Enable by default the .cpu.usage metrics #34217
  3. Make the *.cpu.utilization metrics disabled by default.
  4. Remove entirely the k8s.node.cpu.utilization, k8s.pod.cpu.utilization and container.cpu.utilization metrics.
  5. Add back the k8s.node.cpu.utilization with [receiver/kubeletstats] Emit k8s.node.cpu.utilization as ratio against node's capacity #34191, which will be disabled by default.

The above should be done in concrete/standalone PRs and can be split across different releases to ensure a gradual switch process for the users.

WDYT?

/cc @TylerHelmuth @dmitryax

@TylerHelmuth
Copy link
Member Author

So if we change this now as is, any existing users' configurations will become invalid and the metric will being reported (will be 0).

Ya that'd be bad.

I like this plan as it works toward the original goal (using the proper name) and allows us to work on an actual utilization metric separately. My only concern is that I really didn't want to do breaking semantic convention changes to the k8s components until the k8s semconv was stable, so that all the changes could be hidden behind one feature flag.

That isn't happening anytime soon tho. I think starting with enabling cpu.usage by default is a good next step.

@dmitryax
Copy link
Member

dmitryax commented Aug 7, 2024

@ChrsMark, I like that outlined plan. Thanks for putting it together! However, I'd prefer combining 1 and 2 in one release so we don't change the cardinality. cc @TylerHelmuth

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Aug 7, 2024

@dmitryax If we do 1 and 2 in one release we need more evangelism around the change. The usage metrics and their warnings have been around for a long time, but the change wasn't very public and I worry users aren't adopting it.

I think we'd need to update to the README, add a warning in the release notes, and update the printed warning that the value will "swap" in a specific future release. Maybe write a blog post too.

@ChrsMark
Copy link
Member

ChrsMark commented Aug 8, 2024

I think we'd need to update to the README, add a warning in the release notes, and update the printed warning that the value will "swap" in a specific future release. Maybe write a blog post too.

@TylerHelmuth @dmitryax Should we then first do those in a step 0 in order to later combine step 1 and step 2? I'd be fine with that and taking care of it.

@TylerHelmuth
Copy link
Member Author

Sure

@ChrsMark
Copy link
Member

@TylerHelmuth cool! Any preferences on what the target release of this change/deprecation should be? Based on the release-schedule I would go with v0.111.0 (2024-10-07).

@TylerHelmuth
Copy link
Member Author

Is there a way for use to add a feature gate or something that causes the collector to fail on start if utilization is used? Breaking changes from semantics are so difficult because a user could upgrade the collector and not realize it just started emitting new telemetry and in this case it is a metric that has been a core metric from this receiver for 4+ years.

It would be great it we can add a set where the collector fails to start if utilization is in use, and the user could disable the feature gate to get back to a running state, but at least they'd have to acknowledge the change is coming on a specific date.

@ChrsMark
Copy link
Member

It would be great it we can add a set where the collector fails to start if utilization is in use, and the user could disable the feature gate to get back to a running state, but at least they'd have to acknowledge the change is coming on a specific date.

Hmm having the Collector to fail completely for one metric sounds a bit concerning to me tbh. Has this approach been used in other components too?

How about making the switch completely behind a feature gate then?

This way we enable users to switch whenever they want prior to a specific release and then after we have make the switch the feature gate to continue using the utilization will still be there for some period. We can define a plan like the following:

  • Introduce it in v0.x.0 as alpha - disabled by default.
  • Move it to beta in v0.y.0 - enabled by default. (for configs that still enable the utilization it can fail)
  • Move it to stable in v0.z.0 - cannot be disabled.
  • Remove it three releases after stable.

(we have something similar in hostmetricsreceiver)

Wouldn't that cover most of what we have discussed here?

@TylerHelmuth
Copy link
Member Author

TylerHelmuth commented Aug 13, 2024

Yes thats what I'm thinking. If the feature gate is enabled and a utilization metric is enabled, the collector will fail to start.

@ChrsMark
Copy link
Member

Thank's @TylerHelmuth! I will be off for the following weeks, but I will pick it up once I'm back.

@ChrsMark
Copy link
Member

@TylerHelmuth @dmitryax I have tried the feature gate approach at #35139. Please have a look once you get the chance. You can find more details about the feature gate's plan in the PR's description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants