Skip to content

fix(customresourcestate): handle nil path gracefully in StateSet metrics#2884

Open
Br1an67 wants to merge 1 commit intokubernetes:mainfrom
Br1an67:fix/issue-2482-crd-status-nil-handling
Open

fix(customresourcestate): handle nil path gracefully in StateSet metrics#2884
Br1an67 wants to merge 1 commit intokubernetes:mainfrom
Br1an67:fix/issue-2482-crd-status-nil-handling

Conversation

@Br1an67
Copy link
Copy Markdown

@Br1an67 Br1an67 commented Mar 8, 2026

Fixes #2482

What this PR does / why we need it:

When CustomResourceDefinition status fields don't exist at resource creation time, StateSet metrics would previously log an error for each resource instance. This caused error spam like:

registry_factory.go:685] "cr_test" err="[status,phase]: expected value for path to be string, got <nil>"

Status fields are not guaranteed to exist at resource creation, so this behavior was inconsistent with known types where nil values are handled gracefully (e.g., Gauge with NilIsZero).

This PR modifies compiledStateSet.values() to return empty results instead of an error when the path resolves to nil, consistent with how Gauge handles nil values.

How does this change affect the cardinality of KSM: Does not change cardinality

Which issue(s) this PR fixes:
Fixes #2482

When CustomResourceDefinition status fields don't exist at resource
creation time, StateSet metrics would previously log an error for each
resource instance. This caused error spam for CRDs with many resources.

Now, when the path resolves to nil (field doesn't exist), StateSet
returns empty results instead of an error, consistent with how Gauge
handles nil values when NilIsZero is false.
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Mar 8, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: Br1an67 / name: Br1an (24da992)

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Br1an67
Once this PR has been reviewed and has the lgtm label, please assign mrueg for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 8, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If kube-state-metrics contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Instrumentation Mar 8, 2026
@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 8, 2026
@mrueg
Copy link
Copy Markdown
Member

mrueg commented Mar 9, 2026

@Br1an67 can you sign the CLA?

@Br1an67 Br1an67 force-pushed the fix/issue-2482-crd-status-nil-handling branch from 17ed36a to 24da992 Compare March 18, 2026 00:26
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Mar 18, 2026
@mrueg mrueg requested a review from Copilot March 18, 2026 22:51
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts custom resource StateSet metric evaluation to avoid error spam when a configured path resolves to nil (common for CRD status fields during resource creation).

Changes:

  • Update compiledStateSet.values() to return no results/no error when the resolved value is nil.
  • Add a unit test covering the “StateSet nil path” behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
pkg/customresourcestate/registry_factory.go Treat nil resolved StateSet values as “no data” instead of an error.
pkg/customresourcestate/registry_factory_test.go Add regression test to ensure missing StateSet paths don’t produce errors.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

if comparable == nil {
return []eachValue{}, nil
}
return []eachValue{}, []error{fmt.Errorf("%s: expected value for path to be string, got %T", c.path, comparable)}
Comment on lines +336 to +351
{name: "stateset nil path", each: &compiledStateSet{
compiledCommon: compiledCommon{
path: mustCompilePath(t, "does", "not", "exist"),
},
LabelName: "phase",
List: []string{"foo", "bar"},
}, wantResult: []eachValue{}, wantErrors: nil},
{name: "stateset non-string value", each: &compiledStateSet{
compiledCommon: compiledCommon{
path: mustCompilePath(t, "spec", "replicas"),
},
LabelName: "phase",
List: []string{"1", "2"},
}, wantResult: []eachValue{}, wantErrors: []error{
errors.New("[spec,replicas]: expected value for path to be string, got float64"),
}},
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

Status: Needs Triage

Development

Successfully merging this pull request may close these issues.

CustomResourceDefinitions status fields cause spam of errors that cannot be fixed

4 participants