Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/kubeletstats] Failing test: TestScraperWithNodeUtilization #33681

Closed
crobert-1 opened this issue Jun 20, 2024 · 4 comments
Closed

[receiver/kubeletstats] Failing test: TestScraperWithNodeUtilization #33681

crobert-1 opened this issue Jun 20, 2024 · 4 comments
Assignees

Comments

@crobert-1
Copy link
Member

Component(s)

receiver/kubeletstats

Describe the issue you're reporting

Failing CI/CD action

Failure output:

=== RUN   TestScraperWithNodeUtilization
    scraper_test.go:139: 
        	Error Trace:	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/receiver/kubeletstatsreceiver/scraper_test.go:139
        	Error:      	Not equal: 
        	            	expected: 18
        	            	actual  : 0
        	Test:       	TestScraperWithNodeUtilization
--- FAIL: TestScraperWithNodeUtilization (0.00s)
@crobert-1 crobert-1 added the needs triage New item requiring triage label Jun 20, 2024
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@ChrsMark
Copy link
Member

I suspect this might be a timing issue with the node Informer, which is based on fake k8s client, not being updated fast enough before the first scrape.

Filed #33685 to ensure that the assertion of the metrics happens after a first valid scrape. Hopefully this will cover the flakiness here.

@TylerHelmuth TylerHelmuth removed the needs triage New item requiring triage label Jun 21, 2024
TylerHelmuth added a commit that referenced this issue Jul 1, 2024
…hared Informer sync (#33685)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

This PR introduces a "retry" check on the test that relies on the shared
node informer. This retry will ensure that the first valid scrape will
be after the informer has been "triggered" to handle the node's
addition.

**Link to tracking Issue:** <Issue number if applicable> Related to
#33681.

**Testing:** <Describe what testing was performed and which tests were
added.>

**Documentation:** <Describe the documentation added.>

---------

Signed-off-by: ChrsMark <chrismarkou92@gmail.com>
Co-authored-by: Tyler Helmuth <12352919+TylerHelmuth@users.noreply.github.com>
@ChrsMark
Copy link
Member

Hey @crobert-1 any way we can check if this has been actually fixed?

@crobert-1
Copy link
Member Author

I checked failures of the e2e-tests workflow on main, we haven't seen this error since the PR was merged. I think we're good to close.

Thanks for fixing @ChrsMark!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants