Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

e2e/kind - kourier-tls tests are flakey #15052

Closed
dprotaso opened this issue Mar 27, 2024 · 7 comments · Fixed by knative-extensions/net-kourier#1236
Closed

e2e/kind - kourier-tls tests are flakey #15052

dprotaso opened this issue Mar 27, 2024 · 7 comments · Fixed by knative-extensions/net-kourier#1236
Assignees
Milestone

Comments

@dprotaso
Copy link
Member

In a lot of PRs I'm seeing kourier-tls tests are particularly

TestRevisionTimeout
TestRevisionTimeout/writes_first_byte_before_timeout

https://github.com/knative/serving/actions/runs/8458644026/job/23173569075?pr=14866
https://github.com/knative/serving/actions/runs/8455087778

TestDestroyPodInflight

https://github.com/knative/serving/actions/runs/8443066955/job/23126078591

TestSvcToSvcViaActivator/both-disabled (502 error)
https://github.com/knative/serving/actions/runs/8237583116/job/22526966476

@dprotaso
Copy link
Member Author

/assign @ReToCode

@ReToCode
Copy link
Member

ReToCode commented Mar 28, 2024

Just some notes:

  • Request goes to activator. Activator gets 500 as error, but the previous probing check was fine (so SVC was up and running at probe time)
  • The request should finish at >15s, but somehow finishes way before that: 4859 ms
  • Trying to get some more logs of QP
  • Seems to not be a problem when run solely (I did run in 100 times without errors).
revision_timeout_test.go:125: Probing to force at least one pod http://revision-timeout-writes-first-byte-before-pehcwozf.serving-tests.example.com/
    spoof.go:110: Spoofing revision-timeout-writes-first-byte-before-pehcwozf.serving-tests.example.com -> 1[72](https://github.com/knative/serving/actions/runs/8458644026/job/23173569075?pr=14866#step:9:73).18.255.1
    spoof.go:110: Spoofing revision-timeout-writes-first-byte-before-pehcwozf.serving-tests.example.com -> 172.18.255.1
revision_timeout_test.go:48: URL: http://revision-timeout-writes-first-byte-before-pehcwozf.serving-tests.example.com,/ initialSleep: 0s, sleep: 15s, request elapsed 4859 ms
    revision_timeout_test.go:140: Failed request with initialSleep 0s, sleep 15s, with revision timeout 10s, expecting status 200: failed roundtripping: response: <nil> did not pass checks: unexpected EOF
--- FAIL: TestRevisionTimeout/writes_first_byte_before_timeout (11.99s)

Ah, this could be it:

3scale-kourier-gateway-76c5cfb4ff-28shk   1/1     Running   1 (37s ago)   9m2s
3scale-kourier-gateway-76c5cfb4ff-4264r   1/1     Running   1 (34s ago)   9m2s
3scale-kourier-gateway-76c5cfb4ff-87bhc   1/1     Running   1 (39s ago)   9m2s
3scale-kourier-gateway-76c5cfb4ff-dtdr4   1/1     Running   1 (38s ago)   9m2s
3scale-kourier-gateway-76c5cfb4ff-jqqhv   1/1     Running   1 (41s ago)   9m2s
3scale-kourier-gateway-76c5cfb4ff-z797x   1/1     Running   1 (39s ago)   9m2s
      Last State:     Terminated
        Reason:       OOMKilled
        Exit Code:    137
        Started:      Wed, 27 Mar 2024 21:16:11 +0000
        Finished:     Wed, 27 Mar 2024 21:24:32 +0000
      Ready:          True

@ReToCode
Copy link
Member

Testing: #15060

@ReToCode
Copy link
Member

ReToCode commented Apr 3, 2024

Should be fine once the automated update from kourier lands in Serving.

@dprotaso
Copy link
Member Author

dprotaso commented Apr 3, 2024

Curious what about TLS makes it consume more memory?

@ReToCode
Copy link
Member

ReToCode commented Apr 4, 2024

Each Knative Service now has its own certificate that gets put into envoys config. I assume it is that (500m was already pretty much on the edge before). We'll run some more downstream testing in our QE department to verify the difference full TLS on/off.

@dprotaso
Copy link
Member Author

dprotaso commented Apr 4, 2024

Changes merged in #15097 going to close this out - thanks for the turn around @ReToCode

@dprotaso dprotaso closed this as completed Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants