-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mongodb petsets extended tests: dns record of replica cannot be resolved #12588
Comments
Hm. the pods aren't supposed to report ready until they can at least resolve their own hostname. i'd think once they can do that, their hostname could be resolved by other pods too. but regardless it may mean we need more robust startup logic that waits for the DNS to be resolvable. |
Seen another failure where test seems to query the mongodb cluster before it is ready: https://ci.openshift.redhat.com/jenkins/job/origin_extended_image_tests/828/consoleText We currently only wait before the mongodb pods are |
i think adding a readiness probe will cause problems w/ the replica init since the pods contact each other before they are "up/ready". There is the tolerate unready annotation, but then i think you're back to the original problem. Solving this may require fixing the test to explicitly wait for a condition. |
I think https://ci.openshift.redhat.com/jenkins/job/origin_extended_image_tests/837/consoleFull is another example of this issue. Pod 1 starts at 9.50.34.542, attempts to register against pod 0 at 9.50.34.764, gives up and goes home at 9.50.35.372 because 0 can't find 1 in the DNS. Almost certainly that's not giving enough time for k8s' DNS to settle.
@bparees is this a case of sending a PR to https://github.com/sclorg/mongodb-container or opening an issue there, or something else? |
@jim-minter my read of the logs is that this pod:
could not find itself:
Which i had hoped to have fixed earlier by forcing mongo to connect to itself via DNS before considering itself started and beginning the process of setting up replication: That said, both of your suggestions sound reasonable, if you are in a position to do so you can just submit the PR and tag me. |
Will do. |
Seen here: https://ci.openshift.redhat.com/jenkins/job/origin_extended_image_tests/825/testReport/junit/(root)/Extended/_image_ecosystem__mongodb__Slow__openshift_mongodb_replication__with_petset__creating_from_a_template_should_process_and_create_the__https___raw_githubusercontent_com_sclorg_mongodb_container_master_examples_petset_mongodb_petset_persistent_yaml__template/
The test fails on reading record on second replica
mongodb-replicaset-1
:This, in turn, seems to be caused by the first replica,
mongodb-replicaset-0
, not being able to resolvemongodb-replicaset-1
's hostname:Can't seem to reproduce this locally.
The text was updated successfully, but these errors were encountered: