Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kuttl test might try to assert the configuration in an old keystone pod #230

Open
gibizer opened this issue Apr 12, 2023 · 2 comments
Open

Comments

@gibizer
Copy link
Contributor

gibizer commented Apr 12, 2023

During the reconfiguration test case the test code changes the config of the keystone service via the CRD then gets the keystone pods with oc get pods -n openstack -l service=keystone -o name Due to the reconfiguration the keystone service pod is restarted. So there is a chance that the still terminating old Pod is returned by oc get pods -n openstack -l service=keystone -o name see below in the logs.

    logger.go:42: 19:01:09 | change_keystone_config/2-patch_keystone_deploy | test step completed 2-patch_keystone_deploy
    logger.go:42: 19:01:09 | change_keystone_config/3- | starting test step 3-
I0412 19:01:10.681551  822864 request.go:655] Throttling request took 1.049088433s, request: GET:https://api.crc.testing:6443/apis/authorization.openshift.io/v1?timeout=32s
    logger.go:42: 19:01:12 | change_keystone_config/3- | running command: [sh -c $KEYSTONE_KUTTL_DIR/../common/scripts/check_debug_in_keystone_pod_logs.sh]
    logger.go:42: 19:01:12 | change_keystone_config/3- | + found=0
    logger.go:42: 19:01:12 | change_keystone_config/3- | + not_found=1
    logger.go:42: 19:01:12 | change_keystone_config/3- | + [  = --reverse ]
    logger.go:42: 19:01:12 | change_keystone_config/3- | + oc get pods -n openstack -l service=keystone -o name
    logger.go:42: 19:01:13 | change_keystone_config/3- | + pod=pod/keystone-65f4f767d9-7flm6
    logger.go:42: 19:01:13 | change_keystone_config/3- | + oc logs -n openstack pod/keystone-65f4f767d9-7flm6
    logger.go:42: 19:01:13 | change_keystone_config/3- | + grep DEBUG
    logger.go:42: 19:01:13 | change_keystone_config/3- | + head -n 1
    logger.go:42: 19:01:13 | change_keystone_config/3- | Defaulted container "keystone-api" out of: keystone-api, init (init)
    logger.go:42: 19:01:13 | change_keystone_config/3- | + debug=[Wed Apr 12 17:01:05.827833 2023] [wsgi:error] [pid 25:tid 53] 2023-04-12 17:01:05.827 25 DEBUG mod_wsgi [-] ******************************************************************************** log_opt_values /usr/lib/python3.9/site-packages/oslo_config/cfg.py:2589\x1b[00m
    logger.go:42: 19:01:13 | change_keystone_config/3- | + [ -n [Wed Apr 12 17:01:05.827833 2023] [wsgi:error] [pid 25:tid 53] 2023-04-12 17:01:05.827 25 DEBUG mod_wsgi [-] ******************************************************************************** log_opt_values /usr/lib/python3.9/site-packages/oslo_config/cfg.py:2589\x1b[00m ]
    logger.go:42: 19:01:13 | change_keystone_config/3- | + exit 0
    logger.go:42: 19:01:13 | change_keystone_config/3- | test step completed 3-
    logger.go:42: 19:01:13 | change_keystone_config/4-change_keystone_config | starting test step 4-change_keystone_config
    logger.go:42: 19:01:13 | change_keystone_config/4-change_keystone_config | running command: [sh -c oc patch keystoneapi -n openstack keystone --type='json' -p='[{"op": "replace", "path": "/spec/customServiceConfig", "value": "[DEFAULT]\ndebug = false"}]'
        ]
    logger.go:42: 19:01:13 | change_keystone_config/4-change_keystone_config | keystoneapi.keystone.openstack.org/keystone patched
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | running command: [sh -c $KEYSTONE_KUTTL_DIR/../common/scripts/check_debug_in_keystone_pod_logs.sh --reverse]
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + found=0
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + not_found=1
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + [ --reverse = --reverse ]
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + found=1
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + not_found=0
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + oc get pods -n openstack -l service=keystone -o name
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + pod=pod/keystone-65f4f767d9-7flm6
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | pod/keystone-75d7c46d99-7jmqr
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + oc logs -n openstack pod/keystone-65f4f767d9-7flm6
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | pod/keystone-75d7c46d99-7jmqr
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + grep DEBUG
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + head -n 1
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | error: arguments in resource/name form may not have more than one slash
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + debug=
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + [ -n  ]
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | + exit 0
    logger.go:42: 19:01:16 | change_keystone_config/4-change_keystone_config | test step completed 4-change_keystone_config
@gibizer
Copy link
Contributor Author

gibizer commented Apr 12, 2023

So this is probably a race condition between the test changing the CRD and the operator + k8s reacting to that change and replacing the Pod. Eventually the only running pod will be the one with the new configuration. But the cluster might need time to get there. More time than what the kuttl test takes to assert that the reconfiguration take effect on the Pod. So in theory the above code can lead to the case where the kuttl test fails the assert as it still sees the old Pod only as the cluster haven't reacted to the CRD changes yet.

@gibizer
Copy link
Contributor Author

gibizer commented Apr 12, 2023

In this particular case the race lead to false positive assert. The oc logs command returned an error due to multiple pod names are passed to it, but the assert script exited with 0 status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant