Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tooling] create_and_set_aggregate self.aggregates_client.add_host is Failing on TripleO Train #2

Open
Yarboa opened this issue Apr 14, 2021 · 8 comments

Comments

@Yarboa
Copy link
Contributor

Yarboa commented Apr 14, 2021

CI of Director is failing 16.2
On the following tests:
nfv_tempest_plugin.tests.scenario.test_nfv_advanced_usecases.TestAdvancedScenarios

Are failing on hypervisors suffix when calling aggregations
packages/nfv_tempest_plugin/tests/scenario/baremetal_manager.py", line 269, in create_and_set_aggregate self.aggregates_client.add_host(aggr['aggregate']['id'], host=host) File "/home/stack/tempest/openstack-tempest/tempest/lib/services/compute/aggregates_client.py", line 107, in add_host post_body) File "/home/stack/tempest/openstack-tempest/tempest/lib/common/rest_client.py", line 300, in post return self.request('POST', url, extra_headers, headers, body, chunked) File "/home/stack/tempest/openstack-tempest/tempest/lib/services/compute/base_compute_client.py", line 48, in request method, url, extra_headers, headers, body, chunked) File "/home/stack/tempest/openstack-tempest/tempest/lib/common/rest_client.py", line 704, in request self._error_checker(resp, resp_body) File "/home/stack/tempest/openstack-tempest/tempest/lib/common/rest_client.py", line 810, in _error_checker raise exceptions.NotFound(resp_body, resp=resp) tempest.lib.exceptions.NotFound: Object not found Details: {'code': 404, 'message': 'Compute host computeovsdpdksriov-1.novalocal could not be found.'}

@Yarboa Yarboa changed the title create_and_set_aggregate self.aggregates_client.add_host is Failing on TripleO Train [Tooling] create_and_set_aggregate self.aggregates_client.add_host is Failing on TripleO Train Apr 14, 2021
@eshulman2
Copy link
Collaborator

eshulman2 commented Apr 14, 2021

We can just override the "dhcp_domain" parameter in our deployment with an empty string. we can also alternativly just create a parsing functions the looks like

def parsing(full_name):
    return full_name.split('.')[0]

@MaxBab
Copy link
Collaborator

MaxBab commented Apr 14, 2021

It's an Openstack bug related to the naming of the hypervisor hosts.
https://bugzilla.redhat.com/show_bug.cgi?id=1949385

@SeanMooney
Copy link

that is incorrect

its not a compute/nova bug you have not configured ooo correctly.
novalocal is the default dns domain used by nova,
ooo uses localdomain as the default cloud domain.

you need to make bot aling both.

https://opendev.org/openstack/tripleo-heat-templates/src/commit/d58efb58e0c39b2ca1585d87fe6d542484b33ad0/overcloud.j2.yaml#L139-L144

i dont think this is an openstack bug and its definetly not an nova bug but it might be a ooo one.

@MaxBab
Copy link
Collaborator

MaxBab commented Apr 14, 2021

Hi @SeanMooney

We never configured the "CloudDomain" parameter in our deployments so it always taken from the defaults.
If it is taken from the default, it should be configured with the same default value across all the deployed environment.

As I mentioned in the BZ - https://bugzilla.redhat.com/show_bug.cgi?id=1949385
The output of the "openstack hypervisor list" and controller nova-scheduler logs differs.
The suffix of one is "novalocal" and "localdomain" for the second.

We are using exact the same tht for the deployment for 16.1 and 16.2.
And that issue popped up in 16.2. In 16.1 everything works.

So, I still think, it's a bug.

@SeanMooney
Copy link

yep i just responed to the bz

openstack hypervisor list is not the correct command to use you should be using
"openstack compute service list --service nova-compute"
to get the host to add.

normally ooo configures the hostname on the over cloud hosts to be the same as the value it puts in the nova.conf hosts fileld
however that behavior has been broken a few times.
i would guess that way the /etc/hosts file and /etc/hostname file is being generated on the overcloud host has changed in some way.

in any case this si a porblem in the deployment tooling and the api that is being used not in nova.

@MaxBab
Copy link
Collaborator

MaxBab commented Apr 14, 2021

@SeanMooney

Ok, so you are saying the the concept of taking the compute host details is incorrect and should be taken from the "compute service list"

Let's continue the discussion within the bz to not create duplicates.

Thanks.

@SeanMooney
Copy link

actully i think there are also duplicat bzs
https://bugzilla.redhat.com/show_bug.cgi?id=1949385

i need to look at that one to but sure lets move this to bugzilla.

it really does look like a regresson of the compute hostname again.
this has been broken before...

@Yarboa
Copy link
Contributor Author

Yarboa commented Jun 1, 2021

Thanks @SeanMooney we certainly need to do that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants