Skip to content
This repository has been archived by the owner on Sep 12, 2022. It is now read-only.

Fix unshelved instance failing to get fixed ip #599

Merged
merged 1 commit into from
Apr 2, 2018

Conversation

cdosborn
Copy link
Contributor

@cdosborn cdosborn commented Mar 29, 2018

Description

Problem

On unshelve instances would sometimes fail to get a fixed ip

Solution

Rather than update ip on current port, delete all ports and create a new one

I changed the name of add_fixed_ip, because we are taking a quite different approach here. We delete all an instance's ports, and then instruct nova to attach the instance to a subnet. Nova takes care of creating a port.

This fixes an outstanding issue in nova[0]. Prior to this change we kept ports around and then removed/added fixed ips. However, when an instance and its port are created, they each have an availability zone(AZ), when we unshelve an instance its AZ may change. If it does change, openstack fails to update the device owner AZ of the port and attaching a fixed ip fails because no ports exist in the AZ.

https://bugs.launchpad.net/nova/+bug/1759924 [0]

Checklist before merging Pull Requests

  • Reviewed and approved by at least one other contributor.

@cdosborn cdosborn self-assigned this Mar 29, 2018
@cdosborn cdosborn changed the base branch from master to v31 March 29, 2018 21:09
celery_logger.exception("Neutron did not accept request - %s."
% bad_request.message)
if 'no fixed ip' in bad_request.message.lower():
fixed_ip = add_fixed_ip(driverCls, provider, identity,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only necessary change was to replace this call to add_fixed_ip, I couldn't overlook how incredibly hacky this all was, and got rid of it.

This was probably a failed attempt to fix the bug fixed in this pr.

@cdosborn
Copy link
Contributor Author

I tested this against an unshelved instance that was in the wrong availability zone on the IU provider, which is where the bug was first observed.

@cdosborn cdosborn force-pushed the fix-fail-to-add-ip-on-unshelve branch from cb38f2b to 4d6b413 Compare March 29, 2018 21:35
@c-mart
Copy link
Contributor

c-mart commented Mar 29, 2018

I like the approach here iff we're continuing to remove fixed IP addresses from instances when we shelve (and suspend, etc.) them.

@coveralls
Copy link

coveralls commented Mar 29, 2018

Coverage Status

Coverage increased (+0.02%) to 36.969% when pulling af002eb on cdosborn:fix-fail-to-add-ip-on-unshelve into 42ef4f0 on cyverse:v31.

@cdosborn cdosborn force-pushed the fix-fail-to-add-ip-on-unshelve branch from 4d6b413 to 5130640 Compare March 29, 2018 22:50
I changed the name of add_fixed_ip, because we are taking a quite different
approach here. We delete all an instance's ports, and then instruct nova to
attach the instance to a subnet. Nova takes care of creating a port.

This fixes an outstanding issue in nova[0]. Our original approach is to keep
ports around and then just update the fixed ips. However, when an instance and
its port are created, they each have an availability zone, when we unshelve an
instance its AZ may change. If it does change, openstack fails to update the
device owner AZ of the port and attaching a fixed ip fails because no ports
exist in the AZ.

We also used to exit from add_fixed_ip if the instance already had a fixed ip.
However, we can no longer do this. Instances affected by this shelving bug,
will never have their fixed ips removed, so if we exited early. Instances that
need a new port (on the new AZ) would not receive a new port.

https://bugs.launchpad.net/nova/+bug/1759924[0]
@cdosborn
Copy link
Contributor Author

We are still removing the fixed ips. However, instances affected by the nova bug will actually fail to remove the fixed ips, and restore_port compensates for that.

@cdosborn cdosborn force-pushed the fix-fail-to-add-ip-on-unshelve branch from 5130640 to af002eb Compare March 29, 2018 22:59
@cdosborn
Copy link
Contributor Author

cdosborn commented Mar 29, 2018

My base branch was off, there is only a single commit.

@cdosborn cdosborn merged commit 3cfb237 into cyverse:v31 Apr 2, 2018
@cdosborn cdosborn mentioned this pull request Apr 3, 2018
1 task
@julianpistorius julianpistorius added this to the v32 milestone Apr 6, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants