Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contiv network is not deleted when docker network is deleted #1029

Merged
merged 8 commits into from
Oct 31, 2017

Conversation

kahou82
Copy link
Member

@kahou82 kahou82 commented Oct 22, 2017

Steps to reproduce:

  1. Create a docker network with docker network CLI
  2. Ensure docker network and contiv network are created (docker network ls and netctl network ls)
  3. Remove docker network.
  4. Check docker network ls, docker network is deleted
  5. Check netctl networ ls, contiv network is still test.

The root cause is due to the dockernetworkstate mapping is planned to delete after
contiv network is delete. However, the determination of contiv network deletion is
determined by the existance of the dockernetworkstate. Thefore, if the state is still
there, the contiv network will be unable to delete.

The fix is to delete the dockernetworkstate before proceed to contiv network deletion

Also refactor docker plugin update by separating the plugin and ipam driver

Signed-off-by: Kahou Lei kalei@cisco.com

Steps to reproduce:
1. Create a docker network with docker network CLI
2. Ensure docker network and contiv network are created (docker network ls and netctl network ls)
3. Remove docker network.
4. Check docker network ls, docker network is deleted
5. Check netctl networ ls, contiv network is still test.

The root cause is due to the dockernetworkstate mapping is planned to delete after
contiv network is delete. However, the determination of contiv network deletion is
determined by the existance of the dockernetworkstate. Thefore, if the state is still
there, the contiv network will be unable to delete.

The fix is to delete the dockernetworkstate before proceed to contiv network deletion

Signed-off-by: Kahou Lei kalei@cisco.com
@kahou82
Copy link
Member Author

kahou82 commented Oct 22, 2017

DO NOT MERGE: I am writing unit test.

// delete the dnet oper state
err = docknet.DeleteDockNetState(dnet.TenantName, dnet.NetworkName, dnet.ServiceName)
if err != nil {
log.Errorf("Couldn't delete docknet for nwID %s", networkID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Log the error here to aid with debugging, please:

log.Errorf("Couldn't delete docknet for nwID %s: %s", networkID, err)

@rchirakk
Copy link
Contributor

rchirakk commented Oct 23, 2017

PTAL @gkvijay ^^

@kahou82
Copy link
Member Author

kahou82 commented Oct 24, 2017

build PR

@kahou82
Copy link
Member Author

kahou82 commented Oct 24, 2017

build PR

@kahou82
Copy link
Member Author

kahou82 commented Oct 24, 2017

build PR

log.Errorf("Couldn't delete docknet for nwID %s: %s", networkID, err.Error())
}
log.Infof("Deleted docker network mapping for %v", networkID)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we be failing if we get err in either of these two cases? under what conditions can docker and contiv networks be out of sync (when there is not a bug) . . . err==nil or not seems too generic, can we check for "not present" vs any other err e.g. "error talking to docker"?

}
log.Infof("Deleted docker network mapping for %v", networkID)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we deleted docker network, found a contiv netowrk, but failed to delete the contiv netowrk, should we restore the docker network? (i dunno, just asking) . . L771

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no hook for me to do that at the moment to restore the network.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think we must secretly revert what the user does (docker network rm in this case). If contiv network deletion fails, I think an error to the user is good instead of reverting what the user did.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

log.Infof("docker v2plugin name (%s) updated to %s", netDriverName, pluginName)
netDriverName = pluginName
ipamDriverName = pluginName
func UpdatePluginName(netdriver string, ipamDriver string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netDriver?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprised go vet didn't complain about that

log.Infof("docker v2plugin (%s) updated to %s and ipam (%s) updated to %s",
netDriverName, netdriver, ipamDriverName, ipamDriver)
netDriverName = netdriver
ipamDriverName = ipamDriver
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change is missing from the commit description

if delErr != nil {
t.Fatalf("Unable to delete docker network. Error: %v", delErr)
t.Fail()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we check for the network to be absent from both docker and netctl here?

func TestUpdatePluginName(t *testing.T) {
expectNetDriver := "bridge"
expectIPAMDriver := "default"
UpdatePluginName("bridge", "default")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no docknet prefix?

expectIPAMDriver := "default"
UpdatePluginName("bridge", "default")

if expectNetDriver != netDriverName {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is netDriverName in scope?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this Go test file is in the docknet package (see top of file), everything in the docknet package is available here.

This var is defined in docknet.go here: https://github.com/kahou82/netplugin/blob/37a3b829c00226b8af742f80e6c46889c20fd4d2/netmaster/docknet/docknet.go#L44

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

t.Fail()
}

if expectIPAMDriver != ipamDriverName {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor

@chrisplo chrisplo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my first go code review, maybe my concerns are invalid

@kahou82
Copy link
Member Author

kahou82 commented Oct 28, 2017

build PR

1 similar comment
@kahou82
Copy link
Member Author

kahou82 commented Oct 28, 2017

build PR

@kahou82
Copy link
Member Author

kahou82 commented Oct 29, 2017

@chrisplo can you review? It is one of the CVD blockers.

.idea/vcs.xml Outdated
@@ -0,0 +1,6 @@
<?xml version="1.0" encoding="UTF-8"?>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please git rm this file (or the whole directory). It shouldn't be included in git. You could add it to .gitignore so it's not tracked

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

instInfo := core.InstanceInfo{DbURL: "etcd://127.0.0.1:2379"}
func initFakeStateDriver() {
instInfo := core.InstanceInfo{}
d, _ := utils.NewStateDriver("fakedriver", &instInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check the error and assert that it is nil

(you never know!)

initFakeStateDriver()

// Delete Docker network by using helper
delErr := deleteNetworkHelper("non_existing_network_id")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors can just be named "err" unless you legitimately need multiple error variables in the same scope at the same time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// delete the dnet oper state
err = docknet.DeleteDockNetState(dnet.TenantName, dnet.NetworkName, dnet.ServiceName)
if err != nil {
log.Errorf("Couldn't delete docknet for nwID %s: %s", networkID, err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I would do something like:

msg := fmt.Sprintf("my %s", "error message")
log.Error(msg)
return errors.New(msg)

Return as much useful info to the caller as possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}
log.Infof("Deleted docker network mapping for %v", networkID)
} else {
log.Errorf("Couldn't find Docker network %s: %s", networkID, err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// initStateDriver initialize etcd state driver
func initFakeStateDriver() {
instInfo := core.InstanceInfo{}
d, _ := utils.NewStateDriver("fakedriver", &instInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check error here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@kahou82
Copy link
Member Author

kahou82 commented Oct 29, 2017

build PR

1 similar comment
@kahou82
Copy link
Member Author

kahou82 commented Oct 29, 2017

build PR

Copy link
Contributor

@rchirakk rchirakk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than minor comments

if err != nil {
msg := fmt.Sprintf("Could not delete docknet for nwID %s: %s", networkID, err.Error())
log.Errorf(msg)
return errors.New(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another way to simplify,

err := fmt.Errorf()
log.Error(err)
return err

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to stick with the current implementation

}
log.Infof("Deleted docker network mapping for %v", networkID)
} else {
msg := fmt.Sprintf("Could not find Docker network %s: %s", networkID, err.Error())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt package takes care of calling Error()
fmt.Sprintf("xxx %s", err)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to stick with the current implementation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i understand now that due to other bad behavior (not erroring in v2plugin mode when requesting to create the network) that we can get out of sync, I'm just looking for a ticket to block the netctl when in v2plugin mode, then we can clean up this lax behavior here

} else {
msg := fmt.Sprintf("Could not find Docker network %s: %s", networkID, err.Error())
log.Errorf(msg)
return errors.New(msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't return, log error message & proceed to delete network.

Copy link
Member

@vhosakot vhosakot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

if err != nil {
t.Fatalf("Unable to delete docker network. Error: %v", err)
t.Fail()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checking the return is non-error is good, but that didn't actually test that the network was deleted, can you check w/ docker/netctl for current status?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Take out return statement when docker network is not there.
@kahou82
Copy link
Member Author

kahou82 commented Oct 30, 2017

build PR

return errors.New("Failed to delete network")
msg := fmt.Sprintf("Failed to delete network: %s", err.Error())
log.Errorf(msg)
return errors.New(msg)
}
log.Infof("Deleted contiv network %v", networkID)
} else {
log.Infof("Could not find contiv network %v", networkID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should still be bubbling up an error here (like bad request to user for deleting a network that didn't exist), but happy to see this as a new ticket

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, I don't think return error here will bubbling up to the docker network create command. We have to have another ticket to investigate it.

@chrisplo chrisplo dismissed their stale review October 30, 2017 20:35

concerns addressed or discussed and understood

@kahou82
Copy link
Member Author

kahou82 commented Oct 30, 2017

build PR

@kahou82
Copy link
Member Author

kahou82 commented Oct 31, 2017

Going to merge as @chrisplo and @rchirakk +1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants