Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace docker commands with docker sdk #12

Merged
merged 10 commits into from
Apr 26, 2019
Merged

Conversation

iwilltry42
Copy link
Member

@iwilltry42 iwilltry42 commented Apr 15, 2019

Hey,

I started to replace calls to the docker executable with calls to the docker API using the Docker client SDK.

The only point where it wasn't really straight forward was the docker cp command, since the docker cli has some extensive checks there...

What do you think about this?

Benefits:

  • more flexibility
  • no dependency on docker (or at some point docker-compose) executables
  • no parsing of command outputs to gather information

Drawbacks:

  • more code
  • a bit more code complexity

Why?

This will enable us to spawn sets of worker nodes (as requested in an issue), without having to rely on the docker-compose executable.

@iwilltry42
Copy link
Member Author

Since I forgot to add all functionality (e.g. adding env vars), this PR will receive some updates.
But as the change is quite large you might want to enjoy it in smaller pieces :)

@zeerorg
Copy link
Collaborator

zeerorg commented Apr 17, 2019

Got the below error when testing the PR

2019/04/17 10:20:40 Creating cluster [k3s_default]
2019/04/17 10:20:47 ERROR: failed to create cluster
ERROR: couldn't create container k3d-k3s_default-server
Error: No such image: docker.io/rancher/k3s:v0.4.0

@zeerorg
Copy link
Collaborator

zeerorg commented Apr 17, 2019

Can you remove k3s:v0.4.0 from your docker images and try to do this again ?

@zeerorg
Copy link
Collaborator

zeerorg commented Apr 17, 2019

When I stop k3s cluster, only the server stops but the worker nodes keep on running.

@zeerorg
Copy link
Collaborator

zeerorg commented Apr 17, 2019

kubectl get nodes only shows the server node and not the worker nodes that were created.

@iwilltry42
Copy link
Member Author

Thanks for the feedback @zeerorg, I will check this tomorrow. I didn't have the problem with workers not shutting down yet.
I will investigate those issues.
This PR is definitely not ready yet, I only opened it this early to give you more time to check out the switch from executing docker cli commands to using the docker API via client sdk

@zeerorg
Copy link
Collaborator

zeerorg commented Apr 17, 2019

Not a problem, I just wanted to look a bit at this since it is a big change that's all. You are doing way too good work

@iwilltry42
Copy link
Member Author

iwilltry42 commented Apr 18, 2019

@zeerorg

Got the below error when testing the PR

2019/04/17 10:20:40 Creating cluster [k3s_default]
2019/04/17 10:20:47 ERROR: failed to create cluster
ERROR: couldn't create container k3d-k3s_default-server
Error: No such image: docker.io/rancher/k3s:v0.4.0

Fixed in commit 7d5964c
Problem was that the returned ReadCloser from docker.ImagePull was not read in non-verbose mode, which apparently makes docker to not pull the image at all. Also verbose mode couldn't be enabled there, because it was using c.Bool instead of c.GlobalBool

@iwilltry42
Copy link
Member Author

When I stop k3s cluster, only the server stops but the worker nodes keep on running.

@zeerorg this happened when there was more than one cluster active at a time and using delete -a, right?
Because then the getClusters function got confused with label matching.
I fixed this in commit 92039ed 👍

@iwilltry42
Copy link
Member Author

kubectl get nodes only shows the server node and not the worker nodes that were created.

@zeerorg the containers couldn't communicate with each other.
We now create a docker network for each cluster so that this will work.
Changes are in commit 6c072fb

@iwilltry42
Copy link
Member Author

Lots of significant changes here, please test again :)

FYI @zeerorg & @ibuildthecloud

main.go Outdated Show resolved Hide resolved
@iwilltry42
Copy link
Member Author

get-kubeconfig now also uses the docker SDK, so I was able to remove run.go

@iwilltry42
Copy link
Member Author

@zeerorg and @ibuildthecloud I think this should be on par with the old version now (+ some new features)

@zeerorg
Copy link
Collaborator

zeerorg commented Apr 23, 2019

Almost all the things have been fixed. Really nice 👍
Somethings you can look at:

  1. export KUBECONFIG... should come after workers are started
    image

  2. The --wait flag should also include waiting for workers to come up and respect the timeout value.

@iwilltry42
Copy link
Member Author

iwilltry42 commented Apr 23, 2019

Almost all the things have been fixed. Really nice +1
Somethings you can look at:

1. `export KUBECONFIG...` should come after workers are started
   ![image](https://user-images.githubusercontent.com/13547997/56580963-88505300-65f1-11e9-9ec7-35b6f9d02e83.png)

2. The `--wait` flag should also include waiting for workers to come up and respect the timeout value.

@zeerorg , thanks for the review.

  1. fixed in 068172a ✔️
  2. Hmm... I'm not sure if we should only consider it up if all the workers are up and running, since the API-Server will be up and kubectl will work already if they are not ready yet. Anyways, I'd put this out of scope for this PR, or what do you think? I'd rather phrase it as a new issue and starting point for a new PR 👍

@zeerorg
Copy link
Collaborator

zeerorg commented Apr 24, 2019

Yeah, we can keep this out. We can include it in the next update. Another thing we could do is parallelize worker creation so that if someone creates a lot of workers they can be up and running more quickly.

Copy link
Collaborator

@zeerorg zeerorg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@iwilltry42
Copy link
Member Author

Yeah, we can keep this out. We can include it in the next update. Another thing we could do is parallelize worker creation so that if someone creates a lot of workers they can be up and running more quickly.

Sure thing. Parallelizing the work is a super good idea 👍 I also already thought about using goroutines if we decided on including the workers in the timeout. Some more ideas, but I'm happy to first merge this PR :) Maybe I can get another approval today, but since we're already actively using it without any problems, I guess we're good to go.

@iwilltry42 iwilltry42 merged commit 705ef69 into development Apr 26, 2019
@iwilltry42 iwilltry42 deleted the use-docker-sdk branch April 27, 2019 20:33
@iwilltry42 iwilltry42 mentioned this pull request Apr 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants