Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add e2e testing #16

Closed
bowei opened this issue Oct 11, 2017 · 35 comments
Closed

Add e2e testing #16

bowei opened this issue Oct 11, 2017 · 35 comments
Assignees
Labels
area/infra help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@bowei
Copy link
Member

bowei commented Oct 11, 2017

From @bprashanth on November 10, 2016 18:48

kubernetes-retired/contrib#1441 (comment)

e2e testing: If we could figure out a way to setup an e2e builder that runs https://github.com/kubernetes/kubernetes/blob/master/test/e2e/ingress.go#L58 for every commit just like the cadvisor repo https://github.com/google/cadvisor, that would be great. I'm sure our test-infra people would be more than willing to help with this problem (file an issue like kubernetes/test-infra#939, but more descriptive maybe :)

@porridge fyi

Copied from original issue: kubernetes/ingress-nginx#5

@bowei bowei added help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. area/infra labels Oct 11, 2017
@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @aledbf on November 11, 2016 3:23

kubernetes/ingress-nginx#12 adds initial e2e structure in directories hack and test.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 15, 2016 10:32

FWIW, the test-e2e target is broken at the moment.
I don't really understand what is the git root directory that the message mentions.

porridge@beczulka:~/Desktop/coding/go/src/k8s.io/ingress$ time make test-e2e 
go get github.com/onsi/ginkgo/ginkgo
2016/12/15 11:28:10 e2e.go:128: Called from invalid working directory: must run from git root directory: /home/porridge/Desktop/coding/go/src/k8s.io/ingress
exit status 1
make: *** [test-e2e] Error 1

real	0m0.756s
user	0m0.636s
sys	0m0.180s
porridge@beczulka:~/Desktop/coding/go/src/k8s.io/ingress$ 

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 15, 2016 12:58

FTR, kubernetes/ingress-nginx#62 fixes the breakage.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 22, 2016 13:19

The next problem is that the test-e2e target tries to push the image, tracked in kubernetes/ingress-nginx#79

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 22, 2016 13:45

After that, and deploying the nginx controller, the problem is that the controller cannot talk to the API server:

I1222 13:19:12.959734       6 launch.go:84] &{NGINX 0.8.4 git-f0762ba git@github.com:porridge/ingress.git}
I1222 13:19:12.961047       6 nginx.go:102] starting NGINX process...
F1222 13:19:12.962171       6 launch.go:109] no service with name default/default-http-backend found: Get http://localhost:8080/api/v1/namespaces/default/services/default-http-backend: dial tcp [::1]:8080: getsockopt: connection refused

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 22, 2016 14:11

That's apparently because /var/run/secrets/kubernetes.io/serviceaccount/token does not exist, so restclient.InClusterConfig() fails, and the config returned by clientConfig.ClientConfig() is wrong.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 22, 2016 14:55

I hacked launch.go to list the environment, and hardcoded kubeconfig.Host = "10.0.0.1:443"

KUBERNETES_SERVICE_PORT_HTTPS=443
KUBERNETES_PORT_443_TCP=tcp://10.0.0.1:443
DEFAULT_HTTP_BACKEND_SERVICE_PORT=80
DEFAULT_HTTP_BACKEND_PORT=tcp://10.0.0.90:80
KUBERNETES_SERVICE_PORT=443
KUBERNETES_PORT_443_TCP_ADDR=10.0.0.1
KUBERNETES_PORT=tcp://10.0.0.1:443
DEFAULT_HTTP_BACKEND_PORT_80_TCP=tcp://10.0.0.90:80
DEFAULT_HTTP_BACKEND_PORT_80_TCP_PORT=80
DEFAULT_HTTP_BACKEND_PORT_80_TCP_ADDR=10.0.0.90
KUBERNETES_PORT_443_TCP_PORT=443
KUBERNETES_PORT_443_TCP_PROTO=tcp
DEFAULT_HTTP_BACKEND_SERVICE_HOST=10.0.0.90
DEFAULT_HTTP_BACKEND_PORT_80_TCP_PROTO=tcp
KUBERNETES_SERVICE_HOST=10.0.0.1

Yet it still does not work:

F1222 14:51:47.434441       7 launch.go:114] no service with name default/default-http-backend found: Get http://10.0.0.1:443/api/v1/namespaces/default/services/default-http-backend: dial tcp 10.0.0.1:443: getsockopt: connection timed out

@bprashanth can you point me to a working setup of this kind somewhere in the k8s ecosystem? https://github.com/kubernetes/community/blob/master/contributors/devel/local-cluster/docker.md starts with a big bold "Stop, use minikube instead".

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @aledbf on December 22, 2016 15:0

@porridge this should work if you set the master address like
export KUBERNETES_MASTER=http://172.17.4.99:8080
before launching the controller binary

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 22, 2016 16:12

To summarize our chat, hostNetwork: true on the pod spec seems to solve it.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @bprashanth on December 22, 2016 17:3

It would be great to not have to run with host networking in the e2e since most people don't in reality, but I think it's fine for a first cut with a TODO.

so to confirm, hostNetwork is just a workaround for setting KUBERNETES_MASTER, which defaults to localhost instead?

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @bprashanth on December 22, 2016 18:6

@porridge do these docs help https://github.com/kubernetes/ingress/blob/master/docs/dev/setup.md ? (try local-up-cluster, or minikube as described, if they've released a version with the nginx addon)

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on December 27, 2016 20:4

@bprashanth I think it was a bit more complicated:

  1. First of all, the hypercube-based cluster brought up by ingress/hack/e2e-internal/e2e-up.sh was broken in the sense that /var/run/secrets/kubernetes.io/serviceaccount in every pod was an empty directory (no token file, not even the ..* dir/symlink).
  2. this in turn confused the config loading code into thinking it's not running in a cluster, and therefore needs to connect to localhost, which didn't work
  3. I tried to force it to connect to 10.0.0.1:8080 and :443 but that failed with getsockopt: connection timed out
  4. finally, this was worked around with hostNetwork with aledbf's help.

After your suggestion, I tried with the cluster brought up by kubernetes/hack/local-up-cluster.sh just now, but it failed the same way as (3) above:

F1227 19:51:24.140035       6 launch.go:123] no service with name default/default-http-backend found: Get https://10.0.0.1:443/api/v1/namespaces/default/services/default-http-backend: dial tcp 10.0.0.1:443: getsockopt: connection timed out

Turns out this (as well as [3]) was caused by my laptop's overzealous firewall.

I haven't tracked down the cause for (1) though.

Perhaps I should try minikube, as I fear trying to teach ufw what to let through will be hard, given that I'm somewhat confused about this networking myself.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

@porridge -- the security token issue seems to be related to setting rshared on the mount point for kubelet:

I was able to get the hyperkube containerize kubelet running after doing:

$ mount --bind /var/lib/kubelet /var/lib/kubelet
$ mount --make-rshared /var/lib/kubelet

and passing --volume=/var/lib/kubelet:/var/lib/kubelet:rshared in the docker run command

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on February 3, 2017 17:55

I made some baby steps towards using minikube.
I'm going to investigate that further, given the problems with hyperkube-based cluster (the fact that it seems soft-deprecated, incompatibility with ufw, and the rshared issue on kubelet).

One questions is what I should do with hack/e2e* - since it is currently not much apart from a skeleton around hyperkube setup - should I try extending it to support minikube as well (--deployment)?

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

@porridge -- if you want an example using hyperkube, you can take a look at this:

https://github.com/kubernetes/dns/tree/master/pkg/e2e

It starts an API server + controller manager in a container. There are some things that need to be resolved with the containerized mounter, but it works...

The kickoff script is here: https://github.com/kubernetes/dns/build/e2e-test.sh

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on February 4, 2017 12:47

@bowei ingress/hack already contains code to start a hyperkube-based cluster, so I'm not sure we need more examples. However I'm not sure we should be going that way given the problems I mentioned:

Of course I don't know yet how much better minikube is going to be, but the fact that it's more hermetic is promising.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

hyperkube has only a dependency on docker, which means we were able to run the e2e tests as part of a travis ci test, which is helpful. The code above does take care of the mounting issue.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @aledbf on February 4, 2017 19:20

@porridge I think @bowei is right. We should use the code from the dns repo to bootstrap the e2e test suite from a repo that is already using it (so we don't waste time in the setup)

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

@aledbf if there is common interest, this may be worth splitting off into a common framework. Then we can mutually benefit from the work. Let me know which direction you guys decide to go...

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @aledbf on February 4, 2017 19:43

if there is common interest, this may be worth splitting off into a common framework

Yes please :)
I think this is one of the missing pieces, a common bootstrap for e2e test suites

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @aledbf on February 4, 2017 19:43

@bprashanth please comment about this ^^

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @bprashanth on February 6, 2017 18:25

Yeah of course, what are we thinking - a bootstrap hyperkube for travis incubator?

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

I'm happy to propose the project and drive it

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @bprashanth on February 6, 2017 18:42

Thanks! Suggest an email to kubernetes-dev per https://github.com/kubernetes/community/blob/master/incubator.md#existing-code-in-kubernetes, the test infra team might have some thoughts (or recommend putting it in https://github.com/kubernetes/test-infra or https://github.com/kubernetes/repo-infra)

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on February 6, 2017 19:39

I'd appreciate it if you could keep me in the loop.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on February 24, 2017 10:5

@bowei was there any movement on this?

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

Sorry about the delay -- will send something post-code freeze...

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on February 24, 2017 17:7

@Beeps sorry it took me this long to test this on the e2e cluster from the main kubernetes repo like you suggested in December, but it does not work either - by the looks of it, for the same reason as with ingress repo's own local cluster case:

$ $GOPATH/k8s.io/kubernetes/cluster/kubectl.sh --namespace=kube-system logs -f nginx-ingress-controller-484927508-skkxg
I0224 17:01:14.408940       7 launch.go:94] &{NGINX 0.9.0-beta.2 git-7013a52 git@github.com:ixdy/kubernetes-ingress.git}
I0224 17:01:14.409029       7 launch.go:97] Watching for ingress class: nginx
I0224 17:01:14.409432       7 nginx.go:112] starting NGINX process...
I0224 17:01:14.410528       7 launch.go:223] Creating API server client for https://10.0.0.1:443
F0224 17:01:29.439727       7 launch.go:111] no service with name kube-system/default-http-backend found: Get https://10.0.0.1:443/api/v1/namespaces/kube-system/services/default-http-backend: dial tcp 10.0.0.1:443: getsockopt: connection timed out

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on March 30, 2017 11:36

Status update: I (temporarily) ditched the attempts to get this running on my laptop, and moved my dev environment onto a workstation, to rule out the interference from the local firewall.

There I was able to successfully play with the nginx controller on a local cluster brought up with k8s.io/kubernetes/hack/local-up-cluster.sh

Starting the same controller on the cluster launched with k8s.io/ingress/hack/e2e.go -v --up --test=false --down=false fails with:

F0330 11:25:55.592077       7 launch.go:251] Error while initializing connection to Kubernetes apiserver. This most likely means that the cluster is misconfigured (e.g., it has invalid apiserver certificates or
 service accounts configuration). Reason: invalid configuration: no configuration has been provided

I think I'm going to start working on some first e2e test cases, using that former cluster for the time being, while waiting for @bowei to start the effort towards common bootstrap (now that 1.6 is released, hint hint).

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on March 31, 2017 11:6

One interesting problem - worth thinking about when designing this common e2e test infrastructure:

I when trying to import "k8s.io/kubernetes/test/e2e/framework" and use ginkgo at the same time, I found the hard way that the ginkgo imported directly clashed with the vendored copy of ginkgo from the kubernetes repo:

panic: /tmp/ginkgo687931842/test.test flag redefined: ginkgo.seed
goroutine 1 [running]:
flag.(*FlagSet).Var(0xc420018180, 0x3532640, 0x3660aa0, 0xc42027f910, 0xb, 0x23f7307, 0x2a)
        /usr/lib/google-golang/src/flag/flag.go:793 +0x420
flag.(*FlagSet).Int64Var(0xc420018180, 0x3660aa0, 0xc42027f910, 0xb, 0x58de340d, 0x23f7307, 0x2a)
        /usr/lib/google-golang/src/flag/flag.go:618 +0x71
github.com/onsi/ginkgo/config.Flags(0xc420018180, 0xc4207f7e30, 0x7, 0xc420000101)
        /usr/local/google/home/porridge/projects/go/src/github.com/onsi/ginkgo/config/config.go:66 +0xee
github.com/onsi/ginkgo.init.1()
        /usr/local/google/home/porridge/projects/go/src/github.com/onsi/ginkgo/ginkgo_dsl.go:54 +0x59
github.com/onsi/ginkgo.init()
        /usr/local/google/home/porridge/projects/go/src/github.com/onsi/ginkgo/ginkgo_dsl.go:570 +0x9d
k8s.io/ingress/controllers/nginx/test.init()
        /usr/local/google/home/porridge/projects/go/src/k8s.io/ingress/controllers/nginx/test/test_suite_test.go:17 +0x50
main.init()
        k8s.io/ingress/controllers/nginx/test/_test/_testmain.go:46 +0x53

Apparently, this is not because of version difference, but because flags are global, and the two ginkgo packages are treated as unrelated ones, so their (identical) init() functions are both invoked as explained in onsi/ginkgo#234 (comment).

For lack of better ideas, I worked this around for now by removing k8s.io/kubernetes/vendor/github.com/onsi.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @onsi on April 1, 2017 22:17

srsly the flags package being global is both incredibly convenient and also a painful source of these sorts of rough edges. Not sure how to help @porridge :/ perhaps there's some way for Ginkgo to detect if Ginkgo has already parsed flags? If you could try that out locally until you get it to work and submit a PR that would be great.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @porridge on April 7, 2017 15:50

@onsi I took a look and:

  • the panic is at flag declaration time, not parsing time
  • luckily it is possible to test whether a flag with a given name has been declared
  • unfortunately the fact that ginkgo exposes flag value as fields of a public struct variable, makes it hard/impossible to fix this problem without changing the API.

Please take a look at onsi/ginkgo@master...porridge:flag-tolerant which I think is a reasonable compromise between backwards-compatibility and making is possible for vendored ginkgo to work at all.
Obviously it's not the full change, it only shows what needs to be done using one of the flags as an example.

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @stibi on September 13, 2017 9:43

Hi, is the "help wanted" label still valid here? If yes, I'd like to join the ride.

Is this work kubernetes/ingress-nginx#1331 also related to this issue?

@bowei
Copy link
Member Author

bowei commented Oct 11, 2017

From @aledbf on September 13, 2017 12:56

Is this work #1331 also related to this issue?

Yes

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 9, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/infra help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

5 participants