Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky test: TestCoreDNSApplyChanges #4240

Closed
pascalgn opened this issue Feb 7, 2024 · 4 comments · Fixed by #4245
Closed

Flaky test: TestCoreDNSApplyChanges #4240

pascalgn opened this issue Feb 7, 2024 · 4 comments · Fixed by #4245
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@pascalgn
Copy link
Contributor

pascalgn commented Feb 7, 2024

Issue for TestCoreDNSApplyChanges test failure, as it's currently affecting multiple PRs

@pascalgn pascalgn added the kind/bug Categorizes issue or PR as related to a bug. label Feb 7, 2024
@pascalgn
Copy link
Contributor Author

pascalgn commented Feb 7, 2024

I had a look and I'm not exactly sure why the test is failing in CI.

There are two problems that I've identified so far:

  1. The validateServices method isn't really working, it doesn't check the expected services as one would think. If I change the code in coredns_test.go to the following, the test still passes for me
    expectedServices4 := map[string]*Service{
    	"/skydns/local/domain2": {Host: "site.local"},
    	"/skydns/local/xx1":     {Host: "5.5.5.5"},         // <-- no reference to domain1 anymore
    	"/skydns/local/xx2":     {Host: "6.6.6.6"},         // <-- no reference to domain1 anymore
    	"/skydns/local/domain1": {Host: "7.7.7.7"},
    }
  2. The expected behaviour, at least in the test setup, is not actually happening. The failing part of the test was part of 3602c47 which says "Multiple A records support for the same FQDN" but I added some logging to validateServices and saw that all A records point to the same IP, instead of 3 different ones:
    coredns_test.go:325:   srv: /skydns/local/domain1/3a4dab26 7.7.7.7
    coredns_test.go:325:   srv: /skydns/local/domain2/18d03fa3 site.local
    coredns_test.go:325:   srv: /skydns/local/domain1/1d5f2afc 7.7.7.7
    coredns_test.go:325:   srv: /skydns/local/domain1/30517d1e 7.7.7.7
    
    (That's why the test is passing locally, it always takes /skydns/local/domain1 from the expectedServices map, which gives 7.7.7.7, which is what all records have)

@szuecs
Copy link
Contributor

szuecs commented Feb 7, 2024

Interesting this sounds like a common Go iteration issue if you hold only the reference to the last one of a list....
Did the Go version changed in the test suite?

@pascalgn
Copy link
Contributor Author

pascalgn commented Feb 8, 2024

I had another look and it could be this code in coredns_test.go, where we directly store the pointers to the given service in the map.

func (c fakeETCDClient) SaveService(service *Service) error {
	c.services[service.Key] = service
	return nil
}

I have no idea about Go, so I don't know which guarantees there are around pointers, but during debugging I saw that in this loop, after the 3rd run, the fakeETCDClient contains 3 pointers which all point to the same memory, so they all have Host=7.7.7.7, even though the "Add/set key..." logging output shows the 3 different IPs

		for _, service := range services {
			log.Infof("Add/set key %s to Host=%s, Text=%s, TTL=%d", service.Key, service.Host, service.Text, service.TTL)
			if !p.dryRun {
				err := p.client.SaveService(&service)

I'm trying to create a PR, but, as I said, no idea about Go

@pascalgn
Copy link
Contributor Author

pascalgn commented Feb 8, 2024

@szuecs OK, I created a PR. Please have a look, I hope it will fix the issues! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
2 participants