Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS Route 53 provider: fix various problems with handling of alias records #1860

Merged

Conversation

alfredkrohmer
Copy link
Contributor

@alfredkrohmer alfredkrohmer commented Nov 9, 2020

This PR supersedes #1356. It's rebased against master, has some commits squashed and I will keep rebasing until it's merged.

Description

The management of Route 53 alias records is utterly broken in its current state, see issues #1105 and #1175.

This PR introduces a new way to handle alias records.

  • A new ModifyEndpoint methods is introduced in a new (optional) EndpointModifyingProvider interface that providers can implement. It allows the AWS provider to convert CNAME endpoints (generated from various sources) to alias A records before a change plan is calculated. This allows proper deletion and update handling of alias records. Due to this change, the provider-specific check for the aws/evaluate-target-health provider-specific property can be removed from the plan calculation. See here for a lenghtly explanation on how this works.
  • The isAWSAlias function is completely reimplemented to fix creation of alias records for targets that are not a load balancer.

Fixes #1105. Fixes #1175.

Checklist

  • Unit tests updated

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 9, 2020
@alfredkrohmer
Copy link
Contributor Author

/assign @njuettner

@igor-nikiforov
Copy link

@devkid, Does it handle #1809?

@alfredkrohmer
Copy link
Contributor Author

@igor-nikiforov unfortunately not, that's in a different code path.

@seanmalloy
Copy link
Member

/kind bug

Copy link
Member

@linki linki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devkid Thanks for fixing the issues with ALIAS records.

I support the idea of the EndpointModifier approach. It removes unrelated code from the ideally provider-independent Plan phase into each provider. I've ran this version against our e2e tests to see if it succeeds in addition to the unit tests here. It passes just fine (don't mind the unrelated failing test) but note that we don't use custom values for provider-specific stuff like EvaluateTargetHealth, WeightedRecords and so on there, I think.

Regarding the code: I don't understand the purpose of all the changes. Maybe you can shed some more light on the different parts, such as the removal of the recordsCache and the need to rewrite the ALIAS detection. We should make sure that we don't break the existing (working) behaviour.

@@ -394,15 +398,13 @@ func (p *AWSProvider) DeleteRecords(ctx context.Context, endpoints []*endpoint.E

func (p *AWSProvider) doRecords(ctx context.Context, action string, endpoints []*endpoint.Endpoint) error {
zones, err := p.Zones(ctx)
p.ModifyEndpoints(endpoints)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should come after the if err != nil { below.

@@ -509,20 +502,43 @@ func (p *AWSProvider) newChanges(action string, endpoints []*endpoint.Endpoint,
return changes
}

func (p *AWSProvider) ModifyEndpoints(endpoints []*endpoint.Endpoint) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please summarize the modification logic in a comment or in the code. The code alone is hard to understand.

@Raffo
Copy link
Contributor

Raffo commented Nov 22, 2020

@linki similar to what I commented in #1849 and #1861 , I'm using this for a refactor. I do think the idea is good as well and can solve #1849 as well. I don't think the additional interface is a good idea though even though simpler than just modifying the provider interface. I think we should embrace the change that we want to do to the provider. I will submit soon a PR that should likely replace this one and #1849. Can you give us a way to at least run your end to end tests on the ExternalDNS repo?

@linki
Copy link
Member

linki commented Nov 23, 2020

Sounds good to me.

I think we don't have an automated setup for PR builds from this repository at the moment but I'll check again.
Meanwhile I can build and trigger it manually.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 9, 2020
@njuettner
Copy link
Member

@devkid thank you for your effort 🙏🏻 , can we push this again? Sorry it didn't get much attention the last months. I saw there are still some open comments, do you mind address them so we can merge it hopefully 🤞🏻 ?

@alfredkrohmer
Copy link
Contributor Author

I'll look into it.

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 22, 2021
@alfredkrohmer
Copy link
Contributor Author

@njuettner rebased

Copy link
Member

@njuettner njuettner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devkid thanks again for open a PR, do you mind adding a test case which uncovers this issue? It seems like we missed a test which shows this issue for handling alias records?

Happy to merge this PR once we have it.

@alfredkrohmer
Copy link
Contributor Author

@njuettner I already had adjusted the test for isAWSAlias before (as well as all the other tests that used alias records), now I've also added a test for the new AdjustEndpoints. Is that enough? Otherwise, please advise what else you want tested.

Copy link
Member

@njuettner njuettner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 7, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: devkid, njuettner

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 7, 2021
@k8s-ci-robot k8s-ci-robot merged commit e2eefbe into kubernetes-sigs:master Apr 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
8 participants