Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1881057: ignore the IPv6DualStack feature gate for the kubelet config #2108

Merged

Conversation

danwinship
Copy link
Contributor

@danwinship danwinship commented Sep 21, 2020

To install a dual-stack cluster you must set the IPv6DualStack feature gate at install time. MCO currently does not process feature gates provided as install-time manifests and so the bootstrap configs will not match the post-install configs and the operator will become Degraded because the masters are stuck unable to move to the expected config.

As it happens, kubelet does not use the IPv6DualStack feature gate unless you are using dockershim, so to work around this problem for now, just ignore that feature gate when generating the kubelet config, so that the "real" generated config will match the config expected by the bootstrap code.

(Not yet tested, but I think this will fix things in 4.6, and then it can be reverted and fixed correctly in 4.7, since kubelet will need the feature gate then.)

To install a dual-stack cluster you must set the IPv6DualStack feature
gate at install time. MCO currently does not process feature gates
provided as install-time manifests and will generate a Degraded
cluster.

As it happens, kubelet does not use the IPv6DualStack feature gate
unless you are using dockershim, so to work around this problem for
now, just ignore that feature gate when generating the kubelet config,
so that the "real" generated config will match the config expected by
the bootstrap code.
@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 21, 2020
@openshift-ci-robot
Copy link
Contributor

@danwinship: An error was encountered adding this pull request to the external tracker bugs for bug 1881057 on the Bugzilla server at https://bugzilla.redhat.com:

JSONRPC error 32000: There was an error reported for a GitHub REST call. URL: https://api.github.com/repos/openshift/machine-config-operator/pulls/2108 Error: 403 Forbidden at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/Type/GitHub.pm line 111. at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/Type/GitHub.pm line 111. eval {...} called at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/Type/GitHub.pm line 98 Bugzilla::Extension::ExternalBugs::Type::GitHub::_do_rest_call('Bugzilla::Extension::ExternalBugs::Type::GitHub=HASH(0x55894c...', 'https://api.github.com/repos/openshift/machine-config-operato...', 'GET') called at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/Type/GitHub.pm line 62 Bugzilla::Extension::ExternalBugs::Type::GitHub::get_data('Bugzilla::Extension::ExternalBugs::Type::GitHub=HASH(0x55894c...', 'Bugzilla::Extension::ExternalBugs::Bug=HASH(0x55894c404300)') called at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/Bug.pm line 302 eval {...} called at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/Bug.pm line 302 Bugzilla::Extension::ExternalBugs::Bug::update_ext_info('Bugzilla::Extension::ExternalBugs::Bug=HASH(0x55894c404300)', 1) called at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/Bug.pm line 125 Bugzilla::Extension::ExternalBugs::Bug::create('Bugzilla::Extension::ExternalBugs::Bug', 'HASH(0x55894cc21678)') called at /var/www/html/bugzilla/extensions/ExternalBugs/Extension.pm line 877 Bugzilla::Extension::ExternalBugs::bug_start_of_update('Bugzilla::Extension::ExternalBugs=HASH(0x55894cd05ea8)', 'HASH(0x55894cbc95c0)') called at /var/www/html/bugzilla/Bugzilla/Hook.pm line 21 Bugzilla::Hook::process('bug_start_of_update', 'HASH(0x55894cbc95c0)') called at /var/www/html/bugzilla/Bugzilla/Bug.pm line 1170 Bugzilla::Bug::update('Bugzilla::Bug=HASH(0x55894cb6e5c8)') called at /loader/0x558945dcc880/Bugzilla/Extension/ExternalBugs/WebService.pm line 88 Bugzilla::Extension::ExternalBugs::WebService::add_external_bug('Bugzilla::WebService::Server::JSONRPC::Bugzilla::Extension::E...', 'HASH(0x55894ce7a470)') called at (eval 2143) line 1 eval ' $procedure->{code}->($self, @params) ;' called at /usr/share/perl5/vendor_perl/JSON/RPC/Legacy/Server.pm line 220 JSON::RPC::Legacy::Server::_handle('Bugzilla::WebService::Server::JSONRPC::Bugzilla::Extension::E...', 'HASH(0x55894ccfd0d8)') called at /var/www/html/bugzilla/Bugzilla/WebService/Server/JSONRPC.pm line 295 Bugzilla::WebService::Server::JSONRPC::_handle('Bugzilla::WebService::Server::JSONRPC::Bugzilla::Extension::E...', 'HASH(0x55894ccfd0d8)') called at /usr/share/perl5/vendor_perl/JSON/RPC/Legacy/Server.pm line 126 JSON::RPC::Legacy::Server::handle('Bugzilla::WebService::Server::JSONRPC::Bugzilla::Extension::E...') called at /var/www/html/bugzilla/Bugzilla/WebService/Server/JSONRPC.pm line 70 Bugzilla::WebService::Server::JSONRPC::handle('Bugzilla::WebService::Server::JSONRPC::Bugzilla::Extension::E...') called at /var/www/html/bugzilla/jsonrpc.cgi line 31 ModPerl::ROOT::Bugzilla::ModPerl::ResponseHandler::var_www_html_bugzilla_jsonrpc_2ecgi::handler('Apache2::RequestRec=SCALAR(0x55894cb85c10)') called at /usr/lib64/perl5/vendor_perl/ModPerl/RegistryCooker.pm line 207 eval {...} called at /usr/lib64/perl5/vendor_perl/ModPerl/RegistryCooker.pm line 207 ModPerl::RegistryCooker::run('Bugzilla::ModPerl::ResponseHandler=HASH(0x55894c1ccde0)') called at /usr/lib64/perl5/vendor_perl/ModPerl/RegistryCooker.pm line 173 ModPerl::RegistryCooker::default_handler('Bugzilla::ModPerl::ResponseHandler=HASH(0x55894c1ccde0)') called at /usr/lib64/perl5/vendor_perl/ModPerl/Registry.pm line 32 ModPerl::Registry::handler('Bugzilla::ModPerl::ResponseHandler', 'Apache2::RequestRec=SCALAR(0x55894cb85c10)') called at /var/www/html/bugzilla/mod_perl.pl line 139 Bugzilla::ModPerl::ResponseHandler::handler('Bugzilla::ModPerl::ResponseHandler', 'Apache2::RequestRec=SCALAR(0x55894cb85c10)') called at (eval 2143) line 0 eval {...} called at (eval 2143) line 0
Please contact an administrator to resolve this issue, then request a bug refresh with /bugzilla refresh.

In response to this:

WIP: Bug 1881057: ignore the IPv6DualStack feature gate for the kubelet config

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. label Sep 21, 2020
@openshift-ci-robot
Copy link
Contributor

@danwinship: This pull request references Bugzilla bug 1881057, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

WIP: Bug 1881057: ignore the IPv6DualStack feature gate for the kubelet config

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Sep 21, 2020
@cgwalters
Copy link
Member

Hmm...so the feature gate isn't implemented in crio? Or is it more we're "abusing" the kubelet config flag in other places in the platform - i.e. are there operators that are reading this flag in the platform right now?

Copy link
Member

@cgwalters cgwalters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentative approval

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 21, 2020
@danwinship
Copy link
Contributor Author

danwinship commented Sep 21, 2020

Hmm...so the feature gate isn't implemented in crio?

AFAIK crio doesn't know about feature gates, and doesn't treat "dual stack" as a configurable option; if the CNI plugin returns dual-stack IPs to crio then crio returns dual-stack IPs to kubelet. (And kubelet likewise will just set dual-stack IPs on the Pod if it gets dual-stack IPs from crio.)

Or is it more we're "abusing" the kubelet config flag in other places in the platform - i.e. are there operators that are reading this flag in the platform right now?

No, nothing else is looking at kubelet's config. The issue is that OCP only lets you specify a single set of feature gates which then gets used by every component that cares about feature gates. So we can't enable IPv6DualStack in kube-apiserver without also causing it to be enabled in kubelet, which then triggers the MCO bug. So this just works around the problem by having MCO's kubelet FeatureGate listener ignore it.

@danwinship danwinship changed the title WIP: Bug 1881057: ignore the IPv6DualStack feature gate for the kubelet config Bug 1881057: ignore the IPv6DualStack feature gate for the kubelet config Sep 21, 2020
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 21, 2020
@danwinship
Copy link
Contributor Author

With this PR, machine-config becomes Available and not Degraded.

@danwinship
Copy link
Contributor Author

/retest

@cgwalters
Copy link
Member

#2108 (comment)
is great info, makes sense.
/approve
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Sep 21, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, danwinship

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Sep 21, 2020

@danwinship: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/okd-e2e-aws 908356b link /test okd-e2e-aws

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit e93a9c8 into openshift:master Sep 21, 2020
@openshift-ci-robot
Copy link
Contributor

@danwinship: All pull requests linked via external trackers have merged:

Bugzilla bug 1881057 has been moved to the MODIFIED state.

In response to this:

Bug 1881057: ignore the IPv6DualStack feature gate for the kubelet config

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-urgent Referenced Bugzilla bug's severity is urgent for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants