Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix status report of Antrea-native policies #3074

Merged
merged 1 commit into from
Dec 3, 2021

Conversation

tnqn
Copy link
Member

@tnqn tnqn commented Dec 1, 2021

For an Antrea-native policy with multiple rules that have different AppliedTo, some rules may not be effective on a Node when the policy applies to the Node. In this case, antrea-agent didn't make any actual dataplane change and didn't report new status to antrea-controller if the only changed part of a policy event is about the ineffective rules, even though the generation of the policy may have changed. This leads to a policy staying in realizing state because of the missing status report.

This patch fixes it by triggering status report at least once as long as a policy is actually updated. In addition, it removes some useless return value and changes the order of event logs to better reflect the cause of policy reconciliation.

Signed-off-by: Quan Tian qtian@vmware.com

Fixes #3075

@tnqn
Copy link
Member Author

tnqn commented Dec 1, 2021

/test-e2e

@codecov-commenter
Copy link

codecov-commenter commented Dec 1, 2021

Codecov Report

Merging #3074 (cde4d67) into main (b7da020) will increase coverage by 2.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3074      +/-   ##
==========================================
+ Coverage   59.79%   61.84%   +2.05%     
==========================================
  Files         292      292              
  Lines       24732    34550    +9818     
==========================================
+ Hits        14788    21369    +6581     
- Misses       8317    11408    +3091     
- Partials     1627     1773     +146     
Flag Coverage Δ
e2e-tests 53.62% <100.00%> (?)
kind-e2e-tests 47.95% <100.00%> (+1.11%) ⬆️
unit-tests 40.17% <70.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/controller/networkpolicy/cache.go 85.11% <100.00%> (-1.44%) ⬇️
...ntroller/networkpolicy/networkpolicy_controller.go 68.85% <100.00%> (-1.57%) ⬇️
pkg/agent/cniserver/pod_configuration_linux.go 26.31% <0.00%> (-40.36%) ⬇️
pkg/controller/networkpolicy/endpoint_querier.go 61.46% <0.00%> (-29.97%) ⬇️
pkg/ipam/ipallocator/allocator.go 56.33% <0.00%> (-27.88%) ⬇️
pkg/agent/cniserver/ipam/antrea_ipam_controller.go 40.47% <0.00%> (-24.91%) ⬇️
pkg/ipam/poolallocator/allocator.go 37.80% <0.00%> (-24.62%) ⬇️
.../registry/networkpolicy/clustergroupmember/rest.go 64.28% <0.00%> (-23.95%) ⬇️
pkg/controller/egress/controller.go 60.18% <0.00%> (-23.64%) ⬇️
pkg/agent/util/ethtool/ethtool_linux.go 46.66% <0.00%> (-23.34%) ⬇️
... and 278 more

@tnqn tnqn marked this pull request as ready for review December 1, 2021 17:50
@tnqn
Copy link
Member Author

tnqn commented Dec 1, 2021

/test-all

antoninbas
antoninbas previously approved these changes Dec 1, 2021
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

c.policyMapLock.Lock()
defer c.policyMapLock.Unlock()
return c.updateNetworkPolicyLocked(policy)
}

func (c *ruleCache) updateNetworkPolicyLocked(policy *v1beta.NetworkPolicy) error {
func (c *ruleCache) updateNetworkPolicyLocked(policy *v1beta.NetworkPolicy) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a function comment to explain what the return value stands for

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

For an Antrea-native policy with multiple rules that have different
AppliedTo, some rules may not be effective on a Node when the policy
applies to the Node. In this case, antrea-agent didn't make any actual
dataplane change and didn't report new status to antrea-controller if
the only changed part of a policy event is about the ineffective rules,
even though the generation of the policy may have changed. This leads
to a policy staying in realizing state because of the missing status
report.

This patch fixes it by triggering status report at least once as long as
a policy is actually updated. In addition, it removes some useless
return value and changes the order of event logs to better reflect the
cause of policy reconciliation.

Signed-off-by: Quan Tian <qtian@vmware.com>
@tnqn
Copy link
Member Author

tnqn commented Dec 2, 2021

/test-all

@tnqn tnqn merged commit 96f5581 into antrea-io:main Dec 3, 2021
@tnqn tnqn deleted the fix-netpol-status branch December 3, 2021 01:15
@tnqn tnqn added the action/backport Indicates a PR that requires backports. label Jan 25, 2022
antoninbas pushed a commit that referenced this pull request Oct 20, 2022
For an Antrea-native policy with multiple rules that have different
AppliedTo, some rules may not be effective on a Node when the policy
applies to the Node. In this case, antrea-agent didn't make any actual
dataplane change and didn't report new status to antrea-controller if
the only changed part of a policy event is about the ineffective rules,
even though the generation of the policy may have changed. This leads
to a policy staying in realizing state because of the missing status
report.

This patch fixes it by triggering status report at least once as long as
a policy is actually updated. In addition, it removes some useless
return value and changes the order of event logs to better reflect the
cause of policy reconciliation.

Signed-off-by: Quan Tian <qtian@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/backport Indicates a PR that requires backports.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Antrea-native NetworkPolicy status is not correct in some cases
3 participants