Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Versioned API and Antctl for NetworkPolicyEvaluation (effective policy rule prediction) #5740

Merged
merged 1 commit into from
Mar 4, 2024

Conversation

qiyueyao
Copy link
Contributor

@qiyueyao qiyueyao commented Nov 21, 2023

  • Added antctl support for querying effective policy rule by networkpolicyevaluation
    antctl query networkpolicyevaluation -S ns1/pod1 -D ns2/pod2
  • Added versioned API for NetworkPolicy evaluation POST call
    curl -d "@<test json file>" -H "Content-Type: application/json" -X POST <k8s-apiserver>:8001/apis/controlplane.antrea.io/v1beta2/networkpolicyevaluation

Adds the above APIs and antctl queriers that returns the predicted effective NetworkPolicy rule, which affects traffic from ns1/pod1 to ns2/pod2. The solution picks the highest priority rule that satisfies the query.

pkg/antctl/antctl.go Outdated Show resolved Hide resolved
pkg/antctl/antctl.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/antctl/command_definition.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved
@qiyueyao qiyueyao marked this pull request as ready for review December 8, 2023 01:28
@qiyueyao qiyueyao changed the title [WIP] Add Antctl NetworkPolicy Rule Prediction Analysis Query Add Antctl NetworkPolicy Rule Prediction Analysis Query Dec 8, 2023
@qiyueyao qiyueyao force-pushed the ods-antctl branch 5 times, most recently from 9bf6c0f to e859616 Compare December 13, 2023 22:42
@qiyueyao qiyueyao requested a review from tnqn December 14, 2023 01:09
@Dyanngg Dyanngg added this to the Antrea v1.15 release milestone Dec 14, 2023
@Dyanngg Dyanngg added the area/component/antctl Issues or PRs releated to the command line interface component label Dec 14, 2023
docs/antctl.md Outdated Show resolved Hide resolved
docs/antctl.md Outdated Show resolved Hide resolved
pkg/antctl/antctl.go Outdated Show resolved Hide resolved
pkg/antctl/command_definition.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/networkpolicyanalysis/handler.go Outdated Show resolved Hide resolved

// HandleFunc creates a http.HandlerFunc which uses an AgentNetworkPolicyInfoQuerier
// to query network policy rules in current agent.
func HandleFunc(eq networkpolicy.EndpointQuerier) http.HandlerFunc {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just curious as to whether some thought was given regarding implementing all this logic in antctl, rather than in the controller API server? I imagine that in theory antctl query networkpolicyanalysis and antctl query endpoint could share a single API (with extra information compared to what /endpoints provide today) and that the rule comparison / ordering could happen in antctl instead of in the server. If however, we think that there is a need for a dedicated API providing the end result (e.g., because it needs to be consumed by some other component), then that would explain why we need a dedicated API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we do intend to have this API consumed by some downstream components

Copy link
Member

@tnqn tnqn Dec 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this tends to be a public API and not only consumed by antctl, should we make a well-defined, structured, and versioned resource API? The API looks helpful to me and could perhaps be extended to test Pod-to-IP as well.
If it's a non-resource API like the current one, it may be hard to evolve and handle compatibility with its consumers.
I think it could be an API under controlplane API like the group membership APIs. There are quite some examples in Kubernetes APIs similar to this, used as ephemeral queries, like AdmissionReview, SubjectAccessReview, TokenReview, TokenRequest, CertificateSigningRequest.
There will be plenty of benefits by making it a resource API:

  1. It can be exposed via APIService so any consumers running out of the cluster (including antctl) can access it publicly.
  2. Generated client code makes it easier to consume, instead of rewriting the parser code in every client.
  3. It's versioned so easier to support different versions of clients.
  4. The request and response can be better structured when extending it.

For example, the data type in my mind:

type NetworkPolicyAccessReview struct {
	metav1.TypeMeta
	Request *NetworkPolicyAccessRequest
	Response *NetworkPolicyAccessResponse
}

type Entity struct {
	Namespace string
	Pod       string
	// It can be added when we can support IP check.
	IP   string

}

type NetworkPolicyAccessRequest struct {
	Source      Entity
	Destination Entity
	// It can be added when we can support port level check.
        Protocol Protocol
	SourcePort int
        DestinationPort int
}

type NetworkPolicyAccessResponse struct {
	// The expected action.
	Action Action
	// The reference of the effective NetworkPolicy.
	NetworkPolicy NetworkPolicyReference
	Direction cpv1beta.Direction 
	RuleIndex int
	// The content of the effective rule. Type is runtime.Object because it can be different types.
	Rule runtime.Object
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to resourced API, and updated antctl call to it.

pkg/antctl/antctl.go Outdated Show resolved Hide resolved
docs/antctl.md Outdated Show resolved Hide resolved
docs/antctl.md Outdated Show resolved Hide resolved
pkg/antctl/antctl.go Outdated Show resolved Hide resolved
transformedResponse: reflect.TypeOf(controllernetworkpolicy.EndpointQueryResponse{}),
transformedResponse: reflect.TypeOf(endpointServer.EndpointQueryResponse{}),
},
{use: "networkpolicyanalysis",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we name it "effectivepolicyrule"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently implementing this comment #5740 (comment) , so perhaps this will be "networkpolicyaccess"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that comment is about making the API definition generic? But the antctl command here is just for a single operation of returning the effective rule, or you plan to extend the command to support generic policy queries later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Named the API networkpolicyaccessreview and named the antctl command effectivepolicyrule.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us hear what @antoninbas and @tnqn may suggest too.

Copy link
Member

@tnqn tnqn Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I used NetworkPolicyAccessReview in #5740 (comment) just as an example and didn't think too much about the naming. Now I feel it sounds different from what it actually represents. How about naming it NetworkPolicyEnforcement, which may also be used as the cmd name, i.e. antctl query networkpolicyenforcement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the api&cmd doesn't enforce the current configs, just queries them, perhaps enforcement would be misleading? How about NetworkPolicyEffectiveRule, NetworkPolicyPrimeRule, NetworkPolicyEvaluation, NetworkPolicyAnalysis?

Copy link
Contributor

@Dyanngg Dyanngg Jan 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Qiyue that without the query verb in antctl, NetworkPolicyEnforcement as an API name could be misleading as it does not return the actual policy enforcement state. I personally would vote for NetworkPolicyEvaluation but love to hear what @tnqn and @jianjuns think as well

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetworkPolicyEvaluation sounds good to me too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed the types.

@qiyueyao qiyueyao changed the title Add Antctl NetworkPolicy Rule Prediction Analysis Query Add Versioned API and Antctl for Effective NetworkPolicy Rule Prediction Jan 24, 2024
pkg/apis/controlplane/types.go Show resolved Hide resolved
pkg/apiserver/apiserver.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/endpoint/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/endpoint/handler.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
pkg/antctl/command_definition.go Outdated Show resolved Hide resolved
pkg/antctl/command_definition.go Outdated Show resolved Hide resolved
pkg/antctl/command_definition.go Outdated Show resolved Hide resolved
@qiyueyao qiyueyao changed the title Add Versioned API and Antctl for Effective NetworkPolicy Rule Prediction Add Versioned API and Antctl for NetworkPolicyEvaluation (effective policy rule prediction) Feb 6, 2024
pkg/antctl/antctl.go Outdated Show resolved Hide resolved
pkg/antctl/antctl.go Outdated Show resolved Hide resolved
pkg/antctl/antctl.go Outdated Show resolved Hide resolved
pkg/apis/controlplane/types.go Outdated Show resolved Hide resolved
return ns, pod
}

func NewNetworkPolicyEvaluation(args map[string]string) (runtime.Object, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this package was meant for output transforms. For parameter transform (which is a new concept), maybe it should be in a separate package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a new package paramter under antctl.

pkg/apiserver/handlers/endpoint/handler.go Outdated Show resolved Hide resolved
pkg/apiserver/handlers/endpoint/handler.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier_test.go Outdated Show resolved Hide resolved
@qiyueyao qiyueyao force-pushed the ods-antctl branch 3 times, most recently from 47108e1 to 319cab6 Compare February 9, 2024 02:06
return ns, pod
}

func NewNetworkPolicyEvaluation(args map[string]string) (runtime.Object, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be moved to transform/networkpolicy like the type of the response? otherwise this file would include all parameters of discrete commands.

Copy link
Contributor Author

@qiyueyao qiyueyao Feb 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function was moved here as a new package parameter based on this comment, if I understood it correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably find a better package name / location in a future PR though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, do you have any suggestion or insight? To include it inside transform or refactor some of the existing structs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it some more, maybe we could keep it in transform/networkpolicy, but as a separate file to clearly mark the difference? Maybe we could have transform/networkpolicy/request.go and ransform/networkpolicy/response.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, will open a PR for this.

pkg/antctl/transform/networkpolicy/transform.go Outdated Show resolved Hide resolved
pkg/apis/controlplane/v1beta2/types.go Outdated Show resolved Hide resolved
pkg/apis/controlplane/v1beta2/types.go Outdated Show resolved Hide resolved
@@ -214,6 +216,7 @@ func installAPIGroup(s *APIServer, c completedConfig) error {
cpv1beta2Storage["appliedtogroups"] = appliedToGroupStorage
cpv1beta2Storage["networkpolicies"] = networkPolicyStorage
cpv1beta2Storage["networkpolicies/status"] = networkPolicyStatusStorage
cpv1beta2Storage["networkpolicyevaluation"] = networkPolicyEvaluationStorage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use "networkpolicies/evaluation" as the path to indicate their relationship? Like "pods/log", "pods/attach", "pods/exec" APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some test and found that, since NetworkPolicyEvaluation is currently a resource, to use networkpolicies/evaluation we have to make evaluation a subresource of networkpolicies. But different from status&scale, this NetworkPolicyEvaluation is not affiliated to a particular networkpolicy at input, so I could not provide a networkpolicy name for the subresource. I feel like this might have to stay as a separate resource like appliedtogroups?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

pkg/apiserver/handlers/endpoint/handler.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
@qiyueyao
Copy link
Contributor Author

qiyueyao commented Feb 15, 2024

Depends on #5989
Merged and rebased.

@@ -214,6 +216,7 @@ func installAPIGroup(s *APIServer, c completedConfig) error {
cpv1beta2Storage["appliedtogroups"] = appliedToGroupStorage
cpv1beta2Storage["networkpolicies"] = networkPolicyStorage
cpv1beta2Storage["networkpolicies/status"] = networkPolicyStatusStorage
cpv1beta2Storage["networkpolicyevaluation"] = networkPolicyEvaluationStorage
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
pkg/controller/networkpolicy/endpoint_querier.go Outdated Show resolved Hide resolved
Comment on lines +283 to +292
if len(commonRules) > 0 {
commonRule = commonRules[0]
// filter Antrea-native policy rules with Pass action
// if pass rule currently has the highest precedence, skip the remaining rules
// until the next K8s rule or Baseline rule, or return the pass rule otherwise
isPass := func(ruleInfo *controlplane.NetworkPolicyRule) bool {
return ruleInfo.Action != nil && *ruleInfo.Action == crdv1beta1.RuleActionPass
}
if isPass(commonRule.Rule) {
for _, rule := range commonRules[1:] {
if rule.Policy.SourceRef.Type == controlplane.K8sNetworkPolicy ||
(rule.Policy.TierPriority != nil && *rule.Policy.TierPriority == BaselineTierPriority && !isPass(rule.Rule)) {
commonRule = rule
break
}
}
}
}
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the following equivalent?

for _, rule := range commonRules {
	if rule.Policy.SourceRef.Type == controlplane.K8sNetworkPolicy ||
		*rule.Policy.TierPriority == BaselineTierPriority ||
		rule.Rule.Action == nil || *rule.Rule.Action != crdv1beta1.RuleActionPass) {
		return rule
	}
}
return nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems not, we discussed the cases when 1) Pass was the first rule, but there are no satisfied rules found later, then the first Pass rule should be returned. 2) If Pass was the first rule, we need to skip all ACNP/ANNP rules until a K8s rule or Baseline rule appears, looks like the above solution returns the next non-pass rule.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

build/yamls/externalnode/vm-agent-rbac.yml Outdated Show resolved Hide resolved
Comment on lines +283 to +292
if len(commonRules) > 0 {
commonRule = commonRules[0]
// filter Antrea-native policy rules with Pass action
// if pass rule currently has the highest precedence, skip the remaining rules
// until the next K8s rule or Baseline rule, or return the pass rule otherwise
isPass := func(ruleInfo *controlplane.NetworkPolicyRule) bool {
return ruleInfo.Action != nil && *ruleInfo.Action == crdv1beta1.RuleActionPass
}
if isPass(commonRule.Rule) {
for _, rule := range commonRules[1:] {
if rule.Policy.SourceRef.Type == controlplane.K8sNetworkPolicy ||
(rule.Policy.TierPriority != nil && *rule.Policy.TierPriority == BaselineTierPriority && !isPass(rule.Rule)) {
commonRule = rule
break
}
}
}
}
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense

Adds a versioned API and antctl query for NetworkPolicy
evaluation that returns the predicted effective NetworkPolicy
rule, which affects traffic from ns1/pod1 to ns2/pod2.

Signed-off-by: Qiyue Yao <yaoq@vmware.com>
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@Dyanngg Dyanngg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for addressing all the comments on this giant PR

@tnqn
Copy link
Member

tnqn commented Feb 29, 2024

@jianjuns @antoninbas do you have other comments?

@antoninbas
Copy link
Contributor

@tnqn I took another quick look, LGTM

@antoninbas antoninbas added the action/release-note Indicates a PR that should be included in release notes. label Mar 1, 2024
@tnqn
Copy link
Member

tnqn commented Mar 4, 2024

/test-all

@tnqn
Copy link
Member

tnqn commented Mar 4, 2024

Ignoring the following tests:

@tnqn tnqn merged commit 5396f58 into antrea-io:main Mar 4, 2024
49 of 55 checks passed
@qiyueyao qiyueyao deleted the ods-antctl branch March 4, 2024 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/component/antctl Issues or PRs releated to the command line interface component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants