Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Multicast] Add Multicast statistics API #3354

Merged
merged 1 commit into from
May 18, 2022

Conversation

ceclinux
Copy link
Contributor

@ceclinux ceclinux commented Feb 24, 2022

This PR adds a field Multicast for control-plane API NodeStatsSummary
and field MulticastGroup for public statistics API group.

Signed-off-by: ceclinux src655@gmail.com

For #3294

@ceclinux ceclinux changed the title Add Multicast statistics API and controlplane stats API [Multicast] Add Multicast statistics API and controlplane stats API Feb 24, 2022
@ceclinux ceclinux changed the title [Multicast] Add Multicast statistics API and controlplane stats API [Multicast] Add Multicast statistics API Feb 24, 2022
@ceclinux ceclinux force-pushed the multicast_stats_api branch 2 times, most recently from e2b7ec3 to 2291d90 Compare February 24, 2022 03:01
@codecov-commenter
Copy link

codecov-commenter commented Feb 24, 2022

Codecov Report

Merging #3354 (594715f) into main (3a51abe) will decrease coverage by 21.32%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##             main    #3354       +/-   ##
===========================================
- Coverage   63.38%   42.06%   -21.33%     
===========================================
  Files         280      325       +45     
  Lines       39750    51694    +11944     
===========================================
- Hits        25197    21745     -3452     
- Misses      12593    28330    +15737     
+ Partials     1960     1619      -341     
Flag Coverage Δ
integration-tests 38.22% <ø> (?)
kind-e2e-tests ?
unit-tests 43.73% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/apis/controlplane/types.go 0.00% <ø> (ø)
pkg/log/log.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/ovs/openflow/default.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/util/runtime/runtime.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/agent/openflow/groups.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/ovs/openflow/ofctrl_pipeline.go 0.00% <0.00%> (-95.84%) ⬇️
pkg/ovs/openflow/logs.go 10.34% <0.00%> (-89.66%) ⬇️
pkg/agent/controller/networkpolicy/reject.go 0.00% <0.00%> (-85.30%) ⬇️
pkg/agent/nodeportlocal/k8s/annotations.go 0.00% <0.00%> (-84.45%) ⬇️
pkg/apis/controlplane/register.go 0.00% <0.00%> (-83.34%) ⬇️
... and 313 more

@ceclinux ceclinux force-pushed the multicast_stats_api branch 2 times, most recently from 149a55b to 949db7e Compare March 8, 2022 13:36
Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question: why adding MulticastGroupMember/MulticastGroupMemberList in both pkg/apis/stats/types.go and pkg/apis/stats/v1alpha1/types.go? I think adding in pkg/apis/stats/v1alpha1/types.go is enough.

MulticastStats []MulticastStats
}

// MulticastStats contains the information for Multicast Groups that a pod has joined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/MulticastStats contains the information for Multicast Groups that a pod has joined/MulticastStats contains the multicast groups that a Pod has joined/s

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@ceclinux
Copy link
Contributor Author

One question: why adding MulticastGroupMember/MulticastGroupMemberList in both pkg/apis/stats/types.go and pkg/apis/stats/v1alpha1/types.go? I think adding in pkg/apis/stats/v1alpha1/types.go is enough.

From the @tnqn reply in #1221

@ceclinux please take a look at https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md which explains why an internal version is required. Actually it's not a must right now but will be needed once we have another API version.

After reading the docs provided, I think we should have the internal structs changes(/stats/v1alpha1/types.go) and versioned api changes(pkg/apis/stats/v1alpha1/types.go) in this pr.

@ceclinux
Copy link
Contributor Author

/test-multicast-e2e

@ceclinux ceclinux mentioned this pull request Mar 23, 2022
12 tasks
@ceclinux ceclinux requested a review from liu4480 April 13, 2022 02:52
@ceclinux ceclinux added this to the Antrea v1.7 release milestone Apr 24, 2022
@ceclinux ceclinux requested a review from wenyingd April 26, 2022 02:10
pkg/apis/controlplane/v1beta2/types.go Outdated Show resolved Hide resolved
pkg/apis/controlplane/v1beta2/types.go Outdated Show resolved Hide resolved
pkg/apis/controlplane/v1beta2/types.go Outdated Show resolved Hide resolved
// MulticastGroupMember contains the mapping between Multicast Group and Pods.
type MulticastGroupMember struct {
metav1.TypeMeta
metav1.ObjectMeta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use standard K8s API for this, what would be the name of the object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the the name of the object would be same as group IP

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also think it is not necessary to have a single field for name also namespace.

Pod *PodReference `json:"pod,omitempty" protobuf:"bytes,1,opt,name=pod"`

// List of multicast IPs that the pod has joined.
GroupJoinedList []IPAddress `json:"groupJoinedList,omitempty" protobuf:"bytes,2,rep,name=groupJoinedList"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe rename it "GroupList"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. The whole struct has changed.

}

// MulticastStats contains the multicast groups that a Pod has joined.
type MulticastStats struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks this struc is only for the group list per Pod? But the public API is shown pod list per group in MulticastGroupMember. Why use a different orgnization here?

Do you plan to show the multicast traffic stats in this API on the scenario no ANP is applied to the Pod?

Copy link
Contributor Author

@ceclinux ceclinux Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks this struc is only for the group list per Pod? But the public API is shown pod list per group in MulticastGroupMember. Why use a different orgnization here?

Updated. Please check #3354 (comment)

Do you plan to show the multicast traffic stats in this API on the scenario no ANP is applied to the Pod?

No sure I understand your question. the group pods relation will be affected by multicast np. If igmp is block by np, the pod is not considered joined to the group in the final antctl get multicaststats representation. no ANP and ANP with allow action to the igmp are considered to be the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean do you want to show the packet number in the MulticastStats as you only list the Pod count of a Group in this struct? Or the the multicast packet numbers are only shown in the result of ANP stats? But if a Multicast group is not applied with any ANP, we can't get the traffic numbers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to check traffic statistics not covered by ANP, you can check the command antctl get multicaststats running in antrea-agent, as described here #3294.

@@ -325,6 +325,16 @@ type NodeStatsSummary struct {
AntreaClusterNetworkPolicies []NetworkPolicyStats `json:"antreaClusterNetworkPolicies,omitempty" protobuf:"bytes,3,rep,name=antreaClusterNetworkPolicies"`
// The TrafficStats of Antrea NetworkPolicies collected from the Node.
AntreaNetworkPolicies []NetworkPolicyStats `json:"antreaNetworkPolicies,omitempty" protobuf:"bytes,4,rep,name=antreaNetworkPolicies"`
// List of PodMulticast statistics collected from the Node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Multicast stats API should start from version v1alpha1 if you don't change the feature gate version.

Besides, does this new kind need to add into the function addConversionFuncs https://github.com/antrea-io/antrea/blob/main/pkg/apis/controlplane/v1beta2/conversion.go#L28

Copy link
Contributor Author

@ceclinux ceclinux Apr 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Multicast stats API should start from version v1alpha1 if you don't change the feature gate version.

I think the the API changed I made at controlplane in this pr fit the case described here Adding Unstable Features to Stable Versions https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md#adding-unstable-features-to-stable-versions

Besides, does this new kind need to add into the function addConversionFuncs https://github.com/antrea-io/antrea/blob/main/pkg/apis/controlplane/v1beta2/conversion.go#L28

I checked the auto-generated func Convert_v1beta2_MulticastStats_To_controlplane_MulticastStats and Convert_controlplane_MulticastStats_To_v1beta2_MulticastStats, both LGTM. Could you explain why we need add addConversionFuncs? Actually addConversionFuncs confuses me because delete the one kind from this func doesn't affect the generated code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn Could you take a look at these questions? Not sure my description was correct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could follow https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md#new-field-in-existing-api-version?
The field could be added to the struct, but whether it will be collected, filled, and consumed depends on whether MulticastStats feature gate is enabled.
If we want to reuse the NodeStatsSummary API to report multicast stats data, we cannot really have another version starting from v1alpha1.

@ceclinux ceclinux force-pushed the multicast_stats_api branch 3 times, most recently from 8c3e41d to 0ade78f Compare April 27, 2022 09:27
@ceclinux ceclinux requested review from tnqn and wenyingd April 27, 2022 09:28
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall

@@ -342,6 +342,16 @@ type NodeStatsSummary struct {
AntreaClusterNetworkPolicies []NetworkPolicyStats `json:"antreaClusterNetworkPolicies,omitempty" protobuf:"bytes,3,rep,name=antreaClusterNetworkPolicies"`
// The TrafficStats of Antrea NetworkPolicies collected from the Node.
AntreaNetworkPolicies []NetworkPolicyStats `json:"antreaNetworkPolicies,omitempty" protobuf:"bytes,4,rep,name=antreaNetworkPolicies"`
// List of PodMulticast statistics collected from the Node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could we make the comment consistent with the one in pkg/apis/controlplane/types.go?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@@ -85,6 +85,34 @@ type NetworkPolicyStats struct {
TrafficStats TrafficStats `json:"trafficStats,omitempty" protobuf:"bytes,2,opt,name=trafficStats"`
}

// +genclient
// +resourceName=multicastgroupmembers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this is not required. It was added to xxxstats struct because its plural automatically generated is not correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object

// MulticastGroupMember contains the mapping between Multicast Group and Pods.
type MulticastGroupMember struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we plan to have more information for MulticastGroup that can be exposed by this API?
If no, current API name and group work for me.
If yes, I wonder if we should just use single API for all information specific to MulticastGroup, name it MulticastGroup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we don't have plan to expose more group-level multicast information.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be called MulticastGroupMembers, but I also feel a more generic name like MulticastGroup or MulticastGroupInfo is better for future extensions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

Changed struct name to MulticastGroup. Trade explicitness for extensibility seems to be a good deal for me.

@ceclinux
Copy link
Contributor Author

ceclinux commented May 11, 2022

Currently, my implementation is antrea-agents sending whole multicast stats to the controller, which aligns with the control plane API change in the pr.
I am thinking about whether I should shift implementation to an incremental style for multicast stats, maybe something like

// MulticastStats contains the multicast groups that a Pod has joined.
type MulticastStats struct {
	// Group is the IP of Multicast Group.
	Group string `json:"group,omitempty" protobuf:"bytes,1,opt,name=group"`
	// PodsPatches is the list of MulticastGroupPatch that describes changes of pod members.
	PodsPatches []MulticastGroupPatch `json:"podsPatches,omitempty" protobuf:"bytes,2,rep,name=podsPatches"`
}

// MulticastGroupPatch describes the incremental update for the pod members with a group.
type MulticastGroupPatch struct {
	AddedGroupMembers   []PodReference `protobuf:"bytes,1,rep,name=addedGroupMembers"`
	RemovedGroupMembers []PodReference `protobuf:"bytes,2,rep,name=removedGroupMembers"`
}

pros and cons:

  1. whole multicast stats style: easier, may hit performance issue.
  2. incremental style: harder to code, need to take extra effort to solve antrea-agent restart and antrea-controller restart issue, better performance.

@tnqn @wenyingd @liu4480

@wenyingd
Copy link
Contributor

pros and cons:

  1. whole multicast stats style: easier, may hit performance issue.
  2. incremental style: harder to code, need to take extra effort to solve antrea-agent restart and antrea-controller restart issue, better performance.

@tnqn @wenyingd @liu4480

I prefer agent to send the full update ( containing all latest statistics ) every time.

@tnqn
Copy link
Member

tnqn commented May 12, 2022

Incremental style may not just work: antrea-controller doesn't persist data, using incremental update will also introduce the problem when it's time to do a full sync (how does agent know antrea-controller restarts).

A periodical sync of a list of Pods that have joined a group seems fine to me. There were only 110 Pods on a Node at most by K8s default configuration, even they are all for multicast, the data was just few KBs per group, considering each Pod object in K8s also has few KBs.

@tnqn tnqn added api-review Categorizes an issue or PR as actively needing an API review. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API. action/release-note Indicates a PR that should be included in release notes. labels May 12, 2022
@tnqn tnqn requested review from antoninbas and jianjuns May 12, 2022 04:12
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall, it's a bit strange to me that we use the stats terminology and the existing stats API to report multicast group membership information


// MulticastStats contains the multicast groups that a Pod has joined.
type MulticastStats struct {
// Group is the IP of Multicast Group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Group is the IP of Multicast Group.
// Group is the IP of the multicast group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

type MulticastStats struct {
// Group is the IP of Multicast Group.
Group string
// Pods is the list of Pods that has joined the multicast group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Pods is the list of Pods that has joined the multicast group.
// Pods is the list of Pods that have joined the multicast group.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Multicast []MulticastStats
}

// MulticastStats contains the multicast groups that a Pod has joined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// MulticastStats contains the multicast groups that a Pod has joined.
// MulticastStats contains the list of Pods that has joined a multicast group, for a given Node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

Comment on lines 343 to 344
// The stats related to multicast collected from the Node.
Multicast []MulticastStats
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me stats implies things like counters. Right now it seems that we only provide group membership information. Is that going to change in the future? If not, this should probably be renamed. @jianjuns @tnqn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand it correctly, you were implying MulticastStats should not be put in NodeStatsSummary API because it doesn't include real stats(only group-pod membership) for now. It should be added to another controlplane API.

For me creating a new API may seem like overkill. Also, We may add extra stats-like fields in this struct in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Antonin means the struct should not be named as MulticastStats as it has nothing to do with stats. If there is no plan to add statistics to the struct in future, probably name it MulticastGroupInfo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the same API (NodeStatsSummary) is good, even though in a future version maybe the API should be renamed to something more generic (e.g., NodeSummary). But yes, you don't have any stats in MulticastStats at the moment, and MulticastGroupInfo sounds like a more appropriate name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks for the explanation.

@ceclinux ceclinux force-pushed the multicast_stats_api branch 2 times, most recently from 403c2c6 to 1e0024f Compare May 15, 2022 09:27
@@ -340,6 +340,16 @@ type NodeStatsSummary struct {
AntreaClusterNetworkPolicies []NetworkPolicyStats
// The TrafficStats of Antrea NetworkPolicies collected from the Node.
AntreaNetworkPolicies []NetworkPolicyStats
// The stats related to multicast collected from the Node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revise the comments too: The stats ... -> Multicast group information

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks

Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits.

@@ -340,6 +340,16 @@ type NodeStatsSummary struct {
AntreaClusterNetworkPolicies []NetworkPolicyStats
// The TrafficStats of Antrea NetworkPolicies collected from the Node.
AntreaNetworkPolicies []NetworkPolicyStats
// Multicast group information related to multicast collected from the Node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should remove "related to multicast"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Multicast []MulticastGroupInfo `json:"multicast,omitempty" protobuf:"bytes,5,rep,name=multicast"`
}

// MulticastGroupInfo contains the list of Pods that has joined a multicast group, for a given Node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

has -> have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

// +genclient:nonNamespaced
// +k8s:deepcopy-gen:interfaces=k8s.io/apimachinery/pkg/runtime.Object

// MulticastGroup contains the mapping between Multicast Group and Pods.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel no need to capitalize "Multicast Group".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

jianjuns
jianjuns previously approved these changes May 17, 2022
Copy link
Contributor

@jianjuns jianjuns left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Others might have more comments.

antoninbas
antoninbas previously approved these changes May 17, 2022
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well

@ceclinux ceclinux requested a review from tnqn May 18, 2022 01:57
@ceclinux ceclinux dismissed stale reviews from antoninbas and jianjuns via ee0d87c May 18, 2022 06:27
@tnqn
Copy link
Member

tnqn commented May 18, 2022

/test-all

tnqn
tnqn previously approved these changes May 18, 2022
@tnqn
Copy link
Member

tnqn commented May 18, 2022

/test-e2e

@tnqn
Copy link
Member

tnqn commented May 18, 2022

@ceclinux I just realized the commit message is out of date, please update the PR description with correct one, I will use it when merging the commit.

This PR adds a field Multicast for control-plane API NodeStatsSummary
and field MulticastGroup for public statistics API group.

Signed-off-by: ceclinux <src655@gmail.com>
@ceclinux
Copy link
Contributor Author

@ceclinux I just realized the commit message is out of date, please update the PR description with correct one, I will use it when merging the commit.

updated.

@tnqn
Copy link
Member

tnqn commented May 18, 2022

/skip-all

@tnqn tnqn merged commit c23a20c into antrea-io:main May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. api-review Categorizes an issue or PR as actively needing an API review. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants