-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added initial support for P2P-based image distribution #2303
Conversation
Codecov ReportAttention: ❗ Your organization needs to install the Codecov GitHub app to enable full functionality.
... and 165 files with indirect coverage changes 📢 Thoughts on this report? Let us know!. |
You have done a great job and deserve praise. Including technology selection, coding and comparative experiments, it is relatively complete. However, the amount of code is a bit too large. For the convenience of the Reviewer, please take some time to describe the main methods of each file in this comments. |
Thank you for your comment. I'll briefly describe what is going on behind the scenes. ClientA piece of program that acts as a P2P client is required on each target server. The implementation of this client can be found at
P2P DistributorThis part includes modification to sealer itself. Most logics are located in The distributor complies with the
I also created some methods to assist the implementation, namely:
|
Does this PR affect the original distribution logic? In other words, if I do not enable P2P mode, is the program path the same as before your modification? |
No, the PR does not affect the original logic. Every occurrence of the P2P distributor is guarded by an if statement, and if the condition is not met (i.e. the user does not enable the P2P distributor), the program does exactly what it does without the P2P distributor. The aforementioned logic can be found at |
Pls check: " |
All fixed. Please restart CI to check if the warnings persist. |
Still some error. |
pls rebase upstream/main into your branch |
Done. Now the latest commit in upstream/main is on my branch. |
pkg/cluster-runtime/installer.go
Outdated
@@ -148,8 +149,14 @@ func (i *Installer) Install() error { | |||
} | |||
|
|||
// distribute rootfs | |||
if err := i.Distributor.Distribute(all, rootfs); err != nil { | |||
return err | |||
if i.P2PDistributor != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we use distibutor as the interface and initialize different instances, there's no need to make an if else decision here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a temporary solution. In the future, I plan to introduce multicast distribution as well, so multiple distributors will be running in one cluster. SFTP-based distributor will be the fallback method (when other methods won't work due to environmental limitations or failure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even so, we should solve this problem at the implementation level, not at the use level.
maybe we could have an intelligent distributor implementation that invokes several other distributors and have the ability to fallback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even so, we should solve this problem at the implementation level, not at the use level.
maybe we could have an intelligent distributor implementation that invokes several other distributors and have the ability to fallback.
Okay, I'll unify the two distributors here for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Is there anything else that I need to pay attention to? If no I'll be squashing my commits shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just do it 😄
@dynos01 In general, a PR should contain only one commit, and if multiple commits are included, they should contain different features. |
I'll squash all the commits once I complete everything. Thanks for reviewing! |
@starnop Thank you for reviewing my code. Before I squash and push the final commit, let me know if there's anything else that should be fixed. |
36c0c73
to
11fa836
Compare
cmd/dist-receiver/main.go
Outdated
} | ||
|
||
if err := plugins.Inject(); err != nil { | ||
return fmt.Errorf("error initializing plugins: %s", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initializing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo here, all occurrences fixed
cmd/dist-receiver/main.go
Outdated
func main() { | ||
bootstrap := flag.String("bootstrap", "", "Specify the bootstrap node") | ||
cidArg := flag.String("cid", "", "Specify the CID") | ||
fileName := flag.String("file", "", "Specify the file name") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if file is a good parameter name so that the user can quickly understand what value should be passed for this parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so do with the comments for the file
flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, this part is only intended to be invoked by sealer, not a user. Nevertheless, I've updated them for better clarity.
cmd/dist-receiver/main.go
Outdated
bootstrap := flag.String("bootstrap", "", "Specify the bootstrap node") | ||
cidArg := flag.String("cid", "", "Specify the CID") | ||
fileName := flag.String("file", "", "Specify the file name") | ||
targetDir := flag.String("dir", "", "Specify the target directory") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
cmd/dist-receiver/main.go
Outdated
} | ||
|
||
err = files.WriteTo(rootNode, *fileName) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err := files.WriteTo(rootNode, *fileName); err!=nil {
xxxxx
}
Please also take a look at similar code globally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, replaced every similar place in my code. However I do note that there are still quite some similar snippets in sealer's codebase, which I'll be not fixing now since they are not a part of my commit.
cmd/dist-receiver/main.go
Outdated
panic(fmt.Errorf("could not get file with CID: %s", err)) | ||
} | ||
|
||
if err := os.RemoveAll(*fileName); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please have a comment here?
and, in my opinion is it necessary to clean this file? What about overwriting the original file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this should be OK, since the following line will truncate the file if it is already present. Removed this part.
var p2p bool | ||
switch runFlags.Distributor { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we start writing the same code a second time, we can consider whether we should extract it as a public method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's perfectly reasonable. However this part will be rewritten soon with the introduction of the other distribution methods I mentioned yesterday, I'd keep it simple here for now. The p2p variable itself will be gone then. There will be a set of logics which will be abstracted as a public method.
pkg/infradriver/ssh_infradriver.go
Outdated
for _, host := range cluster.Spec.Hosts { | ||
if err = mergo.Merge(&host.SSH, &cluster.Spec.SSH); err != nil { | ||
for i, host := range cluster.Spec.Hosts { | ||
if err = mergo.Merge(&cluster.Spec.Hosts[i].SSH, &cluster.Spec.SSH); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the need for this code change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is not related to my commit, but it does cause lint errors, so I fixed it to pass CI.
https://github.com/sealerio/sealer/actions/runs/6611224807/job/17954685797
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, this PR should only contain your own code, you can get the latest code through git rebase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll drop these changes then.
pkg/registry/configurator.go
Outdated
@@ -52,7 +52,8 @@ func NewConfigurator(deployHosts []net.IP, | |||
containerRuntimeInfo containerruntime.Info, | |||
regConfig v2.Registry, | |||
infraDriver infradriver.InfraDriver, | |||
distributor imagedistributor.Distributor) (Configurator, error) { | |||
distributor imagedistributor.Distributor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
go fmt code
Generally, in code style, we don't use ) as the first character of a line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops sorry, I was mainly using Rust which encourages this style. Fixed.
pkg/registry/installer.go
Outdated
@@ -50,7 +50,8 @@ type Installer interface { | |||
func NewInstaller(currentDeployHost []net.IP, | |||
regConfig *v2.LocalRegistry, | |||
infraDriver infradriver.InfraDriver, | |||
distributor imagedistributor.Distributor) Installer { | |||
distributor imagedistributor.Distributor, | |||
) Installer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
BTW, in general, before the code is reviewed by the reviewers, we should develop a good habit, that is, review our own code firstly, and try to ensure that there are no unnecessary changes. Of course, good modifications are welcome
utils/ssh/ssh.go
Outdated
@@ -88,13 +88,13 @@ func NewSSHClient(ssh *v1.SSH, alsoToStdout bool) Interface { | |||
|
|||
// GetHostSSHClient is used to executed bash command and no std out to be printed. | |||
func GetHostSSHClient(hostIP net.IP, cluster *v2.Cluster) (Interface, error) { | |||
for _, host := range cluster.Spec.Hosts { | |||
for i, host := range cluster.Spec.Hosts { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
@dynos01 please try to run |
Well... I tried, but
Currently, none of these errors belong to my commit. However they are strange: I reviewed some errors and realized the errors appeared to be false alarms. How is this possible? Investigating on... I've pushed a new commit with all the changes you requested before. |
@dynos01 I have submit a new PR to fix the lint error for main branch. Please rebase the code after it was merged. |
Got it. I'll wait for your PR to be merged. |
Pls Rebase |
@starnop I'm trying to solve the issue of Based on my tests, With the upgrade of And of course, there are a few places Since I should be focusing on my own code, would you mind helping me:
And I'll push my fixed code. If everything goes well, this time all Github Actions should pass. Thank you! |
A separate PR might be better 😄 |
@dynos01 BTW, please rebase the code before you push it. |
Signed-off-by: Yinuo Deng <i@dyn.im>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
LGTM |
Describe what this PR does / why we need it
This PR provides P2P-based image distribution on top of the current SFTP-based approach, as described in #2278.
P2P-based content distribution has gained great popularity in the recent years. By using P2P-based content distribution, the load on the source provider significantly decreases as the entire system scales up. For example, the source would like to distribute 1GiB of content to 10 receivers, then it has to send 10GiB in total. However, with P2P used, the theoretical lower bound of sent bytes is only 1GiB. Other 9GiB of data is transmitted between the receivers.
Does this pull request fix one issue?
Describe how you did it
The design relies on Kubo (formerly known as IPFS), a popular P2P network implementation in Go. Since P2P-based distribution is just introduced, it is not used by default. If the user chooses to use it, then Sealer will:
Describe how to verify it
dist-receiver
to the destination servers (usually to/usr/bin
)--p2p-distribution
when runningsealer run
Special notes for reviews
I have completed preliminary tests on the result (as the following figure shows). P2P-based distribution gradually gains advantage when the number of servers increase. (The tests are done under 1Gbit/s network offered by a cloud provider)
Due to the nature of how P2P works, Sealer will use slightly more memory when using P2P-based distributions. In my experienments, with 10 destinations (peers) added, Sealer uses about 70MiB more of memory compared with SFTP-based distribution. However, The increase shouldn't be linear, as the base algorithm P2P networks use, Kademlia, uses a data structure called "K-bucket" to manage peers. This data structure does not keep track of every peer it discovers. When the number of peers approaches infinite, the memory usage will remain almost constant.
Since the integration of an alternative distribution system involves fundamental changes, P2P-based distribution shouldn't be used as the default method right now, if this PR gets merged.
This PR also bumps some dependencies. For quicker review, please refer to
pkg/imagedistributor/p2p_distributor.go
andcmd/dist-receiver/main.go
, where the main logics can be found.