Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for root filesystem image transfer via uftp, and md5sum c… #3560

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

rich350
Copy link

@rich350 rich350 commented Jul 28, 2017

This supercedes pull request #3545. I removed my changes from master, added branch rich350_uftp, and added my changes to this branch.

@xcatbot
Copy link

xcatbot commented Jul 28, 2017

CI CHECK RESULT : > PR FORMAT WARNING : Missing milestone.Missing labels.> BUILD SUCCESSFUL > INSTALL XCAT SUCCESSFUL> CODE SYNTAX CORRECT> FAST REGRESSION TEST Successful: Totalcase 215 Passed 215 Failed 0

debug "sleeping $SLEEP seconds"
sleep $SLEEP # wait for additional nodes that might also want this image
debug "copying $ROOTIMG"
$UFTP -I $INTERFACE -R 500000 -B 104857600 -b 8800 -D /rootimg.cpio.gz $ROOTIMG
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about -R -1?

       -R txrate
	      The  transmission	 speed	in Kbps.  Specifying -1 for this value
	      results in data being sent as fast as the network interface will
	      allow.   Using  a value of -1 is recommended only if the network
	      path between the server and  all	clients	 is  as	 fast  as  the
	      server's	local  interface, and works best in a gigabit environ-
	      ment.   Default is 1000 Kbps.  Ignored if -C is given any	 value
	      other than "none".

seems the block_size should be less than MTU,


       -b block_size
	      Specifies the size of a data block.  This value should be around
	      100-200 bytes less that the path MTU to provide ample  room  for
	      all  headers  and extensions, up to and including the IP and UDP
	      headers.	Prior to version 4.0, this option  specified  the  MTU
	      and calculated the block size based on that.  Default is 1300.

@rich350
Copy link
Author

rich350 commented Jul 31, 2017

R -1 gives similar results:

-rw-r--r-- 1 root root 3541421170 Jul 28 22:12 /install/netboot/crest-compute/rhels7.3/20170726/rootimg.cpio.gz

uftp -I enP3p5s0f0 -R -1 -D /rootimg.cpio.gz /install/netboot/crest-compute/rhels7.3/20170726/rootimg.cpio.gz

Transfer status:
Host: 0xAC1ECC04 Status: Completed time: 32.555 seconds
Total elapsed time: 32.555 seconds
Overall throughput: 106231.95 KB/s

It would be interesting to see what happens when the sender has a 10Gb/s interface and the receiver has a 1Gb/s interface. I would say this would create buffer overflows on the switch, unless -R 950000 is used.

Regarding block size, it actually will work with -b 8800 and MTU=1500. Perhaps UDP fragmentation and continuation is handling this. In any event, I agree that block size should be less than MTU. Since most computes probably have a 1Gb/s interface, the safest setting would be default block size and -R 950000.

Richard Ray added 2 commits July 31, 2017 17:26
packimage.pm now associates a unique port with an image, which is passed
to xcatroot, so uftpd knows on which port to listen.  The port is added
to the request to uftp-listener, which parses it and starts the transfer
only to nodes listening on that port.  Tested on crest1 and crest2 here.
"sum|s" => \$domd5sum,
"uftp|u" => \$douftp,
"port|d=s" => \$uftpport,
"delay|t=s" => \$uftpdelay,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @rich350 , what is this argument "delay|t=s" => \$uftpdelay for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an amount of time to wait to make sure all booting nodes have gone through their hardware initialization, loaded the kernel and initrd, started xcatroot, and started uftpd. For example, let's say we are booting lots of nodes at the same time to the same image. Each node will initialize, but maybe a few nodes take longer to load a particular driver for whatever reason. So, with each boot there will be a fastest node, and a slowest node. The fastest node will start uftpd and be the first to send a request to uftp-listener for the image. If uftp-listener immediately starts the transfer after the first request, the slowest node might not have started uftpd yet, and would therefore miss the transfer. So, uftpdelay is the amount of time that uftp-listener should wait before starting the transfer, once it receives the first request from the node that booted the fastest.

Since there is a timeout built into the uftp transfer logic in xcatroot, both uftp-listener and xcatroot need to be aware of uftpdelay, so it is passed as a boot parameter. I added the timeout so that if some node is really slow and doesn't get uftpd started before the transfer starts, then sends his request after the transfer has started, it will time out and fall back to wget. The default of 30 seconds is probably overkill, and could probably safely be reduced to 15 or even 10 seconds, but this will depend on the particular hardware, networking, number of nodes, etc. Since this will vary by site, I went with a safe default that can be tuned.

@immarvin
Copy link
Contributor

immarvin commented Aug 1, 2018

hi @rich350 , have you signed the CLA? http://xcat-docs.readthedocs.io/en/stable/developers/license/contributors.html

We can only accept your PR after you sign the CLA, thx

@rich350
Copy link
Author

rich350 commented Aug 1, 2018 via email

@whowutwut
Copy link
Member

@rich350 Thanks! Yes, you are covered under the CCLA, just need to make that connection and tag your GitHub ID so we know, thanks. I'll discard the individual CLA

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.


Richard Ray seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants