Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client: don't start jobs that will exceed RAM #5642

Merged
merged 1 commit into from
Jun 6, 2024
Merged

Client: don't start jobs that will exceed RAM #5642

merged 1 commit into from
Jun 6, 2024

Conversation

davidpanderson
Copy link
Contributor

Fixes #5641

@AenBleidd AenBleidd merged commit 72ae94b into master Jun 6, 2024
127 checks passed
@AenBleidd AenBleidd deleted the dpa_ewss branch June 6, 2024 22:28
AenBleidd added a commit to AenBleidd/boinc that referenced this pull request Jun 16, 2024
Client: don't start jobs that will exceed RAM
@Chamiu
Copy link

Chamiu commented Jun 16, 2024

Oh ?

Until 8.0.2, it was able to process 4WU of DENIS@home at 16GBx4%, but when I installed 8.0.3dev, it was waiting for memory 1WU of 2WU.
DENIS@home's 1WU need 1.7MB.
8.0.3dev. need 16GBx6%.

8.0.3dev want a 50% more RAM of 8.0.2 at start.

Should I verify it?
0. setting anytime 2WU.

  1. down grade to 8.0.2
  2. check memory usage (BOINC Manager and 2WU) at Activity Monitor.app
  3. upgrade to 8.0.3dev.
  4. check memory usage (BOINC Manager and 2WU) at Activity Monitor.app

Environment : macOS Sonoma.5 on MacBook Air Late 2020 (Apple M1)

Will only macOS behave in the opposite way ?
Can't it be verified without the situation in each OS ?

@davidpanderson
Copy link
Contributor Author

I don't understand the above.

What is rsc_memory_bound for those jobs?

@Chamiu
Copy link

Chamiu commented Jun 16, 2024

BOINC 8 0 2
4.
BOINC 8 0 3dev

@RichardHaselgrove
Copy link
Contributor

Assistance please. I have a Linux Mint 20.3 machine running BOINC 8.0.2 under systemd from @LocutusOfBorg PPA. I'm working with Glenn to help him adjust to the BOINC ecosystem (he has an HPC background), and am trying to test this PR for CPDN.

I have downloaded linux_client_5e0d0db47cad2f1f27df4ae69175d5b813c21e90.zip from the artifacts for this PR - but it won't run on this machine: systemd won't start it, but no other error message or reason given.

Have I found the right artifact? Is there another one I could try, or should I resort to a private build?

@AenBleidd
Copy link
Member

@RichardHaselgrove, please use our own Linux installer: https://boinc.berkeley.edu/linux_install.php
Please keep in mind that this particular PR is in 8.0.3 alpha version only

@RichardHaselgrove
Copy link
Contributor

OK, I'll try that when I've reported all current work, and re-confirmed the mapping between Ubuntu and Mint version numbers.

Of course I know that 8.0.3 is alpha and bespoke for CPDN - that's why we want to test it.

We might need a Plan C ...

@RichardHaselgrove
Copy link
Contributor

Well, that was a complete disaster.

  1. re the instructions to install OpenCL: the response was
Package opencl-icd is a virtual package provided by:
  libnvidia-compute-418-server 418.226.00-0ubuntu0.20.04.2
  mesa-opencl-icd 21.2.6-0ubuntu0.1~20.04.2
  nvidia-opencl-icd-340 340.108-0ubuntu5.20.04.2
  libnvidia-compute-535-server 535.161.08-0ubuntu2.20.04.1
  libnvidia-compute-535 535.171.04-0ubuntu0.20.04.1
  libnvidia-compute-470-server 470.239.06-0ubuntu0.20.04.1
  libnvidia-compute-470 470.239.06-0ubuntu0.20.04.1
  libnvidia-compute-450-server 450.248.02-0ubuntu0.20.04.1
  libnvidia-compute-390 390.157-0ubuntu0.20.04.1
  pocl-opencl-icd 1.4-6
  intel-opencl-icd 20.13.16352-1
  beignet-opencl-icd 1.3.2-7build1
You should explicitly select one to install.

E: Package 'opencl-icd' has no installation candidate
  1. installed boinc-client boinc-manager

I now have the shared libraries boinc and boincmgr in /usr/bin (as before), but they are both datestamped Sun 19 Apr 2020 16:07:33 BST - I think that equates to about v7.16.6. And I can't start either of them.

Reverting to the PPA overnight.

@RichardHaselgrove
Copy link
Contributor

And we're running again. the same files are now datestamped Tue 28 May 2024 04:18:35 BST, and 2 GPU and 4 CPU tasks are running. Manager and client agree version is 8.0.2

Now about that Plan C ...

@AenBleidd
Copy link
Member

@RichardHaselgrove, because we are installing binaries to /usr/local/bin since our packages are not officially distributed with the corresponding distro

@RichardHaselgrove
Copy link
Contributor

RichardHaselgrove commented Jun 18, 2024

Well, where did it get v7.16.6 from, then? And why wouldn't that even run?If there are hidden gotchas like that in the 'new' instructions, you need to be explicit about them.

image
No files!

@AenBleidd
Copy link
Member

@RichardHaselgrove, for me instructions are quite clear, but I'm a developer, and I see the things from a little bit different perspective. So let's make instructions better together ;)

  1. For the user there should not be any difference where the binaries are located. We're putting them in that particular location because this is a recommended location for the packages that are not provided with a distro.
  2. Everything should work OOTB ('should' doesn't mean it does), at least in my case all the files are located in the folder I specified.
  3. If in your case it's different - than that means that you haven't installed our package. I do not now if the PPA provides 8.0.2 version, but you can try to update to 8.0.3 from the alpha and see, if you get it (please keep in mind that if you have boinc running as a daemon before upgrade, you need either to restart the boinc-client service manually or do the machine restart).

@RichardHaselgrove
Copy link
Contributor

OK, but not tonight - I've only just got back from a long road trip, and I'm knackered - barely thinking straight. Which is not the right time to do complicated software installs.

The PPA currently offers v8.0.2, so that's working as it's supposed to. I'm very used to running stop / disable / enable / start, to keep systemd happy around system restarts.

Perhaps the biggest take-home from today is that we need to consider both brand-new installs, and upgrades from old and messy working boxes - that'll be the difficult one to write.

@AenBleidd
Copy link
Member

Perhaps the biggest take-home from today is that we need to consider both brand-new installs, and upgrades from old and messy working boxes - that'll be the difficult one to write.

This is something I have tried to address already in 8.0.2.
I have checked this manually some time ago, and it was working fine for me.
But if you find anything additional - then we can improve that together.
Thank you for your help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Client starts set of jobs too large for system memory
4 participants